SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
Tutorial on “Orange: An Open Source Data Mining Package”

                                               Prepared By:

                           Mr. KISHOJ BAJRACHARYA (ID No: 111224)
                                       kishoj@gmail.com


                  Department of Computer Science and Information Management
                             School of Engineering and Technology
                                 Asian Institute of Technology

                                             October 12, 2011



Contents

1 Orange                                                                                                      2
   1.1   Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    2
   1.2   Features of Orange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      2
   1.3   Installing Orange-Canvas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      2
         1.3.1   Installing on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       2
         1.3.2   Installing on Ubuntu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      4
   1.4   Python Scripting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      4

2 Python Scripting Code Examples                                                                               6
   2.1   Using Python Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       6
   2.2   Support, Confidence and Lift for Association Rule . . . . . . . . . . . . . . . . . . . . . . .        7
   2.3   Naive Bayes Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       8
   2.4   Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    9
   2.5   K-Means Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 References                                                                                                  13




                                                      1
1 Orange

1.1   Introduction

Orange is a collection of Python-based modules that sit over the core library of C++ objects and routines
that handles machine learning and data mining algorithms. It is an open source data mining package build
on Python, Wrapped C, C++ and Qt.

Orange widgets provide a graphical users interface to Oranges data mining and machine learning meth-
ods. They include widgets for data entry and pre-processing, data visualization, classification, regression,
association rules and clustering, a set of widgets for model evaluation and visualization of evaluation re-
sults, and widgets for exporting the models into Decision support system. Orange widgets and Orange
Canvas are all written in Python using Qt graphical users interface library. This allows Orange to run on
various platforms, including MS Windows and Linux.


1.2   Features of Orange

   1. Open and free software: Orange is an open source and free data mining software tool.

   2. Platform independent software: Orange is supported on various versions of Linux, Microsoft win-
      dows, and Apples Mac.

   3. Programming support: Orange supports visual programming tools for Data mining: Users can
      design data analysis process via visual programming. Orange provides different visualization like
      bardiagram, scatterplots, trees, network, etc.

   4. Scripting Interface: Orange provides python scripting. Programmers can test various new algorithms
      and data analysis using python scripting.

   5. Support for other components: Orange provides support for Machine Learning, bioinformatics, text
      mining, etc.


1.3   Installing Orange-Canvas

Orange-Canvas can be install on any platform. Browse the URI http://orange.biolab.si/nightly_
builds.html for more informations on installing Orange-Canvas on different platforms. Here we focus
on two platforms: Windows 7 and Ubuntu.


1.3.1 Installing on Windows

   1. Browse the URI http://orange.biolab.si/nightly_builds.html.

   2. Download “orange-win-w-python-snapshot-2011-09-11-py2.7.exe”.

   3. Install the software by double clicking the file “orange-win-w-python-snapshot-2011-09-11-py2.7.exe”

The steps for installations are shown in the figures below:

                                                     2
Fig 1: License Agreement




Fig 2: Completion of Installation




Fig 3: Locating Orange-Canvas




               3
Fig 4: Orange-Canvas GUI


1.3.2 Installing on Ubuntu

   1. Browse the URI http://orange.biolab.si/download/archive/.

   2. Download the compressed file “orange-2.0-20101215svn.zip”.

   3. Extract all the files from the file “orange-2.0-20101215svn.zip”.
          unzip orange-2.0-20101215svn.zip

   4. Type the following commands on linux terminal
          python setup.py build
          sudo python setup.py install
          python setup.py install –user


1.4   Python Scripting

Using the scripting language python in which the module orange can be imported by following code.
# Import module orange for python scripting
import orange




                                                     4
Fig 5(a): Using Python Scripting on Windows




Fig 5(b): Using Python Scripting on Ubuntu




                    5
2 Python Scripting Code Examples

     Orange provides scripting interface on python programming language. Programmers can test various new
     algorithms and data analysis using python scripting.


     2.1    Using Python Code

     The following code in the file test.py is used to test python scripting for Orange. It shows the simple python
     program to test the importing of data from an external file and play with the data access mechanism.

 1         # test.py
 2         # Importing Orange Library for python
 3         import orange
 4
 5         # Importing data from the file named "test.tab"
 6         data = orange.ExampleTable("test")
 7
 8         # Printing the attributes of the table
 9         print "Attributes:"
10         print data.domain.attributes
11
12         # List of attributes
13         attributeList = []
14
15         # Printing the attributes of the table
16         for i in data.domain.attributes:
17                 attributeList.append(i.name)
18                 print i.name
19         attributeList.append(data.domain.classVar.name)
20
21         # Class Name
22         print "Class:", data.domain.classVar.name
23
24         # Display atributes
25         print attributeList
26         #attributeList.split(",")
27         print
28
29         # Displaying the data from the table
30         print "Data items:"
31         for i in range(14):
32                 print data[i]


     Let the data table for above code be shown as in Fig 6.




                                                  Fig 6: Data Table 1


                                                          6
The output of the above program is shown in Fig 7.




                                              Fig 7: Output of test.py


     2.2   Support, Confidence and Lift for Association Rule

     The following “example2.py” shows how do we use scripting language like python to get support and
     confidence for all the possible association rules developed from the data of imported file “association.tab”.

 1         # example2.py
 2         # Importing classes Orange and orngAssoc
 3         import orange, orngAssoc
 4
 5         # Importing data from a file named association.tab
 6         data = orange.ExampleTable("association")
 7
 8         # Data Preprocessing
 9         data = orange.Preprocessor_discretize(data, method=orange.EquiNDiscretization(numberOfIntervals=4))
10
11         # Data Selection (We have range of 2)
12         data = data.select(range(2))
13
14         # List of supports
15         iList = [0.1, 0.2, 0.3, 0.4]
16
17         for x in iList:
18                 # Developing association rules from Orange
19                 rules = orange.AssociationRulesInducer(data, support=x)
20
21                  # if there is no association rule
22                  if(len(rules) == 0):
23                          print "No any association rules for support = %5.3f" % (x)
24                  # if there exists an association rule
25                  else:
26                          print "%i rules with support = %5.3f found.n" % (len(rules), x)
27                          orngAssoc.sort(rules, ["support", "confidence", "lift"])
28                          orngAssoc.printRules(rules[:(len(rules))], ["support", "confidence", "lift"])
29                          print


     The output of the above program is shown in Fig 8.

                                                          7
Fig 8: Output of example2.py


    2.3   Naive Bayes Classifier

    Using Python, we observe the working of Bayesian classifier from voting data set i.e. “voting.tab” and will
    use it to classify the first five instances from this data set.

1         # classifier.py
2         import orange
3         data = orange.ExampleTable("voting")
4         classifier = orange.BayesLearner(data)
5         for i in range(5):
6             c = classifier(data[i])
7             print "%d: %s (originally %s)" % (i+1, c, data[i].getclass())


    The script loads the data, uses it to constructs a classifier using naive Bayesian method, and then classifies
    first five instances of the data set. Naive Bayes made a mistake at a third instance, but otherwise predicted
    correctly as shown if the figure below.




                                                         8
Fig 9: Output of classifier.py


     2.4   Regression

     Following example uses both regression trees and k-nearest neighbors, and also uses a majority learner
     which for regression simply returns an average value from learning data set.

 1         # regression2.py
 2         import orange, orngTree, orngTest, orngStat
 3
 4         data = orange.ExampleTable("housing.tab")
 5         selection = orange.MakeRandomIndices2(data, 0.5)
 6         train_data = data.select(selection, 0)
 7         test_data = data.select(selection, 1)
 8
 9         maj = orange.MajorityLearner(train_data)
10         maj.name = "default"
11
12         rt = orngTree.TreeLearner(train_data, measure="retis", mForPruning=2, minExamples=20)
13         rt.name = "reg. tree"
14
15         k = 5
16         knn = orange.kNNLearner(train_data, k=k)
17         knn.name = "k-NN (k=%i)" % k
18
19         regressors = [maj, rt, knn]
20
21         print "n%10s " % "original",
22         for r in regressors:
23           print "%10s " % r.name,
24         print
25
26         for i in range(10):
27           print "%10.1f " % test_data[i].getclass(),
28           for r in regressors:
29             print "%10.1f " % r(test_data[i]),
30           print


     The output of the above program is shown in Fig 10.




                                         Fig 10: Output of regression.py

                                                           9
2.5    K-Means Clustering Algorithm

     Let us use python to implement K-means clustering algorithm for the problem solved in the class i.e. K = 2
     and array = [1,2,3,4,8,9,10,11].

 1         # test3.py
 2         import numpy
 3         import math
 4
 5         # Given Array of elements that needs to be clustered
 6         iArray = [1.0, 2.0, 3.0, 4.0, 8.0, 9.0, 10.0, 11.0]
 7
 8         # Returns the value of the mean of an array elements
 9         def meanArray(aArray):
10                 icount = len(aArray)
11                 iSum = 0
12                 for x in aArray:
13                         iSum = iSum + x
14                 return (iSum/icount)
15
16         Count = len(iArray)
17
18         # Randomly select 2 elements
19         c1 = iArray[Count-2]
20         c2 = iArray[Count-1]
21
22         # Initial assumptions all classes null
23         Class1 = [0.0]
24         Class2 = []
25         oldClass1 = []
26         i = 1
27
28         # Loop exit condition
29         while (oldClass1 != Class1):
30                 print "Iteration: " + str(i)
31                 oldClass1 = Class1
32                 Class1 = []
33                 Class2 = []
34                 for x in iArray:
35                         if math.fabs(c1 - x) < math.fabs(c2 - x):
36                                  Class1.append(x)
37                         else:
38                                  Class2.append(x)
39                 print "Class1: " + str(Class1)
40                 c1 = round(meanArray(Class1),1)
41                 print "c1 = " + str(c1)
42
43                 print "Class2: " + str(Class2)
44                 c2 = round(meanArray(Class2),1)
45                 print "c2 = " + str(c2)
46                 print
47                 i = i + 1




                                                        10
Fig 11: Output of K-Means Clustering Algorithm

     Using Orange, we can easily implement K-Means Clustering algorithm and plot graph using the following
     code.

 1       import   orange
 2       import   orngClustering
 3       import   pylab
 4       import   random
 5
 6       # To plot the 2D-point
 7       def plot_scatter(data, km, attx, atty, filename="kmeans-scatter", title=None):
 8           """plot a data scatter plot with the position of centroids"""
 9           pylab.rcParams.update({’font.size’: 8, ’figure.figsize’: [4,3]})
10
11           # For the points
12           x = [float(d[attx]) for d in data]
13           y = [float(d[atty]) for d in data]
14           colors = ["c", "b"]
15           cs = "".join([colors[c] for c in km.clusters])
16           pylab.scatter(x, y, c=cs, s=10)
17
18           # For the centroid points
19           xc = [float(d[attx]) for d in km.centroids]
20           yc = [float(d[atty]) for d in km.centroids]
21           pylab.scatter(xc, yc, marker="x", c="k", s=200)
22
23           pylab.xlabel(attx)
24           pylab.ylabel(atty)
25           if title:
26               pylab.title(title)
27           pylab.savefig("%s-%03d.png" % (filename, km.iteration))
28           pylab.close()
29
30       def in_callback(km):
31           print "Iteration: %d, changes: %d" % (km.iteration, km.nchanges)
32           plot_scatter(data, km, "X", "Y", title="Iteration %d" % km.iteration)
33
34       # Read the data from table
35       data = orange.ExampleTable("data")
36       km = orngClustering.KMeans(data, 2, minscorechange=-1, maxiters=10, inner_callback=in_callback)




                                                        11
The output of this program is shown below: Result of test.py




                                      Fig 12(a): During iteration 0




                                      Fig 12(b): During iteration 1




                                                   12
3 References

The following are the references taken as a help to prepare this manual:

   1. http://en.wikipedia.org/wiki/Orange_(software)

   2. http://orange.biolab.si/

   3. http://orange.biolab.si/nightly_builds.html

   4. http://orange.biolab.si/doc/ofb-rst/genindex.html




                                                    13

Contenu connexe

Tendances

Basic Access Notes
Basic Access NotesBasic Access Notes
Basic Access NotesPyi Soe
 
Standar testing software
Standar testing softwareStandar testing software
Standar testing softwareazfa_rasikh
 
membuat desain sistem keamanan jaringan
membuat desain sistem keamanan jaringanmembuat desain sistem keamanan jaringan
membuat desain sistem keamanan jaringanahmad amiruddin
 
Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0
Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0
Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0Marlinda
 
Erd (entity relationship diagram)
Erd (entity relationship diagram)Erd (entity relationship diagram)
Erd (entity relationship diagram)Fariszal Nova
 
Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...
Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...
Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...Muhammad Iqbal
 
Visualisasi Data di R dengan ggplot2
Visualisasi Data di R dengan ggplot2Visualisasi Data di R dengan ggplot2
Visualisasi Data di R dengan ggplot2Muhammad Rifqi
 
Pertemuan 1 Pemrograman Dasar
Pertemuan 1 Pemrograman DasarPertemuan 1 Pemrograman Dasar
Pertemuan 1 Pemrograman DasarDisma Ariyanti W
 
RPL 1 (Lama) - Pengujian Perangkat Lunak
RPL 1 (Lama) - Pengujian Perangkat LunakRPL 1 (Lama) - Pengujian Perangkat Lunak
RPL 1 (Lama) - Pengujian Perangkat LunakAdam Mukharil Bachtiar
 
metode-pengujian-blackbox
 metode-pengujian-blackbox metode-pengujian-blackbox
metode-pengujian-blackboxIwan Kurniarasa
 
SKPL Bungkusin v2.0
SKPL Bungkusin v2.0SKPL Bungkusin v2.0
SKPL Bungkusin v2.0Kania Amalia
 
Contoh rich picture
Contoh rich pictureContoh rich picture
Contoh rich pictureonenthree
 
DESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil Studi
DESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil StudiDESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil Studi
DESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil StudiWindi Widiastuti
 
Perancangan game edukasi untuk presentasi
Perancangan  game  edukasi  untuk presentasiPerancangan  game  edukasi  untuk presentasi
Perancangan game edukasi untuk presentasiPelnap GPdI Ketapang
 
Prediksi Harga Saham dengan Machine Learning - Tia Dwi Setiani
Prediksi Harga Saham dengan Machine Learning - Tia Dwi SetianiPrediksi Harga Saham dengan Machine Learning - Tia Dwi Setiani
Prediksi Harga Saham dengan Machine Learning - Tia Dwi SetianiDicodingEvent
 

Tendances (20)

Basic Access Notes
Basic Access NotesBasic Access Notes
Basic Access Notes
 
Standar testing software
Standar testing softwareStandar testing software
Standar testing software
 
membuat desain sistem keamanan jaringan
membuat desain sistem keamanan jaringanmembuat desain sistem keamanan jaringan
membuat desain sistem keamanan jaringan
 
12 depresiasi
12 depresiasi12 depresiasi
12 depresiasi
 
Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0
Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0
Makalah Aplikasi Data Penjualan Menggunakan Visual Basic 6.0
 
Erd (entity relationship diagram)
Erd (entity relationship diagram)Erd (entity relationship diagram)
Erd (entity relationship diagram)
 
SKPL
SKPLSKPL
SKPL
 
Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...
Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...
Program Pembelian Barang Dan Pencetakan Struk BSI Mart Menggunakan Bahasa Pem...
 
Visualisasi Data di R dengan ggplot2
Visualisasi Data di R dengan ggplot2Visualisasi Data di R dengan ggplot2
Visualisasi Data di R dengan ggplot2
 
Pertemuan 1 Pemrograman Dasar
Pertemuan 1 Pemrograman DasarPertemuan 1 Pemrograman Dasar
Pertemuan 1 Pemrograman Dasar
 
RPL 1 (Lama) - Pengujian Perangkat Lunak
RPL 1 (Lama) - Pengujian Perangkat LunakRPL 1 (Lama) - Pengujian Perangkat Lunak
RPL 1 (Lama) - Pengujian Perangkat Lunak
 
metode-pengujian-blackbox
 metode-pengujian-blackbox metode-pengujian-blackbox
metode-pengujian-blackbox
 
SKPL Bungkusin v2.0
SKPL Bungkusin v2.0SKPL Bungkusin v2.0
SKPL Bungkusin v2.0
 
Contoh rich picture
Contoh rich pictureContoh rich picture
Contoh rich picture
 
Perkuliahan ke 6 queue
Perkuliahan ke 6 queuePerkuliahan ke 6 queue
Perkuliahan ke 6 queue
 
Keamanan Sistem
Keamanan SistemKeamanan Sistem
Keamanan Sistem
 
Laporan akhir-pkm-kc
Laporan akhir-pkm-kcLaporan akhir-pkm-kc
Laporan akhir-pkm-kc
 
DESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil Studi
DESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil StudiDESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil Studi
DESKRIPSI PERANCANGAN PERANGKAT LUNAK Sistem Akademik Kartu Hasil Studi
 
Perancangan game edukasi untuk presentasi
Perancangan  game  edukasi  untuk presentasiPerancangan  game  edukasi  untuk presentasi
Perancangan game edukasi untuk presentasi
 
Prediksi Harga Saham dengan Machine Learning - Tia Dwi Setiani
Prediksi Harga Saham dengan Machine Learning - Tia Dwi SetianiPrediksi Harga Saham dengan Machine Learning - Tia Dwi Setiani
Prediksi Harga Saham dengan Machine Learning - Tia Dwi Setiani
 

En vedette

DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGENeeraj Goswami
 
orange mineria de datos
orange mineria de datosorange mineria de datos
orange mineria de datosOmar Cespedes
 
Orange Canvas - PyData 2013
Orange Canvas - PyData 2013Orange Canvas - PyData 2013
Orange Canvas - PyData 2013justin_sun
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Krishna Petrochemicals
 
Data Mining Tools / Orange
Data Mining Tools / OrangeData Mining Tools / Orange
Data Mining Tools / OrangeYasemin Karaman
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Cuadro base de datos
Cuadro base de datosCuadro base de datos
Cuadro base de datosga2012
 
Programm TheatriumWilhelmstraßenfest 2015 auf einen Blick
Programm TheatriumWilhelmstraßenfest 2015 auf einen BlickProgramm TheatriumWilhelmstraßenfest 2015 auf einen Blick
Programm TheatriumWilhelmstraßenfest 2015 auf einen BlickLandeshauptstadt Wiesbaden
 
Programacion Foro Universitario Juan Luis Vives
Programacion Foro Universitario Juan Luis VivesProgramacion Foro Universitario Juan Luis Vives
Programacion Foro Universitario Juan Luis VivesPablo Herreros
 
6-NANOE-Enterprise-Partner-Progam-Guide-RV19
6-NANOE-Enterprise-Partner-Progam-Guide-RV196-NANOE-Enterprise-Partner-Progam-Guide-RV19
6-NANOE-Enterprise-Partner-Progam-Guide-RV19Joy Braunstein
 
Dirección Orquestal (ensayo)
Dirección Orquestal (ensayo)Dirección Orquestal (ensayo)
Dirección Orquestal (ensayo)Juan Vazquez
 
I CATCH WEB SOLUTION
I CATCH WEB SOLUTION I CATCH WEB SOLUTION
I CATCH WEB SOLUTION Banti Bhargav
 
Gus And Grandpa Lesson 7 Day 1
Gus And Grandpa Lesson 7 Day 1Gus And Grandpa Lesson 7 Day 1
Gus And Grandpa Lesson 7 Day 1Sandy Bones
 
2002 Killeen Civic Art Guild Spring Art
2002 Killeen Civic Art Guild Spring Art2002 Killeen Civic Art Guild Spring Art
2002 Killeen Civic Art Guild Spring ArtLinda McMurrayq
 

En vedette (20)

DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGE
 
orange mineria de datos
orange mineria de datosorange mineria de datos
orange mineria de datos
 
Orange Canvas - PyData 2013
Orange Canvas - PyData 2013Orange Canvas - PyData 2013
Orange Canvas - PyData 2013
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
 
Data Mining Tools / Orange
Data Mining Tools / OrangeData Mining Tools / Orange
Data Mining Tools / Orange
 
Data mining
Data miningData mining
Data mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Cuadro base de datos
Cuadro base de datosCuadro base de datos
Cuadro base de datos
 
PRESENTACION LATIAM 2015 C3
PRESENTACION LATIAM 2015 C3PRESENTACION LATIAM 2015 C3
PRESENTACION LATIAM 2015 C3
 
Programm TheatriumWilhelmstraßenfest 2015 auf einen Blick
Programm TheatriumWilhelmstraßenfest 2015 auf einen BlickProgramm TheatriumWilhelmstraßenfest 2015 auf einen Blick
Programm TheatriumWilhelmstraßenfest 2015 auf einen Blick
 
C2C - the eHealth company [ES]
C2C - the eHealth company [ES]C2C - the eHealth company [ES]
C2C - the eHealth company [ES]
 
Programacion Foro Universitario Juan Luis Vives
Programacion Foro Universitario Juan Luis VivesProgramacion Foro Universitario Juan Luis Vives
Programacion Foro Universitario Juan Luis Vives
 
6-NANOE-Enterprise-Partner-Progam-Guide-RV19
6-NANOE-Enterprise-Partner-Progam-Guide-RV196-NANOE-Enterprise-Partner-Progam-Guide-RV19
6-NANOE-Enterprise-Partner-Progam-Guide-RV19
 
Dirección Orquestal (ensayo)
Dirección Orquestal (ensayo)Dirección Orquestal (ensayo)
Dirección Orquestal (ensayo)
 
I CATCH WEB SOLUTION
I CATCH WEB SOLUTION I CATCH WEB SOLUTION
I CATCH WEB SOLUTION
 
Cabala y-tarot
Cabala y-tarotCabala y-tarot
Cabala y-tarot
 
Game of Y
Game of YGame of Y
Game of Y
 
Enedina
EnedinaEnedina
Enedina
 
Gus And Grandpa Lesson 7 Day 1
Gus And Grandpa Lesson 7 Day 1Gus And Grandpa Lesson 7 Day 1
Gus And Grandpa Lesson 7 Day 1
 
2002 Killeen Civic Art Guild Spring Art
2002 Killeen Civic Art Guild Spring Art2002 Killeen Civic Art Guild Spring Art
2002 Killeen Civic Art Guild Spring Art
 

Similaire à Manual orange

Introduction to Machine Learning by MARK
Introduction to Machine Learning by MARKIntroduction to Machine Learning by MARK
Introduction to Machine Learning by MARKMRKUsafzai0607
 
Data_Processing_Program
Data_Processing_ProgramData_Processing_Program
Data_Processing_ProgramNeil Dahlqvist
 
Final training course
Final training courseFinal training course
Final training courseNoor Dhiya
 
Parallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets AnalysisParallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets AnalysisIllia Ovchynnikov
 
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...Felipe Prado
 
Report on forensics tools
Report on forensics toolsReport on forensics tools
Report on forensics toolsVishnuPratap7
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnArnaud Joly
 
R Programming: Importing Data In R
R Programming: Importing Data In RR Programming: Importing Data In R
R Programming: Importing Data In RRsquared Academy
 
python lab programs.pdf
python lab programs.pdfpython lab programs.pdf
python lab programs.pdfCBJWorld
 
Machine_Learning_Trushita
Machine_Learning_TrushitaMachine_Learning_Trushita
Machine_Learning_TrushitaTrushita Redij
 
The Ring programming language version 1.6 book - Part 7 of 189
The Ring programming language version 1.6 book - Part 7 of 189The Ring programming language version 1.6 book - Part 7 of 189
The Ring programming language version 1.6 book - Part 7 of 189Mahmoud Samir Fayed
 
Kinect installation guide
Kinect installation guideKinect installation guide
Kinect installation guidegilmsdn
 
Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)Jose Manuel Ortega Candel
 
Pemrograman Python untuk Pemula
Pemrograman Python untuk PemulaPemrograman Python untuk Pemula
Pemrograman Python untuk PemulaOon Arfiandwi
 

Similaire à Manual orange (20)

DS LAB MANUAL.pdf
DS LAB MANUAL.pdfDS LAB MANUAL.pdf
DS LAB MANUAL.pdf
 
Introduction to Machine Learning by MARK
Introduction to Machine Learning by MARKIntroduction to Machine Learning by MARK
Introduction to Machine Learning by MARK
 
Data_Processing_Program
Data_Processing_ProgramData_Processing_Program
Data_Processing_Program
 
PMED Undergraduate Workshop - R Tutorial for PMED Undegraduate Workshop - Xi...
PMED Undergraduate Workshop - R Tutorial for PMED Undegraduate Workshop  - Xi...PMED Undergraduate Workshop - R Tutorial for PMED Undegraduate Workshop  - Xi...
PMED Undergraduate Workshop - R Tutorial for PMED Undegraduate Workshop - Xi...
 
Cc code cards
Cc code cardsCc code cards
Cc code cards
 
Final training course
Final training courseFinal training course
Final training course
 
Malware analysis
Malware analysisMalware analysis
Malware analysis
 
Parallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets AnalysisParallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets Analysis
 
ExtraFileIO.pptx
ExtraFileIO.pptxExtraFileIO.pptx
ExtraFileIO.pptx
 
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
 
Report on forensics tools
Report on forensics toolsReport on forensics tools
Report on forensics tools
 
R studio
R studio R studio
R studio
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
 
R Programming: Importing Data In R
R Programming: Importing Data In RR Programming: Importing Data In R
R Programming: Importing Data In R
 
python lab programs.pdf
python lab programs.pdfpython lab programs.pdf
python lab programs.pdf
 
Machine_Learning_Trushita
Machine_Learning_TrushitaMachine_Learning_Trushita
Machine_Learning_Trushita
 
The Ring programming language version 1.6 book - Part 7 of 189
The Ring programming language version 1.6 book - Part 7 of 189The Ring programming language version 1.6 book - Part 7 of 189
The Ring programming language version 1.6 book - Part 7 of 189
 
Kinect installation guide
Kinect installation guideKinect installation guide
Kinect installation guide
 
Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)
 
Pemrograman Python untuk Pemula
Pemrograman Python untuk PemulaPemrograman Python untuk Pemula
Pemrograman Python untuk Pemula
 

Plus de Kishoj Bajracharya (8)

Tutorial for RDF Graphs
Tutorial for RDF GraphsTutorial for RDF Graphs
Tutorial for RDF Graphs
 
Comparison of Agent-based platforms
Comparison of Agent-based platformsComparison of Agent-based platforms
Comparison of Agent-based platforms
 
IPv6 examples
IPv6 examplesIPv6 examples
IPv6 examples
 
Network Coding
Network CodingNetwork Coding
Network Coding
 
Galios: Python Programming
Galios: Python Programming Galios: Python Programming
Galios: Python Programming
 
OLSR setup
OLSR setup OLSR setup
OLSR setup
 
DBpedia mobile
DBpedia mobileDBpedia mobile
DBpedia mobile
 
Random Number Generation
Random Number GenerationRandom Number Generation
Random Number Generation
 

Dernier

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 

Dernier (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 

Manual orange

  • 1. Tutorial on “Orange: An Open Source Data Mining Package” Prepared By: Mr. KISHOJ BAJRACHARYA (ID No: 111224) kishoj@gmail.com Department of Computer Science and Information Management School of Engineering and Technology Asian Institute of Technology October 12, 2011 Contents 1 Orange 2 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Features of Orange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Installing Orange-Canvas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3.1 Installing on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3.2 Installing on Ubuntu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Python Scripting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Python Scripting Code Examples 6 2.1 Using Python Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Support, Confidence and Lift for Association Rule . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Naive Bayes Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5 K-Means Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 References 13 1
  • 2. 1 Orange 1.1 Introduction Orange is a collection of Python-based modules that sit over the core library of C++ objects and routines that handles machine learning and data mining algorithms. It is an open source data mining package build on Python, Wrapped C, C++ and Qt. Orange widgets provide a graphical users interface to Oranges data mining and machine learning meth- ods. They include widgets for data entry and pre-processing, data visualization, classification, regression, association rules and clustering, a set of widgets for model evaluation and visualization of evaluation re- sults, and widgets for exporting the models into Decision support system. Orange widgets and Orange Canvas are all written in Python using Qt graphical users interface library. This allows Orange to run on various platforms, including MS Windows and Linux. 1.2 Features of Orange 1. Open and free software: Orange is an open source and free data mining software tool. 2. Platform independent software: Orange is supported on various versions of Linux, Microsoft win- dows, and Apples Mac. 3. Programming support: Orange supports visual programming tools for Data mining: Users can design data analysis process via visual programming. Orange provides different visualization like bardiagram, scatterplots, trees, network, etc. 4. Scripting Interface: Orange provides python scripting. Programmers can test various new algorithms and data analysis using python scripting. 5. Support for other components: Orange provides support for Machine Learning, bioinformatics, text mining, etc. 1.3 Installing Orange-Canvas Orange-Canvas can be install on any platform. Browse the URI http://orange.biolab.si/nightly_ builds.html for more informations on installing Orange-Canvas on different platforms. Here we focus on two platforms: Windows 7 and Ubuntu. 1.3.1 Installing on Windows 1. Browse the URI http://orange.biolab.si/nightly_builds.html. 2. Download “orange-win-w-python-snapshot-2011-09-11-py2.7.exe”. 3. Install the software by double clicking the file “orange-win-w-python-snapshot-2011-09-11-py2.7.exe” The steps for installations are shown in the figures below: 2
  • 3. Fig 1: License Agreement Fig 2: Completion of Installation Fig 3: Locating Orange-Canvas 3
  • 4. Fig 4: Orange-Canvas GUI 1.3.2 Installing on Ubuntu 1. Browse the URI http://orange.biolab.si/download/archive/. 2. Download the compressed file “orange-2.0-20101215svn.zip”. 3. Extract all the files from the file “orange-2.0-20101215svn.zip”. unzip orange-2.0-20101215svn.zip 4. Type the following commands on linux terminal python setup.py build sudo python setup.py install python setup.py install –user 1.4 Python Scripting Using the scripting language python in which the module orange can be imported by following code. # Import module orange for python scripting import orange 4
  • 5. Fig 5(a): Using Python Scripting on Windows Fig 5(b): Using Python Scripting on Ubuntu 5
  • 6. 2 Python Scripting Code Examples Orange provides scripting interface on python programming language. Programmers can test various new algorithms and data analysis using python scripting. 2.1 Using Python Code The following code in the file test.py is used to test python scripting for Orange. It shows the simple python program to test the importing of data from an external file and play with the data access mechanism. 1 # test.py 2 # Importing Orange Library for python 3 import orange 4 5 # Importing data from the file named "test.tab" 6 data = orange.ExampleTable("test") 7 8 # Printing the attributes of the table 9 print "Attributes:" 10 print data.domain.attributes 11 12 # List of attributes 13 attributeList = [] 14 15 # Printing the attributes of the table 16 for i in data.domain.attributes: 17 attributeList.append(i.name) 18 print i.name 19 attributeList.append(data.domain.classVar.name) 20 21 # Class Name 22 print "Class:", data.domain.classVar.name 23 24 # Display atributes 25 print attributeList 26 #attributeList.split(",") 27 print 28 29 # Displaying the data from the table 30 print "Data items:" 31 for i in range(14): 32 print data[i] Let the data table for above code be shown as in Fig 6. Fig 6: Data Table 1 6
  • 7. The output of the above program is shown in Fig 7. Fig 7: Output of test.py 2.2 Support, Confidence and Lift for Association Rule The following “example2.py” shows how do we use scripting language like python to get support and confidence for all the possible association rules developed from the data of imported file “association.tab”. 1 # example2.py 2 # Importing classes Orange and orngAssoc 3 import orange, orngAssoc 4 5 # Importing data from a file named association.tab 6 data = orange.ExampleTable("association") 7 8 # Data Preprocessing 9 data = orange.Preprocessor_discretize(data, method=orange.EquiNDiscretization(numberOfIntervals=4)) 10 11 # Data Selection (We have range of 2) 12 data = data.select(range(2)) 13 14 # List of supports 15 iList = [0.1, 0.2, 0.3, 0.4] 16 17 for x in iList: 18 # Developing association rules from Orange 19 rules = orange.AssociationRulesInducer(data, support=x) 20 21 # if there is no association rule 22 if(len(rules) == 0): 23 print "No any association rules for support = %5.3f" % (x) 24 # if there exists an association rule 25 else: 26 print "%i rules with support = %5.3f found.n" % (len(rules), x) 27 orngAssoc.sort(rules, ["support", "confidence", "lift"]) 28 orngAssoc.printRules(rules[:(len(rules))], ["support", "confidence", "lift"]) 29 print The output of the above program is shown in Fig 8. 7
  • 8. Fig 8: Output of example2.py 2.3 Naive Bayes Classifier Using Python, we observe the working of Bayesian classifier from voting data set i.e. “voting.tab” and will use it to classify the first five instances from this data set. 1 # classifier.py 2 import orange 3 data = orange.ExampleTable("voting") 4 classifier = orange.BayesLearner(data) 5 for i in range(5): 6 c = classifier(data[i]) 7 print "%d: %s (originally %s)" % (i+1, c, data[i].getclass()) The script loads the data, uses it to constructs a classifier using naive Bayesian method, and then classifies first five instances of the data set. Naive Bayes made a mistake at a third instance, but otherwise predicted correctly as shown if the figure below. 8
  • 9. Fig 9: Output of classifier.py 2.4 Regression Following example uses both regression trees and k-nearest neighbors, and also uses a majority learner which for regression simply returns an average value from learning data set. 1 # regression2.py 2 import orange, orngTree, orngTest, orngStat 3 4 data = orange.ExampleTable("housing.tab") 5 selection = orange.MakeRandomIndices2(data, 0.5) 6 train_data = data.select(selection, 0) 7 test_data = data.select(selection, 1) 8 9 maj = orange.MajorityLearner(train_data) 10 maj.name = "default" 11 12 rt = orngTree.TreeLearner(train_data, measure="retis", mForPruning=2, minExamples=20) 13 rt.name = "reg. tree" 14 15 k = 5 16 knn = orange.kNNLearner(train_data, k=k) 17 knn.name = "k-NN (k=%i)" % k 18 19 regressors = [maj, rt, knn] 20 21 print "n%10s " % "original", 22 for r in regressors: 23 print "%10s " % r.name, 24 print 25 26 for i in range(10): 27 print "%10.1f " % test_data[i].getclass(), 28 for r in regressors: 29 print "%10.1f " % r(test_data[i]), 30 print The output of the above program is shown in Fig 10. Fig 10: Output of regression.py 9
  • 10. 2.5 K-Means Clustering Algorithm Let us use python to implement K-means clustering algorithm for the problem solved in the class i.e. K = 2 and array = [1,2,3,4,8,9,10,11]. 1 # test3.py 2 import numpy 3 import math 4 5 # Given Array of elements that needs to be clustered 6 iArray = [1.0, 2.0, 3.0, 4.0, 8.0, 9.0, 10.0, 11.0] 7 8 # Returns the value of the mean of an array elements 9 def meanArray(aArray): 10 icount = len(aArray) 11 iSum = 0 12 for x in aArray: 13 iSum = iSum + x 14 return (iSum/icount) 15 16 Count = len(iArray) 17 18 # Randomly select 2 elements 19 c1 = iArray[Count-2] 20 c2 = iArray[Count-1] 21 22 # Initial assumptions all classes null 23 Class1 = [0.0] 24 Class2 = [] 25 oldClass1 = [] 26 i = 1 27 28 # Loop exit condition 29 while (oldClass1 != Class1): 30 print "Iteration: " + str(i) 31 oldClass1 = Class1 32 Class1 = [] 33 Class2 = [] 34 for x in iArray: 35 if math.fabs(c1 - x) < math.fabs(c2 - x): 36 Class1.append(x) 37 else: 38 Class2.append(x) 39 print "Class1: " + str(Class1) 40 c1 = round(meanArray(Class1),1) 41 print "c1 = " + str(c1) 42 43 print "Class2: " + str(Class2) 44 c2 = round(meanArray(Class2),1) 45 print "c2 = " + str(c2) 46 print 47 i = i + 1 10
  • 11. Fig 11: Output of K-Means Clustering Algorithm Using Orange, we can easily implement K-Means Clustering algorithm and plot graph using the following code. 1 import orange 2 import orngClustering 3 import pylab 4 import random 5 6 # To plot the 2D-point 7 def plot_scatter(data, km, attx, atty, filename="kmeans-scatter", title=None): 8 """plot a data scatter plot with the position of centroids""" 9 pylab.rcParams.update({’font.size’: 8, ’figure.figsize’: [4,3]}) 10 11 # For the points 12 x = [float(d[attx]) for d in data] 13 y = [float(d[atty]) for d in data] 14 colors = ["c", "b"] 15 cs = "".join([colors[c] for c in km.clusters]) 16 pylab.scatter(x, y, c=cs, s=10) 17 18 # For the centroid points 19 xc = [float(d[attx]) for d in km.centroids] 20 yc = [float(d[atty]) for d in km.centroids] 21 pylab.scatter(xc, yc, marker="x", c="k", s=200) 22 23 pylab.xlabel(attx) 24 pylab.ylabel(atty) 25 if title: 26 pylab.title(title) 27 pylab.savefig("%s-%03d.png" % (filename, km.iteration)) 28 pylab.close() 29 30 def in_callback(km): 31 print "Iteration: %d, changes: %d" % (km.iteration, km.nchanges) 32 plot_scatter(data, km, "X", "Y", title="Iteration %d" % km.iteration) 33 34 # Read the data from table 35 data = orange.ExampleTable("data") 36 km = orngClustering.KMeans(data, 2, minscorechange=-1, maxiters=10, inner_callback=in_callback) 11
  • 12. The output of this program is shown below: Result of test.py Fig 12(a): During iteration 0 Fig 12(b): During iteration 1 12
  • 13. 3 References The following are the references taken as a help to prepare this manual: 1. http://en.wikipedia.org/wiki/Orange_(software) 2. http://orange.biolab.si/ 3. http://orange.biolab.si/nightly_builds.html 4. http://orange.biolab.si/doc/ofb-rst/genindex.html 13