SlideShare une entreprise Scribd logo
1  sur  47
RUBY AND R


Chang Sau Sheong
Director, Applied Research, HP Labs Singapore


1   © Copyright 2010 Hewlett-Packard Development Company, L.P.
About HP Labs



2   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS
– Exploratory and advanced
  research group for Hewlett-Packard
– Global organization that tackles
  complex challenges facing our
  customers and society over the next
  decade
– Pushes the frontiers of fundamental
  science
– HQ Palo Alto



3   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS AROUND THE WORLD

                                                                 Bristol   St. Petersburg

                                                                                 Beijing
           Palo Alto

                                                                             Bangalore

                      Haifa                                                 Singapore




4   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS SINGAPORE
– Set up in February 2010
– Focus on Cloud Computing
      Research                                                   Applied Research
            •   Exploratory research                              •   Applied Research
            •   Researchers                                       •   Innovators
            •   Change the state of the art                       •   Take the research to the next
                                                                      stage
            •   Working closely with the
                academic community                                •   Work closely with customers
                                                                      and business units



5   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Ruby and R



6   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Programming language and
    platform for statistical computing,
           licensed under GPL


7   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Strengths in
               statistical processing
                                                                 and
                          data visualization

8   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Extensive library of statistical
           computing packages (CRAN)
              written by statisticians



9   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Statistics is not just
                            for statisticians


10   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Recommendation                                                       Speech
         engine                                                         recognition
        Fingerprint         Spam detection
       identification
                    Card fraud Financial
         Face        detection forecasting
     recognition

                       Data                                       OCR      Credit scoring
                      mining
11   © Copyright 2010 Hewlett-Packard Development Company, L.P.
CRAN
– Almost 2000 packages, mostly created by
  statisticians
     • BiodiversityR                           – GUI for biodiversity and community ecology
       analysis
     • Emu – analyze speech patterns
     • GenABEL – study human genome
     • Quantmod– quantitative financial modeling framework
     • Ftrading – technical trading analysis
     • Cyclones – cyclone identification
     • DOSim – disease analysis toolkit for gene set
     • Agricolae – statistical procedures for agricultural research


12   © Copyright 2010 Hewlett-Packard Development Company, L.P.
EXAMPLE R CODE
– EPL data from football-data.co.uk
– Show home/away goals distribution for 201 season
                                           1




13   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Why Ruby and R?



14   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Stand on shoulders
                          of giants


15   © Copyright 2010 Hewlett-Packard Development Company, L.P.
–Ruby
     • Human   focused programming!
     • Better general purpose programming capabilities
     • Great                  frameworks!
     • Great                  libraries (20,000+ gems in RubyGems)
–R
     • Focus   on statistical computing/crunching
     • Lots of packages written by domain experts/
       statisticians
     • Great                  graphing libraries

16   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Ruby and R
                                                    integration


17   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RINRUBY
– 100% Ruby
– Uses pipes to send commands and evals
– Uses TCP/IP Sockets to send and retrieve data
– Pros:
     •   Doesn't requires anything but R
     •   Works flawlessly on Windows
     •   Work with Ruby 1.8, 1.9 and JRuby 1.5
     •   All API tested

– Cons:
     •   VERY SLOW in assigning
     •   Very limited datatypes: only Vector and Matrix
     •   Not released since 2009
     •   Poor documentation


18   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RSRUBY
– C Extension for Ruby, linked to R's shared library
– Pros:
     •   Blazing speed! 5-10 times faster than Rserve and 100-1000 than RinRuby.
     •   Seamless integration with Ruby. Every method and object is treated like a Ruby object

– Cons:
     •   Transformation between R and Ruby types aren't trivial
     •   Dependent on operating system, Ruby implementation and R version
     •   Not available for alternative implementations of Ruby (eg JRuby)
     •   Not released since 2009
     •   Poor documentation




19   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RSERVE
– 100% Ruby
– Uses TCP/IP sockets to interchange data and commands
– Requires Rserve installed on the server machine
– Access with Ruby uses Ruby-Rserve-Client library
– Pros:
     •   Work with Ruby 1.8, 1.9 and JRuby 1.5.
     •   Session allows to process data asynchronously
     •   Fast: 5-10 times faster than RinRuby
     •   Most recently updated (Jan 2011)

– Cons:
     •   Requires Rserve
     •   Limited features on Windows
     •   Poor documentation



20   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RAPACHE/RRACK
– Web service based
– Run R scripts as web services, consumed by Ruby front-end apps
– Pros:
     •   Modular and separate (no direct integration)
     •   Can be scalable, ‘cloud’-ready

– Cons:
     •   Requires Rapache/rRack
     •   rRack is very new (not accepted by CRAN yet, as of today!), requires R 2.13 (just
         released a few weeks ago)
     •   Rapache specific to Apache web server only
     •   Communications overhead for smaller integrations




21   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Let’s look at some
                                    code!
                                                  (I’m going to use Rserve)




22   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Text classification



23   © Copyright 2010 Hewlett-Packard Development Company, L.P.
TEXT CLASSIFICATION
–Automatically sorting a set of documents into
 different categories from a predefined set
–Classic uses:                                                    Training
                                                                                          Test data
     • Spam               filtering                                 data
     • Email              prioritization
                                                                             Classifier




                                                                             category


24   © Copyright 2010 Hewlett-Packard Development Company, L.P.
25   © Copyright 2010 Hewlett-Packard Development Company, L.P.
TEXT CLASSIFIER CODE

 Prepare




26   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Train classifier by counting frequency of
each word in the document




27   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Get word count




28   © Copyright 2010 Hewlett-Packard Development Company, L.P.
What you get
     {"check"=>1, "result"=>3, "marissa"=>1, "experi"=>1,
     "click"=>1, "engin"=>1, "simpli"=>1, "mistakenli"=>1,
     "pick"=>1, "prevent"=>1, "40"=>1, "regularli"=>1, "place"=>1,
     "user"=>5, "prefer"=>1, "malevol"=>1, "access"=>1,
     "robust"=>1, "servic"=>1, "fault"=>1, "malici"=>1, "list"=>2,
     "hand"=>1, "internet"=>1, "attribut"=>1, "instal"=>1,
     "file"=>1, "unabl"=>1, "vice"=>1, "stopbadwareorg"=>2,
     "merit"=>1, "decid"=>1, "flag"=>2, "saturdai"=>2, "hit"=>2,
     "offici"=>1, "error"=>3, "work"=>1, "site"=>5, "happen"=>2,
     "incid"=>1, "technic"=>1, "advis"=>1, "put"=>1, "human"=>3,
     "harm"=>2, "softwar"=>1, "ms"=>1, "affect"=>1, "carefulli"=>1,
     "product"=>1, "presid"=>1, "complaint"=>1, "potenti"=>2,
     "googl"=>6, "comput"=>2, "peopl"=>1, "investig"=>2,
     "consum"=>1, "danger"=>2, "period"=>1, "wrote"=>2,
     "search"=>7, "ascertain"=>1, "blog"=>1, "warn"=>2,
     "problem"=>1, "updat"=>2, "minut"=>1, "mayer"=>2}




29   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Generate training data for prediction




30   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Training data



31   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,sof
twar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,syst
em,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,wal
l,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,


                                                                     The top 25 most
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0

                                                                    frequent words in
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0


                                                                   the training dataset
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0



 32   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,sof
twar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,syst
em,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,wal
l,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,


                                                                       Each line
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0

                                                                     represents 1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0


                                                                   document trained
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0



 33   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site
,softwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,
system,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous
,wall,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0
,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0


                                                                    Categories set
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0
                                                                   when the classifier
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,


                                                                      is created
0,0,3,3,0,0,0,0,0,0,0,2,0,0
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0


 34   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,s
oftwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,sy
stem,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,w
all,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,


                                                                   Number indicates the
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,1

                                                                   number of times the
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0


                                                                   word appears in that
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,

                                                                        document
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0


 35   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Test data



36   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,micr
 osoft,site,softwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sha
 rpli,error,group,result,system,rebel,econom,presid,crisi,find,year,accus,g
 lobal,obama,china,civilian,shrink,hous,wall,street,quarter,white,heavi,leh
 man,economi,session,ey,time,davo,human
 category,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0
 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0

37   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Using different
                  classification models


38   © Copyright 2010 Hewlett-Packard Development Company, L.P.
NAÏVE BAYES




39   © Copyright 2010 Hewlett-Packard Development Company, L.P.
SVM




40   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RANDOM FOREST




41   © Copyright 2010 Hewlett-Packard Development Company, L.P.
NEURAL NETWORKS




42   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Using the classifier



43   © Copyright 2010 Hewlett-Packard Development Company, L.P.
44   © Copyright 2010 Hewlett-Packard Development Company, L.P.
45   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RESOURCES
– HP Labs Worldwide                                               – Rserve-Ruby-Client
http://www.hpl.hp.com/                                            https://github.com/clbustos/Rserve-
– R Project                                                       Ruby-client

http://www.r-project.org/                                         – rApache
– RsRuby                                                          http://rapache.net/index.html

https://github.com/alexgutteridge/rsrub                           – rRack
y                                                                 https://github.com/jeffreyhorner/rRack/
– RinRuby
http://rinruby.ddahl.org/
– Rserve
http://www.rforge.net/Rserve/


46   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Thank you

 sausheong@hp.com
 http://twitter.com/sausheong
 http://blog.saush.com
47   © Copyright 2010 Hewlett-Packard Development Company, L.P.

Contenu connexe

Similaire à Ruby and R

Python course in hyderabad
Python course in hyderabadPython course in hyderabad
Python course in hyderabadRevathiUppala
 
Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pigRavi Mutyala
 
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian FrankHP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian FrankBeMyApp
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Revolution Analytics
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
 
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataPatrickCrompton
 
Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006juliannacole
 
Helion meetup-2014
Helion meetup-2014Helion meetup-2014
Helion meetup-2014Bruno Cornec
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Inside Analysis
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopKrishna-Kumar
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4Ferdin Joe John Joseph PhD
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Mac Moore
 

Similaire à Ruby and R (20)

Evented programming
Evented programmingEvented programming
Evented programming
 
Python course in hyderabad
Python course in hyderabadPython course in hyderabad
Python course in hyderabad
 
Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pig
 
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian FrankHP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics?
 
Revolution Analytics Podcast
Revolution Analytics PodcastRevolution Analytics Podcast
Revolution Analytics Podcast
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
Reason To learn & use r
Reason To learn & use rReason To learn & use r
Reason To learn & use r
 
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big Data
 
iKariera 2015
iKariera 2015iKariera 2015
iKariera 2015
 
Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006
 
Helion meetup-2014
Helion meetup-2014Helion meetup-2014
Helion meetup-2014
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Pig programming is fun
Pig programming is funPig programming is fun
Pig programming is fun
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
 
HP and linux
HP and linuxHP and linux
HP and linux
 

Dernier

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 

Dernier (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

Ruby and R