SlideShare a Scribd company logo
1 of 27
Download to read offline
Data-driven modeling
                                           APAM E4990


                                           Jake Hofman

                                          Columbia University


                                         January 30, 2012




Jake Hofman   (Columbia University)        Data-driven modeling   January 30, 2012   1 / 23
Outline



      1 Digit recognition



      2 Image classification



      3 Acquiring image data




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   2 / 23
Digit recognition

          Classification is an supervised learning task by which we aim to
             predict the correct label for an example given its features




                                               ↓
                                  0 5 4 1 4 9
              e.g. determine which digit {0, 1, . . . , 9} is in depicted in each
                                         image


Jake Hofman    (Columbia University)    Data-driven modeling            January 30, 2012   3 / 23
Digit recognition

          Determine which digit {0, 1, . . . , 9} is in depicted in each image




Jake Hofman   (Columbia University)   Data-driven modeling          January 30, 2012   4 / 23
Images as arrays


               Grayscale images ↔ 2-d arrays of M × N pixel intensities




Jake Hofman   (Columbia University)   Data-driven modeling       January 30, 2012   5 / 23
Images as arrays

               Grayscale images ↔ 2-d arrays of M × N pixel intensities




          Represent each image as a “vector of pixels”, flattening the 2-d
                          array of pixels to a 1-d vector

Jake Hofman   (Columbia University)   Data-driven modeling       January 30, 2012   5 / 23
k-nearest neighbors classification
         k-nearest neighbors: memorize training examples, predict labels
                   using labels of the k closest training points




                          Intuition: nearby points have similar labels

Jake Hofman   (Columbia University)      Data-driven modeling            January 30, 2012   6 / 23
k-nearest neighbors classification




              Small k gives a complex boundary, large k results in coarse
                                     averaging


Jake Hofman    (Columbia University)   Data-driven modeling      January 30, 2012   7 / 23
k-nearest neighbors classification




                Evaluate performance on a held-out test set to assess
                                generalization error


Jake Hofman   (Columbia University)   Data-driven modeling      January 30, 2012   7 / 23
Digit recognition


       Simple digit classifer with k=1 nearest neighbors
          ./ classify_digits . py




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   8 / 23
Outline



      1 Digit recognition



      2 Image classification



      3 Acquiring image data




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   9 / 23
Image classification


                      Determine if an image is a landscape or headshot




                                            ↓                 ↓
                                       ’landscape’        ’headshot’

              Represent each image with a binned RGB intensity histogram




Jake Hofman    (Columbia University)         Data-driven modeling      January 30, 2012   10 / 23
Images as arrays
          Color images ↔ 3-d arrays of M × N × 3 RGB pixel intensities




       import matplotlib . image as mpimg
       I = mpimg . imread ( ' chairs . jpg ')

Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   11 / 23
Images as arrays
          Color images ↔ 3-d arrays of M × N × 3 RGB pixel intensities




       import matplotlib . image as mpimg
       I = mpimg . imread ( ' chairs . jpg ')


Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   11 / 23
Intensity histograms

        Disregard all spatial information, simply count pixels by intensities
               (e.g. lots of pixels with bright green and dark blue)




Jake Hofman   (Columbia University)   Data-driven modeling       January 30, 2012   12 / 23
Intensity histograms

                                How many bins for pixel intensities?




         Too many bins gives a noisy, overly complex representation of
                                  the data,
           while using too few bins results in an overly simple one


Jake Hofman   (Columbia University)        Data-driven modeling        January 30, 2012   13 / 23
Image classification

       Classify
           ./ classify_flickr . py 16 9
               flickr_headshot flickr_landscape




       Change in performance on test set with number of neighbors
       k      =   1,    accuracy      =   0.7125
       k      =   3,    accuracy      =   0.7425
       k      =   5,    accuracy      =   0.7725
       k      =   7,    accuracy      =   0.7650
       k      =   9,    accuracy      =   0.7500


Jake Hofman   (Columbia University)       Data-driven modeling   January 30, 2012   14 / 23
Outline



      1 Digit recognition



      2 Image classification



      3 Acquiring image data




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   15 / 23
Simple screen scraping




       One-liner to download ESL digit data
          wget - Nr -- level =1 --no - parent http :// www -
            stat . stanford . edu /~ tibs / ElemStatLearn /
            datasets / zip . digits




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   16 / 23
Simple screen scraping


       One-liner to scrape images from a webpage
          wget -O - http :// bit . ly / zxy0jN |
            tr ' ' ' ' " = ' 'n ' |
            egrep '^ http .*( png | jpg | gif ) ' |
            xargs wget




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   17 / 23
Simple screen scraping


       One-liner to scrape images from a webpage
          wget -O - http :// bit . ly / zxy0jN |
            tr ' ' ' ' " = ' 'n ' |
            egrep '^ http .*( png | jpg | gif ) ' |
            xargs wget



              • get page source
              • translate quotes and = to newlines
              • match urls with image extensions
              • download qualifying images



Jake Hofman    (Columbia University)   Data-driven modeling   January 30, 2012   17 / 23
“cat flickr xargs wget”?




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   18 / 23
Flickr API




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   19 / 23
YQL: SELECT * FROM Internet1




                                   http://developer.yahoo.com/yql




              1
                  http://oreillynet.com/pub/e/1369
Jake Hofman       (Columbia University)      Data-driven modeling   January 30, 2012   20 / 23
YQL: Console




                       http://developer.yahoo.com/yql/console


Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   21 / 23
YQL + Python
       Python function for public YQL queries
        def yql_public ( query , env = False ):
            # build dictionary of GET parameters
            params = { 'q ': query , ' format ': ' json '}
            if env :
                params [ ' env '] = env

                 # escape query
                 query_str = urlencode ( params )

                 # fetch results
                 url = '% s ?% s ' % ( YQL_PUBLIC , query_str )
                 result = urlopen ( url )

                 # parse json and return
                 return json . load ( result )[ ' query ' ][ ' results ']
Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   22 / 23
YQL + Python + Flickr


       Fetch info for “interestingness” photos
          ./ simpleyql . py ' select * from flickr . photos
            . interestingness (20) where api_key = " ... " '



       Download thumbnails for photos tagged with “vivid”
          ./ download_flickr . py vivid 500 < api_key >




Jake Hofman   (Columbia University)   Data-driven modeling   January 30, 2012   23 / 23

More Related Content

Viewers also liked

Chapter 1 market & marketing
Chapter 1 market & marketingChapter 1 market & marketing
Chapter 1 market & marketingHo Cao Viet
 
Maherprofessional c vbio216
Maherprofessional c vbio216Maherprofessional c vbio216
Maherprofessional c vbio216Pat Maher
 
Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...
Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...
Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...Rijksdienst voor Ondernemend Nederland
 
1172. su.-.pps_angelines
1172.  su.-.pps_angelines1172.  su.-.pps_angelines
1172. su.-.pps_angelinesfilipj2000
 
K state candidacy presentation
K state candidacy presentationK state candidacy presentation
K state candidacy presentationTeague Allen
 
Cost Reductions Process - Results
Cost Reductions Process - ResultsCost Reductions Process - Results
Cost Reductions Process - Resultsoferyuval
 
Successful Employee Engagement through Sharepoint OOTB tools
Successful Employee Engagement through Sharepoint OOTB toolsSuccessful Employee Engagement through Sharepoint OOTB tools
Successful Employee Engagement through Sharepoint OOTB toolsPrageeth Sandakalum
 
Towers of today
Towers of todayTowers of today
Towers of todayfilipj2000
 
Letter to arvind kejriwal dec 31 13 water issue
Letter to arvind kejriwal dec 31 13   water issueLetter to arvind kejriwal dec 31 13   water issue
Letter to arvind kejriwal dec 31 13 water issueMadhukar Varshney
 
PropSafe - property renting & mgmt solution
PropSafe - property renting & mgmt solutionPropSafe - property renting & mgmt solution
PropSafe - property renting & mgmt solutionRatnesh1979
 
Praias παραλίες
Praias   παραλίεςPraias   παραλίες
Praias παραλίεςfilipj2000
 
Internet Message
Internet MessageInternet Message
Internet Messagebearobo
 

Viewers also liked (20)

Chapter 1 market & marketing
Chapter 1 market & marketingChapter 1 market & marketing
Chapter 1 market & marketing
 
Orosz fotosok
Orosz fotosokOrosz fotosok
Orosz fotosok
 
Maherprofessional c vbio216
Maherprofessional c vbio216Maherprofessional c vbio216
Maherprofessional c vbio216
 
产品早期市场推广探路实践 by XDash
产品早期市场推广探路实践 by XDash产品早期市场推广探路实践 by XDash
产品早期市场推广探路实践 by XDash
 
Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...
Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...
Verkiezing schaduwburgemeester dorpscafé malden ruud van gisteren d.d. 3 3-20...
 
Creativity
CreativityCreativity
Creativity
 
Reuters2
Reuters2Reuters2
Reuters2
 
1172. su.-.pps_angelines
1172.  su.-.pps_angelines1172.  su.-.pps_angelines
1172. su.-.pps_angelines
 
K state candidacy presentation
K state candidacy presentationK state candidacy presentation
K state candidacy presentation
 
Cost Reductions Process - Results
Cost Reductions Process - ResultsCost Reductions Process - Results
Cost Reductions Process - Results
 
Successful Employee Engagement through Sharepoint OOTB tools
Successful Employee Engagement through Sharepoint OOTB toolsSuccessful Employee Engagement through Sharepoint OOTB tools
Successful Employee Engagement through Sharepoint OOTB tools
 
Towers of today
Towers of todayTowers of today
Towers of today
 
Letter to arvind kejriwal dec 31 13 water issue
Letter to arvind kejriwal dec 31 13   water issueLetter to arvind kejriwal dec 31 13   water issue
Letter to arvind kejriwal dec 31 13 water issue
 
PropSafe - property renting & mgmt solution
PropSafe - property renting & mgmt solutionPropSafe - property renting & mgmt solution
PropSafe - property renting & mgmt solution
 
Tearsofawoman
TearsofawomanTearsofawoman
Tearsofawoman
 
Praias παραλίες
Praias   παραλίεςPraias   παραλίες
Praias παραλίες
 
Grecia linda
Grecia lindaGrecia linda
Grecia linda
 
sibgrapi2015
sibgrapi2015sibgrapi2015
sibgrapi2015
 
Imp local
Imp localImp local
Imp local
 
Internet Message
Internet MessageInternet Message
Internet Message
 

More from jakehofman

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2jakehofman
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1jakehofman
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networksjakehofman
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classificationjakehofman
 
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationModeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationjakehofman
 
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1jakehofman
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scalejakehofman
 
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in Rjakehofman
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overviewjakehofman
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scalejakehofman
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studiesjakehofman
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Sciencejakehofman
 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classificationjakehofman
 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regressionjakehofman
 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experimentsjakehofman
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wranglingjakehofman
 
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIjakehofman
 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part Ijakehofman
 

More from jakehofman (20)

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networks
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classification
 
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationModeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalization
 
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scale
 
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in R
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overview
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayes
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scale
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studies
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Science
 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classification
 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regression
 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experiments
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wrangling
 
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part II
 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part I
 

Recently uploaded

An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxCeline George
 
Financial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdfFinancial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdfMinawBelay
 
Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024CapitolTechU
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfQucHHunhnh
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...Nguyen Thanh Tu Collection
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45MysoreMuleSoftMeetup
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxCapitolTechU
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Celine George
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesashishpaul799
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjMohammed Sikander
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17Celine George
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff17thcssbs2
 
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17Celine George
 
How to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 InventoryHow to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 InventoryCeline George
 
Discover the Dark Web .pdf InfosecTrain
Discover the Dark Web .pdf  InfosecTrainDiscover the Dark Web .pdf  InfosecTrain
Discover the Dark Web .pdf InfosecTraininfosec train
 
Behavioral-sciences-dr-mowadat rana (1).pdf
Behavioral-sciences-dr-mowadat rana (1).pdfBehavioral-sciences-dr-mowadat rana (1).pdf
Behavioral-sciences-dr-mowadat rana (1).pdfaedhbteg
 

Recently uploaded (20)

An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptx
 
Financial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdfFinancial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdf
 
Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
“O BEIJO” EM ARTE .
“O BEIJO” EM ARTE                       .“O BEIJO” EM ARTE                       .
“O BEIJO” EM ARTE .
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyes
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
 
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
 
How to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 InventoryHow to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 Inventory
 
Discover the Dark Web .pdf InfosecTrain
Discover the Dark Web .pdf  InfosecTrainDiscover the Dark Web .pdf  InfosecTrain
Discover the Dark Web .pdf InfosecTrain
 
Word Stress rules esl .pptx
Word Stress rules esl               .pptxWord Stress rules esl               .pptx
Word Stress rules esl .pptx
 
Behavioral-sciences-dr-mowadat rana (1).pdf
Behavioral-sciences-dr-mowadat rana (1).pdfBehavioral-sciences-dr-mowadat rana (1).pdf
Behavioral-sciences-dr-mowadat rana (1).pdf
 

Data-driven modeling: Lecture 02

  • 1. Data-driven modeling APAM E4990 Jake Hofman Columbia University January 30, 2012 Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 1 / 23
  • 2. Outline 1 Digit recognition 2 Image classification 3 Acquiring image data Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 2 / 23
  • 3. Digit recognition Classification is an supervised learning task by which we aim to predict the correct label for an example given its features ↓ 0 5 4 1 4 9 e.g. determine which digit {0, 1, . . . , 9} is in depicted in each image Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 3 / 23
  • 4. Digit recognition Determine which digit {0, 1, . . . , 9} is in depicted in each image Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 4 / 23
  • 5. Images as arrays Grayscale images ↔ 2-d arrays of M × N pixel intensities Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 5 / 23
  • 6. Images as arrays Grayscale images ↔ 2-d arrays of M × N pixel intensities Represent each image as a “vector of pixels”, flattening the 2-d array of pixels to a 1-d vector Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 5 / 23
  • 7. k-nearest neighbors classification k-nearest neighbors: memorize training examples, predict labels using labels of the k closest training points Intuition: nearby points have similar labels Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 6 / 23
  • 8. k-nearest neighbors classification Small k gives a complex boundary, large k results in coarse averaging Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 7 / 23
  • 9. k-nearest neighbors classification Evaluate performance on a held-out test set to assess generalization error Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 7 / 23
  • 10. Digit recognition Simple digit classifer with k=1 nearest neighbors ./ classify_digits . py Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 8 / 23
  • 11. Outline 1 Digit recognition 2 Image classification 3 Acquiring image data Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 9 / 23
  • 12. Image classification Determine if an image is a landscape or headshot ↓ ↓ ’landscape’ ’headshot’ Represent each image with a binned RGB intensity histogram Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 10 / 23
  • 13. Images as arrays Color images ↔ 3-d arrays of M × N × 3 RGB pixel intensities import matplotlib . image as mpimg I = mpimg . imread ( ' chairs . jpg ') Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 11 / 23
  • 14. Images as arrays Color images ↔ 3-d arrays of M × N × 3 RGB pixel intensities import matplotlib . image as mpimg I = mpimg . imread ( ' chairs . jpg ') Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 11 / 23
  • 15. Intensity histograms Disregard all spatial information, simply count pixels by intensities (e.g. lots of pixels with bright green and dark blue) Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 12 / 23
  • 16. Intensity histograms How many bins for pixel intensities? Too many bins gives a noisy, overly complex representation of the data, while using too few bins results in an overly simple one Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 13 / 23
  • 17. Image classification Classify ./ classify_flickr . py 16 9 flickr_headshot flickr_landscape Change in performance on test set with number of neighbors k = 1, accuracy = 0.7125 k = 3, accuracy = 0.7425 k = 5, accuracy = 0.7725 k = 7, accuracy = 0.7650 k = 9, accuracy = 0.7500 Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 14 / 23
  • 18. Outline 1 Digit recognition 2 Image classification 3 Acquiring image data Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 15 / 23
  • 19. Simple screen scraping One-liner to download ESL digit data wget - Nr -- level =1 --no - parent http :// www - stat . stanford . edu /~ tibs / ElemStatLearn / datasets / zip . digits Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 16 / 23
  • 20. Simple screen scraping One-liner to scrape images from a webpage wget -O - http :// bit . ly / zxy0jN | tr ' ' ' ' " = ' 'n ' | egrep '^ http .*( png | jpg | gif ) ' | xargs wget Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 17 / 23
  • 21. Simple screen scraping One-liner to scrape images from a webpage wget -O - http :// bit . ly / zxy0jN | tr ' ' ' ' " = ' 'n ' | egrep '^ http .*( png | jpg | gif ) ' | xargs wget • get page source • translate quotes and = to newlines • match urls with image extensions • download qualifying images Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 17 / 23
  • 22. “cat flickr xargs wget”? Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 18 / 23
  • 23. Flickr API Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 19 / 23
  • 24. YQL: SELECT * FROM Internet1 http://developer.yahoo.com/yql 1 http://oreillynet.com/pub/e/1369 Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 20 / 23
  • 25. YQL: Console http://developer.yahoo.com/yql/console Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 21 / 23
  • 26. YQL + Python Python function for public YQL queries def yql_public ( query , env = False ): # build dictionary of GET parameters params = { 'q ': query , ' format ': ' json '} if env : params [ ' env '] = env # escape query query_str = urlencode ( params ) # fetch results url = '% s ?% s ' % ( YQL_PUBLIC , query_str ) result = urlopen ( url ) # parse json and return return json . load ( result )[ ' query ' ][ ' results '] Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 22 / 23
  • 27. YQL + Python + Flickr Fetch info for “interestingness” photos ./ simpleyql . py ' select * from flickr . photos . interestingness (20) where api_key = " ... " ' Download thumbnails for photos tagged with “vivid” ./ download_flickr . py vivid 500 < api_key > Jake Hofman (Columbia University) Data-driven modeling January 30, 2012 23 / 23