SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Story of reCAPTCHA

Naga Chokkanathan
Remember This?
                                               • CAPTCHA
                                                 –   Completely
                                                 –   Automated
                                                 –   Public
                                                 –   Turing test to tell
                                                 –   Computers and
                                                 –   Humans
                                                 –   Apart
                                                             • Security for the website, Agreed
                                                                      • But for the real users?
                                                                               • BORING task
                                                                               • Waste of time


          Story of reCAPTCHA                                                          www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
CAPTCHA
           • Yahoo! popularized it first
           • Later, almost every website started using CAPTCHA to
             avoid automated attacks
           • Very effective : Only people can crack those word /
             image puzzles
           • But, it is a waste of time too
                    – Assuming you spend 10 seconds on a CAPTCHA
                    – Multiplied by 200 Million CAPTCHAs every day
                    – Thousands of hours being wasted on a daily basis
           • Can something be done about this? (1)


          Story of reCAPTCHA                                             www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
Another Problem
           • Digitizing Books
           • Process:
                    – Stage 1
                            • Scan
                            • Convert to image
                            • Save
                    – Stage 2
                            • Use OCR to convert
                              images to text
                            • Searchable Text




          Story of reCAPTCHA                       www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
OCR
           •      Optical Character Recognition
           •      Wonderful technology
           •      But not always reliable
           •      Especially with old text (due to ancient typeface,
                  damages, stains etc.,)




           • Can something be done about this? (2)

          Story of reCAPTCHA                                           www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
Possible Solutions
           • Manual Corrections
                    – Near Impossible
                    – VERY Expensive
           • Using multiple OCR Programs
                    – They will still make mistakes
                    – But not the same mistakes
                    – Hopefully!
           • Can something be done about this? (3)




          Story of reCAPTCHA                          www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
Crowd Sourcing
           • Each book contains 25000 words (Assume)
                    –    Can we split them to 25 people, each correcting 1000 words?
                    –    Or 50 people, each 500 words?
                    –    Or 100 people, each 250 words?
                    –    Or 2500 people, each 10 words?
                    –    Or 25000 people, each 1 word?
           • Sounds Stupid?
                    – Think again!




          Story of reCAPTCHA                                                  www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
Dr. Luis von Ahn




           •      Associate Professor @ Carnegie Mellon University
           •      Coined the word CAPTCHA
           •      Pioneer in the field of Crowdsourcing
           •      Founder of the company reCAPTCHA (Later acquired
                  by Google)

          Story of reCAPTCHA                                 www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
reCAPTCHA




          Story of reCAPTCHA                   www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
reCAPTCHA Process
           • Step 1 : Using multiple OCR Programs
                    – Accept Matching Words
                    – Use Dictionary
                    – Flag “Problematic” Words
           • Step 2 : reCAPTCHA
                    – Millions of users on various websites fill reCAPTCHA forms
                            • Proving they are not robots
                            • Proof reading text, One word at a time
                    – Similar entries are compared, before arriving at the final word




          Story of reCAPTCHA                                                  www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
How It Works

                   Flagged Word                          Control Word
                                                       (Real CAPTCHA)




           Remember “25000 people, Proof Reading 1 Word at a time”?
                          Not “Stupid” Anymore!


          Story of reCAPTCHA                                   www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
Few Statistics
           • 100M+ reCAPTCHAs every day
           • 96000+ Websites
                    – Most major websites use it
                            • Facebook, Twitter, CNN etc.,
           • Security concerns exist!




          Story of reCAPTCHA                                 www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
What We Can Do
           • Use reCAPTCHA instead of CAPTCHA in your
             websites, wherever required
                    – Registration Forms, Blogs, Forums etc.,
                    – Easy to use Widgets
           • Be proud when filling a reCAPTCHA form
                    – You are helping Google preserve books ☺




          Story of reCAPTCHA                                    www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
Applying Crowd Sourcing
           • Can it solve some of your existing problems?




          Story of reCAPTCHA                                www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
References, Image Credits
           •      https://www.youtube.com/watch?v=VoybhowC4LE
           •      http://www.nytimes.com/2011/03/29/science/29recaptcha.html?_r=1&
           •      http://techie-buzz.com/tech-news/recaptcha-crowdsourcing-ocr-google-
                  books.html
           •      http://www.google.com/recaptcha
           •      http://drupal.org/project/captcha
           •      http://www.captcha.net/
           •      http://www.brothersoft.com/cuneiform-ocr-4384.html
           •      http://www.compzets.com/view-upload.php?id=166&action=view
           •      http://en.wikipedia.org/




          Story of reCAPTCHA                                                    www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.
Thank you




          Story of reCAPTCHA                               www.crmit.com
© Copyright 2013 CRMIT. All rights reserved.

Contenu connexe

Similaire à Story of reCAPTCHA

Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...
Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...
Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...Loadzen
 
Metadata is a Love Note to the Future
Metadata is a Love Note to the FutureMetadata is a Love Note to the Future
Metadata is a Love Note to the FutureRachel Lovinger
 
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...Spark Summit
 
latest ppt in tranning
latest ppt in tranninglatest ppt in tranning
latest ppt in tranningRaj Fageria
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterJohn Adams
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamCraig Sullivan
 
Design for Scale / Surge 2010
Design for Scale / Surge 2010Design for Scale / Surge 2010
Design for Scale / Surge 2010Christopher Brown
 
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data PlatformStream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platformconfluent
 
Security Challenges in Emerging Technologies
Security Challenges in Emerging TechnologiesSecurity Challenges in Emerging Technologies
Security Challenges in Emerging TechnologiesSmart Assessment
 
Security Challenges in Emerging Technologies
Security Challenges in Emerging TechnologiesSecurity Challenges in Emerging Technologies
Security Challenges in Emerging TechnologiesPraveen Vackayil
 
Maintaining reliability in an unreliable world
Maintaining reliability in an unreliable worldMaintaining reliability in an unreliable world
Maintaining reliability in an unreliable worldJeremy Edberg
 
CAPTCHA- Newly Attractive Presentation for Youth
CAPTCHA- Newly Attractive Presentation for YouthCAPTCHA- Newly Attractive Presentation for Youth
CAPTCHA- Newly Attractive Presentation for YouthWebCrazyLabs
 
Ar design reality2018
Ar design reality2018Ar design reality2018
Ar design reality2018Anselm Hook
 
Security & Privacy in Cloud Computing
Security & Privacy in Cloud ComputingSecurity & Privacy in Cloud Computing
Security & Privacy in Cloud ComputingJohn D. Johnson
 
Bigdata analytics-twitter
Bigdata analytics-twitterBigdata analytics-twitter
Bigdata analytics-twitterdfilppi
 

Similaire à Story of reCAPTCHA (20)

Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...
Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...
Load testing, Lessons learnt and Loadzen - Martin Buhr at DevTank - 31st Janu...
 
Metadata is a Love Note to the Future
Metadata is a Love Note to the FutureMetadata is a Love Note to the Future
Metadata is a Love Note to the Future
 
Captchas
CaptchasCaptchas
Captchas
 
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
 
Lectures 11+12: Debates
Lectures 11+12: DebatesLectures 11+12: Debates
Lectures 11+12: Debates
 
latest ppt in tranning
latest ppt in tranninglatest ppt in tranning
latest ppt in tranning
 
The Future is Here
The Future is HereThe Future is Here
The Future is Here
 
Captcha
CaptchaCaptcha
Captcha
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion Jam
 
Design for Scale / Surge 2010
Design for Scale / Surge 2010Design for Scale / Surge 2010
Design for Scale / Surge 2010
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
 
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data PlatformStream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform
 
Security Challenges in Emerging Technologies
Security Challenges in Emerging TechnologiesSecurity Challenges in Emerging Technologies
Security Challenges in Emerging Technologies
 
Security Challenges in Emerging Technologies
Security Challenges in Emerging TechnologiesSecurity Challenges in Emerging Technologies
Security Challenges in Emerging Technologies
 
Maintaining reliability in an unreliable world
Maintaining reliability in an unreliable worldMaintaining reliability in an unreliable world
Maintaining reliability in an unreliable world
 
CAPTCHA- Newly Attractive Presentation for Youth
CAPTCHA- Newly Attractive Presentation for YouthCAPTCHA- Newly Attractive Presentation for Youth
CAPTCHA- Newly Attractive Presentation for Youth
 
Ar design reality2018
Ar design reality2018Ar design reality2018
Ar design reality2018
 
Security & Privacy in Cloud Computing
Security & Privacy in Cloud ComputingSecurity & Privacy in Cloud Computing
Security & Privacy in Cloud Computing
 
Bigdata analytics-twitter
Bigdata analytics-twitterBigdata analytics-twitter
Bigdata analytics-twitter
 

Plus de Naga Chokkanathan

வெல்லுவதோ இளமை: என். சொக்கன் உரை
வெல்லுவதோ இளமை: என். சொக்கன் உரைவெல்லுவதோ இளமை: என். சொக்கன் உரை
வெல்லுவதோ இளமை: என். சொக்கன் உரைNaga Chokkanathan
 
Zeo : the zero effort opportunity
Zeo : the zero effort opportunityZeo : the zero effort opportunity
Zeo : the zero effort opportunityNaga Chokkanathan
 
தமிழார்வலர்களும் செல்பேசிக் கணிமையும்
தமிழார்வலர்களும் செல்பேசிக் கணிமையும்தமிழார்வலர்களும் செல்பேசிக் கணிமையும்
தமிழார்வலர்களும் செல்பேசிக் கணிமையும்Naga Chokkanathan
 
மாண்புமிகு முந்திரி
மாண்புமிகு முந்திரிமாண்புமிகு முந்திரி
மாண்புமிகு முந்திரிNaga Chokkanathan
 
Religious and social reformers of india
Religious and social reformers of indiaReligious and social reformers of india
Religious and social reformers of indiaNaga Chokkanathan
 
Simple Presentations: A forgotten art
Simple Presentations: A forgotten artSimple Presentations: A forgotten art
Simple Presentations: A forgotten artNaga Chokkanathan
 
Climbing, swimming, running and few random thoughts
Climbing, swimming, running and few random thoughtsClimbing, swimming, running and few random thoughts
Climbing, swimming, running and few random thoughtsNaga Chokkanathan
 
பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)
பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)
பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)Naga Chokkanathan
 
Lemonade and Salad By N. Nangai
Lemonade and Salad By N. NangaiLemonade and Salad By N. Nangai
Lemonade and Salad By N. NangaiNaga Chokkanathan
 
Too Much Noise (Remixed By nchokkan@gmail.com)
Too Much Noise (Remixed By nchokkan@gmail.com)Too Much Noise (Remixed By nchokkan@gmail.com)
Too Much Noise (Remixed By nchokkan@gmail.com)Naga Chokkanathan
 
CRMIT : Oracle CRM On Demand to Fusion CRM Migration success story
CRMIT : Oracle CRM On Demand to Fusion CRM Migration success storyCRMIT : Oracle CRM On Demand to Fusion CRM Migration success story
CRMIT : Oracle CRM On Demand to Fusion CRM Migration success storyNaga Chokkanathan
 

Plus de Naga Chokkanathan (20)

வெல்லுவதோ இளமை: என். சொக்கன் உரை
வெல்லுவதோ இளமை: என். சொக்கன் உரைவெல்லுவதோ இளமை: என். சொக்கன் உரை
வெல்லுவதோ இளமை: என். சொக்கன் உரை
 
Developer Discipline
Developer DisciplineDeveloper Discipline
Developer Discipline
 
Zeo : the zero effort opportunity
Zeo : the zero effort opportunityZeo : the zero effort opportunity
Zeo : the zero effort opportunity
 
Friend-Detector HD
Friend-Detector HDFriend-Detector HD
Friend-Detector HD
 
What We Eat: Watch Out
What We Eat: Watch OutWhat We Eat: Watch Out
What We Eat: Watch Out
 
தமிழார்வலர்களும் செல்பேசிக் கணிமையும்
தமிழார்வலர்களும் செல்பேசிக் கணிமையும்தமிழார்வலர்களும் செல்பேசிக் கணிமையும்
தமிழார்வலர்களும் செல்பேசிக் கணிமையும்
 
A lie saves a life
A lie saves a lifeA lie saves a life
A lie saves a life
 
Farmer Finds a Friend
Farmer Finds a FriendFarmer Finds a Friend
Farmer Finds a Friend
 
Friend-Detector
Friend-DetectorFriend-Detector
Friend-Detector
 
Brave Bhumika's Adventure
Brave Bhumika's AdventureBrave Bhumika's Adventure
Brave Bhumika's Adventure
 
மாண்புமிகு முந்திரி
மாண்புமிகு முந்திரிமாண்புமிகு முந்திரி
மாண்புமிகு முந்திரி
 
Religious and social reformers of india
Religious and social reformers of indiaReligious and social reformers of india
Religious and social reformers of india
 
Simple Presentations: A forgotten art
Simple Presentations: A forgotten artSimple Presentations: A forgotten art
Simple Presentations: A forgotten art
 
Climbing, swimming, running and few random thoughts
Climbing, swimming, running and few random thoughtsClimbing, swimming, running and few random thoughts
Climbing, swimming, running and few random thoughts
 
பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)
பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)
பாட்டிக்குப் பிறந்த நாள் (என். சொக்கன்)
 
Space Boy (N. Nangai)
Space Boy (N. Nangai)Space Boy (N. Nangai)
Space Boy (N. Nangai)
 
Lemonade and Salad By N. Nangai
Lemonade and Salad By N. NangaiLemonade and Salad By N. Nangai
Lemonade and Salad By N. Nangai
 
Too Much Noise (Remixed By nchokkan@gmail.com)
Too Much Noise (Remixed By nchokkan@gmail.com)Too Much Noise (Remixed By nchokkan@gmail.com)
Too Much Noise (Remixed By nchokkan@gmail.com)
 
Mobile UX
Mobile UXMobile UX
Mobile UX
 
CRMIT : Oracle CRM On Demand to Fusion CRM Migration success story
CRMIT : Oracle CRM On Demand to Fusion CRM Migration success storyCRMIT : Oracle CRM On Demand to Fusion CRM Migration success story
CRMIT : Oracle CRM On Demand to Fusion CRM Migration success story
 

Dernier

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Story of reCAPTCHA

  • 1. Story of reCAPTCHA Naga Chokkanathan
  • 2. Remember This? • CAPTCHA – Completely – Automated – Public – Turing test to tell – Computers and – Humans – Apart • Security for the website, Agreed • But for the real users? • BORING task • Waste of time Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 3. CAPTCHA • Yahoo! popularized it first • Later, almost every website started using CAPTCHA to avoid automated attacks • Very effective : Only people can crack those word / image puzzles • But, it is a waste of time too – Assuming you spend 10 seconds on a CAPTCHA – Multiplied by 200 Million CAPTCHAs every day – Thousands of hours being wasted on a daily basis • Can something be done about this? (1) Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 4. Another Problem • Digitizing Books • Process: – Stage 1 • Scan • Convert to image • Save – Stage 2 • Use OCR to convert images to text • Searchable Text Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 5. OCR • Optical Character Recognition • Wonderful technology • But not always reliable • Especially with old text (due to ancient typeface, damages, stains etc.,) • Can something be done about this? (2) Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 6. Possible Solutions • Manual Corrections – Near Impossible – VERY Expensive • Using multiple OCR Programs – They will still make mistakes – But not the same mistakes – Hopefully! • Can something be done about this? (3) Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 7. Crowd Sourcing • Each book contains 25000 words (Assume) – Can we split them to 25 people, each correcting 1000 words? – Or 50 people, each 500 words? – Or 100 people, each 250 words? – Or 2500 people, each 10 words? – Or 25000 people, each 1 word? • Sounds Stupid? – Think again! Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 8. Dr. Luis von Ahn • Associate Professor @ Carnegie Mellon University • Coined the word CAPTCHA • Pioneer in the field of Crowdsourcing • Founder of the company reCAPTCHA (Later acquired by Google) Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 9. reCAPTCHA Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 10. reCAPTCHA Process • Step 1 : Using multiple OCR Programs – Accept Matching Words – Use Dictionary – Flag “Problematic” Words • Step 2 : reCAPTCHA – Millions of users on various websites fill reCAPTCHA forms • Proving they are not robots • Proof reading text, One word at a time – Similar entries are compared, before arriving at the final word Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 11. How It Works Flagged Word Control Word (Real CAPTCHA) Remember “25000 people, Proof Reading 1 Word at a time”? Not “Stupid” Anymore! Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 12. Few Statistics • 100M+ reCAPTCHAs every day • 96000+ Websites – Most major websites use it • Facebook, Twitter, CNN etc., • Security concerns exist! Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 13. What We Can Do • Use reCAPTCHA instead of CAPTCHA in your websites, wherever required – Registration Forms, Blogs, Forums etc., – Easy to use Widgets • Be proud when filling a reCAPTCHA form – You are helping Google preserve books ☺ Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 14. Applying Crowd Sourcing • Can it solve some of your existing problems? Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 15. References, Image Credits • https://www.youtube.com/watch?v=VoybhowC4LE • http://www.nytimes.com/2011/03/29/science/29recaptcha.html?_r=1& • http://techie-buzz.com/tech-news/recaptcha-crowdsourcing-ocr-google- books.html • http://www.google.com/recaptcha • http://drupal.org/project/captcha • http://www.captcha.net/ • http://www.brothersoft.com/cuneiform-ocr-4384.html • http://www.compzets.com/view-upload.php?id=166&action=view • http://en.wikipedia.org/ Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.
  • 16. Thank you Story of reCAPTCHA www.crmit.com © Copyright 2013 CRMIT. All rights reserved.