SlideShare une entreprise Scribd logo
1  sur  10
Accentuate Us!
  Michael Schade
 December 2, 2010
http://accentuate.us/talks
Keyboard Input
• Lack appropriate input methods
• Electronic texts often entered as plain ASCII
   o Transliteration       Cherokee ᏴᏴᏴᏴᏴ →
     galvquodiyu
   o Omitting diacritics Lingala      likɔngá → likonga
   o Ad hoc approaches Irish          béal    → be/al
• Diacritics matter!
• Omission leads to ambiguities, misunderstandings
   o leite vs. léite
Statistical Machine Learning
• Classification problem
• Machine learning
• Never-before seen words
   o French: "cera" vs. "cerc," "cabl" vs. "cabo"
   o Under-resourced languages
• 114 trained languages!
API
• Protocol: JSON
• Calls
  o langs
  o lift
  o feedback
• Sample Call
   o   { "call": "charlifter.lift"
         , "lang": "ht"
         , "text": "Bon, la fe sa apre demen pito, le la we mwen andey."
         , "locale": "ht"
       }
• Full documentation at http://accentuate.us/api
Service Architecture
API Servers




Load-Balancing Proxy




Clients
HTTP Communication (Proxy)
Cache-Control: no-cache
Connection: keep-alive
Pragma: no-cache
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept-Encoding: gzip,deflate
Accept-Language: en-us,en;q=0.5
Host: ht.api.accentuate.us:8080
User-Agent: Accentuate.us/0.9b3 Mozilla/5.0 (Windows; U; Windows NT
6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.1
Content-Length: 113
Content-Type: application/json; charset=utf-8
Keep-Alive: 115

{"call":"charlifter.lift","lang":"ht","text":"Bon, la fe sa apre demen pito, le la
we mwen andey.","locale":"ht"}
HTTP Communication (API)
Cache-Control: no-cache
Connection: close
Pragma: no-cache
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept-Encoding: gzip,deflate
Accept-Language: en-us,en;q=0.5
Host: ht
User-Agent: Accentuate.us/distribution
Content-Length: 113
Content-Type: application/json; charset=utf-8

{"call":"charlifter.lift","lang":"ht","text":"Bon, la fe sa apre demen pito, le la we
mwen andey.","locale":"ht"}
Demos




        (and a sneak preview of 1.0!)
Thank You!

Contenu connexe

En vedette

樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.tw
樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.tw樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.tw
樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.twZorba Ben
 
FIVE MARKETING KEYS INC.
FIVE MARKETING KEYS INC.FIVE MARKETING KEYS INC.
FIVE MARKETING KEYS INC.Rham Lasay
 
อุทยานแห่งชาติ ภูกระดึง
อุทยานแห่งชาติ ภูกระดึงอุทยานแห่งชาติ ภูกระดึง
อุทยานแห่งชาติ ภูกระดึงpipatchai
 
FIVE MARKETING KEYS REALTY INC.
FIVE MARKETING KEYS REALTY INC.FIVE MARKETING KEYS REALTY INC.
FIVE MARKETING KEYS REALTY INC.Rham Lasay
 
творческий отчёт 2008
творческий отчёт 2008творческий отчёт 2008
творческий отчёт 2008mguseva1
 
Музей интересных коллекций
Музей интересных коллекцийМузей интересных коллекций
Музей интересных коллекцийmguseva1
 
FRT Report 2016 Published-The Pulse of Technology
FRT Report 2016 Published-The Pulse of TechnologyFRT Report 2016 Published-The Pulse of Technology
FRT Report 2016 Published-The Pulse of TechnologyPeter Zehren, XMPA (LION)
 

En vedette (11)

樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.tw
樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.tw樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.tw
樂活生命文化學系 全版簡介 首頁網址:lohas.fguweb.fgu.edu.tw
 
FIVE MARKETING KEYS INC.
FIVE MARKETING KEYS INC.FIVE MARKETING KEYS INC.
FIVE MARKETING KEYS INC.
 
อุทยานแห่งชาติ ภูกระดึง
อุทยานแห่งชาติ ภูกระดึงอุทยานแห่งชาติ ภูกระดึง
อุทยานแห่งชาติ ภูกระดึง
 
FIVE MARKETING KEYS REALTY INC.
FIVE MARKETING KEYS REALTY INC.FIVE MARKETING KEYS REALTY INC.
FIVE MARKETING KEYS REALTY INC.
 
Accentuate Us!
Accentuate Us!Accentuate Us!
Accentuate Us!
 
feasib
feasibfeasib
feasib
 
творческий отчёт 2008
творческий отчёт 2008творческий отчёт 2008
творческий отчёт 2008
 
Музей интересных коллекций
Музей интересных коллекцийМузей интересных коллекций
Музей интересных коллекций
 
Peter zehren ~ ftr 2015.power in palm
Peter zehren ~ ftr 2015.power in palmPeter zehren ~ ftr 2015.power in palm
Peter zehren ~ ftr 2015.power in palm
 
SMART Response PGO
SMART Response PGOSMART Response PGO
SMART Response PGO
 
FRT Report 2016 Published-The Pulse of Technology
FRT Report 2016 Published-The Pulse of TechnologyFRT Report 2016 Published-The Pulse of Technology
FRT Report 2016 Published-The Pulse of Technology
 

Similaire à Accentuate Us!: Lightning Talk

Unknown features of PHP
Unknown features of PHPUnknown features of PHP
Unknown features of PHPsquid_zce
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBaseJosh Elser
 
Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...
Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...
Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...PyData
 
James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...
James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...
James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...PyData
 
Midwest php 2013 deploying php on paas- why & how
Midwest php 2013   deploying php on paas- why & howMidwest php 2013   deploying php on paas- why & how
Midwest php 2013 deploying php on paas- why & howdotCloud
 
HBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBaseHBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBaseMichael Stack
 
Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?Docker, Inc.
 

Similaire à Accentuate Us!: Lightning Talk (20)

Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Unknown features of PHP
Unknown features of PHPUnknown features of PHP
Unknown features of PHP
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
 
Prersentation
PrersentationPrersentation
Prersentation
 
Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...
Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...
Ferry - Share and Deploy Big Data Applications with Docker by James Horey PyD...
 
Pydata2014
Pydata2014Pydata2014
Pydata2014
 
James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...
James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...
James Horey (OpenCore.io) Ferry - Share and Deploy Big Data Applications with...
 
Php
PhpPhp
Php
 
Php
PhpPhp
Php
 
Php
PhpPhp
Php
 
Midwest php 2013 deploying php on paas- why & how
Midwest php 2013   deploying php on paas- why & howMidwest php 2013   deploying php on paas- why & how
Midwest php 2013 deploying php on paas- why & how
 
HBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBaseHBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBase
 
Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?Deploying PHP on PaaS: Why and How?
Deploying PHP on PaaS: Why and How?
 
Intro
IntroIntro
Intro
 

Dernier

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Dernier (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

Accentuate Us!: Lightning Talk

  • 1. Accentuate Us! Michael Schade December 2, 2010
  • 3. Keyboard Input • Lack appropriate input methods • Electronic texts often entered as plain ASCII o Transliteration Cherokee ᏴᏴᏴᏴᏴ → galvquodiyu o Omitting diacritics Lingala likɔngá → likonga o Ad hoc approaches Irish béal → be/al • Diacritics matter! • Omission leads to ambiguities, misunderstandings o leite vs. léite
  • 4. Statistical Machine Learning • Classification problem • Machine learning • Never-before seen words o French: "cera" vs. "cerc," "cabl" vs. "cabo" o Under-resourced languages • 114 trained languages!
  • 5. API • Protocol: JSON • Calls o langs o lift o feedback • Sample Call o { "call": "charlifter.lift" , "lang": "ht" , "text": "Bon, la fe sa apre demen pito, le la we mwen andey." , "locale": "ht" } • Full documentation at http://accentuate.us/api
  • 7. HTTP Communication (Proxy) Cache-Control: no-cache Connection: keep-alive Pragma: no-cache Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Accept-Encoding: gzip,deflate Accept-Language: en-us,en;q=0.5 Host: ht.api.accentuate.us:8080 User-Agent: Accentuate.us/0.9b3 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.1 Content-Length: 113 Content-Type: application/json; charset=utf-8 Keep-Alive: 115 {"call":"charlifter.lift","lang":"ht","text":"Bon, la fe sa apre demen pito, le la we mwen andey.","locale":"ht"}
  • 8. HTTP Communication (API) Cache-Control: no-cache Connection: close Pragma: no-cache Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Accept-Encoding: gzip,deflate Accept-Language: en-us,en;q=0.5 Host: ht User-Agent: Accentuate.us/distribution Content-Length: 113 Content-Type: application/json; charset=utf-8 {"call":"charlifter.lift","lang":"ht","text":"Bon, la fe sa apre demen pito, le la we mwen andey.","locale":"ht"}
  • 9. Demos (and a sneak preview of 1.0!)

Notes de l'éditeur

  1. Name\n19-year-old entrepreneur, student at Saint Louis University\nCo-found Spearhead with mom, Accentuate.us with Kevin Scannell of SLU\n
  2. Expanded, 45-minute version online\n\nGoing to start with background, architecture, and finally some demos\n
  3. - 90% loss!\n\n- Irrevocable loss\n- Each is a repository of the culture, traditions, and world view\n- Akin to extinction of animal or plant species\n\n- They’re looking to the Internet and technology for that.\n\nSo, let’s help!\n
  4. - Even Unicode-encoded languages often lack appropriate input methods\n\n- Identified problem: keyboard input\n
  5. - Every character that allows a diacritic is a classification problem\n\n- trained with corpus of texts with diacritics\n\n- Never-before seen words: statistics of 3-character sequences in a neighborhood of the character in question\n
  6. Simple: only three calls\n\n- Langs: get languages & localizations\n- Lift: accentuate text (legacy)\n- Feedback: add to corpora, improve models\n
  7. Clients send requests to load-balancing proxy 'distribution center"\n\nProxy\n    - Load balances across same-language API servers\n    - Allows quick management of servers–no DNS propagation time!\n    - Increases privacy (masks real UA, IP)\n\nAPI servers ran by language communities!\n    - Makes keeping it free doable\n    - Helps learn technology \n    - Distributed to language hot spots (French servers for French-using zones, etc.).\n
  8. Firefox API request\n\nBlue text is most important to proxy server!\n\nInformation in headers so we don’t unpack body\n\nUA must start with "Accentuate.us/version"\n    - Analytics\n    - Mismatch resolution\n    - Spam prevention\n
  9. Accentuate the differences: API server receives less information!\n\nClient is not identifiable based on:\n\n- UA\n- Host\n- IP \n\nBlue parts are what is different from API request\n
  10. Emacs users: stand your ground!\n\nVersion 1.0: early alpha; will\n\n- Grab context words\n- Modularize processing\n
  11. \n