SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
Sheldon Kreger
Web Engineer
Five Algorithms Every Web Developer Can Use and
Understand
Make state-of-the-art algorithms
accessible and discoverable by
everyone.
Sample algorithms
● Text Analysis summarizer, sentence tagger, profanity detection
● Machine Learning digit recognizer, recommendation engines
● Web crawler, scraper, pagerank, emailer, html to text
● Computer Vision image similarity, face detection, smile detection
● Audio & Video speech recognition, sound filters, file conversions
● Computation linear regression, spike detection, fourier filter
● Graph traveling salesman, maze generator, theta star
● Utilities parallel for-each, geographic distance, email validator
A marketplace for algorithms...
We host algorithms
Anyone can turn their algorithms into scalable web services
Typical users: scientists, academics, domain experts
We make them discoverable
Anyone can use and integrate these algorithms into their solutions
Typical users: businesses, data scientists, app developers, IoT makers
We make them monetizable
Users of algorithms pay for algorithms they use
Typical scenarios: heavy-load use cases with large user base
+ CLIENTS
Why?
Create something bigger
Easily combine algorithms like building blocks, regardless of language
Growing Catalogue of Algorithms
New algorithms everyday, make them usable by software developers
Make applications smarter
Smarter algorithms = cooler toys
The Five Algorithms
- Sentiment Analysis
- Language Detection
- PageRank
- Nudity Detection
- Term Frequency-Inverse
Document Frequency
Sentiment Analysis - Practical Applications
Businesses frequently seek feedback
on the quality of their products
from consumers, and large
amounts of reviews require too
much time to manually review.
Data can be used to in various
forecasting applications, such as
political elections.
Sentiment Analysis - The Math
Basic sentiment analysis uses natural language
processing (NLP), via a “bag of words”, to
spot keywords that are signs of strong
emotional triggers. Once spotted, they
classify a document as positive, negative, or
neutral.
Statements can be dual in nature, such as “I
loved the food, BUT hated the service”. This
requires more advanced algorithms to
separate the two.
PageRank - Practical Applications
The assumption is the more inbound links to a
page across the web, the more valid its
content.
Most famous application of PageRank is the
Google search engine. Its initial success is
based largely on the success of PageRank.
Not only web pages can utilize PageRank. Any
data that can be directionally modeled can
interact with PageRank.
PageRank - Graph Terminology
Node (vertex): Item in graph.
Edge: Relationship between two or more
nodes.
Directionality: Property of an edge indicating
nature of relationship.
PageRank - The Math
Nudity Detection - Practical Applications
Nudity detection algorithms minimize the need for manual moderation and deletion
of malicious content.
In a CMS, this algorithm can help prevent pornography from being uploaded by
users.
Nudity Detection - The Math
1. Detect skin-colored pixels in the image.
2. Locate skin regions based on the detected pixels.
3. Detect face in image.
4. Calculate ratio of skin toned vs non-skin toned pixels in image, taking into
account the size of the face.
5. Classify the image as nude or not.
More information at: https://algorithmia.com/algorithms/sfw/NudityDetection
Nudity Detection - Vanilla PHP
Nudity Detection PHP
TF/IDF - Practical Applications
- Keyword extraction is used in search engines, and content
categorization algorithms.
- Creates great content recommendations!
- https://drupal.org/project/algorithmia
- https://wordpress.org/plugins/algorithmia
- https://algorithmia.com/recommends
TF/IDF - The Math
TF-IDF computes a weight, recognizing the
importance of a term inside a document,
comparing its usage frequency in the
document set.
The more a term appears, the higher its
importance becomes.
Thanks toothpastefordinner.com for the comic.
TF/IDF - The Math
Assume you have a 100 word blog post with the word "JavaScript" in it 5 times.
Term Frequency = 5/100 = 0.05
Also assume your entire collection of blog posts has 10,000 documents, and the
word "JavaScript" appears at least once in 100 of these.
Inverse Document Frequency = log(10,000/100) = 2
For this document, this gives us the score:
TF-IDF = 0.05 * 2 = 0.1
Language Detection - Practical Applications
Applications
- Web searching, as engines bring up sites in dozens of languages.
- May be required in conjunction with other Natural Language Processing (NLP)
algorithms. Data sets may include documents in other languages. Some
algorithms will only work in their natural language due to their training data.
- Spam filtering services, so they can properly filter out specific languages and areas
of origin.
Language Detection - The Math
Each language has a corpus at its core, a central pattern of components that uniquely
identifies it.
Profiling algorithms are used to set a core set of words to identify that language.
The problem is not all text is long enough to identify a language.
Instead, using the 3-gram algorithm via Algorithmia API, one HTTP request can
break down detection by looking at groups of 3 letters.
Algorithmia Credits
Sign up at https://algorithmia.com
Use code: FiveAlgorithmsBook
10,000k additional free API credits.

Contenu connexe

En vedette

Teorias organizativas
Teorias organizativasTeorias organizativas
Teorias organizativasnkfm1
 
Storware vProtect - simplified data protection for virtual environments
Storware vProtect - simplified data protection for virtual environmentsStorware vProtect - simplified data protection for virtual environments
Storware vProtect - simplified data protection for virtual environmentsPawel Maczka
 
Magnetic guidance in surgery
Magnetic guidance in surgeryMagnetic guidance in surgery
Magnetic guidance in surgeryArshdeep Singh
 
Mifb 2017 sales deck v5
Mifb 2017 sales deck v5Mifb 2017 sales deck v5
Mifb 2017 sales deck v5Esther Low
 
5 mesesiitos a tu lado!
5 mesesiitos a tu lado!5 mesesiitos a tu lado!
5 mesesiitos a tu lado!mac0330
 
JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31
JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31
JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31vinhbinh2010
 
Pertumbuhan Ekonomi
Pertumbuhan EkonomiPertumbuhan Ekonomi
Pertumbuhan Ekonominova147
 
Gametogenesis
GametogenesisGametogenesis
Gametogenesisnova147
 
SAM Presentation
SAM PresentationSAM Presentation
SAM PresentationHeini Savio
 
The Renovation of One Haworth Center
The Renovation of One Haworth CenterThe Renovation of One Haworth Center
The Renovation of One Haworth CenterTina Tilton
 
Robotique domestique
Robotique domestiqueRobotique domestique
Robotique domestiqueSamy Jemai
 

En vedette (16)

Teorias organizativas
Teorias organizativasTeorias organizativas
Teorias organizativas
 
PointFix marketing plan
PointFix marketing planPointFix marketing plan
PointFix marketing plan
 
Storware vProtect - simplified data protection for virtual environments
Storware vProtect - simplified data protection for virtual environmentsStorware vProtect - simplified data protection for virtual environments
Storware vProtect - simplified data protection for virtual environments
 
Magnetic guidance in surgery
Magnetic guidance in surgeryMagnetic guidance in surgery
Magnetic guidance in surgery
 
FORMATO PARA FICHAS O MÓDULO
FORMATO PARA FICHAS O MÓDULOFORMATO PARA FICHAS O MÓDULO
FORMATO PARA FICHAS O MÓDULO
 
Mifb 2017 sales deck v5
Mifb 2017 sales deck v5Mifb 2017 sales deck v5
Mifb 2017 sales deck v5
 
Legislación Laboral
Legislación Laboral Legislación Laboral
Legislación Laboral
 
Cup&Cino
Cup&Cino Cup&Cino
Cup&Cino
 
5 mesesiitos a tu lado!
5 mesesiitos a tu lado!5 mesesiitos a tu lado!
5 mesesiitos a tu lado!
 
JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31
JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31
JANUARY 2017 - Pictures of the day - Jan 25 - Jan 31
 
Pertumbuhan Ekonomi
Pertumbuhan EkonomiPertumbuhan Ekonomi
Pertumbuhan Ekonomi
 
Gametogenesis
GametogenesisGametogenesis
Gametogenesis
 
18rf 31170-ts-003 (screws)
18rf 31170-ts-003 (screws)18rf 31170-ts-003 (screws)
18rf 31170-ts-003 (screws)
 
SAM Presentation
SAM PresentationSAM Presentation
SAM Presentation
 
The Renovation of One Haworth Center
The Renovation of One Haworth CenterThe Renovation of One Haworth Center
The Renovation of One Haworth Center
 
Robotique domestique
Robotique domestiqueRobotique domestique
Robotique domestique
 

Similaire à Five Algorithms Every Web Developer Can Understand

Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial IntelligenceEnes Bolfidan
 
Machine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeMachine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeAndrey Karpov
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
GDSC Machine Learning Session Presentation
GDSC Machine Learning Session PresentationGDSC Machine Learning Session Presentation
GDSC Machine Learning Session Presentationgdsclavasa
 
Classification with R
Classification with RClassification with R
Classification with RNajima Begum
 
Екатерина Гордиенко (Serpstat)
Екатерина Гордиенко (Serpstat)Екатерина Гордиенко (Serpstat)
Екатерина Гордиенко (Serpstat)Octopus Events
 
How can we use LangChain for Data Analysis_ A Detailed Perspective.pdf
How can we use LangChain for Data Analysis_ A Detailed Perspective.pdfHow can we use LangChain for Data Analysis_ A Detailed Perspective.pdf
How can we use LangChain for Data Analysis_ A Detailed Perspective.pdfBluebash LLC
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Top 5 Machine Learning Tools for Software Development in 2024.pdf
Top 5 Machine Learning Tools for Software Development in 2024.pdfTop 5 Machine Learning Tools for Software Development in 2024.pdf
Top 5 Machine Learning Tools for Software Development in 2024.pdfPolyxer Systems
 
Imtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsImtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsimtiaz khan
 
Machine learning and TensorFlow
Machine learning and TensorFlowMachine learning and TensorFlow
Machine learning and TensorFlowJose Papo, MSc
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)Leslie McFarlin
 
unleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdf
unleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdfunleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdf
unleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdfData & Analytics Magazin
 
Data mining for_java_and_dot_net 2016-17
Data mining for_java_and_dot_net 2016-17Data mining for_java_and_dot_net 2016-17
Data mining for_java_and_dot_net 2016-17redpel dot com
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptxrandominfo
 

Similaire à Five Algorithms Every Web Developer Can Understand (20)

Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Machine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeMachine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source Code
 
Transform unstructured e&p information
Transform unstructured e&p informationTransform unstructured e&p information
Transform unstructured e&p information
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
GDSC BPIT ML Campaign.pptx
GDSC BPIT ML Campaign.pptxGDSC BPIT ML Campaign.pptx
GDSC BPIT ML Campaign.pptx
 
GDSC Machine Learning Session Presentation
GDSC Machine Learning Session PresentationGDSC Machine Learning Session Presentation
GDSC Machine Learning Session Presentation
 
Classification with R
Classification with RClassification with R
Classification with R
 
Екатерина Гордиенко (Serpstat)
Екатерина Гордиенко (Serpstat)Екатерина Гордиенко (Serpstat)
Екатерина Гордиенко (Serpstat)
 
How can we use LangChain for Data Analysis_ A Detailed Perspective.pdf
How can we use LangChain for Data Analysis_ A Detailed Perspective.pdfHow can we use LangChain for Data Analysis_ A Detailed Perspective.pdf
How can we use LangChain for Data Analysis_ A Detailed Perspective.pdf
 
Eckovation Machine Learning
Eckovation Machine LearningEckovation Machine Learning
Eckovation Machine Learning
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
TechDayPakistan-Slides RAG with Cosmos DB.pptx
TechDayPakistan-Slides RAG with Cosmos DB.pptxTechDayPakistan-Slides RAG with Cosmos DB.pptx
TechDayPakistan-Slides RAG with Cosmos DB.pptx
 
Top 5 Machine Learning Tools for Software Development in 2024.pdf
Top 5 Machine Learning Tools for Software Development in 2024.pdfTop 5 Machine Learning Tools for Software Development in 2024.pdf
Top 5 Machine Learning Tools for Software Development in 2024.pdf
 
Imtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsImtiaz khan data_science_analytics
Imtiaz khan data_science_analytics
 
Machine learning and TensorFlow
Machine learning and TensorFlowMachine learning and TensorFlow
Machine learning and TensorFlow
 
Machine learning
Machine learningMachine learning
Machine learning
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
 
unleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdf
unleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdfunleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdf
unleashing-the-power-of-semantic-search-2023-5-30-5-27-10.pdf
 
Data mining for_java_and_dot_net 2016-17
Data mining for_java_and_dot_net 2016-17Data mining for_java_and_dot_net 2016-17
Data mining for_java_and_dot_net 2016-17
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptx
 

Dernier

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 

Dernier (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 

Five Algorithms Every Web Developer Can Understand

  • 1. Sheldon Kreger Web Engineer Five Algorithms Every Web Developer Can Use and Understand
  • 2. Make state-of-the-art algorithms accessible and discoverable by everyone.
  • 3. Sample algorithms ● Text Analysis summarizer, sentence tagger, profanity detection ● Machine Learning digit recognizer, recommendation engines ● Web crawler, scraper, pagerank, emailer, html to text ● Computer Vision image similarity, face detection, smile detection ● Audio & Video speech recognition, sound filters, file conversions ● Computation linear regression, spike detection, fourier filter ● Graph traveling salesman, maze generator, theta star ● Utilities parallel for-each, geographic distance, email validator
  • 4. A marketplace for algorithms... We host algorithms Anyone can turn their algorithms into scalable web services Typical users: scientists, academics, domain experts We make them discoverable Anyone can use and integrate these algorithms into their solutions Typical users: businesses, data scientists, app developers, IoT makers We make them monetizable Users of algorithms pay for algorithms they use Typical scenarios: heavy-load use cases with large user base
  • 5.
  • 6.
  • 8. Why? Create something bigger Easily combine algorithms like building blocks, regardless of language Growing Catalogue of Algorithms New algorithms everyday, make them usable by software developers Make applications smarter Smarter algorithms = cooler toys
  • 9. The Five Algorithms - Sentiment Analysis - Language Detection - PageRank - Nudity Detection - Term Frequency-Inverse Document Frequency
  • 10. Sentiment Analysis - Practical Applications Businesses frequently seek feedback on the quality of their products from consumers, and large amounts of reviews require too much time to manually review. Data can be used to in various forecasting applications, such as political elections.
  • 11. Sentiment Analysis - The Math Basic sentiment analysis uses natural language processing (NLP), via a “bag of words”, to spot keywords that are signs of strong emotional triggers. Once spotted, they classify a document as positive, negative, or neutral. Statements can be dual in nature, such as “I loved the food, BUT hated the service”. This requires more advanced algorithms to separate the two.
  • 12. PageRank - Practical Applications The assumption is the more inbound links to a page across the web, the more valid its content. Most famous application of PageRank is the Google search engine. Its initial success is based largely on the success of PageRank. Not only web pages can utilize PageRank. Any data that can be directionally modeled can interact with PageRank.
  • 13. PageRank - Graph Terminology Node (vertex): Item in graph. Edge: Relationship between two or more nodes. Directionality: Property of an edge indicating nature of relationship.
  • 15. Nudity Detection - Practical Applications Nudity detection algorithms minimize the need for manual moderation and deletion of malicious content. In a CMS, this algorithm can help prevent pornography from being uploaded by users.
  • 16. Nudity Detection - The Math 1. Detect skin-colored pixels in the image. 2. Locate skin regions based on the detected pixels. 3. Detect face in image. 4. Calculate ratio of skin toned vs non-skin toned pixels in image, taking into account the size of the face. 5. Classify the image as nude or not. More information at: https://algorithmia.com/algorithms/sfw/NudityDetection
  • 17. Nudity Detection - Vanilla PHP
  • 19. TF/IDF - Practical Applications - Keyword extraction is used in search engines, and content categorization algorithms. - Creates great content recommendations! - https://drupal.org/project/algorithmia - https://wordpress.org/plugins/algorithmia - https://algorithmia.com/recommends
  • 20. TF/IDF - The Math TF-IDF computes a weight, recognizing the importance of a term inside a document, comparing its usage frequency in the document set. The more a term appears, the higher its importance becomes. Thanks toothpastefordinner.com for the comic.
  • 21. TF/IDF - The Math Assume you have a 100 word blog post with the word "JavaScript" in it 5 times. Term Frequency = 5/100 = 0.05 Also assume your entire collection of blog posts has 10,000 documents, and the word "JavaScript" appears at least once in 100 of these. Inverse Document Frequency = log(10,000/100) = 2 For this document, this gives us the score: TF-IDF = 0.05 * 2 = 0.1
  • 22. Language Detection - Practical Applications Applications - Web searching, as engines bring up sites in dozens of languages. - May be required in conjunction with other Natural Language Processing (NLP) algorithms. Data sets may include documents in other languages. Some algorithms will only work in their natural language due to their training data. - Spam filtering services, so they can properly filter out specific languages and areas of origin.
  • 23. Language Detection - The Math Each language has a corpus at its core, a central pattern of components that uniquely identifies it. Profiling algorithms are used to set a core set of words to identify that language. The problem is not all text is long enough to identify a language. Instead, using the 3-gram algorithm via Algorithmia API, one HTTP request can break down detection by looking at groups of 3 letters.
  • 24. Algorithmia Credits Sign up at https://algorithmia.com Use code: FiveAlgorithmsBook 10,000k additional free API credits.