SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Building NLP applications

with Transformers
Julien Simon
Chief Evangelist, Hugging Face
julsimon@huggingface.co
2
Deep Learning 1.0
Neural networks
A few open 

datasets
GPUs
Expert tools
Free images from pngset.com
A typical project (pretending waterfall is agile)
Collect
data
Clean data
Label data
Experiment
Train
Evaluate
Engineer
features
Optimize
performance
Deploy model
Demo
Deep Learning 1.0: how it’s going
87% of data science
projects never


make it into
production
https://venturebeat.com/2019/07/19/
why-do-87-of-data-science-projects-
never-make-it-into-production
Only 25% of
companies report
widespread
adoption
https://www.pwc.com/us/en/tech-
effect/ai-analytics/ai-
predictions.html
Deep Learning 2.0
5
Transformers
Transfer

Learning
ML Hardware
Developer

tools
6
"Transformers are emerging
as a general-purpose
architecture for ML"



https://www.stateof.ai/
RNN and CNN usage down, 

Transformers usage up



https://www.kaggle.com/
kaggle-survey-2021
Transformers: one of the fastest-growing open source projects 

https://github.com/huggingface/transformers/
Over 1 million
model downloads
daily
Transfer Learning
7
• Identify the task type for your business problem
• Pick and test a pre-trained model
• No need to prepare a large dataset
• Only a couple of lines of code
• Optionally, fine-tune the model on your data
• Much less data is required
• No need to train for long periods of time
• Less than 50 lines of code
Example: Translation + Part of Speech Tagging
8
Multilingual voice queries on financial documents
• Speech-to-text in 21 languages (Facebook wav2vec2 300M)
• Semantic search on SEC filings (Sentence Transformers)
https://huggingface.co/spaces/juliensimon/voice-queries

https://www.youtube.com/watch?v=YPme-gR0f80
Demo: pretrained models
10
• A new generation of chips specially designed for ML
• Faster training increases agility and productivity
• Faster inference decreases latency and increases throughput
• Get more work done with less infrastructure and at lower cost
• Hugging Face is partnering with ML hardware innovators
• Training: Habana Labs, Graphcore,
• Inference: Intel, Qualcomm, AWS Inferentia
• Minimal code changes thanks to 

https://github.com/huggingface/optimum
Machine Learning Hardware
fine-tune BERT Large on GLUE MRPC

with Habana Gaudi on AWS
https://huggingface.co/blog/getting-started-habana
Demo: accelerating Transformer training jobs
Developer tools
Optimum
Model in
production
4,000+ datasets on
the hub
40,000+


pre-trained models


on the hub


No-code AutoML
HW-accelerated


managed API
HW-accelerated


inference
Hosted


ML applications
HW-accelerated


training


Train and deploy on


Amazon SageMaker
Optimum
Datasets


Models


Transformers
Train and deploy a Hugging Face model
on Amazon SageMaker

https://huggingface.co/juliensimon/reviews-sentiment-analysis
Demo: from the hub to AWS and back
14
• ML is complicated because we love to make it complicated
• Make sure to focus on the right things
1. Find an pre-trained model that fits your business use case
2. Identify a business KPI that shows success
3. Measure the model on real-life data
4. Good enough? Done!
5. Need a bit more accuracy? Fine-tune on your data
6. Optimize prediction latency and deploy in production
7. Move to the next project
• Tools, platforms, and infrastructure are here: no need to reinvent them
Key Takeaways
15
• Join our community

https://huggingface.co
• New to Transformers?

https://huggingface.co/course 

https://discuss.huggingface.co
• Need help? Ask about our Expert Acceleration Program (EAP) 

https://huggingface.co/support
• Need more privacy and compliance? Ask about a private hub deployment

https://huggingface.co/platform
Getting started with Hugging Face
julsimon@huggingface.co
@julsimon
https://youtube.com/juliensimonfr/
https://julsimon.medium.com/
response = translator(">>hun<< Thank you very much!")


response[0]['generated_text']


'Nagyon köszönöm!'

Contenu connexe

Tendances

Tendances (20)

LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
Journey of Generative AI
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Intro to LLMs
Intro to LLMsIntro to LLMs
Intro to LLMs
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
BERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil KumarBERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil Kumar
 
Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
How to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptxHow to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptx
 
gpt3_presentation.pdf
gpt3_presentation.pdfgpt3_presentation.pdf
gpt3_presentation.pdf
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 

Similaire à Building NLP applications with Transformers

MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
Provectus
 
Managers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsManagers guide to effective building of machine learning products
Managers guide to effective building of machine learning products
Gianmario Spacagna
 
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
Codiax
 
Machine Learning is more than Algorithms - A Consultant's Perspective on the ...
Machine Learning is more than Algorithms - A Consultant's Perspective on the ...Machine Learning is more than Algorithms - A Consultant's Perspective on the ...
Machine Learning is more than Algorithms - A Consultant's Perspective on the ...
Niklas Haas
 

Similaire à Building NLP applications with Transformers (20)

MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Democratize ai with google cloud
Democratize ai with google cloudDemocratize ai with google cloud
Democratize ai with google cloud
 
From Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMakerFrom Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMaker
 
Advanced Machine Learning with Amazon SageMaker
Advanced Machine Learning with Amazon SageMakerAdvanced Machine Learning with Amazon SageMaker
Advanced Machine Learning with Amazon SageMaker
 
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
 
Integrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an HourIntegrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an Hour
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
 
Managers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsManagers guide to effective building of machine learning products
Managers guide to effective building of machine learning products
 
Amazon SageMaker (December 2018)
Amazon SageMaker (December 2018)Amazon SageMaker (December 2018)
Amazon SageMaker (December 2018)
 
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
 
IIBA® Sydney Unlocking the Power of Low Code No Code: Why BAs Hold the Key
IIBA® Sydney Unlocking the Power of Low Code No Code: Why BAs Hold the KeyIIBA® Sydney Unlocking the Power of Low Code No Code: Why BAs Hold the Key
IIBA® Sydney Unlocking the Power of Low Code No Code: Why BAs Hold the Key
 
Which Application Modernization Pattern Is Right For You?
Which Application Modernization Pattern Is Right For You?Which Application Modernization Pattern Is Right For You?
Which Application Modernization Pattern Is Right For You?
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
Digitizing and automating HR workflows with DronaHQ
Digitizing and automating HR workflows with DronaHQ Digitizing and automating HR workflows with DronaHQ
Digitizing and automating HR workflows with DronaHQ
 
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
 
Machine Learning is more than Algorithms - A Consultant's Perspective on the ...
Machine Learning is more than Algorithms - A Consultant's Perspective on the ...Machine Learning is more than Algorithms - A Consultant's Perspective on the ...
Machine Learning is more than Algorithms - A Consultant's Perspective on the ...
 
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scalaSviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
 
Powerful Google developer tools for immediate impact! (2023-24 A)
Powerful Google developer tools for immediate impact! (2023-24 A)Powerful Google developer tools for immediate impact! (2023-24 A)
Powerful Google developer tools for immediate impact! (2023-24 A)
 
An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)
 
ML-Ops: Philosophy, Best-Practices and Tools
ML-Ops:Philosophy, Best-Practices and ToolsML-Ops:Philosophy, Best-Practices and Tools
ML-Ops: Philosophy, Best-Practices and Tools
 

Plus de Julien SIMON

Plus de Julien SIMON (20)

Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
 
Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)
 
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
 
Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)
 
Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Building NLP applications with Transformers

  • 1. Building NLP applications
 with Transformers Julien Simon Chief Evangelist, Hugging Face julsimon@huggingface.co
  • 2. 2 Deep Learning 1.0 Neural networks A few open 
 datasets GPUs Expert tools Free images from pngset.com
  • 3. A typical project (pretending waterfall is agile) Collect data Clean data Label data Experiment Train Evaluate Engineer features Optimize performance Deploy model Demo
  • 4. Deep Learning 1.0: how it’s going 87% of data science projects never 
 make it into production https://venturebeat.com/2019/07/19/ why-do-87-of-data-science-projects- never-make-it-into-production Only 25% of companies report widespread adoption https://www.pwc.com/us/en/tech- effect/ai-analytics/ai- predictions.html
  • 6. 6 "Transformers are emerging as a general-purpose architecture for ML"
 
 https://www.stateof.ai/ RNN and CNN usage down, 
 Transformers usage up
 
 https://www.kaggle.com/ kaggle-survey-2021 Transformers: one of the fastest-growing open source projects 
 https://github.com/huggingface/transformers/ Over 1 million model downloads daily
  • 7. Transfer Learning 7 • Identify the task type for your business problem • Pick and test a pre-trained model • No need to prepare a large dataset • Only a couple of lines of code • Optionally, fine-tune the model on your data • Much less data is required • No need to train for long periods of time • Less than 50 lines of code
  • 8. Example: Translation + Part of Speech Tagging 8
  • 9. Multilingual voice queries on financial documents • Speech-to-text in 21 languages (Facebook wav2vec2 300M) • Semantic search on SEC filings (Sentence Transformers) https://huggingface.co/spaces/juliensimon/voice-queries
 https://www.youtube.com/watch?v=YPme-gR0f80 Demo: pretrained models
  • 10. 10 • A new generation of chips specially designed for ML • Faster training increases agility and productivity • Faster inference decreases latency and increases throughput • Get more work done with less infrastructure and at lower cost • Hugging Face is partnering with ML hardware innovators • Training: Habana Labs, Graphcore, • Inference: Intel, Qualcomm, AWS Inferentia • Minimal code changes thanks to 
 https://github.com/huggingface/optimum Machine Learning Hardware
  • 11. fine-tune BERT Large on GLUE MRPC
 with Habana Gaudi on AWS https://huggingface.co/blog/getting-started-habana Demo: accelerating Transformer training jobs
  • 12. Developer tools Optimum Model in production 4,000+ datasets on the hub 40,000+ pre-trained models on the hub 
 No-code AutoML HW-accelerated 
 managed API HW-accelerated 
 inference Hosted 
 ML applications HW-accelerated 
 training 
 Train and deploy on 
 Amazon SageMaker Optimum Datasets 
 Models 
 Transformers
  • 13. Train and deploy a Hugging Face model on Amazon SageMaker
 https://huggingface.co/juliensimon/reviews-sentiment-analysis Demo: from the hub to AWS and back
  • 14. 14 • ML is complicated because we love to make it complicated • Make sure to focus on the right things 1. Find an pre-trained model that fits your business use case 2. Identify a business KPI that shows success 3. Measure the model on real-life data 4. Good enough? Done! 5. Need a bit more accuracy? Fine-tune on your data 6. Optimize prediction latency and deploy in production 7. Move to the next project • Tools, platforms, and infrastructure are here: no need to reinvent them Key Takeaways
  • 15. 15 • Join our community
 https://huggingface.co • New to Transformers?
 https://huggingface.co/course 
 https://discuss.huggingface.co • Need help? Ask about our Expert Acceleration Program (EAP) 
 https://huggingface.co/support • Need more privacy and compliance? Ask about a private hub deployment
 https://huggingface.co/platform Getting started with Hugging Face