SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
GPT and other Text Transformers:
Black Swans and Stochastic Parrots
Grigory Sapunov, Konstantin Savenkov (ks@inten.to)
Intento (https://inten.to)
#GALACONNECTED
•Machines in Content Creation Chain
•Pre-Trained Transformers
•Practical Applications
Agenda
#GALACONNECTED
Translation - one of the final steps of content creation, many capable
models, extensive tool support for post-editing (PE) workflows, rapidly
expanding adoption.
OCR, ASR, TTS - a decent amount of good models, relatively poor
toolchain for PE workflows.
Content authoring, quality improvement and stylization - no good
models until very recently, zero to none support for post-editing workflows
(sentence completion and predictive input in some text editors).
Machines in the Content Creation Chain
#GALACONNECTED
Pre-Trained Transformers
#GALACONNECTED
Transformers
The new type of neural networks
Transformer is a new type of neural network emerged in 2017. It is based on
attention mechanisms.
Transformers replaced previously dominating recurrent neural networks
(RNN/LSTM/GRU) in Machine Translation and many other NLP tasks.
Comparing to recurrent neural networks Transformer has a much higher
computational efficiency and can efficiently exploit modern parallel hardware
(GPU/TPU). This allows training of much larger models on huge amounts of data.
We won’t dive into the inner workings of the transformer, for this read the beautiful
article by Jay Alammar: http://jalammar.github.io/illustrated-transformer/
#GALACONNECTED
Transformers
Large models and data
http://faculty.washington.edu/ebender/papers/Stochastic_Parrots.pdf
#GALACONNECTED
Pre-Trained Transformers
Large model trained on a huge amount of texts
Transformers can be pre-trained on huge amounts of texts in an unsupervised/self-supervised manner. That
means they don’t require human-labeled data, just plain texts.
Example of tasks:
● Language modeling (LM): predict the next word/token based on the history of previous words/tokens.
Used by GPT-like models.
● Masked language modeling (MLM): some words/tokens in an input sequence are corrupted (replaced
by random tokens or some special tokens) and the goal is to recover these tokens using the remaining
context. Used by BERT-like models.
● There are some other tasks: Multilingual MLM (MMLM), Translation language modeling (TLM), Next
sentence prediction (NSP), Cross-Lingual contrast (XLCO), etc.
#GALACONNECTED
Language modeling (LM)
https://jalammar.github.io/how-gpt3-works-visualizations-animations/
#GALACONNECTED
Masked language modeling (MLM)
https://jalammar.github.io/illustrated-bert/
#GALACONNECTED
Generally, there are three modes:
● Train from scratch: train a transformer model for your own task on your own dataset.
Very expensive.
● Fine-tune: fine-tune a pretrained model for your own task with a much smaller dataset. Requires much
less data and computational resources, but still requires a significant amount of AI, software and
infrastructure expertise.
● Use as is: use a pretrained model without any training (possibly through the API). The model results
might be useful as-is, e.g. BERT embeddings (outputs of the model) can be used for similarity search,
or GPT-3 can be used to resume your texts (prompts).
Pre-Trained Transformers
How to use?
#GALACONNECTED
Transformers have a limited context size (a number of tokens in can consume on the input). Typically,
BERT-like models has a limit of 512 tokens, GPT-3 has a limit of 2048 tokens. Token is typically a subword
unit.
Caveats
https://colab.research.google.com/github/huggingface/transformers/blob/master/notebooks/01-training-tokenizers.ipynb
#GALACONNECTED
As you remember, GPT-like models perform
language modeling task:
predict the next word.
The model predicts (produces) an answer
given:
● A natural language description
of the task
● No/one/a few examples of the task
So, it is called zero-, one- or few-shot
in-context learning.
There is NO model training on user side at all.
GPT-3
#GALACONNECTED
The goal is to design a prompt (task description + examples) to solve your problem.
Actually, you need “to convince” the model to produce the expected result. It is called “prompt engineering”.
The model still has limited context of 2048 tokens, so it limits the total size of prompt + generated answer.
More on GPT-3: https://blog.inten.to/gpt-3-language-models-are-few-shot-learners-a13d1ae8b1f9
GPT-3
#GALACONNECTED
These models can produce high-quality word/token (sometimes sentence) representations, which can be
used for different purposes:
● Single sentence classification
● Sentence pair classification
● Similarity search
● Sentence tagging tasks
● ...
In some cases you can use the model as-is (e.g. for similarity search), but in many cases you need to
fine-tune the model for your special task.
Typically it is not suited for content generation.
DeBERTa is being integrated into the next version of the Microsoft Turing natural language representation
model (Turing NLRv4).
BERT/RoBERTa/DeBERTa/etc
http://jalammar.github.io/illustrated-bert/
#GALACONNECTED
BERT/RoBERTa/DeBERTa/etc
#GALACONNECTED
In T5 all NLP tasks are reframed into
a unified text-to-text-format where
the input and output are always text
strings.
Can be fine-tuned to many NLP
tasks, including machine translation,
document summarization, question
answering, and classification tasks
(e.g., sentiment analysis).
mT5 is a multilingual version of the
T5 transformer.
T5/mT5
https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
#GALACONNECTED
T5/mT5
#GALACONNECTED
Practical Applications
● Finding catchy titles for long descriptions
● Explaining complex things to second-graders
● Writing descriptions based on keywords
● Script generation
● Paraphrasing
● Source quality improvement for MT and chatbots
● Image captions for accessibility
#GALACONNECTED
Basic Copywriting
Finding a title that fits description
What other titles can fit this description?
GPT and other Text Transformers: Who's Leading Whom.. and to Where?
Over the last year, we see increasingly more performant Text Transformers models, such as GPT-3 from OpenAI, Turing from
Microsoft, and T5 from Google. They are capable of transforming the text in very creative and unexpected ways, like generating a
summary of an article, explaining complex concepts in a simple language, or synthesizing realistic datasets for AI training. Unlike
more traditional Machine Learning models, they do not require vast training datasets and can start based on just a few examples.
In this talk, we will make a short overview of such models, share the first experimental results and ask questions about the future
of the content creation process. Are those models ready for prime time? What will happen to the professional content creators?
Will they be able to compete against such powerful models? Will we see GPT post-editing similar to MT post-editing? We will
share some answers we have based on the extensive experimenting and the first production projects that employ this new
technology.
#GALACONNECTED
“Transformer Models in Context of Content Creation”
No cherry-picking. No edits.
Basic Copywriting
Finding a title that fits description
#GALACONNECTED
“Transformer Models in Context of Content Creation”
“The Content Creator’s New Nemesis: A Text Transformer in Your Pocket”
No cherry-picking. No edits.
Basic Copywriting
Finding a title that fits description
#GALACONNECTED
“Transformer Models in Context of Content Creation”
“The Content Creator’s New Nemesis: A Text Transformer in Your Pocket”
“Generative Text Transformers: The Future of Content Creation?”
No cherry-picking. No edits.
Basic Copywriting
Finding a title that fits description
#GALACONNECTED
“Transformer Models in Context of Content Creation”
“The Content Creator’s New Nemesis: A Text Transformer in Your Pocket”
“Generative Text Transformers: The Future of Content Creation?”
“Text Transformers: Artificial Intelligence for Content Creation”
No cherry-picking. No edits.
Basic Copywriting
Finding a title that fits description
#GALACONNECTED
“Transformer Models in Context of Content Creation”
“The Content Creator’s New Nemesis: A Text Transformer in Your Pocket”
“Generative Text Transformers: The Future of Content Creation?”
“Text Transformers: Artificial Intelligence for Content Creation”
“What Will Happen to Content Creation? – Deep Learning Models Are Changing the Rules”
No cherry-picking. No edits.
Basic Copywriting
Finding a title that fits description
#GALACONNECTED
“Transformer Models in Context of Content Creation”
“The Content Creator’s New Nemesis: A Text Transformer in Your Pocket”
“Generative Text Transformers: The Future of Content Creation?”
“Text Transformers: Artificial Intelligence for Content Creation”
“What Will Happen to Content Creation? – Deep Learning Models Are Changing the Rules”
“Text Transformers for Content Creation: The Good, Bad, and Ugly”
No cherry-picking. No edits.
Basic Copywriting
Finding a title that fits description
#GALACONNECTED
Text simplification
What is this talk really about?
Let’s explain it for a second grader!
Over the last year, we see increasingly more performant Text Transformers models, such as GPT-3 from OpenAI, Turing from Microsoft, and T5 from Google. They are
capable of transforming the text in very creative and unexpected ways, like generating a summary of an article, explaining complex concepts in a simple language, or
synthesizing realistic datasets for AI training. Unlike more traditional Machine Learning models, they do not require vast training datasets and can start based on just a
few examples.
In this talk, we will make a short overview of such models, share the first experimental results and ask questions about the future of the content creation process. Are
those models ready for prime time? What will happen to the professional content creators? Will they be able to compete against such powerful models? Will we see
GPT post-editing similar to MT post-editing? We will share some answers we have based on the extensive experimenting and the first production projects that
employ this new technology.
#GALACONNECTED
Text simplification
What is this talk really about?
“Computers that can write stories like Mommy. And Daddy.”
Let’s explain it for a second grader!
Over the last year, we see increasingly more performant Text Transformers models, such as GPT-3 from OpenAI, Turing from Microsoft, and T5 from Google. They are
capable of transforming the text in very creative and unexpected ways, like generating a summary of an article, explaining complex concepts in a simple language, or
synthesizing realistic datasets for AI training. Unlike more traditional Machine Learning models, they do not require vast training datasets and can start based on just a
few examples.
In this talk, we will make a short overview of such models, share the first experimental results and ask questions about the future of the content creation process. Are
those models ready for prime time? What will happen to the professional content creators? Will they be able to compete against such powerful models? Will we see
GPT post-editing similar to MT post-editing? We will share some answers we have based on the extensive experimenting and the first production projects that
employ this new technology.
#GALACONNECTED
Unisex, Cosplay, Masquerade,
Cotton, Black, Cartoon, Funny,
Teeth, Half, Face, Mouth, Mask
Basic Copywriting
Write a description based on keywords
#GALACONNECTED
This is a half face mask that covers the mouth and the
lower part of the face. It is made of cotton and is black
in color. It has a cartoonish design of a mouth with teeth.
Unisex, Cosplay, Masquerade,
Cotton, Black, Cartoon, Funny,
Teeth, Half, Face, Mouth, Mask
Basic Copywriting
Write a description based on keywords
#GALACONNECTED
Agent: Good afternoon! Where are you flying to today?
John: New York.
Agent: May I have your passport, please?
John: Here you go.
Agent: Are you checking any bags?
John: I have these two.
Agent: OK, please place your bags on the scale.
John: I have a stopover in Salt Lake City for 6 hours– do I need to pick up my
luggage there?
Agent: No, it’ll go straight through to New York. Here are your boarding passes
– your flight leaves from gate 19B and it’ll begin boarding at 6:30. Your seat
number is 11B.
John: Thanks.
Script writing
Let humans focus on diversity and subject matter
Vessel: Aircraft
Customer: John
Destination: New York
Bags: 2
#GALACONNECTED
Agent: Good afternoon! Where are you flying to today?
Ivan: Moscow.
Agent: May I have your passport, please?
Ivan: Here you go.
Agent: Are you checking any bags?
Ivan: No, I just need a boarding pass.
Agent: OK, here are your boarding passes. Your flight leaves from gate 2A and it’ll
begin boarding at 1:30. Your seat number is 21B.
Ivan: What time does the flight land?
Agent: It arrives in Moscow at 10:30 in the morning.
Ivan: Thanks.
Vessel: Aircraft
Customer: Ivan
Destination: Moscow
Bags: none
Script writing
Let humans focus on diversity and subject matter
#GALACONNECTED
Vessel: Bus
Customer: Mary
Destination: Sacramento
Bags: 1
Script writing
Let humans focus on diversity and subject matter
Agent: Good afternoon! Where are you traveling from today?
Mary: I am traveling from Sacramento.
Agent: May I have your ticket, please?
Mary: Here you go.
Agent: Thank you for choosing Greyhound. Your bus will arrive from the west at gate
2. You can pick up your luggage at the baggage claim.
Mary: Will my luggage be safe if I don’t pick it up at the baggage claim?
Agent: Yes, your luggage will be safe. You can pick it up at the baggage claim after
you arrive at your destination.
Mary: OK.
Agent: Have a great trip!
#GALACONNECTED
Paraphrasing
Gеtting more intent-specific utterances
“I have a stopover in <location>Salt Lake City</location> for <duration>6
hours</duration>– do I need to pick up my luggage there?”
#GALACONNECTED
“Please, clarify for me: am I required to pick up my luggage in <location>Salt Lake
City</location> if I have a stopover there for <duration>6 hours</duration>?”
Paraphrasing
Gеtting more intent-specific utterances
“I have a stopover in <location>Salt Lake City</location> for <duration>6
hours</duration>– do I need to pick up my luggage there?”
“I have a stopover in <location>Salt Lake City</location> for <duration>6
hours</duration> and I wonder if I need to pick up my luggage there?”
“Could you please tell me if I have to pick up my luggage in <location>Salt Lake
City</location>? I have a stopover there for <duration>6 hours</duration>.”
#GALACONNECTED
Source quality improvement
Help MT to understand your audience
IDK y ppl lik txt slng
bc its rlly hrd 2 rd +
cnfsing 2
#GALACONNECTED
Source quality improvement
Help MT to understand your audience
IDK y ppl lik txt slng
bc its rlly hrd 2 rd +
cnfsing 2
MT (EN > ES)
IDK y ppl como txt slng
bc its rlly hrd 2 rd +
cnfsing 2
#GALACONNECTED
Source quality improvement
Help MT to understand your audience
IDK y ppl lik txt slng
bc its rlly hrd 2 rd +
cnfsing 2
MT (EN > ES)
IDK y ppl como txt slng
bc its rlly hrd 2 rd +
cnfsing 2
I don't know why
people like text-speak
because it's really
hard to read and
confusing to
comprehend.
#GALACONNECTED
Source quality improvement
Help MT to understand your audience
IDK y ppl lik txt slng
bc its rlly hrd 2 rd +
cnfsing 2
MT (EN > ES)
IDK y ppl como txt slng
bc its rlly hrd 2 rd +
cnfsing 2
I don't know why
people like text-speak
because it's really
hard to read and
confusing to
comprehend.
MT (EN > ES)
No sé por qué a la
gente le gusta el
lenguaje de texto
porque es realmente
difícil de leer y confuso
de comprender.
#GALACONNECTED
“I wantto cansel my subscptn to your service nd i would of
done it myself if could of figuredouthow to find the link that
doesnt work when i click on it can u tell me how to get this
done so I can just move on and not keep having it chargd to
my friggin creditcard ok? do u get it? Huh?”
Source quality improvement
Helping chatbots to understand your audience
Do you think your chatbot is happy with
this? Neither your human support agent!
#GALACONNECTED
Better, huh?
“I would like to cancel my subscription to your service, and I
would have done it myself if I could have figured out how to
find the link that doesn't work when I click on it. Can you tell
me how to get this done so I can just move on and not keep
having it charged to my credit card”
Source quality improvement
Helping chatbots to understand your audience
#GALACONNECTED
Accessibility
Generate image captions automatically
#GALACONNECTED
Girl reading a book next to
a fireplace on Christmas
Eve.
Accessibility
Generate image captions automatically
#GALACONNECTED
Girl reading a book next to
a fireplace on Christmas
Eve.
Accessibility
Generate image captions automatically
#GALACONNECTED
• The generated output must be reviewed and edited by
humans to make sure it does not contain harmful biases.
• Also, because it may end up published in the Internet and
used for further training.
• Much like with MT, effective post-editing requires workforce
re-training, workflows and tools
• Commercial / production applications of GPT-3 must be
approved by the Open AI’s Safety Board.
Some Remarks
There must be humans in the loop
#GALACONNECTED
• Transformers are capable of authoring content at near-human
quality (especially compared to cheap outsourcing)
• They can deviate from the task (much like humans)
• They may expose biases and lack of expertise (much like humans)
• “All things post-editing” is coming.
• Proper post-editing tools, workflows, and business models to be
developed.
Conclusions
#GALACONNECTED
Q&A
Grigory Sapunov, Konstantin Savenkov
(ks@inten.to)
Intento (https://inten.to)

Contenu connexe

Tendances

Google BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A reviewGoogle BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A reviewDR. Ram Kumar Pathak
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersSteven Van Vaerenbergh
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyPekka Abrahamsson / Tampere University
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learngnakan
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3Ishan Jain
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPXiachongFeng
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPTLoic Merckel
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
 
Introduction to ChatGPT
Introduction to ChatGPTIntroduction to ChatGPT
Introduction to ChatGPTannusharma26
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?Windzoon Technologies
 
Revolutionary-ChatGPT
Revolutionary-ChatGPTRevolutionary-ChatGPT
Revolutionary-ChatGPT9 series
 
Praneet’s Pre On ChatGpt edited.pptx
Praneet’s Pre On ChatGpt edited.pptxPraneet’s Pre On ChatGpt edited.pptx
Praneet’s Pre On ChatGpt edited.pptxSalunke2
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveHuahai Yang
 
Prompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaMichal Jaskolski
 
Everything to know about ChatGPT
Everything to know about ChatGPTEverything to know about ChatGPT
Everything to know about ChatGPTKnoldus Inc.
 

Tendances (20)

Google BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A reviewGoogle BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A review
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
 
ChatGPT Use- Cases
ChatGPT Use- Cases ChatGPT Use- Cases
ChatGPT Use- Cases
 
ChatGPT for Academic
ChatGPT for AcademicChatGPT for Academic
ChatGPT for Academic
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learn
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLP
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
Introduction to ChatGPT
Introduction to ChatGPTIntroduction to ChatGPT
Introduction to ChatGPT
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?
 
Revolutionary-ChatGPT
Revolutionary-ChatGPTRevolutionary-ChatGPT
Revolutionary-ChatGPT
 
OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
 
Praneet’s Pre On ChatGpt edited.pptx
Praneet’s Pre On ChatGpt edited.pptxPraneet’s Pre On ChatGpt edited.pptx
Praneet’s Pre On ChatGpt edited.pptx
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
 
Unlocking the Power of ChatGPT
Unlocking the Power of ChatGPTUnlocking the Power of ChatGPT
Unlocking the Power of ChatGPT
 
Prompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowania
 
Everything to know about ChatGPT
Everything to know about ChatGPTEverything to know about ChatGPT
Everything to know about ChatGPT
 

Similaire à GPT and other Text Transformers: Black Swans and Stochastic Parrots

How to build a GPT model.pdf
How to build a GPT model.pdfHow to build a GPT model.pdf
How to build a GPT model.pdfStephenAmell4
 
Introduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfIntroduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfsudeshnakundu10
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchrohitcse52
 
Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?Jordi Cabot
 
The World of ChatGPT.pdf
The World of ChatGPT.pdfThe World of ChatGPT.pdf
The World of ChatGPT.pdfRia Morgan
 
BSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 SessionsBSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 SessionsBigML, Inc
 
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...rahul_net
 
Third Gen Production ML Architectures: Lessons from History, Experiences with...
Third Gen Production ML Architectures: Lessons from History, Experiences with...Third Gen Production ML Architectures: Lessons from History, Experiences with...
Third Gen Production ML Architectures: Lessons from History, Experiences with...M Waleed Kadous
 
leewayhertz.com-How to build a GPT model (1).pdf
leewayhertz.com-How to build a GPT model (1).pdfleewayhertz.com-How to build a GPT model (1).pdf
leewayhertz.com-How to build a GPT model (1).pdfKristiLBurns
 
Transformer Models_ BERT vs. GPT.pdf
Transformer Models_ BERT vs. GPT.pdfTransformer Models_ BERT vs. GPT.pdf
Transformer Models_ BERT vs. GPT.pdfhelloworld28847
 
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfOrtus Solutions, Corp
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...PyData
 
Deprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackDeprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackJustina Petraitytė
 
Model-driven Development of Model Transformations
Model-driven Development of Model TransformationsModel-driven Development of Model Transformations
Model-driven Development of Model TransformationsPieter Van Gorp
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfStephenAmell4
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfAnastasiaSteele10
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfMatthewHaws4
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr TsapFwdays
 

Similaire à GPT and other Text Transformers: Black Swans and Stochastic Parrots (20)

How to build a GPT model.pdf
How to build a GPT model.pdfHow to build a GPT model.pdf
How to build a GPT model.pdf
 
Introduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfIntroduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdf
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
 
Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?
 
Chatbot_Presentation
Chatbot_PresentationChatbot_Presentation
Chatbot_Presentation
 
The World of ChatGPT.pdf
The World of ChatGPT.pdfThe World of ChatGPT.pdf
The World of ChatGPT.pdf
 
BSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 SessionsBSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 Sessions
 
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
 
Third Gen Production ML Architectures: Lessons from History, Experiences with...
Third Gen Production ML Architectures: Lessons from History, Experiences with...Third Gen Production ML Architectures: Lessons from History, Experiences with...
Third Gen Production ML Architectures: Lessons from History, Experiences with...
 
leewayhertz.com-How to build a GPT model (1).pdf
leewayhertz.com-How to build a GPT model (1).pdfleewayhertz.com-How to build a GPT model (1).pdf
leewayhertz.com-How to build a GPT model (1).pdf
 
Transformer Models_ BERT vs. GPT.pdf
Transformer Models_ BERT vs. GPT.pdfTransformer Models_ BERT vs. GPT.pdf
Transformer Models_ BERT vs. GPT.pdf
 
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 
Deprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackDeprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stack
 
Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU
 
Model-driven Development of Model Transformations
Model-driven Development of Model TransformationsModel-driven Development of Model Transformations
Model-driven Development of Model Transformations
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
 

Plus de Konstantin Savenkov

Dodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutionsDodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutionsKonstantin Savenkov
 
Building Multi-Purpose MT Portfolio
Building Multi-Purpose MT PortfolioBuilding Multi-Purpose MT Portfolio
Building Multi-Purpose MT PortfolioKonstantin Savenkov
 
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Konstantin Savenkov
 
Progress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation SystemsProgress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation SystemsKonstantin Savenkov
 
Cloud Artificial Intelligence Landscape
Cloud Artificial Intelligence LandscapeCloud Artificial Intelligence Landscape
Cloud Artificial Intelligence LandscapeKonstantin Savenkov
 
State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)Konstantin Savenkov
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)Konstantin Savenkov
 
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)Konstantin Savenkov
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...Konstantin Savenkov
 
Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Konstantin Savenkov
 
Сравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаKonstantin Savenkov
 
State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)Konstantin Savenkov
 
Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)Konstantin Savenkov
 
State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)Konstantin Savenkov
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)Konstantin Savenkov
 
NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017Konstantin Savenkov
 
Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Konstantin Savenkov
 

Plus de Konstantin Savenkov (20)

Dodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutionsDodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutions
 
Building Multi-Purpose MT Portfolio
Building Multi-Purpose MT PortfolioBuilding Multi-Purpose MT Portfolio
Building Multi-Purpose MT Portfolio
 
Machine Translation Insights
Machine Translation InsightsMachine Translation Insights
Machine Translation Insights
 
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
 
Progress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation SystemsProgress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation Systems
 
Cloud Artificial Intelligence Landscape
Cloud Artificial Intelligence LandscapeCloud Artificial Intelligence Landscape
Cloud Artificial Intelligence Landscape
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)
 
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
 
Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)
 
Сравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного перевода
 
State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)
 
Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)
 
State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017
 
Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017
 
Building a Data Driven Business
Building a Data Driven BusinessBuilding a Data Driven Business
Building a Data Driven Business
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

GPT and other Text Transformers: Black Swans and Stochastic Parrots

  • 1. GPT and other Text Transformers: Black Swans and Stochastic Parrots Grigory Sapunov, Konstantin Savenkov (ks@inten.to) Intento (https://inten.to)
  • 2. #GALACONNECTED •Machines in Content Creation Chain •Pre-Trained Transformers •Practical Applications Agenda
  • 3. #GALACONNECTED Translation - one of the final steps of content creation, many capable models, extensive tool support for post-editing (PE) workflows, rapidly expanding adoption. OCR, ASR, TTS - a decent amount of good models, relatively poor toolchain for PE workflows. Content authoring, quality improvement and stylization - no good models until very recently, zero to none support for post-editing workflows (sentence completion and predictive input in some text editors). Machines in the Content Creation Chain
  • 5. #GALACONNECTED Transformers The new type of neural networks Transformer is a new type of neural network emerged in 2017. It is based on attention mechanisms. Transformers replaced previously dominating recurrent neural networks (RNN/LSTM/GRU) in Machine Translation and many other NLP tasks. Comparing to recurrent neural networks Transformer has a much higher computational efficiency and can efficiently exploit modern parallel hardware (GPU/TPU). This allows training of much larger models on huge amounts of data. We won’t dive into the inner workings of the transformer, for this read the beautiful article by Jay Alammar: http://jalammar.github.io/illustrated-transformer/
  • 6. #GALACONNECTED Transformers Large models and data http://faculty.washington.edu/ebender/papers/Stochastic_Parrots.pdf
  • 7. #GALACONNECTED Pre-Trained Transformers Large model trained on a huge amount of texts Transformers can be pre-trained on huge amounts of texts in an unsupervised/self-supervised manner. That means they don’t require human-labeled data, just plain texts. Example of tasks: ● Language modeling (LM): predict the next word/token based on the history of previous words/tokens. Used by GPT-like models. ● Masked language modeling (MLM): some words/tokens in an input sequence are corrupted (replaced by random tokens or some special tokens) and the goal is to recover these tokens using the remaining context. Used by BERT-like models. ● There are some other tasks: Multilingual MLM (MMLM), Translation language modeling (TLM), Next sentence prediction (NSP), Cross-Lingual contrast (XLCO), etc.
  • 9. #GALACONNECTED Masked language modeling (MLM) https://jalammar.github.io/illustrated-bert/
  • 10. #GALACONNECTED Generally, there are three modes: ● Train from scratch: train a transformer model for your own task on your own dataset. Very expensive. ● Fine-tune: fine-tune a pretrained model for your own task with a much smaller dataset. Requires much less data and computational resources, but still requires a significant amount of AI, software and infrastructure expertise. ● Use as is: use a pretrained model without any training (possibly through the API). The model results might be useful as-is, e.g. BERT embeddings (outputs of the model) can be used for similarity search, or GPT-3 can be used to resume your texts (prompts). Pre-Trained Transformers How to use?
  • 11. #GALACONNECTED Transformers have a limited context size (a number of tokens in can consume on the input). Typically, BERT-like models has a limit of 512 tokens, GPT-3 has a limit of 2048 tokens. Token is typically a subword unit. Caveats https://colab.research.google.com/github/huggingface/transformers/blob/master/notebooks/01-training-tokenizers.ipynb
  • 12. #GALACONNECTED As you remember, GPT-like models perform language modeling task: predict the next word. The model predicts (produces) an answer given: ● A natural language description of the task ● No/one/a few examples of the task So, it is called zero-, one- or few-shot in-context learning. There is NO model training on user side at all. GPT-3
  • 13. #GALACONNECTED The goal is to design a prompt (task description + examples) to solve your problem. Actually, you need “to convince” the model to produce the expected result. It is called “prompt engineering”. The model still has limited context of 2048 tokens, so it limits the total size of prompt + generated answer. More on GPT-3: https://blog.inten.to/gpt-3-language-models-are-few-shot-learners-a13d1ae8b1f9 GPT-3
  • 14. #GALACONNECTED These models can produce high-quality word/token (sometimes sentence) representations, which can be used for different purposes: ● Single sentence classification ● Sentence pair classification ● Similarity search ● Sentence tagging tasks ● ... In some cases you can use the model as-is (e.g. for similarity search), but in many cases you need to fine-tune the model for your special task. Typically it is not suited for content generation. DeBERTa is being integrated into the next version of the Microsoft Turing natural language representation model (Turing NLRv4). BERT/RoBERTa/DeBERTa/etc http://jalammar.github.io/illustrated-bert/
  • 16. #GALACONNECTED In T5 all NLP tasks are reframed into a unified text-to-text-format where the input and output are always text strings. Can be fine-tuned to many NLP tasks, including machine translation, document summarization, question answering, and classification tasks (e.g., sentiment analysis). mT5 is a multilingual version of the T5 transformer. T5/mT5 https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
  • 18. #GALACONNECTED Practical Applications ● Finding catchy titles for long descriptions ● Explaining complex things to second-graders ● Writing descriptions based on keywords ● Script generation ● Paraphrasing ● Source quality improvement for MT and chatbots ● Image captions for accessibility
  • 19. #GALACONNECTED Basic Copywriting Finding a title that fits description What other titles can fit this description? GPT and other Text Transformers: Who's Leading Whom.. and to Where? Over the last year, we see increasingly more performant Text Transformers models, such as GPT-3 from OpenAI, Turing from Microsoft, and T5 from Google. They are capable of transforming the text in very creative and unexpected ways, like generating a summary of an article, explaining complex concepts in a simple language, or synthesizing realistic datasets for AI training. Unlike more traditional Machine Learning models, they do not require vast training datasets and can start based on just a few examples. In this talk, we will make a short overview of such models, share the first experimental results and ask questions about the future of the content creation process. Are those models ready for prime time? What will happen to the professional content creators? Will they be able to compete against such powerful models? Will we see GPT post-editing similar to MT post-editing? We will share some answers we have based on the extensive experimenting and the first production projects that employ this new technology.
  • 20. #GALACONNECTED “Transformer Models in Context of Content Creation” No cherry-picking. No edits. Basic Copywriting Finding a title that fits description
  • 21. #GALACONNECTED “Transformer Models in Context of Content Creation” “The Content Creator’s New Nemesis: A Text Transformer in Your Pocket” No cherry-picking. No edits. Basic Copywriting Finding a title that fits description
  • 22. #GALACONNECTED “Transformer Models in Context of Content Creation” “The Content Creator’s New Nemesis: A Text Transformer in Your Pocket” “Generative Text Transformers: The Future of Content Creation?” No cherry-picking. No edits. Basic Copywriting Finding a title that fits description
  • 23. #GALACONNECTED “Transformer Models in Context of Content Creation” “The Content Creator’s New Nemesis: A Text Transformer in Your Pocket” “Generative Text Transformers: The Future of Content Creation?” “Text Transformers: Artificial Intelligence for Content Creation” No cherry-picking. No edits. Basic Copywriting Finding a title that fits description
  • 24. #GALACONNECTED “Transformer Models in Context of Content Creation” “The Content Creator’s New Nemesis: A Text Transformer in Your Pocket” “Generative Text Transformers: The Future of Content Creation?” “Text Transformers: Artificial Intelligence for Content Creation” “What Will Happen to Content Creation? – Deep Learning Models Are Changing the Rules” No cherry-picking. No edits. Basic Copywriting Finding a title that fits description
  • 25. #GALACONNECTED “Transformer Models in Context of Content Creation” “The Content Creator’s New Nemesis: A Text Transformer in Your Pocket” “Generative Text Transformers: The Future of Content Creation?” “Text Transformers: Artificial Intelligence for Content Creation” “What Will Happen to Content Creation? – Deep Learning Models Are Changing the Rules” “Text Transformers for Content Creation: The Good, Bad, and Ugly” No cherry-picking. No edits. Basic Copywriting Finding a title that fits description
  • 26. #GALACONNECTED Text simplification What is this talk really about? Let’s explain it for a second grader! Over the last year, we see increasingly more performant Text Transformers models, such as GPT-3 from OpenAI, Turing from Microsoft, and T5 from Google. They are capable of transforming the text in very creative and unexpected ways, like generating a summary of an article, explaining complex concepts in a simple language, or synthesizing realistic datasets for AI training. Unlike more traditional Machine Learning models, they do not require vast training datasets and can start based on just a few examples. In this talk, we will make a short overview of such models, share the first experimental results and ask questions about the future of the content creation process. Are those models ready for prime time? What will happen to the professional content creators? Will they be able to compete against such powerful models? Will we see GPT post-editing similar to MT post-editing? We will share some answers we have based on the extensive experimenting and the first production projects that employ this new technology.
  • 27. #GALACONNECTED Text simplification What is this talk really about? “Computers that can write stories like Mommy. And Daddy.” Let’s explain it for a second grader! Over the last year, we see increasingly more performant Text Transformers models, such as GPT-3 from OpenAI, Turing from Microsoft, and T5 from Google. They are capable of transforming the text in very creative and unexpected ways, like generating a summary of an article, explaining complex concepts in a simple language, or synthesizing realistic datasets for AI training. Unlike more traditional Machine Learning models, they do not require vast training datasets and can start based on just a few examples. In this talk, we will make a short overview of such models, share the first experimental results and ask questions about the future of the content creation process. Are those models ready for prime time? What will happen to the professional content creators? Will they be able to compete against such powerful models? Will we see GPT post-editing similar to MT post-editing? We will share some answers we have based on the extensive experimenting and the first production projects that employ this new technology.
  • 28. #GALACONNECTED Unisex, Cosplay, Masquerade, Cotton, Black, Cartoon, Funny, Teeth, Half, Face, Mouth, Mask Basic Copywriting Write a description based on keywords
  • 29. #GALACONNECTED This is a half face mask that covers the mouth and the lower part of the face. It is made of cotton and is black in color. It has a cartoonish design of a mouth with teeth. Unisex, Cosplay, Masquerade, Cotton, Black, Cartoon, Funny, Teeth, Half, Face, Mouth, Mask Basic Copywriting Write a description based on keywords
  • 30. #GALACONNECTED Agent: Good afternoon! Where are you flying to today? John: New York. Agent: May I have your passport, please? John: Here you go. Agent: Are you checking any bags? John: I have these two. Agent: OK, please place your bags on the scale. John: I have a stopover in Salt Lake City for 6 hours– do I need to pick up my luggage there? Agent: No, it’ll go straight through to New York. Here are your boarding passes – your flight leaves from gate 19B and it’ll begin boarding at 6:30. Your seat number is 11B. John: Thanks. Script writing Let humans focus on diversity and subject matter Vessel: Aircraft Customer: John Destination: New York Bags: 2
  • 31. #GALACONNECTED Agent: Good afternoon! Where are you flying to today? Ivan: Moscow. Agent: May I have your passport, please? Ivan: Here you go. Agent: Are you checking any bags? Ivan: No, I just need a boarding pass. Agent: OK, here are your boarding passes. Your flight leaves from gate 2A and it’ll begin boarding at 1:30. Your seat number is 21B. Ivan: What time does the flight land? Agent: It arrives in Moscow at 10:30 in the morning. Ivan: Thanks. Vessel: Aircraft Customer: Ivan Destination: Moscow Bags: none Script writing Let humans focus on diversity and subject matter
  • 32. #GALACONNECTED Vessel: Bus Customer: Mary Destination: Sacramento Bags: 1 Script writing Let humans focus on diversity and subject matter Agent: Good afternoon! Where are you traveling from today? Mary: I am traveling from Sacramento. Agent: May I have your ticket, please? Mary: Here you go. Agent: Thank you for choosing Greyhound. Your bus will arrive from the west at gate 2. You can pick up your luggage at the baggage claim. Mary: Will my luggage be safe if I don’t pick it up at the baggage claim? Agent: Yes, your luggage will be safe. You can pick it up at the baggage claim after you arrive at your destination. Mary: OK. Agent: Have a great trip!
  • 33. #GALACONNECTED Paraphrasing Gеtting more intent-specific utterances “I have a stopover in <location>Salt Lake City</location> for <duration>6 hours</duration>– do I need to pick up my luggage there?”
  • 34. #GALACONNECTED “Please, clarify for me: am I required to pick up my luggage in <location>Salt Lake City</location> if I have a stopover there for <duration>6 hours</duration>?” Paraphrasing Gеtting more intent-specific utterances “I have a stopover in <location>Salt Lake City</location> for <duration>6 hours</duration>– do I need to pick up my luggage there?” “I have a stopover in <location>Salt Lake City</location> for <duration>6 hours</duration> and I wonder if I need to pick up my luggage there?” “Could you please tell me if I have to pick up my luggage in <location>Salt Lake City</location>? I have a stopover there for <duration>6 hours</duration>.”
  • 35. #GALACONNECTED Source quality improvement Help MT to understand your audience IDK y ppl lik txt slng bc its rlly hrd 2 rd + cnfsing 2
  • 36. #GALACONNECTED Source quality improvement Help MT to understand your audience IDK y ppl lik txt slng bc its rlly hrd 2 rd + cnfsing 2 MT (EN > ES) IDK y ppl como txt slng bc its rlly hrd 2 rd + cnfsing 2
  • 37. #GALACONNECTED Source quality improvement Help MT to understand your audience IDK y ppl lik txt slng bc its rlly hrd 2 rd + cnfsing 2 MT (EN > ES) IDK y ppl como txt slng bc its rlly hrd 2 rd + cnfsing 2 I don't know why people like text-speak because it's really hard to read and confusing to comprehend.
  • 38. #GALACONNECTED Source quality improvement Help MT to understand your audience IDK y ppl lik txt slng bc its rlly hrd 2 rd + cnfsing 2 MT (EN > ES) IDK y ppl como txt slng bc its rlly hrd 2 rd + cnfsing 2 I don't know why people like text-speak because it's really hard to read and confusing to comprehend. MT (EN > ES) No sé por qué a la gente le gusta el lenguaje de texto porque es realmente difícil de leer y confuso de comprender.
  • 39. #GALACONNECTED “I wantto cansel my subscptn to your service nd i would of done it myself if could of figuredouthow to find the link that doesnt work when i click on it can u tell me how to get this done so I can just move on and not keep having it chargd to my friggin creditcard ok? do u get it? Huh?” Source quality improvement Helping chatbots to understand your audience Do you think your chatbot is happy with this? Neither your human support agent!
  • 40. #GALACONNECTED Better, huh? “I would like to cancel my subscription to your service, and I would have done it myself if I could have figured out how to find the link that doesn't work when I click on it. Can you tell me how to get this done so I can just move on and not keep having it charged to my credit card” Source quality improvement Helping chatbots to understand your audience
  • 42. #GALACONNECTED Girl reading a book next to a fireplace on Christmas Eve. Accessibility Generate image captions automatically
  • 43. #GALACONNECTED Girl reading a book next to a fireplace on Christmas Eve. Accessibility Generate image captions automatically
  • 44. #GALACONNECTED • The generated output must be reviewed and edited by humans to make sure it does not contain harmful biases. • Also, because it may end up published in the Internet and used for further training. • Much like with MT, effective post-editing requires workforce re-training, workflows and tools • Commercial / production applications of GPT-3 must be approved by the Open AI’s Safety Board. Some Remarks There must be humans in the loop
  • 45. #GALACONNECTED • Transformers are capable of authoring content at near-human quality (especially compared to cheap outsourcing) • They can deviate from the task (much like humans) • They may expose biases and lack of expertise (much like humans) • “All things post-editing” is coming. • Proper post-editing tools, workflows, and business models to be developed. Conclusions
  • 46. #GALACONNECTED Q&A Grigory Sapunov, Konstantin Savenkov (ks@inten.to) Intento (https://inten.to)