SlideShare a Scribd company logo
1 of 28
Download to read offline
AI Mid-Term Outlook
Ethan Holland
Vice-President
1
2
Short Term – Bulk of
discussion
Mid Term (6-24
months)
Long Term
Generative AI Models:
GPT, Llama, PaLM
ENPS -> Web Articles
Box Scores -> Articles
Audio Creation:
ElevenLabs
THIS IS WHAT I WANT
TO TALK ABOUT
Media: MidJourney,
Adobe Firefly, HeyGen
ChatGPT PlugIns
Advanced Data Analysis:
Spreadsheets, insights,
code writing
Speech to Text: Whisper,
Otter.ai, Captions
Short Term - The novelty/blooper phase
Mid Term – Determine our AI Acumen
• Can you name a few LLMs other than Chat GPT?
• You don’t need to know them… but be aware.
• Llama, PaLM, etc.
Mid Term – Generative AI is a LOT more than ChatGPT
Google’s are light blue. Meta’s are dark blue.
Proficiency
Mid Term - Determine our AI Acumen
• Can you name a few LLMs other than Chat GPT?
• You don’t need to know them… but be aware.
• Llama, PaLM, etc.
• Do you have Chat GPT Plus?
• If so, have you used Code Interpreter/Advanced Data
Analysis?
• Have you experimented with PlugIns?
• That’s the difference between a “chit chat bot” and
AGENCY – where the language model is an interface to
data that you speak to conversationally
7
Short Term Mid Term Long Term
Generative AI Models:
GPT, Llama, PaLM
Agency
ENPS -> Web Articles
Box Scores -> Articles
Audio Creation:
ElevenLabs
Media: MidJourney,
Adobe Firefly, HeyGen
ChatGPT PlugIns
Advanced Data Analysis:
Spreadsheets, insights,
code writing
Speech to Text: Whisper,
Otter.ai, Captions
Mid Term – Agency
• Generative AI bots are so good, that we THINK THEY ARE
ANSWER-BOTS
• On their own… they can pass IQ tests, take the bar exam, etc.
• GPT4 does so well on some tests, that we cannot create a test to
measure it!
• Like a punching bag game that goes to 1000… GPT4 is scoring
1000…. but if we can’t measure anything stronger… we don’t know if
it’s strength is 1000 or 10,000… or 100,000
Mid Term – Agency
• Generative AI bots are so good, that we THINK THEY ARE ANSWER-BOTS
Mid Term – Agency
• Generative AI bots are so good, that we THINK THEY ARE ANSWER-BOTS
• On their own… they can pass IQ tests, take the bar exam, etc.
• However, when we use them as INTERFACES and connect them with DATA and give
the ability to reflect and refine… avoid hallucinations, check their work…
Access the internet, book flights, change hotel rooms… that’s an AGENT. We’re already
using them… but they will get more powerful.
This is the agent
The Agents are
Coming…
via Enterprise
LLM
“Everything That
Can Be LLM-ed,
Will Be”
Imagine ‘talking with’ 30 years of
ENPS scripts….
This is already
happening. In
other words, “mid-
term” is now.
Mid Term – Agency – Death of the Page View
“The advertising model as we know it — getting people to go to your website and view it
— that’s going to break…I don’t see how that sustains” –Jeremiah Owyang
As consumers increasingly use automated tools to go through the marketing “funnel,”
marketers and creators need to consider something that many might think is
counterintuitive: That is, you actually want, no need, LLMs to train on your data.
“If I was a journalist, I would want my articles ingested by all of the LLMs,” he
explained, adding that more and more chatbots are including citations, including Bing,
You.com and Perplexity. “So when people search for that information, I show up first —
it’s the same as SEO strategy,” he said — cautioning that this would not apply to gated
content, which employs a different business model.
https://venturebeat.com/ai/forget-seo-why-ai-engine-optimization-may-be-the-future/
Never leave the chat = Death of the browser
Elon, Gates, Zuck, Andreeson agree.
?
15
Short Term Mid Term Long Term
Generative AI Models:
GPT, Llama, PaLM
Agency
ENPS -> Web Articles Multimodal
Box Scores -> Articles
Audio Creation:
ElevenLabs
Media: MidJourney,
Adobe Firefly, HeyGen
ChatGPT PlugIns
Advanced Data Analysis:
Spreadsheets, insights,
code writing
Speech to Text: Whisper,
Otter.ai, Captions
Mid Term – Multimodal
• Currently, Large Language Models are thought of as single modes:
• ChatGPT = text
• MidJourney = images
• Etc.
• Next gen LLMs will be multimodal:
• Multimodal is the ability to work with and understand ANY media:
• Text, images, video, data,
• Image to speech, text to video, etc.
• ChatGPT AGENTS like Code Interpreter can use tools like OCR to read PDFs or write Python
to create tables and graphs
• That’s not the same.
Mid Term – Multimodal
Actual interaction with Google DeepMind’s
Flamingo in April 2022 (17 months ago)
• How many people are in this photo
• “There at least five people in this photo”
• How many mirrors are in this photo?
• “There are at least two mirrors”
• What is the person on the scale doing?
• “He is looking at the scale”
• Where is Obama’s foot positioned?
• “Obama’s foot is on the right side of the scale”
• What happens as a result?
• “The scale shows a higher weight”
• Is the person on the scale aware of it?
• “He is not aware of it.”
https://www.youtube.com/watch?v=zRYcKhkAsk4
Mid Term – Multimodal
• Currently, Large Language Models are though of in single modes:
• ChatGPT = text to text
• MidJourney = text to images
• Next gen LLMs will be Multimodal:
• ChatGPT Plus Code Interpreter can use tools like OCR to read PDFs or write Python to
create tables and graphs
• That’s not the same.
• Multimodal is the ability to understand ANY media:
• Text, images, video, data,
• Image to speech, text to video, etc.
• No more need for metadata, keywords? The AP already has the
ability to search its entire archive this way, i.e. “Find all of the
photos of Churchill on a roof smoking a cigar” (actual example)
19
Short Term Mid Term Long Term
Generative AI Models:
GPT, Llama, PaLM
Agency
ENPS -> Web Articles Multimodal
Box Scores -> Articles Embodiment
Audio Creation:
ElevenLabs
Media: MidJourney,
Adobe Firefly, HeyGen
ChatGPT PlugIns
Advanced Data Analysis:
Spreadsheets, insights,
code writing
Speech to Text: Whisper,
Otter.ai, Captions
Mid Term – Embodiment
• Using a Large Language Model as an interface to
accomplish tasks on a network is AGENCY
• Using a Large Language Model as an interface with
a physical robot or machine is EMBODIMENT
Mid Term – Embodiment
The robot’s program does not need to
know what rice chips are…nor where the
drawer is… the LLM works with the video
sensors and derives it… and commands
it.
Again... “mid-term” is now.
“According to Google, when
given a high-level command,
such as bring me the rice
chips from the drawer, PaLM-
E can generate a plan of
action for a mobile robot and
execute the actions by itself.”
– March 2023
Mid Term – Embodiment
• Using a Large Language Model as an interface to accomplish
tasks online is AGENCY
• Using a Large Language Model as an interface with a robot or
machine is EMBODIMENT
• The human says a command in plain English.
• The LLM translates the command into the code required to
execute.
• If you had a Big Trak or used the LOGO programming educational
tool, imagine just talking to “the turtle” as opposed to having to
program it line by line.
• Recently, a large language model embedded in a robot dog was
able to INTUITIVELY figure out the dog’s programming code and
guide the dog to do plain English tasks like “Go to the living room
and bring me the red sock off of the couch”
24
Short Term Mid Term Long Term
Generative AI Models:
GPT, Llama, PaLM
Agency
ENPS -> Web Articles Multimodal
Box Scores -> Articles Embodiment
Audio Creation:
ElevenLabs
Artificial General
Intelligence
More toward “long term”
Media: MidJourney,
Adobe Firefly, HeyGen
ChatGPT PlugIns
Advanced Data Analysis:
Spreadsheets, insights,
code writing
Speech to Text: Whisper,
Otter.ai, Captions
Long Term – Artificial General Intelligence (AGI)
• OpenAI defines AGI as “when computers
can outperform humans at most
economically viable work”
• A combination of multimodality, intelligence, agency,
embodiment…and exponential improvement.
• AGI Predictions
• DeepMind CEO: A few years
• Alan Thompson: 2 years
• Elon Musk: 5 years
• Ray Kurzweil: 6 years
• Geoffrey Hinton: 5-10 years
• Sam Altman: <10 years
10 Months of “otter on a plane using
wifi”
• October 2022
• November 2022
• March 2023
Let’s revisit short term,
the novelty stage…
Source: MidJourney
& Ethan Mollick
Long term sneaks up on us…
• 1964: 1 megaflop = $5M (CDC 6600)
• 1985: 1.9 gigaflop = $15M (The Cray-2)
• 1997: 1 teraflop = $46M (ASCI Red)
• 2023: 35 teraflops = $999 (iPhone 15)
Source: Ethan Mollick
28
Short Term Mid Term Long Term
Generative AI Models:
GPT, Llama, PaLM
Agency Artificial General
Intelligence
ENPS -> Web Articles Multimodal
Box Scores -> Articles Embodiment
Audio Creation:
ElevenLabs
Media: MidJourney,
Adobe Firefly, HeyGen
ChatGPT PlugIns
Advanced Data Analysis:
Spreadsheets, insights,
code writing
Speech to Text: Whisper,
Otter.ai, Captions
These are my
foundations for
thinking about AI
Thank you.

More Related Content

Similar to AI Mid-Term Outlook: National Association of Broadcasters

Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
Ingrid Airi González
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
Ingrid Airi González
 

Similar to AI Mid-Term Outlook: National Association of Broadcasters (20)

AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
 
AI 2023.pdf
AI 2023.pdfAI 2023.pdf
AI 2023.pdf
 
Cloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptxCloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptx
 
Meetup 6/3/2017 - Artificiële Intelligentie: over chatbots & robots
Meetup 6/3/2017 - Artificiële Intelligentie: over chatbots & robotsMeetup 6/3/2017 - Artificiële Intelligentie: over chatbots & robots
Meetup 6/3/2017 - Artificiële Intelligentie: over chatbots & robots
 
The Impact of Emerging Technology on Digital Transformation
The Impact of Emerging Technology on Digital TransformationThe Impact of Emerging Technology on Digital Transformation
The Impact of Emerging Technology on Digital Transformation
 
Introduction to Artificial Intelligence and Machine Learning with Python
Introduction to Artificial Intelligence and Machine Learning with Python Introduction to Artificial Intelligence and Machine Learning with Python
Introduction to Artificial Intelligence and Machine Learning with Python
 
Python enterprise vento di liberta
Python enterprise vento di libertaPython enterprise vento di liberta
Python enterprise vento di liberta
 
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by RajkumarWebinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
 
Using Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptxUsing Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptx
 
Art of artificial intelligence and automation
Art of artificial intelligence and automationArt of artificial intelligence and automation
Art of artificial intelligence and automation
 
DocGPT
DocGPTDocGPT
DocGPT
 
Findability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligenceFindability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligence
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Ai myths test
Ai myths testAi myths test
Ai myths test
 
Revolutionize the way you work with AI and ChatGPT..gslides (2).pdf
Revolutionize the way you work with AI and ChatGPT..gslides (2).pdfRevolutionize the way you work with AI and ChatGPT..gslides (2).pdf
Revolutionize the way you work with AI and ChatGPT..gslides (2).pdf
 
Ai Myths
Ai MythsAi Myths
Ai Myths
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Ai/ML services
Ai/ML servicesAi/ML services
Ai/ML services
 

Recently uploaded

Recently uploaded (20)

The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 

AI Mid-Term Outlook: National Association of Broadcasters

  • 1. AI Mid-Term Outlook Ethan Holland Vice-President 1
  • 2. 2 Short Term – Bulk of discussion Mid Term (6-24 months) Long Term Generative AI Models: GPT, Llama, PaLM ENPS -> Web Articles Box Scores -> Articles Audio Creation: ElevenLabs THIS IS WHAT I WANT TO TALK ABOUT Media: MidJourney, Adobe Firefly, HeyGen ChatGPT PlugIns Advanced Data Analysis: Spreadsheets, insights, code writing Speech to Text: Whisper, Otter.ai, Captions
  • 3. Short Term - The novelty/blooper phase
  • 4. Mid Term – Determine our AI Acumen • Can you name a few LLMs other than Chat GPT? • You don’t need to know them… but be aware. • Llama, PaLM, etc.
  • 5. Mid Term – Generative AI is a LOT more than ChatGPT Google’s are light blue. Meta’s are dark blue. Proficiency
  • 6. Mid Term - Determine our AI Acumen • Can you name a few LLMs other than Chat GPT? • You don’t need to know them… but be aware. • Llama, PaLM, etc. • Do you have Chat GPT Plus? • If so, have you used Code Interpreter/Advanced Data Analysis? • Have you experimented with PlugIns? • That’s the difference between a “chit chat bot” and AGENCY – where the language model is an interface to data that you speak to conversationally
  • 7. 7 Short Term Mid Term Long Term Generative AI Models: GPT, Llama, PaLM Agency ENPS -> Web Articles Box Scores -> Articles Audio Creation: ElevenLabs Media: MidJourney, Adobe Firefly, HeyGen ChatGPT PlugIns Advanced Data Analysis: Spreadsheets, insights, code writing Speech to Text: Whisper, Otter.ai, Captions
  • 8. Mid Term – Agency • Generative AI bots are so good, that we THINK THEY ARE ANSWER-BOTS • On their own… they can pass IQ tests, take the bar exam, etc. • GPT4 does so well on some tests, that we cannot create a test to measure it! • Like a punching bag game that goes to 1000… GPT4 is scoring 1000…. but if we can’t measure anything stronger… we don’t know if it’s strength is 1000 or 10,000… or 100,000
  • 9. Mid Term – Agency • Generative AI bots are so good, that we THINK THEY ARE ANSWER-BOTS
  • 10. Mid Term – Agency • Generative AI bots are so good, that we THINK THEY ARE ANSWER-BOTS • On their own… they can pass IQ tests, take the bar exam, etc. • However, when we use them as INTERFACES and connect them with DATA and give the ability to reflect and refine… avoid hallucinations, check their work… Access the internet, book flights, change hotel rooms… that’s an AGENT. We’re already using them… but they will get more powerful.
  • 11. This is the agent The Agents are Coming… via Enterprise LLM “Everything That Can Be LLM-ed, Will Be” Imagine ‘talking with’ 30 years of ENPS scripts….
  • 12. This is already happening. In other words, “mid- term” is now.
  • 13.
  • 14. Mid Term – Agency – Death of the Page View “The advertising model as we know it — getting people to go to your website and view it — that’s going to break…I don’t see how that sustains” –Jeremiah Owyang As consumers increasingly use automated tools to go through the marketing “funnel,” marketers and creators need to consider something that many might think is counterintuitive: That is, you actually want, no need, LLMs to train on your data. “If I was a journalist, I would want my articles ingested by all of the LLMs,” he explained, adding that more and more chatbots are including citations, including Bing, You.com and Perplexity. “So when people search for that information, I show up first — it’s the same as SEO strategy,” he said — cautioning that this would not apply to gated content, which employs a different business model. https://venturebeat.com/ai/forget-seo-why-ai-engine-optimization-may-be-the-future/ Never leave the chat = Death of the browser Elon, Gates, Zuck, Andreeson agree. ?
  • 15. 15 Short Term Mid Term Long Term Generative AI Models: GPT, Llama, PaLM Agency ENPS -> Web Articles Multimodal Box Scores -> Articles Audio Creation: ElevenLabs Media: MidJourney, Adobe Firefly, HeyGen ChatGPT PlugIns Advanced Data Analysis: Spreadsheets, insights, code writing Speech to Text: Whisper, Otter.ai, Captions
  • 16. Mid Term – Multimodal • Currently, Large Language Models are thought of as single modes: • ChatGPT = text • MidJourney = images • Etc. • Next gen LLMs will be multimodal: • Multimodal is the ability to work with and understand ANY media: • Text, images, video, data, • Image to speech, text to video, etc. • ChatGPT AGENTS like Code Interpreter can use tools like OCR to read PDFs or write Python to create tables and graphs • That’s not the same.
  • 17. Mid Term – Multimodal Actual interaction with Google DeepMind’s Flamingo in April 2022 (17 months ago) • How many people are in this photo • “There at least five people in this photo” • How many mirrors are in this photo? • “There are at least two mirrors” • What is the person on the scale doing? • “He is looking at the scale” • Where is Obama’s foot positioned? • “Obama’s foot is on the right side of the scale” • What happens as a result? • “The scale shows a higher weight” • Is the person on the scale aware of it? • “He is not aware of it.” https://www.youtube.com/watch?v=zRYcKhkAsk4
  • 18. Mid Term – Multimodal • Currently, Large Language Models are though of in single modes: • ChatGPT = text to text • MidJourney = text to images • Next gen LLMs will be Multimodal: • ChatGPT Plus Code Interpreter can use tools like OCR to read PDFs or write Python to create tables and graphs • That’s not the same. • Multimodal is the ability to understand ANY media: • Text, images, video, data, • Image to speech, text to video, etc. • No more need for metadata, keywords? The AP already has the ability to search its entire archive this way, i.e. “Find all of the photos of Churchill on a roof smoking a cigar” (actual example)
  • 19. 19 Short Term Mid Term Long Term Generative AI Models: GPT, Llama, PaLM Agency ENPS -> Web Articles Multimodal Box Scores -> Articles Embodiment Audio Creation: ElevenLabs Media: MidJourney, Adobe Firefly, HeyGen ChatGPT PlugIns Advanced Data Analysis: Spreadsheets, insights, code writing Speech to Text: Whisper, Otter.ai, Captions
  • 20. Mid Term – Embodiment • Using a Large Language Model as an interface to accomplish tasks on a network is AGENCY • Using a Large Language Model as an interface with a physical robot or machine is EMBODIMENT
  • 21. Mid Term – Embodiment The robot’s program does not need to know what rice chips are…nor where the drawer is… the LLM works with the video sensors and derives it… and commands it. Again... “mid-term” is now. “According to Google, when given a high-level command, such as bring me the rice chips from the drawer, PaLM- E can generate a plan of action for a mobile robot and execute the actions by itself.” – March 2023
  • 22. Mid Term – Embodiment • Using a Large Language Model as an interface to accomplish tasks online is AGENCY • Using a Large Language Model as an interface with a robot or machine is EMBODIMENT • The human says a command in plain English. • The LLM translates the command into the code required to execute. • If you had a Big Trak or used the LOGO programming educational tool, imagine just talking to “the turtle” as opposed to having to program it line by line. • Recently, a large language model embedded in a robot dog was able to INTUITIVELY figure out the dog’s programming code and guide the dog to do plain English tasks like “Go to the living room and bring me the red sock off of the couch”
  • 23.
  • 24. 24 Short Term Mid Term Long Term Generative AI Models: GPT, Llama, PaLM Agency ENPS -> Web Articles Multimodal Box Scores -> Articles Embodiment Audio Creation: ElevenLabs Artificial General Intelligence More toward “long term” Media: MidJourney, Adobe Firefly, HeyGen ChatGPT PlugIns Advanced Data Analysis: Spreadsheets, insights, code writing Speech to Text: Whisper, Otter.ai, Captions
  • 25. Long Term – Artificial General Intelligence (AGI) • OpenAI defines AGI as “when computers can outperform humans at most economically viable work” • A combination of multimodality, intelligence, agency, embodiment…and exponential improvement. • AGI Predictions • DeepMind CEO: A few years • Alan Thompson: 2 years • Elon Musk: 5 years • Ray Kurzweil: 6 years • Geoffrey Hinton: 5-10 years • Sam Altman: <10 years
  • 26. 10 Months of “otter on a plane using wifi” • October 2022 • November 2022 • March 2023 Let’s revisit short term, the novelty stage… Source: MidJourney & Ethan Mollick
  • 27. Long term sneaks up on us… • 1964: 1 megaflop = $5M (CDC 6600) • 1985: 1.9 gigaflop = $15M (The Cray-2) • 1997: 1 teraflop = $46M (ASCI Red) • 2023: 35 teraflops = $999 (iPhone 15) Source: Ethan Mollick
  • 28. 28 Short Term Mid Term Long Term Generative AI Models: GPT, Llama, PaLM Agency Artificial General Intelligence ENPS -> Web Articles Multimodal Box Scores -> Articles Embodiment Audio Creation: ElevenLabs Media: MidJourney, Adobe Firefly, HeyGen ChatGPT PlugIns Advanced Data Analysis: Spreadsheets, insights, code writing Speech to Text: Whisper, Otter.ai, Captions These are my foundations for thinking about AI Thank you.

Editor's Notes

  1. Partnering with RTDNA on policy and ethics Met with the Associated Press AI team Leading BLOX product advisory board Presenting to the ITG in September Created an in-house group to test and learn: EMPOWER New models check their work and reflect before responding, catch errors, and suggest improvements. It’s jarring. Written: transferring existing captions/scripts into articles, grammar, clarity, script critique, summaries/localization of documents Sports: raw box scores into scripts Audio : promotion voice track automation Visual : contest and promotional clip art Data: large dataset manipulation, insights, graphs, tables Code: Excel, Python, HTML, JavaScript