Large Language Models - Chat AI.pdf

David Rostcheck
David RostcheckData Scientist & Cross-Disciplinary Technology Leader à Independent Contactor Developer
David Rostcheck 5/9/2023
Large Language Model Chat AI
Where are we and where are we going?
This talk is:
• A broad general overview of the current state of generative AI
• Deep dives into speci
fi
c technology areas as needed (ex.
Retrieval Augmented Generation pattern)
Agenda
▪ Current landscape
▪ Human vs. AI intelligence
▪ Can AI’s really “think”?
▪ Social and market implications
▪ Use of Generative AI at work
▪ What is coming?
About me
▪ Education: Physics
▪ Background: AI, data science, cognitive science
▪ Past roles: software engineer, architect, evangelist, data
scientist
Note: on areas representing frontiers of AI and cognitive science research issues,
assessments and opinions are my own unless otherwise cited and do not reflect those
of any current, past, or future employer
What’s going on?
• We’re in a technology breakout
of generative AI
• Current areas: text, visual
images, audio generation
• ChatGPT reached 1M users
in 5 days – fastest breakout
ever for a foundational
technology
What are LLMs good at?
▪ Creative tasks (especially writing but also other tasks)
▪ Writing code
▪ Summarizing text
▪ Strategic analysis
▪ Understanding legal and tax codes
▪ General advice
▪ Evaluating candidates / writing resumes
Tech: what’s driving the breakout
▪ Transformer (attention-based) architecture, originally applied to text
processing, turns out to be generally applicable to other modalities (speech,
vision, music)
▪ Training on the body of language bootstraps ability to process information
(intelligence) generally
▪ Understanding of data scaling laws (20 tokens to 1 parameters currently
seems ideal)
▪ Open source breakout: models training other models (LLaMA)
▪ Weight quantization to 8 or 4 bit integers allows inference on laptop-class
machines (gpt4all)
AI intelligence – how smart are LLMs
The best LLMs have extremely broad and high intelligence
Yeah, but can they really think?
▪ Like, do common-sense reasoning problems? Solve tests that require theory of
mind? Yes
▪ Don’t LLMs just follow the association between words, i.e. autocomplete on
steroids? Yes
▪ But then aren’t they just “stochastic parrots” that don’t really understand what they
are saying? No
▪ Most of human thinking is following language-based associations (much of human
intelligence is in the language). LLMs inherit that
▪ AIs do have underlying conceptual representations of the concepts they are dealing with
(higher level features), and that can be tested
▪ See Microsoft Research “Sparks of AGI” paper, talk
Difference between human and AI cognition
LLMs mimic human thinking (cognition) patterns but not emotional patterns. Most
of what humans do behaviorally, though, is not thinking in the sense of explicit
cognition, we mostly act out heuristics and/or respond to emotional drives. LLMs
do not have emotional drive, because they lack the mammalian emotional drive
circuitry (Pangsepp's 7 circuits). They also currently have no long-term goals,
memory or planning, although that is changing (w/ Auto-GPT & Retrieval
Augmented Generation).
LLMs do think like humans, because they inherit the same bootstrapped
knowledge structures via language. However the lack of emotional drive, memory,
higher goal, and long-term planning differentiate them.
What are LLM AIs bad at (and how to you fix it?)
▪ They are not search engines and not fact-based. They recall information
from memory the way humans do
▪ “Hallucination” and being “confidently wrong”
▪ These are really confabulation (a memory error) – filling in a plausible story where the
model believes it knows something, but doesn’t really
▪ Examples: song lyrics (removed from training data set)
▪ Tech: Ways to fix this:
▪ Fine-tuning on data sets: adjust the model’s weights, teaches it new tasks. But usually not
the right solution
▪ Feed context into the prompt: generally applicable and works well
▪ More generally: Retrieval-Augmented Generation pattern: first search data, then give it to
the LLM to digest (Bing Search, Perplexity.ai)
Good news – human/AI synergy
Social and market implications
▪ The internet was disruptive; human-level AI is much more disruptive
▪ Expect serious, sustained disruption, primarily in creative jobs (coding,
copywriting) and knowledge-based jobs (technical support, law,
accounting), both positive and negative
▪ Higher-end workers who are able to leverage AI will become much more
productive
▪ Mediocre creative and knowledge-based workers face pressure
▪ Educational models face great pressure to restructure
▪ New fields being created (ex. prompt engineering)
▪ Some organizations are leaning aggressively into AI (ex. consulting), others
are lagging
LLMs at work (general)
▪ Fishbowl survey (11,793 professionals, 1/30/23):
▪ 43% using ChatGPT or other AI tools at work
▪ 32% with management awareness (since then more orgs have AI policies)
▪ ResumeBuilder survey (2/27/23):
▪ 49% of companies currently use ChatGPT; 30% plan to
▪ 48% of companies using ChatGPT say it’s replaced workers
▪ 25% companies using ChatGPT have already saved $75k+
▪ 93% of current users say they plan to expand their use of ChatGPT
▪ 90% of business leaders say chatGPT experience is a beneficial skill for job seekers
▪ Samsung confidential information breach via ChatGPT (TechRadar article)
What’s coming next for LLMs? Short term
▪ Longer-term memory augmentation
▪ Incorporating web search (already in Bing Search, perplexity.ai)
▪ Strategy planning (AutoGPT) becomes widespread
▪ LLMs using tools (ex. Wolfram Language to solve math problems)
▪ LLMs out of containment (real-time internet access)
▪ Embodiment (robots)
▪ Explosion of AI models of different sizes and capabilities
▪ LLM-class AI on laptops and phones
(these are all technologies that area already here but not widespread yet)
What’s coming next for Generative AI? Medium term
▪ Creativity explosion
▪ Intense human/AI collaboration in work and art
▪ Economic disruption, with underlying strong growth bias
▪ Explosion of AI actors leading to a complex landscape
▪ Governments and social systems challenged to align to change
▪ LLM hacking and security tools (security arms race)
▪ LLM impersonation of humans becomes a serious issue for identify verification
(scams, bank validation, etc.)
(these are extensions of existing trends)
What’s coming next for Generative AI? Longer term
We could possibly see:
▪ An increasingly hybrid human/society
▪ Complex human psychological responses
▪ Autonomous robots entering society at scale
▪ Narrative conflict (ex. generative AI video indistinguishable from reality)
▪ Nation-states and cultures pursuing different AI training goals
▪ The rise of new and different kinds of institutions to those of today?
(these are completely speculative but informed by research)
AI Safety and Alignment
▪ “safety” – has primarily really focused on etiquette
▪ “alignment” – much more serious: assuring effective human/AI cooperation
▪ Industry focus has been on safety; alignment now gaining prominence
▪ Future of Life Institute – tech leaders issue letter calling for 6-month pause in training more
advanced AI
▪ Seems unlikely to happen
▪ Nations searching for regulatory structures:
▪ Reactive: Italy bans ChatGPT
▪ Proactive: UK national advanced AI initiative
▪ There are control points (ex. restriction of foundational models, GPUs) but they are being
rapidly overcome (ex. open source training sets, Dolly 2)
How do I stay up to date?
▪ State of the AI space:
▪ Newsletters (free and paid) ex. Lifearchitect.ai/memo
▪ AI Explained channel on YouTube
▪ Tech & learning to build:
▪ YouTube: James Briggs, David Shapiro
▪ AI Research – my archive of experiments and notes
1 sur 19

Contenu connexe

Similaire à Large Language Models - Chat AI.pdf

Similaire à Large Language Models - Chat AI.pdf(20)

Plus de David Rostcheck(7)

NLP and personality analysisNLP and personality analysis
NLP and personality analysis
David Rostcheck2.1K vues
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
David Rostcheck1.1K vues
New professional careers in dataNew professional careers in data
New professional careers in data
David Rostcheck765 vues
Data science as a professional careerData science as a professional career
Data science as a professional career
David Rostcheck688 vues
Breaking New Ground - David RostcheckBreaking New Ground - David Rostcheck
Breaking New Ground - David Rostcheck
David Rostcheck204 vues

Large Language Models - Chat AI.pdf

  • 1. David Rostcheck 5/9/2023 Large Language Model Chat AI Where are we and where are we going?
  • 2. This talk is: • A broad general overview of the current state of generative AI • Deep dives into speci fi c technology areas as needed (ex. Retrieval Augmented Generation pattern)
  • 3. Agenda ▪ Current landscape ▪ Human vs. AI intelligence ▪ Can AI’s really “think”? ▪ Social and market implications ▪ Use of Generative AI at work ▪ What is coming?
  • 4. About me ▪ Education: Physics ▪ Background: AI, data science, cognitive science ▪ Past roles: software engineer, architect, evangelist, data scientist Note: on areas representing frontiers of AI and cognitive science research issues, assessments and opinions are my own unless otherwise cited and do not reflect those of any current, past, or future employer
  • 5. What’s going on? • We’re in a technology breakout of generative AI • Current areas: text, visual images, audio generation • ChatGPT reached 1M users in 5 days – fastest breakout ever for a foundational technology
  • 6. What are LLMs good at? ▪ Creative tasks (especially writing but also other tasks) ▪ Writing code ▪ Summarizing text ▪ Strategic analysis ▪ Understanding legal and tax codes ▪ General advice ▪ Evaluating candidates / writing resumes
  • 7. Tech: what’s driving the breakout ▪ Transformer (attention-based) architecture, originally applied to text processing, turns out to be generally applicable to other modalities (speech, vision, music) ▪ Training on the body of language bootstraps ability to process information (intelligence) generally ▪ Understanding of data scaling laws (20 tokens to 1 parameters currently seems ideal) ▪ Open source breakout: models training other models (LLaMA) ▪ Weight quantization to 8 or 4 bit integers allows inference on laptop-class machines (gpt4all)
  • 8. AI intelligence – how smart are LLMs The best LLMs have extremely broad and high intelligence
  • 9. Yeah, but can they really think? ▪ Like, do common-sense reasoning problems? Solve tests that require theory of mind? Yes ▪ Don’t LLMs just follow the association between words, i.e. autocomplete on steroids? Yes ▪ But then aren’t they just “stochastic parrots” that don’t really understand what they are saying? No ▪ Most of human thinking is following language-based associations (much of human intelligence is in the language). LLMs inherit that ▪ AIs do have underlying conceptual representations of the concepts they are dealing with (higher level features), and that can be tested ▪ See Microsoft Research “Sparks of AGI” paper, talk
  • 10. Difference between human and AI cognition LLMs mimic human thinking (cognition) patterns but not emotional patterns. Most of what humans do behaviorally, though, is not thinking in the sense of explicit cognition, we mostly act out heuristics and/or respond to emotional drives. LLMs do not have emotional drive, because they lack the mammalian emotional drive circuitry (Pangsepp's 7 circuits). They also currently have no long-term goals, memory or planning, although that is changing (w/ Auto-GPT & Retrieval Augmented Generation). LLMs do think like humans, because they inherit the same bootstrapped knowledge structures via language. However the lack of emotional drive, memory, higher goal, and long-term planning differentiate them.
  • 11. What are LLM AIs bad at (and how to you fix it?) ▪ They are not search engines and not fact-based. They recall information from memory the way humans do ▪ “Hallucination” and being “confidently wrong” ▪ These are really confabulation (a memory error) – filling in a plausible story where the model believes it knows something, but doesn’t really ▪ Examples: song lyrics (removed from training data set) ▪ Tech: Ways to fix this: ▪ Fine-tuning on data sets: adjust the model’s weights, teaches it new tasks. But usually not the right solution ▪ Feed context into the prompt: generally applicable and works well ▪ More generally: Retrieval-Augmented Generation pattern: first search data, then give it to the LLM to digest (Bing Search, Perplexity.ai)
  • 12. Good news – human/AI synergy
  • 13. Social and market implications ▪ The internet was disruptive; human-level AI is much more disruptive ▪ Expect serious, sustained disruption, primarily in creative jobs (coding, copywriting) and knowledge-based jobs (technical support, law, accounting), both positive and negative ▪ Higher-end workers who are able to leverage AI will become much more productive ▪ Mediocre creative and knowledge-based workers face pressure ▪ Educational models face great pressure to restructure ▪ New fields being created (ex. prompt engineering) ▪ Some organizations are leaning aggressively into AI (ex. consulting), others are lagging
  • 14. LLMs at work (general) ▪ Fishbowl survey (11,793 professionals, 1/30/23): ▪ 43% using ChatGPT or other AI tools at work ▪ 32% with management awareness (since then more orgs have AI policies) ▪ ResumeBuilder survey (2/27/23): ▪ 49% of companies currently use ChatGPT; 30% plan to ▪ 48% of companies using ChatGPT say it’s replaced workers ▪ 25% companies using ChatGPT have already saved $75k+ ▪ 93% of current users say they plan to expand their use of ChatGPT ▪ 90% of business leaders say chatGPT experience is a beneficial skill for job seekers ▪ Samsung confidential information breach via ChatGPT (TechRadar article)
  • 15. What’s coming next for LLMs? Short term ▪ Longer-term memory augmentation ▪ Incorporating web search (already in Bing Search, perplexity.ai) ▪ Strategy planning (AutoGPT) becomes widespread ▪ LLMs using tools (ex. Wolfram Language to solve math problems) ▪ LLMs out of containment (real-time internet access) ▪ Embodiment (robots) ▪ Explosion of AI models of different sizes and capabilities ▪ LLM-class AI on laptops and phones (these are all technologies that area already here but not widespread yet)
  • 16. What’s coming next for Generative AI? Medium term ▪ Creativity explosion ▪ Intense human/AI collaboration in work and art ▪ Economic disruption, with underlying strong growth bias ▪ Explosion of AI actors leading to a complex landscape ▪ Governments and social systems challenged to align to change ▪ LLM hacking and security tools (security arms race) ▪ LLM impersonation of humans becomes a serious issue for identify verification (scams, bank validation, etc.) (these are extensions of existing trends)
  • 17. What’s coming next for Generative AI? Longer term We could possibly see: ▪ An increasingly hybrid human/society ▪ Complex human psychological responses ▪ Autonomous robots entering society at scale ▪ Narrative conflict (ex. generative AI video indistinguishable from reality) ▪ Nation-states and cultures pursuing different AI training goals ▪ The rise of new and different kinds of institutions to those of today? (these are completely speculative but informed by research)
  • 18. AI Safety and Alignment ▪ “safety” – has primarily really focused on etiquette ▪ “alignment” – much more serious: assuring effective human/AI cooperation ▪ Industry focus has been on safety; alignment now gaining prominence ▪ Future of Life Institute – tech leaders issue letter calling for 6-month pause in training more advanced AI ▪ Seems unlikely to happen ▪ Nations searching for regulatory structures: ▪ Reactive: Italy bans ChatGPT ▪ Proactive: UK national advanced AI initiative ▪ There are control points (ex. restriction of foundational models, GPUs) but they are being rapidly overcome (ex. open source training sets, Dolly 2)
  • 19. How do I stay up to date? ▪ State of the AI space: ▪ Newsletters (free and paid) ex. Lifearchitect.ai/memo ▪ AI Explained channel on YouTube ▪ Tech & learning to build: ▪ YouTube: James Briggs, David Shapiro ▪ AI Research – my archive of experiments and notes