This is the slideshow for a presentation I gave as part of my graduate coursework at the Institute for Innovation and Public Purpose at University College London (UCL IIPP). Drawing on the work of IIPP professors including Carlota Perez (techno-economic paradigms), Mariana Mazzucato (“The Entrepreneurial State”), and Tim O’Reilly, I evaluate the innovation trajectory of Deep Neural Networks as a method of machine learning. I trace the history of machine learning to its present-day and conclude that while Deep Neural Networks have not yet reached technological maturity, they are already starting to encounter barriers to exponential growth and innovation. These slides were designed to be read independently from the spoken portion. If you found this useful or interesting, please message me on LinkedIn! - Justin Beirold
1. Institute for Innovation
and Public Purpose
Presentation
Deep Neural Networks for Machine Learning
Justin Beirold
Module 1: Public Value and Public Purpose
Professor Antonio Andreoni
18 December 2020
2. DEFINITIONS
What is Machine Learning?
• Artificial Intelligence (AI): “A broad discipline with the goal of
creating intelligent machines, as opposed to the natural intelligence
that is demonstrated by humans and animals. It has become a
somewhat catch all term that nonetheless captures the long-term
ambition of the field to build machines that emulate and then exceed
the full range of human cognition.”(Benaich and Hogarth, 2020)
• Artificial General Intelligence (AGI): also known as the
“singularity”, an AGI is an “AI that is as capable of learning
intellectual tasks as humans are…”(Berutti/McKinsey and Co.,
2020)
• Machine Learning(ML): “Machine-learning algorithms use statistics to
find patterns in massive amounts of data. And data, here,
encompasses a lot of things—numbers, words, images, clicks, what
have you. If it can be digitally stored, it can be fed into a machine-
learning algorithm… Machine learning is the process that powers
many of the services we use today— recommendation systems like
those on Netflix, YouTube, and Spotify; search engines like Google
and Baidu; social-media feeds like Facebook and Twitter; voice
assistants like Siri and Alexa. The list goes on.” (Hao/MIT Tech Review,
2018)
3. DEFINITIONS
What are Deep Neural Networks?
• Artificial Neural Networks (ANN): Loosely
modeled after the human brain, a Neural Network is
“…a computing system made up of a number of
simple, highly interconnected processing elements,
which process information by their dynamic state
response to external inputs.”(Caudill, 1987)
• Deep Learning/Deep Neural Networks: “Deep
learning is machine learning on steroids: it uses a
technique that gives machines an enhanced ability
to find—and amplify—even the smallest patterns.
This technique is called a Deep Neural Network
[or Convolutional Neural Network]—deep because
it has many, many layers of simple computational
nodes that work together to munch through data
and deliver a final result in the form of the
prediction.” (Hao/MIT Tech Review, 2018)
Source: Author; Adapted from Donahue, 2018
4. MACHINE LEARNING
VS.
TRADITIONAL PROGRAMMING
- Opposite of traditional programming approach
- Feed the model massive amount of data, and it will
recognize patterns to make inferences and predictions
- Highly useful in cases where large datasets are available
- Superior when the “rules” of a system are difficult or impossible
for humans to define and write as code.
- Deep Neural Networks are the preferred ML model because
they become increasingly accurate as they consume more data
- Three types of learning models
- Supervised Learning – pre-labeled data
- Unsupervised Learning – unlabeled data
- Transfer Learning – using knowledge from one
domain to improve performance in others.
Source: (Google I/O, 2019)
Source: (Dossman, 2018)
5. MACHINE LEARNING
VS.
TRADITIONAL PROGRAMMING
“In a traditional approach to building an algorithmic system for recognizing and sorting data, the programmer
identifies the attributes to be examined, the acceptable values and the action to be taken… Using a machine-
learning approach, a system is shown many, many examples of good and bad data in order to train a model of
what good and bad looks like. The programmer may not always know entirely what features of the data
the machine-learning model is relying on; the programmer knows only that it serves up results that
appear to match or exceed human judgment against a test data set. Then the system is turned loose on
real-world data. After the initial training, the system can be designed to continue to learn.” (O’Reilly, 2020)
6. A FEW
EXAMPLES
OF MACHINE
LEARNING
ADVANTAGES
• Radiology
• Traditional Approach: Write software which tells the
computer to look at a brain scan and determine whether
a tumor is cancerous. Requires you to contrive a formula
for detecting cancer. Very expensive with low accuracy.
• ML Approach: Feed the model millions of brain scan
images. The model finds patterns in the photos and
makes its own rules for recognizing cancer. Ask the
model to analyze a photo. If it gets an answer wrong, the
error is “backpropagated” through the network,
adjusting the relevant neurons, and it will never repeat
the same mistake. The ML approach is far superior to
traditional programming, and now nearly as accurate as
the best human Radiologists. (Do, Song, and Chung,
2020)
• Shredded Document Reconstruction
• Traditional Approach: Meticulously comb through a
mountain of shredded documents and piece them
together by hand. Can take several months.
• ML Approach: Scan the shredded pieces into a
computer database. The ML model recognizes word
and sentence fragments and re-assembles the
documents within hours. Some models are capable of
inferring gaps if pieces are missing. Orders of
magnitude faster + more accurate. (Paixão et al, 2020)
• Legal Research
• Traditional Approach: Attorney hires a paralegal to
conduct a thorough review of case law, legal opinions,
litigation outcomes, etc. Can take weeks or months
depending on the case.
• ML Approach: ML model with access to a legal
database finds every relevant piece of legal research in
minutes. (Donahue, 2018)
*Note: In each case,
the AI does not
replace humans, but
can do the boring and
monotonous tasks
dramatically more
efficiently. Humans
are still required to
analyze and interpret
the results. (for now).
Radiologists and
Attorneys are not out
of a job, but radiologic
technicians and
paralegals will be.
Source: Brooklyn 99/NBC
Source: The-Scientist.com
7. TIMELINE: HISTORY OF NEURAL NETWORKS
Source: Author; inspiration and data from
Foote (2019); Mayo et al (2018)
8. HAVE DEEP NEURAL NETWORKS
REACHED MATURITY?
Source: Carlota Perez IIPP Lecture
Source: The Atlas, 2018
9. Source: Author; adapted from Carlota Perez IIPP lecture
HAVE DEEP
NEURAL
NETWORKS
REACHED
MATURITY?
- Open question, but I would argue
no
- Certainly not for Machine Learning
in general
- We can see hints of the limits to
DNNs, but they are still dominant
+ experiencing incremental
innovation at a rapid pace. (see
Slide 12 for more on limitations)
10. THE ENTREPRENEURIAL STATE
AND NEURAL NETWORKS
• 1957: The first operational Artificial Neural Network, the
Perceptron, was created at the Cornell Aeronautical Lab,
funded by the US Office of Naval Research.(Rosenblatt, 1957)
• 1989: First Artificial Neural Network for optical character
recognition developed by Bell Labs researchers based on
data from US Postal Service Office of Advanced Technology
in 1989 (LeCun et al, 1989)
• 2007: Apple’s Siri developed by the Stanford Research
Institute through funding from DARPA (Mazzucato, 2013)
• 2020: Google/DeepMind’s AlphaGo, in collaboration with the
US Department of Defense, beats a human pilot in a simulated
dogfight. A simulation with real fighter jets is planned for 2024.
(DARPA, 2020)
Rosenblatt’s Perceptron (1960’s), funded by the US Navy
Source: Cornell University
11. THE PRIVATE SECTOR’S DOMINANCE
• Large firms in the private sector
dominate in Neural Network research
and development because they have the
most data
• Silicon Valley firms, led by Google and
Facebook, have massive systems for
collecting data, which improves neural
network performance.
• Empirical scaling laws mean that the
larger the model, the more computing
power (including specialized hardware)
is required.(Benaich and Hogarth, 2020)
• Some estimates suggest that Google spent
over $10million to train its Google Translate
model alone.
• ML makes the extraction of
algorithmic rents by tech platforms
more efficient. (Mazzucato,
Ryan-Collins, and Gouzoulis, 2020)
(O’Reilly, 2020)
Source: (Dossman, 2018)
12. THE LIMITS
OF DEEP
NEURAL
NETWORKS
• “Black Box”
• Neural Networks cannot explain why it
made a certain decision (hidden layers)
• Highly problematic if we are using their
predictions to make life or death
decisions.
• When an error occurs (such as racial
bias), it can be difficult to fix.
• “Last mile problem”
• Key to public trust/ accountability/
acceptability
• Under Specification of Data
• Recently revealed by Google that its
models are far less accurate in the “real
world” than on training data.
(D’Amour/Google, 2020)
• No clear solutions to this, suggesting
innovation may slow
• Massive compute cost
• Only tech giants can really compete
14. BIBLIOGRAPHY
• AlphaDogfight Trials Go Virtual for Final Event. Defense Advanced Research Projects Agency. (2020, July). https://www.darpa.mil/news-events/2020-08-07.
• AlphaFold: a solution to a 50-year-old grand challenge in biology. Deepmind. (2020, November 30). https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-
grand-challenge-in-biology.
• AlphaGo: The story so far. Deepmind. (2020). https://deepmind.com/research/case-studies/alphago-the-story-so-far.
• Benaich, N., & Hogarth, I. (2020). State of AI Report 2020. https://www.stateof.ai/.
• Berruti, F., Nel, P., & Whiteman, R. (2020, October 20). An executive primer on artificial general intelligence. McKinsey & Company. https://www.mckinsey.com/business-
functions/operations/our-insights/an-executive-primer-on-artificial-general-intelligence.
• Caudill, M. (1987). Neural networks primer, part I. AI Expert.
• D'Amour, A., Heller, K., & Google . (2020, November). Underspecification Presents Challenges for Credibility in Modern Machine Learning. https://arxiv.org/pdf/2011.03395.pdf.
• Do, S., Song, K. D., & Chung, J. W. (2020). Basics of Deep Learning: A Radiologist's Guide to Understanding Published Radiology Articles on Deep Learning. Korean Journal of
Radiology, 21(1), 33–41. https://doi.org/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6960318/
• Donahue, L. (2018, January 3). A Primer on Using Artificial Intelligence in the Legal Profession. Harvard Journal of Law & Technology. https://jolt.law.harvard.edu/digest/a-
primer-on-using-artificial-intelligence-in-the-legal-profession.
• Dossman, C. (2018, October 15). Deep Learning Performance Cheat Sheet. Medium. https://towardsdatascience.com/deep-learning-performance-cheat-sheet-21374b9c4f45.
• Foote, Keith D. “A Brief History of Machine Learning.” DATAVERSITY, March 13, 2019. https://www.dataversity.net/a-brief-history-of-machine-learning/.
• Hao, K. (2020, November 17). What is machine learning? MIT Technology Review. https://www.technologyreview.com/2018/11/17/103781/what-is-machine-learning-we-drew-
you-another-flowchart/.
• Hopfield, J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United
States of America, 2554–2558. https://doi.org/https://doi.org/10.1073/pnas.79.8.2554
• LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., … Bell Labs. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Neural
Computation/Massachusetts Institute of Technology, 1, 541–551. https://doi.org/10.1162/neco.1989.1.4.541
• Mayo, Hugo, Hasan Punchihewa, Julie Emile, and Jack Morrison. “History of Machine Learning.” Imperial College London, 2018. https://www.doc.ic.ac.uk/~jce317/history-
machine-learning.html.
• Mazzucato, M. (2013). The Entrepreneurial State: Debunking Public vs. Private Sector Myths. Anthem Press.
• Mazzucato, M., Ryan-Collins, J., & Gouzoulis, G. (2020, June 29). Theorising and mapping modern economic rents. UCL Institute for Innovation and Public Purpose Working
Paper Series . https://www.ucl.ac.uk/bartlett/public-purpose/sites/public-purpose/files/final_iipp-wp2020-13-theorising-and-mapping-modern-economic-rents_8_oct.pdf.
• McCulloch, W., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115–133.
15. BIBLIOGRAPHY
• Minsky, M., & Papert, S. (1969). Perceptrons: an introduction to computational geometry. The MIT Press.
• Moroney, L., Allison , K., & Google I/O. (2019, May). Machine Learning Zero to Hero. YouTube. [Video] https://www.youtube.com/watch?v=VwVg9jCtqaU&t=136s.
• O'Reilly, T. (2020, August 18). We Have Already Let The Genie Out of The Bottle. The Rockefeller Foundation. https://www.rockefellerfoundation.org/blog/we-
have-already-let-the-genie-out-of-the-bottle/.
• Paixa ̃o Thomas M, Berriel, R. F., Boeres, M. C. S., Koerich, A. L., Badue, C., De Souza, A. F., & Olivera-Santos, T. (2020). Fast(er) Reconstruction of Shredded
Text Documents via Self-Supervised Deep Asymmetric Metric Learning. In Conference on Computer Vision and Pattern Recognition. Seattle, Washington.
• Perez, C. (2002). Technological revolutions and financial capital: the dynamics of bubbles and golden ages. Edward Elgar Publishing, Inc.
• Quartz. (2018, November 16). The dramatic rise of the term "deep learning" in research. Atlas. https://theatlas.com/charts/ByhdcCsp7.
• Rosenblatt, F. (1958). The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Psychological Review, 65(6), 386–408.
https://doi.org/https://doi.org/10.1037/h0042519
• Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533–536.
https://doi.org/https://doi.org/10.1038/323533a0
• Turing, A. (1950). Computing Machinery and Intelligence. Mind, LIX(236), 433–460. https://doi.org/https://doi.org/10.1093/mind/LIX.236.433