SlideShare une entreprise Scribd logo
1  sur  50
Télécharger pour lire hors ligne
Copyright © 2014, 2015 John L. Gustafson
AN ENERGY-EFFICIENTAND
MASSIVELY PARALLELAPPROACH
TO VALID NUMERICS
John L. Gustafson, Ph.D.
CTO, Ceranovo
Director, Massively Parallel Technologies, Etaphase,
Clustered Systems
Copyright © 2014, 2015 John L. Gustafson
Big problems facing computing
•  Too much energy and power needed per calculation
•  More hardware parallelism than we know how to use
•  Not enough bandwidth (the “memory wall”)
•  Rounding errors more treacherous than people realize
•  Rounding errors prevent use of parallel methods
•  Sampling errors turn physics simulations into guesswork
•  Numerical methods are hard to use, require experts
•  IEEE floats give different answers on different platforms
Copyright © 2014, 2015 John L. Gustafson
The ones vendors care most about
•  Too much energy and power needed per calculation
•  More hardware parallelism than we know how to use
•  Not enough bandwidth (the “memory wall”)
•  Rounding errors more treacherous than people realize
•  Rounding errors prevent use of parallel methods
•  Sampling errors turn physics simulations into guesswork
•  Numerical methods are hard to use, require experts
•  IEEE floats give different answers on different platforms
Copyright © 2014, 2015 John L. Gustafson
Too much power and heat needed
• Huge heat sinks
• 20 MW limit for exascale
• Data center electric bills
• Mobile device battery life
• Heat intensity means bulk
• Bulk increases latency
• Latency limits speed
Copyright © 2014, 2015 John L. Gustafson
More parallel hardware than we can use
•  Huge clusters usually partitioned into 10s, 100s of cores
•  Few algorithms exploit millions of cores except LINPACK
•  Capacity is not a substitute for capability!
Copyright © 2014, 2015 John L. Gustafson
Not enough bandwidth (“Memory wall”)
Operation Energy
consumed
Time
needed
64-bit multiply-add 200 pJ 1 nsec
Read 64 bits from cache 800 pJ 3 nsec
Move 64 bits across chip 2000 pJ 5 nsec
Execute an instruction 7500 pJ 1 nsec
Read 64 bits from DRAM 12000 pJ 70 nsec
Notice that 12000 pJ @ 3 GHz = 36 watts!
One-size-fits-all overkill 64-bit precision wastes energy, storage bandwidth
Copyright © 2014, 2015 John L. Gustafson
Happy 100th Birthday, Floating Point
1914: Torres proposes automatic computing with a fraction and a scaling factor.
2014: We still use a format designed for World War I hardware capabilities!
Copyright © 2014, 2015 John L. Gustafson
Floats designed for visible scratch work
•  OK for manual calculations
•  Operator sees, remembers errors
•  Can head off overflow, underflow
•  Automatic math hides all that
•  No one sees processor “flags”
•  Disobeys algebraic laws
•  Wastes bit patterns as NaNs
•  IEEE 754 “standard” is really the
IEEE 754 guideline; optional
rules spoil consistent results
Copyright © 2014, 2015 John L. Gustafson
Analogy: Printing in 1970 vs. 2014
1970: 30 sec per page 2013: 30 sec per page
Faster technology is for better prints,
not thousands of low-quality prints per second.
Why not do the same thing with computer arithmetic?
Copyright © 2014, 2015 John L. Gustafson
This is just… sad.
Copyright © 2014, 2015 John L. Gustafson
Floats prevent use of parallelism
•  No associative property for floats
•  (a + b) + (c + d) (parallel) ≠ ((a + b) + c) + d (serial)
•  Looks like a “wrong answer”
•  Programmers trust serial, reject parallel
•  IEEE floats report rounding, overflow, underflow in
processor register bits that no one ever sees.
Copyright © 2014, 2015 John L. Gustafson
A New Number Format: The Unum
•  Universal numbers
•  Superset of IEEE types,
both 754 and 1788
•  Integersèfloatsèunums
•  No rounding, no overflow to
∞, no underflow to zero
•  They obey algebraic laws!
•  Safe to parallelize
•  Fewer bits than floats
•  But… they’re new
•  Some people don’t like new
“You can’t boil the ocean.”
—Former Intel exec, when shown the unum idea
Copyright © 2014, 2015 John L. Gustafson
Three ways to express a big number
Avogadro’s number: ~6.022×1023 atoms or molecules
Unum (29 bits):
0 11001101 111111100001 1 111 1011
sign exp. frac. ubit exp. size frac. size
Self-descriptive “utag” bits track
and manage uncertainty, exponent
size, and fraction size
utag
IEEE Standard Float (64 bits):
0 10001001101 1111111000010101010011110100010101111110101000010011
sign exponent (scale) fraction
Sign-Magnitude Integer (80 bits):
0 1111111100001010101001111010001010111111010100001001010011000000000000000000000
sign Lots of digits
Copyright © 2014, 2015 John L. Gustafson
Why unums use fewer bits than floats
•  Exponent smaller by about 5 – 10 bits, typically
•  Trailing zeros in fraction compressed away, saves ~2 bits
•  Shorter strings for more common values
•  Cancellation removes bits and the need to store them
Unum (29 bits):
IEEE Standard Float (64 bits):
0 10001001101 1111111000010101010011110100010101111110101000010011
0 11001101 111111100001 1 111 1011
Copyright © 2014, 2015 John L. Gustafson
Open ranges, as well as exact points
Bit string meanings
using IEEE Float rules
Bit string meanings
in unum format
Complete representation of all real numbers using a finite number of bits
Copyright © 2014, 2015 John L. Gustafson
Ubounds are the hull of 1 or 2 unums
• Includes closed, open, half-open intervals
• Includes ±∞, empty set, quiet and signaling NaN
• Unlike traditional intervals, ubounds are closed
and lossless under set operations
Copyright © 2014, 2015 John L. Gustafson
The three layers of computing
Grammar
rules for exact,
inexact
High-efficiency
floats and real
intervals
Limited set of
fused
operations
Copyright © 2014, 2015 John L. Gustafson
The Warlpiri unums
Before the aboriginal Warlpiri of
Northern Australia had contact with
other civilizations, their counting
system was “One, two, many.”
Maybe they were onto something.
Copyright © 2014, 2015 John L. Gustafson
Fixed-size unums: faster than floats
•  Warlpiri ubounds are one byte, but closed system for reals
•  Unpacked unums pre-decode exception bits, hidden bit
Circuit required for
“IEEE half-precision
float = ∞?” Circuit required for
“unum = ∞?”
(any precision)
Copyright © 2014, 2015 John L. Gustafson
Floating Point II: The Wrath of Kahan
•  Berkeley professor William Kahan is the father of modern IEEE
Standard floats
•  Also the authority on their many dangers
•  Every idea to fix floats faces his tests that expose how new idea is
even worse
Working unum environment
completed August 13, 2013.
Can unums survive the
wrath of Kahan?
Copyright © 2014, 2015 John L. Gustafson
A Typical Kahan Challenge
•  Correct answer: (1, 1, 1, 1).
•  IEEE 32-bit: (0, 0, 0, 0) FAIL
•  IEEE 64-bit: (0, 0, 0, 0) FAIL
•  Myth: “Getting the same answer with increased precision means the
answer is correct.”
•  IEEE 128-bit: (0, 0, 0, 0) FAIL
•  Extended precision math packages: (0, 0, 0, 0) FAIL
•  Interval arithmetic: Um, somewhere between –∞ and ∞. EPIC FAIL
•  Unums, 6-bit average size: (1, 1, 1, 1) CORRECT
I have been unable to find a problem that “breaks” unum math.
Copyright © 2014, 2015 John L. Gustafson
Kahan’s “Smooth Surprise”
Find minimum of log(|3(1–x)+1|)/80 + x2 + 1 in 0.8 ≤ x ≤ 2.0
Plot, test using half a million
double-precision IEEE floats.
Shows minimum at x = 0.8.
FAIL
Plot, test using a few dozen
very low-precision unums.
Shows minimum where
x spans 4/3.
CORRECT
Copyright © 2014, 2015 John L. Gustafson
Kahan on the computation of powers
5.96046447753906250.875 = 4.76837158203125
I sent Kahan my fixed-cost, fixed-storage method, and he said it looked “impractical.”
I asked if he had a method that shows the following computation is exact:
Have not heard from him since.
Copyright © 2014, 2015 John L. Gustafson
Two can play this game, Professor K.
Unums can do anything floats can do, through explicit use of the guess function.
•  Stable fixed point found by floats, not by traditional intervals
•  Unums find both stable point, and unstable point at origin
•  Finding exact stable point is mathematically incorrect!
•  Adding a tiny wobble sin(x)/x destroys floats, but not unums.
Copyright © 2014, 2015 John L. Gustafson
Rump’s Royal Pain
•  Using IBM (pre-IEEE Standard) floats, Rump got
•  1.172603 in 32-bit precision
•  1.1726039400531 in 64-bit precision
•  1.172603940053178 in 128-bit precision
•  Using IEEE double precision: 1.18059x1021
•  Correct answer: –0.82739605994682136…!
Didn’t even get sign right
Compute 333.75y6 + x2(11x2y2 – y6 – 121y4 – 2) + 5.5y8 + x/(2y)
where x = 77617, y = 33096.
Unums: Correct answer to 23 decimals using an average
of only 75 bits per number. Not even IEEE 128-bit precision
can do that. Precision, range adjust automatically.
Copyright © 2014, 2015 John L. Gustafson
Some fundamental principles
Bound the answer as tightly as possible within
the numerical environment, or admit defeat.
•  No more guessing
•  No more “the error is O(hn)” type estimates
•  The smaller the bound, the greater the information
•  Performance is information per second
•  Maximize information per bit
•  Fused operations are always explicitly distinct from their
non-fused versions and results are identical across
platforms
Copyright © 2014, 2015 John L. Gustafson
Polynomials: bane of classic intervals
Dependency and closed endpoints lose information (amber)
Unum polynomial evaluator
loses no information.
Copyright © 2014, 2015 John L. Gustafson
Polynomial evaluation solved at last
Mathematicians have sought this for at least 60 years.
“Dependency Problem” creates sloppy
range when input is an interval
Unum evaluation refines answer to
limits of the environment precision
Copyright © 2014, 2015 John L. Gustafson
Uboxes and solution sets
•  A ubox is a multidimensional unum
•  Exact or ULP-wide in each dimension
•  Sets of uboxes constitute a solution set
•  One dimension per degree of freedom in solution
•  Solves the main problem with interval arithmetic
•  Super-economical for bit storage
•  Data parallel in general
Copyright © 2014, 2015 John L. Gustafson
Calculus considered harmful
•  Computers are discrete
•  Calculus is continuous
•  Ensures sampling errors
•  Changes problem to fit the tool
Copyright © 2014, 2015 John L. Gustafson
Deeply Unsatisfying Error Bounds
Error ≤ (b – a) h2 |f ʹ′ʹ′(ξ)| / 24
•  Classical numerical texts
teach this “error bound”:
•  What is f ʹ′ʹ′? Where is ξ?
What is the bound??
•  Bound is often infinite, which means no bound at all
•  “Whatever it is, it’s four times better if we make h half as
big” creates demand for supercomputing that cannot be
satisfied.
4x
Copyright © 2014, 2015 John L. Gustafson
Two “ubox methods”, both mindless
•  Paint bucket: find one solution point, test neighbors and classify as
solution or fail until no more neighbors to test
•  Works if solution is known to be a connected set
•  Requires a starting point “seed”
•  Wave front of trial uboxes can be computed in parallel
•  Try the universe: Use Warlpiri uboxes (4-bit precision) to tile all of n-
space; increment exponent and fraction size automatically
•  13n things to do in parallel (!)
•  Finds every solution, no matter what, since all of n-space is represented
•  Detects ill-posed problems and solves them anyway
•  Parallelism adjusts from 3 to trillions
Copyright © 2014, 2015 John L. Gustafson
Quarter-circle example
•  Suppose all we know is x2 + y2 = 1, and x and y are ≥ 0
•  Suppose we have at most 2 bits exponent, 4 bits fraction
Task:
Bound the quarter circle area.
Bound the value of π.
Copyright © 2014, 2015 John L. Gustafson
Set is connected; need a seed
•  We know x = 0, y = 1 works
•  Find its eight ubox
neighbors in the plane
•  Test x2 + y2 = 1, x ≥ 0, y ≥ 0
•  Solution set is green
•  Trial set is amber
•  Failure set is red
•  Stop when no more trials
Copyright © 2014, 2015 John L. Gustafson
Exactly one neighbor passes
•  Unum math automatically
excludes cases that floats
would accept
•  Trials are neighbors of new
solutions that
•  Are not already failures
•  Are not already solutions
•  Note: no calculation of
Not part of the unit circle
y = 1− x2
Copyright © 2014, 2015 John L. Gustafson
The new trial set
•  Five trial uboxes to test
•  Perfect, easy parallelism
for multicore
•  Each ubox takes only
15 to 23 bits
•  Ultra-fast operations
•  Skip to the final result…
Copyright © 2014, 2015 John L. Gustafson
The complete quarter circle
•  The complete solution, to
this finite precision level
•  Information is reciprocal of
green area
•  Use to find area under arc,
bounded above and below
•  Proves value of π to an
accuracy of 3%
•  No calculus needed, or
divides, or square roots
Copyright © 2014, 2015 John L. Gustafson
Compressed Final Result
•  Coalesce uboxes to largest
possible ULP values
•  Lossless compression
•  Total data set: 603 bits!
•  6x faster graphics than
current methods
Instead of ULPs being the
source of error, they are the
atomic units of computation
Copyright © 2014, 2015 John L. Gustafson
Fifth-degree polynomial roots
•  Analytic solution: There isn’t one.
•  Numerical solution: Huge errors from rounding
•  Unums: quickly return
x = –1, x = 2 as the exact
solutions. No rounding.
No underflow. Just…
the correct answer.
With as few as 4 bits
for the operands!
Copyright © 2014, 2015 John L. Gustafson
The power of open-closed endpoints
Root-finding
just works.
Copyright © 2014, 2015 John L. Gustafson
Classical Numerical Analysis
•  Time steps
•  Use position to estimate force
•  Use force to estimate acceleration
•  Update the velocity
•  Update the position
•  Lather, rinse, repeat
•  Accumulates rounding
and sampling error, both unknown
•  Cannot be done in parallel
start
Δt
M
Δt
Δt
Δt
Copyright © 2014, 2015 John L. Gustafson
A New Type of Parallelism
•  Space steps, not time steps
•  Acceleration, velocity bounded
in any given space interval
•  Find traversal time as a function
of space step (2D ubox)
•  Massively parallel!
•  No rounding error
•  No sampling error
•  Obsoletes existing
ODE methods
Copyright © 2014, 2015 John L. Gustafson
Pendulums Done Right
•  Physics teaches us it’s a
harmonic oscillator with
period
•  Force-fits nonlinear ODE
into linear ODE for which
calculus works.
•  WRONG answer
2π
g
L
Copyright © 2014, 2015 John L. Gustafson
Physical Truth vs. Force-Fit Solution
Bends the problem to fit solution methods
Copyright © 2014, 2015 John L. Gustafson
Uboxes for linear solvers
0.74 0.75 0.76 0.77 0.78
x
0.64
0.65
0.66
0.67
0.68
0.69
0.70
y
• If the A and b values in Ax=b
are rounded, the “lines” have
width from uncertainty
• Apply a standard solver, and
get the red dot as “the answer”,
x. A pair of floating-point
numbers.
• Check it by computing Ax and
see if it rigorously contains b.
Yes, it does.
• Hmm… are there any other
points that also work?
Copyright © 2014, 2015 John L. Gustafson
Float, Interval, and Ubox Solutions
0.7544 0.7546 0.7548 0.7550
x
0.6610
0.6612
0.6614
0.6616
0.6618
y
• Point solution (black dot) just gives
one of many solutions; disguises
answer instability
• Interval method (gray box) yields a
bound too loose to be useful
• The ubox set (green) is the best
you can do for a given precision
• Uboxes reveal ill-posed nature…
yet provide solution anyway
• Works equally well on nonlinear
problems!
Copyright © 2014, 2015 John L. Gustafson
Other Apps with Ubox Solutions
Imagine having provable bounds on
answers for the first time, yet with
easier programming, less storage, less
bandwidth use, less energy/power
demands, and abundant parallelism.
•  Photorealistic computer
graphics
•  N-body problems
•  Structural analysis
•  Laplace’s equation
•  Perfect gas models without
statistical mechanics
Copyright © 2014, 2015 John L. Gustafson
Revisiting the Big Challenges-1
•  Too much energy and power needed per calculation
•  Unums cut the main energy hog by about 50%
•  More hardware parallelism than we know how to use
•  Uboxes reveal vast sources of data parallelism, the easiest kind
•  Not enough bandwidth (the “memory wall”)
•  More use of CPU transistors, fewer bits moved to/from memory
•  Rounding errors more treacherous than people realize
•  Unums eliminate rounding error, automate precision choice
•  Rounding errors prevent use of multicore methods
•  Unums restore algebraic laws, eliminating the deterrent
Copyright © 2014, 2015 John L. Gustafson
Revisiting the Big Challenges-2
•  Sampling errors turn physics simulations into guesswork
•  Uboxes produce provable bounds on physical behavior
•  Numerical methods are hard to
use, require expertise
•  “Paint bucket” and “Try the universe” are brute force general
methods that need no expertise… not even calculus
Copyright © 2014, 2015 John L. Gustafson
The End of Error
•  A complete text on unums
and uboxes is available from
CRC Press as of this month:
http://www.crcpress.com/product/isbn/
9781482239867
•  Aimed at general reader;
mathematicians will hate its
casual style
•  Complete prototype
environment is available as
Mathematica notebook
through publisher
Thank you!

Contenu connexe

Tendances

Basics of embedded system design
Basics of embedded system designBasics of embedded system design
Basics of embedded system designK Senthil Kumar
 
introduction to Embedded System
introduction to Embedded Systemintroduction to Embedded System
introduction to Embedded SystemAnkur Soni
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkPavanpreetKaur1
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Ono Shigeru
 
Eye interface technology, Electrooculography – A Proposed Model
Eye interface technology, Electrooculography – A Proposed ModelEye interface technology, Electrooculography – A Proposed Model
Eye interface technology, Electrooculography – A Proposed ModelSneha Joshi
 
Use of Artificial intelligence for chip design.pptx
Use of Artificial intelligence for chip design.pptxUse of Artificial intelligence for chip design.pptx
Use of Artificial intelligence for chip design.pptxowaisnaik2015
 
Final thesis presentation on bci
Final thesis presentation on bciFinal thesis presentation on bci
Final thesis presentation on bciRedwan Islam
 
what is neural network....???
what is neural network....???what is neural network....???
what is neural network....???Adii Shah
 
Classification of EEG Signals for Brain-Computer Interface
Classification of EEG Signals for Brain-Computer InterfaceClassification of EEG Signals for Brain-Computer Interface
Classification of EEG Signals for Brain-Computer InterfaceAzoft
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolutionRevanth Kumar
 
Brain computer interfaces_useful
Brain computer interfaces_usefulBrain computer interfaces_useful
Brain computer interfaces_usefulTahir Zemouri
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature surveyAkshay Hegde
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkManasa Mona
 
Electroencephalography (EEG): an electrophysiological monitoring method to re...
Electroencephalography (EEG): an electrophysiological monitoring method to re...Electroencephalography (EEG): an electrophysiological monitoring method to re...
Electroencephalography (EEG): an electrophysiological monitoring method to re...Habtemariam Mulugeta
 

Tendances (20)

Basics of embedded system design
Basics of embedded system designBasics of embedded system design
Basics of embedded system design
 
introduction to Embedded System
introduction to Embedded Systemintroduction to Embedded System
introduction to Embedded System
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
 
Electromyogram
ElectromyogramElectromyogram
Electromyogram
 
Eye interface technology, Electrooculography – A Proposed Model
Eye interface technology, Electrooculography – A Proposed ModelEye interface technology, Electrooculography – A Proposed Model
Eye interface technology, Electrooculography – A Proposed Model
 
Use of Artificial intelligence for chip design.pptx
Use of Artificial intelligence for chip design.pptxUse of Artificial intelligence for chip design.pptx
Use of Artificial intelligence for chip design.pptx
 
Final thesis presentation on bci
Final thesis presentation on bciFinal thesis presentation on bci
Final thesis presentation on bci
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
what is neural network....???
what is neural network....???what is neural network....???
what is neural network....???
 
ASIC
ASICASIC
ASIC
 
Classification of EEG Signals for Brain-Computer Interface
Classification of EEG Signals for Brain-Computer InterfaceClassification of EEG Signals for Brain-Computer Interface
Classification of EEG Signals for Brain-Computer Interface
 
Vlsi
VlsiVlsi
Vlsi
 
Embedded
EmbeddedEmbedded
Embedded
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolution
 
Brain computer interfaces_useful
Brain computer interfaces_usefulBrain computer interfaces_useful
Brain computer interfaces_useful
 
Real-Time Operating Systems
Real-Time Operating SystemsReal-Time Operating Systems
Real-Time Operating Systems
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Electroencephalography (EEG): an electrophysiological monitoring method to re...
Electroencephalography (EEG): an electrophysiological monitoring method to re...Electroencephalography (EEG): an electrophysiological monitoring method to re...
Electroencephalography (EEG): an electrophysiological monitoring method to re...
 

En vedette

Beyond Floating Point – Next Generation Computer Arithmetic
Beyond Floating Point – Next Generation Computer ArithmeticBeyond Floating Point – Next Generation Computer Arithmetic
Beyond Floating Point – Next Generation Computer Arithmeticinside-BigData.com
 
Getting started with libfabric
Getting started with libfabricGetting started with libfabric
Getting started with libfabricJianxin Xiong
 
Open Networking Revolution - Cambodia - Why?
Open Networking Revolution - Cambodia - Why?Open Networking Revolution - Cambodia - Why?
Open Networking Revolution - Cambodia - Why?KHNOG
 
La Social TV: caratteristiche ed evoluzione sul second screen
La Social TV: caratteristiche ed evoluzione sul second screenLa Social TV: caratteristiche ed evoluzione sul second screen
La Social TV: caratteristiche ed evoluzione sul second screenEmanuela Zaccone
 
Martin Buber's concept of Religion
Martin Buber's concept of ReligionMartin Buber's concept of Religion
Martin Buber's concept of ReligionMathew Jose
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of MicroprocessorAli Usman
 
ttec / transtec | IBM NeXtScale
ttec / transtec | IBM NeXtScale ttec / transtec | IBM NeXtScale
ttec / transtec | IBM NeXtScale Marco van der Hart
 
If the data cannot come to the algorithm...
If the data cannot come to the algorithm...If the data cannot come to the algorithm...
If the data cannot come to the algorithm...Robert Burrell Donkin
 
Lessons Learned From Four Years of API Management Implementation Success at Unum
Lessons Learned From Four Years of API Management Implementation Success at UnumLessons Learned From Four Years of API Management Implementation Success at Unum
Lessons Learned From Four Years of API Management Implementation Success at UnumCA Technologies
 
Open Ethernet: an open-source approach to modern network design
Open Ethernet: an open-source approach to modern network designOpen Ethernet: an open-source approach to modern network design
Open Ethernet: an open-source approach to modern network designAlexander Petrovskiy
 
Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...
Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...
Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...CA Technologies
 
The Evolution Of Computer
The Evolution Of ComputerThe Evolution Of Computer
The Evolution Of ComputerShravan Kumar
 
An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...
An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...
An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...agilemaine
 
Genesis & Progression of Processors in CPU
Genesis & Progression of Processors in CPUGenesis & Progression of Processors in CPU
Genesis & Progression of Processors in CPUAnkita Jangir
 
SC16 Student Cluster Competition Configurations & Results
SC16 Student Cluster Competition Configurations & ResultsSC16 Student Cluster Competition Configurations & Results
SC16 Student Cluster Competition Configurations & Resultsinside-BigData.com
 

En vedette (20)

Beyond Floating Point – Next Generation Computer Arithmetic
Beyond Floating Point – Next Generation Computer ArithmeticBeyond Floating Point – Next Generation Computer Arithmetic
Beyond Floating Point – Next Generation Computer Arithmetic
 
The Future of the OS
The Future of the OSThe Future of the OS
The Future of the OS
 
Getting started with libfabric
Getting started with libfabricGetting started with libfabric
Getting started with libfabric
 
WebSockets in JEE 7
WebSockets in JEE 7WebSockets in JEE 7
WebSockets in JEE 7
 
Open Networking Revolution - Cambodia - Why?
Open Networking Revolution - Cambodia - Why?Open Networking Revolution - Cambodia - Why?
Open Networking Revolution - Cambodia - Why?
 
La Social TV: caratteristiche ed evoluzione sul second screen
La Social TV: caratteristiche ed evoluzione sul second screenLa Social TV: caratteristiche ed evoluzione sul second screen
La Social TV: caratteristiche ed evoluzione sul second screen
 
Martin Buber's concept of Religion
Martin Buber's concept of ReligionMartin Buber's concept of Religion
Martin Buber's concept of Religion
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of Microprocessor
 
ttec / transtec | IBM NeXtScale
ttec / transtec | IBM NeXtScale ttec / transtec | IBM NeXtScale
ttec / transtec | IBM NeXtScale
 
If the data cannot come to the algorithm...
If the data cannot come to the algorithm...If the data cannot come to the algorithm...
If the data cannot come to the algorithm...
 
Apostila lpt
Apostila lptApostila lpt
Apostila lpt
 
Lessons Learned From Four Years of API Management Implementation Success at Unum
Lessons Learned From Four Years of API Management Implementation Success at UnumLessons Learned From Four Years of API Management Implementation Success at Unum
Lessons Learned From Four Years of API Management Implementation Success at Unum
 
Open Ethernet: an open-source approach to modern network design
Open Ethernet: an open-source approach to modern network designOpen Ethernet: an open-source approach to modern network design
Open Ethernet: an open-source approach to modern network design
 
Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...
Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...
Mediating Mature Services, ESBs and APIs: Lessons Learned from Five Years of ...
 
processors
processorsprocessors
processors
 
The Evolution Of Computer
The Evolution Of ComputerThe Evolution Of Computer
The Evolution Of Computer
 
An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...
An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...
An Enterprise Agile Transformation: Lessons Learned at Unum - Miljan Bajic, A...
 
Genesis & Progression of Processors in CPU
Genesis & Progression of Processors in CPUGenesis & Progression of Processors in CPU
Genesis & Progression of Processors in CPU
 
SC16 Student Cluster Competition Configurations & Results
SC16 Student Cluster Competition Configurations & ResultsSC16 Student Cluster Competition Configurations & Results
SC16 Student Cluster Competition Configurations & Results
 
2016 IDC HPC Market Update
2016 IDC HPC Market Update2016 IDC HPC Market Update
2016 IDC HPC Market Update
 

Similaire à Unum Computing: An Energy Efficient and Massively Parallel Approach to Valid Numerics

Beating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit ArithmeticBeating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit Arithmeticinside-BigData.com
 
Smaller and Easier: Machine Learning on Embedded Things
Smaller and Easier: Machine Learning on Embedded ThingsSmaller and Easier: Machine Learning on Embedded Things
Smaller and Easier: Machine Learning on Embedded ThingsNUS-ISS
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbersYutaka Kawai
 
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...huguk
 
Introduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sIntroduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sVidyasagar Bhargava
 
Running & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersRunning & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersFred de Villamil
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...
DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...
DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...Felipe Prado
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Presentation at SMI 2023
Presentation at SMI 2023Presentation at SMI 2023
Presentation at SMI 2023Joaquim Jorge
 
Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16
Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16
Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16MLconf
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksHannes Hapke
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Ceph Community
 
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Cloudera, Inc.
 
[CLASS 2014] Palestra Técnica - Jason Larsen
[CLASS 2014] Palestra Técnica - Jason Larsen[CLASS 2014] Palestra Técnica - Jason Larsen
[CLASS 2014] Palestra Técnica - Jason LarsenTI Safe
 
Design for Test [DFT]-1 (1).pdf DESIGN DFT
Design for Test [DFT]-1 (1).pdf DESIGN DFTDesign for Test [DFT]-1 (1).pdf DESIGN DFT
Design for Test [DFT]-1 (1).pdf DESIGN DFTjayasreenimmakuri777
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee
 
Chalmers microprocessor sept 2010
Chalmers microprocessor sept 2010Chalmers microprocessor sept 2010
Chalmers microprocessor sept 2010parallellabs
 
Echo State Hoeffding Tree Learning
Echo State Hoeffding Tree LearningEcho State Hoeffding Tree Learning
Echo State Hoeffding Tree LearningDiego Marrón Vida
 

Similaire à Unum Computing: An Energy Efficient and Massively Parallel Approach to Valid Numerics (20)

Beating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit ArithmeticBeating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit Arithmetic
 
Smaller and Easier: Machine Learning on Embedded Things
Smaller and Easier: Machine Learning on Embedded ThingsSmaller and Easier: Machine Learning on Embedded Things
Smaller and Easier: Machine Learning on Embedded Things
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers
 
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
 
Introduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sIntroduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner's
 
Running & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersRunning & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch Clusters
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...
DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...
DEF CON 27 - ANDREAS BAUMHOF - are quantum computers really a threat to crypt...
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Presentation at SMI 2023
Presentation at SMI 2023Presentation at SMI 2023
Presentation at SMI 2023
 
Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16
Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16
Erich Elsen, Research Scientist, Baidu Research at MLconf NYC - 4/15/16
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
 
[CLASS 2014] Palestra Técnica - Jason Larsen
[CLASS 2014] Palestra Técnica - Jason Larsen[CLASS 2014] Palestra Técnica - Jason Larsen
[CLASS 2014] Palestra Técnica - Jason Larsen
 
Design for Test [DFT]-1 (1).pdf DESIGN DFT
Design for Test [DFT]-1 (1).pdf DESIGN DFTDesign for Test [DFT]-1 (1).pdf DESIGN DFT
Design for Test [DFT]-1 (1).pdf DESIGN DFT
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
Everybody Lies
Everybody LiesEverybody Lies
Everybody Lies
 
Chalmers microprocessor sept 2010
Chalmers microprocessor sept 2010Chalmers microprocessor sept 2010
Chalmers microprocessor sept 2010
 
Echo State Hoeffding Tree Learning
Echo State Hoeffding Tree LearningEcho State Hoeffding Tree Learning
Echo State Hoeffding Tree Learning
 

Plus de inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

Plus de inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Dernier

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Unum Computing: An Energy Efficient and Massively Parallel Approach to Valid Numerics

  • 1. Copyright © 2014, 2015 John L. Gustafson AN ENERGY-EFFICIENTAND MASSIVELY PARALLELAPPROACH TO VALID NUMERICS John L. Gustafson, Ph.D. CTO, Ceranovo Director, Massively Parallel Technologies, Etaphase, Clustered Systems
  • 2. Copyright © 2014, 2015 John L. Gustafson Big problems facing computing •  Too much energy and power needed per calculation •  More hardware parallelism than we know how to use •  Not enough bandwidth (the “memory wall”) •  Rounding errors more treacherous than people realize •  Rounding errors prevent use of parallel methods •  Sampling errors turn physics simulations into guesswork •  Numerical methods are hard to use, require experts •  IEEE floats give different answers on different platforms
  • 3. Copyright © 2014, 2015 John L. Gustafson The ones vendors care most about •  Too much energy and power needed per calculation •  More hardware parallelism than we know how to use •  Not enough bandwidth (the “memory wall”) •  Rounding errors more treacherous than people realize •  Rounding errors prevent use of parallel methods •  Sampling errors turn physics simulations into guesswork •  Numerical methods are hard to use, require experts •  IEEE floats give different answers on different platforms
  • 4. Copyright © 2014, 2015 John L. Gustafson Too much power and heat needed • Huge heat sinks • 20 MW limit for exascale • Data center electric bills • Mobile device battery life • Heat intensity means bulk • Bulk increases latency • Latency limits speed
  • 5. Copyright © 2014, 2015 John L. Gustafson More parallel hardware than we can use •  Huge clusters usually partitioned into 10s, 100s of cores •  Few algorithms exploit millions of cores except LINPACK •  Capacity is not a substitute for capability!
  • 6. Copyright © 2014, 2015 John L. Gustafson Not enough bandwidth (“Memory wall”) Operation Energy consumed Time needed 64-bit multiply-add 200 pJ 1 nsec Read 64 bits from cache 800 pJ 3 nsec Move 64 bits across chip 2000 pJ 5 nsec Execute an instruction 7500 pJ 1 nsec Read 64 bits from DRAM 12000 pJ 70 nsec Notice that 12000 pJ @ 3 GHz = 36 watts! One-size-fits-all overkill 64-bit precision wastes energy, storage bandwidth
  • 7. Copyright © 2014, 2015 John L. Gustafson Happy 100th Birthday, Floating Point 1914: Torres proposes automatic computing with a fraction and a scaling factor. 2014: We still use a format designed for World War I hardware capabilities!
  • 8. Copyright © 2014, 2015 John L. Gustafson Floats designed for visible scratch work •  OK for manual calculations •  Operator sees, remembers errors •  Can head off overflow, underflow •  Automatic math hides all that •  No one sees processor “flags” •  Disobeys algebraic laws •  Wastes bit patterns as NaNs •  IEEE 754 “standard” is really the IEEE 754 guideline; optional rules spoil consistent results
  • 9. Copyright © 2014, 2015 John L. Gustafson Analogy: Printing in 1970 vs. 2014 1970: 30 sec per page 2013: 30 sec per page Faster technology is for better prints, not thousands of low-quality prints per second. Why not do the same thing with computer arithmetic?
  • 10. Copyright © 2014, 2015 John L. Gustafson This is just… sad.
  • 11. Copyright © 2014, 2015 John L. Gustafson Floats prevent use of parallelism •  No associative property for floats •  (a + b) + (c + d) (parallel) ≠ ((a + b) + c) + d (serial) •  Looks like a “wrong answer” •  Programmers trust serial, reject parallel •  IEEE floats report rounding, overflow, underflow in processor register bits that no one ever sees.
  • 12. Copyright © 2014, 2015 John L. Gustafson A New Number Format: The Unum •  Universal numbers •  Superset of IEEE types, both 754 and 1788 •  Integersèfloatsèunums •  No rounding, no overflow to ∞, no underflow to zero •  They obey algebraic laws! •  Safe to parallelize •  Fewer bits than floats •  But… they’re new •  Some people don’t like new “You can’t boil the ocean.” —Former Intel exec, when shown the unum idea
  • 13. Copyright © 2014, 2015 John L. Gustafson Three ways to express a big number Avogadro’s number: ~6.022×1023 atoms or molecules Unum (29 bits): 0 11001101 111111100001 1 111 1011 sign exp. frac. ubit exp. size frac. size Self-descriptive “utag” bits track and manage uncertainty, exponent size, and fraction size utag IEEE Standard Float (64 bits): 0 10001001101 1111111000010101010011110100010101111110101000010011 sign exponent (scale) fraction Sign-Magnitude Integer (80 bits): 0 1111111100001010101001111010001010111111010100001001010011000000000000000000000 sign Lots of digits
  • 14. Copyright © 2014, 2015 John L. Gustafson Why unums use fewer bits than floats •  Exponent smaller by about 5 – 10 bits, typically •  Trailing zeros in fraction compressed away, saves ~2 bits •  Shorter strings for more common values •  Cancellation removes bits and the need to store them Unum (29 bits): IEEE Standard Float (64 bits): 0 10001001101 1111111000010101010011110100010101111110101000010011 0 11001101 111111100001 1 111 1011
  • 15. Copyright © 2014, 2015 John L. Gustafson Open ranges, as well as exact points Bit string meanings using IEEE Float rules Bit string meanings in unum format Complete representation of all real numbers using a finite number of bits
  • 16. Copyright © 2014, 2015 John L. Gustafson Ubounds are the hull of 1 or 2 unums • Includes closed, open, half-open intervals • Includes ±∞, empty set, quiet and signaling NaN • Unlike traditional intervals, ubounds are closed and lossless under set operations
  • 17. Copyright © 2014, 2015 John L. Gustafson The three layers of computing Grammar rules for exact, inexact High-efficiency floats and real intervals Limited set of fused operations
  • 18. Copyright © 2014, 2015 John L. Gustafson The Warlpiri unums Before the aboriginal Warlpiri of Northern Australia had contact with other civilizations, their counting system was “One, two, many.” Maybe they were onto something.
  • 19. Copyright © 2014, 2015 John L. Gustafson Fixed-size unums: faster than floats •  Warlpiri ubounds are one byte, but closed system for reals •  Unpacked unums pre-decode exception bits, hidden bit Circuit required for “IEEE half-precision float = ∞?” Circuit required for “unum = ∞?” (any precision)
  • 20. Copyright © 2014, 2015 John L. Gustafson Floating Point II: The Wrath of Kahan •  Berkeley professor William Kahan is the father of modern IEEE Standard floats •  Also the authority on their many dangers •  Every idea to fix floats faces his tests that expose how new idea is even worse Working unum environment completed August 13, 2013. Can unums survive the wrath of Kahan?
  • 21. Copyright © 2014, 2015 John L. Gustafson A Typical Kahan Challenge •  Correct answer: (1, 1, 1, 1). •  IEEE 32-bit: (0, 0, 0, 0) FAIL •  IEEE 64-bit: (0, 0, 0, 0) FAIL •  Myth: “Getting the same answer with increased precision means the answer is correct.” •  IEEE 128-bit: (0, 0, 0, 0) FAIL •  Extended precision math packages: (0, 0, 0, 0) FAIL •  Interval arithmetic: Um, somewhere between –∞ and ∞. EPIC FAIL •  Unums, 6-bit average size: (1, 1, 1, 1) CORRECT I have been unable to find a problem that “breaks” unum math.
  • 22. Copyright © 2014, 2015 John L. Gustafson Kahan’s “Smooth Surprise” Find minimum of log(|3(1–x)+1|)/80 + x2 + 1 in 0.8 ≤ x ≤ 2.0 Plot, test using half a million double-precision IEEE floats. Shows minimum at x = 0.8. FAIL Plot, test using a few dozen very low-precision unums. Shows minimum where x spans 4/3. CORRECT
  • 23. Copyright © 2014, 2015 John L. Gustafson Kahan on the computation of powers 5.96046447753906250.875 = 4.76837158203125 I sent Kahan my fixed-cost, fixed-storage method, and he said it looked “impractical.” I asked if he had a method that shows the following computation is exact: Have not heard from him since.
  • 24. Copyright © 2014, 2015 John L. Gustafson Two can play this game, Professor K. Unums can do anything floats can do, through explicit use of the guess function. •  Stable fixed point found by floats, not by traditional intervals •  Unums find both stable point, and unstable point at origin •  Finding exact stable point is mathematically incorrect! •  Adding a tiny wobble sin(x)/x destroys floats, but not unums.
  • 25. Copyright © 2014, 2015 John L. Gustafson Rump’s Royal Pain •  Using IBM (pre-IEEE Standard) floats, Rump got •  1.172603 in 32-bit precision •  1.1726039400531 in 64-bit precision •  1.172603940053178 in 128-bit precision •  Using IEEE double precision: 1.18059x1021 •  Correct answer: –0.82739605994682136…! Didn’t even get sign right Compute 333.75y6 + x2(11x2y2 – y6 – 121y4 – 2) + 5.5y8 + x/(2y) where x = 77617, y = 33096. Unums: Correct answer to 23 decimals using an average of only 75 bits per number. Not even IEEE 128-bit precision can do that. Precision, range adjust automatically.
  • 26. Copyright © 2014, 2015 John L. Gustafson Some fundamental principles Bound the answer as tightly as possible within the numerical environment, or admit defeat. •  No more guessing •  No more “the error is O(hn)” type estimates •  The smaller the bound, the greater the information •  Performance is information per second •  Maximize information per bit •  Fused operations are always explicitly distinct from their non-fused versions and results are identical across platforms
  • 27. Copyright © 2014, 2015 John L. Gustafson Polynomials: bane of classic intervals Dependency and closed endpoints lose information (amber) Unum polynomial evaluator loses no information.
  • 28. Copyright © 2014, 2015 John L. Gustafson Polynomial evaluation solved at last Mathematicians have sought this for at least 60 years. “Dependency Problem” creates sloppy range when input is an interval Unum evaluation refines answer to limits of the environment precision
  • 29. Copyright © 2014, 2015 John L. Gustafson Uboxes and solution sets •  A ubox is a multidimensional unum •  Exact or ULP-wide in each dimension •  Sets of uboxes constitute a solution set •  One dimension per degree of freedom in solution •  Solves the main problem with interval arithmetic •  Super-economical for bit storage •  Data parallel in general
  • 30. Copyright © 2014, 2015 John L. Gustafson Calculus considered harmful •  Computers are discrete •  Calculus is continuous •  Ensures sampling errors •  Changes problem to fit the tool
  • 31. Copyright © 2014, 2015 John L. Gustafson Deeply Unsatisfying Error Bounds Error ≤ (b – a) h2 |f ʹ′ʹ′(ξ)| / 24 •  Classical numerical texts teach this “error bound”: •  What is f ʹ′ʹ′? Where is ξ? What is the bound?? •  Bound is often infinite, which means no bound at all •  “Whatever it is, it’s four times better if we make h half as big” creates demand for supercomputing that cannot be satisfied. 4x
  • 32. Copyright © 2014, 2015 John L. Gustafson Two “ubox methods”, both mindless •  Paint bucket: find one solution point, test neighbors and classify as solution or fail until no more neighbors to test •  Works if solution is known to be a connected set •  Requires a starting point “seed” •  Wave front of trial uboxes can be computed in parallel •  Try the universe: Use Warlpiri uboxes (4-bit precision) to tile all of n- space; increment exponent and fraction size automatically •  13n things to do in parallel (!) •  Finds every solution, no matter what, since all of n-space is represented •  Detects ill-posed problems and solves them anyway •  Parallelism adjusts from 3 to trillions
  • 33. Copyright © 2014, 2015 John L. Gustafson Quarter-circle example •  Suppose all we know is x2 + y2 = 1, and x and y are ≥ 0 •  Suppose we have at most 2 bits exponent, 4 bits fraction Task: Bound the quarter circle area. Bound the value of π.
  • 34. Copyright © 2014, 2015 John L. Gustafson Set is connected; need a seed •  We know x = 0, y = 1 works •  Find its eight ubox neighbors in the plane •  Test x2 + y2 = 1, x ≥ 0, y ≥ 0 •  Solution set is green •  Trial set is amber •  Failure set is red •  Stop when no more trials
  • 35. Copyright © 2014, 2015 John L. Gustafson Exactly one neighbor passes •  Unum math automatically excludes cases that floats would accept •  Trials are neighbors of new solutions that •  Are not already failures •  Are not already solutions •  Note: no calculation of Not part of the unit circle y = 1− x2
  • 36. Copyright © 2014, 2015 John L. Gustafson The new trial set •  Five trial uboxes to test •  Perfect, easy parallelism for multicore •  Each ubox takes only 15 to 23 bits •  Ultra-fast operations •  Skip to the final result…
  • 37. Copyright © 2014, 2015 John L. Gustafson The complete quarter circle •  The complete solution, to this finite precision level •  Information is reciprocal of green area •  Use to find area under arc, bounded above and below •  Proves value of π to an accuracy of 3% •  No calculus needed, or divides, or square roots
  • 38. Copyright © 2014, 2015 John L. Gustafson Compressed Final Result •  Coalesce uboxes to largest possible ULP values •  Lossless compression •  Total data set: 603 bits! •  6x faster graphics than current methods Instead of ULPs being the source of error, they are the atomic units of computation
  • 39. Copyright © 2014, 2015 John L. Gustafson Fifth-degree polynomial roots •  Analytic solution: There isn’t one. •  Numerical solution: Huge errors from rounding •  Unums: quickly return x = –1, x = 2 as the exact solutions. No rounding. No underflow. Just… the correct answer. With as few as 4 bits for the operands!
  • 40. Copyright © 2014, 2015 John L. Gustafson The power of open-closed endpoints Root-finding just works.
  • 41. Copyright © 2014, 2015 John L. Gustafson Classical Numerical Analysis •  Time steps •  Use position to estimate force •  Use force to estimate acceleration •  Update the velocity •  Update the position •  Lather, rinse, repeat •  Accumulates rounding and sampling error, both unknown •  Cannot be done in parallel start Δt M Δt Δt Δt
  • 42. Copyright © 2014, 2015 John L. Gustafson A New Type of Parallelism •  Space steps, not time steps •  Acceleration, velocity bounded in any given space interval •  Find traversal time as a function of space step (2D ubox) •  Massively parallel! •  No rounding error •  No sampling error •  Obsoletes existing ODE methods
  • 43. Copyright © 2014, 2015 John L. Gustafson Pendulums Done Right •  Physics teaches us it’s a harmonic oscillator with period •  Force-fits nonlinear ODE into linear ODE for which calculus works. •  WRONG answer 2π g L
  • 44. Copyright © 2014, 2015 John L. Gustafson Physical Truth vs. Force-Fit Solution Bends the problem to fit solution methods
  • 45. Copyright © 2014, 2015 John L. Gustafson Uboxes for linear solvers 0.74 0.75 0.76 0.77 0.78 x 0.64 0.65 0.66 0.67 0.68 0.69 0.70 y • If the A and b values in Ax=b are rounded, the “lines” have width from uncertainty • Apply a standard solver, and get the red dot as “the answer”, x. A pair of floating-point numbers. • Check it by computing Ax and see if it rigorously contains b. Yes, it does. • Hmm… are there any other points that also work?
  • 46. Copyright © 2014, 2015 John L. Gustafson Float, Interval, and Ubox Solutions 0.7544 0.7546 0.7548 0.7550 x 0.6610 0.6612 0.6614 0.6616 0.6618 y • Point solution (black dot) just gives one of many solutions; disguises answer instability • Interval method (gray box) yields a bound too loose to be useful • The ubox set (green) is the best you can do for a given precision • Uboxes reveal ill-posed nature… yet provide solution anyway • Works equally well on nonlinear problems!
  • 47. Copyright © 2014, 2015 John L. Gustafson Other Apps with Ubox Solutions Imagine having provable bounds on answers for the first time, yet with easier programming, less storage, less bandwidth use, less energy/power demands, and abundant parallelism. •  Photorealistic computer graphics •  N-body problems •  Structural analysis •  Laplace’s equation •  Perfect gas models without statistical mechanics
  • 48. Copyright © 2014, 2015 John L. Gustafson Revisiting the Big Challenges-1 •  Too much energy and power needed per calculation •  Unums cut the main energy hog by about 50% •  More hardware parallelism than we know how to use •  Uboxes reveal vast sources of data parallelism, the easiest kind •  Not enough bandwidth (the “memory wall”) •  More use of CPU transistors, fewer bits moved to/from memory •  Rounding errors more treacherous than people realize •  Unums eliminate rounding error, automate precision choice •  Rounding errors prevent use of multicore methods •  Unums restore algebraic laws, eliminating the deterrent
  • 49. Copyright © 2014, 2015 John L. Gustafson Revisiting the Big Challenges-2 •  Sampling errors turn physics simulations into guesswork •  Uboxes produce provable bounds on physical behavior •  Numerical methods are hard to use, require expertise •  “Paint bucket” and “Try the universe” are brute force general methods that need no expertise… not even calculus
  • 50. Copyright © 2014, 2015 John L. Gustafson The End of Error •  A complete text on unums and uboxes is available from CRC Press as of this month: http://www.crcpress.com/product/isbn/ 9781482239867 •  Aimed at general reader; mathematicians will hate its casual style •  Complete prototype environment is available as Mathematica notebook through publisher Thank you!