SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
1
Where Did All The Errors Go?
European Dependable Computing Conference
http://conferences.ncl.ac.uk/edcc2014/
Newcastle, 13-16may14
Prof. Ian Phillips
Principal Staff Engineer
ARM Ltd
ian.phillips@arm.com
Visiting Prof. at ...
Contribution to
Industry Award 2008
Opinions expressed are my own ...
Links to Pdf and SlideCast @ http://ianp24.blogspot.com
2
When we think of Computing we think of ...
 HPC and Mainframe
... maybe Desktop
... but not really Laptop or (Heaven forbid) Pocketable?
3
The Visible Face of Computing Today
Essential but not Vital ... All want Reliable
4
The Invisible Face of Computing Today
Unrecognised but Vital ... All need Dependabile
5
... State (s) and Time (t) are usually factors in this.
 It can include phenomena ranging from human thinking to calculations with a narrower meaning.
... Wikipedia
 Usually used it to animate analogies (models) of real-world situations
... frequently fast enough to be used as a stabilising factor in a loop (Real-time).
... Not prescriptive about the choice of Implementation Technology!
... Nor prescriptive about Programmability!
SoWhat is Computing ...
A mechanism for the algebraic manipulation of Data ...
y=F(x,t,s)
IN (x)
Enumerated
Phenomena
OUT (y)
Processed Data/
Information
6
Hipparchos’s Antikythera - c87BC
Early-Mechanical
Computation
Hipparchos
c.190 BC – c.120 BC.
Ancient Greek
Astronomer, Philosopher
and Mathematician
 A Machine for Calculating Planetary Positions
 Technology: Metal, Hand-Cut Gears, Analogue
 Found in the Mediterranean in 1900 (Believe there might have been 10’s)
7
Orrery c1700 ... Planet Motion Computer
 Inventor: George Graham (1674-1751). English Clock-Maker.
 Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!)
Mechanical
Technology
8
 A Machine for Computing Polynomial Tables
 Technology: Metal, Precision Gears, Digital (base 10)
 Beyond the gear-cutting technology of the day
Babbage's Difference Engine - 1837
Constructed 2000
Late-Mechanical
Computation
9
Amsler’s Planimeter - c1856
Mechanical
Computation
Planimeter 2014 !
 A Machine for Calculating Area of an arbitrary 2D shape
 Technology: Precision Mechanics, Analogue
 Available today ... Electronically enhanced
10
 General Purpose (Programmable) Computing Machine
 Technology: Electronics (valves), Digital (base 2)
 Available today ... Micro-Electronically enhanced (Mainframe <=> Laptop)
Uo.Manchester’s “Baby” - 1947 (Reconstruction)
Electronic
Computation
11
 Digital Electronics
 Software
 Memory
 Optics
 Analogue Electronic
 Sensors/Transducers
 Mechanics
 Micro-Motors
 Displays
 Discharge Tube
 Robotic Assembly
 Plastic, Metal, Glass
Image Input => Compute (Image Processing) => Data-File
... Many Technologies working, seamlessly, to Enhanced Human Memory
Electronic System (Cyber-physical System) - 2014
1: aka; Cyber-Physical System (Geek-Talk!)
Incorporating DIGIC5+ (ARM)
System-Level
Computation
‘Classic’
Computer
12
 They sell things that Customers want to buy
 Supporting the End-Customers needs ... Who maybe several ‘layers’ above their business.
 Focus on their Core Competencies in a Globally Competitive Market
 Avoid Commoditisation by Differentiation
 Cost and Quality (by improving Process) ..and..
 Improved Business-Models (which make the Money) ..and..
 New/Improved Technology (which are Expensive and/or Risky)
 Product Development is a Cost (Risk) to be Minimised
 Technology (HW, SW, Mechanics, Optics, Graphene, etc) just enables Options!
 New-Technology may cost more (including risk) than it delivers in Product Value!
 Over-Design costs ... Cannot afford the Precautionary Principle!
... Because successful End-Products fund their entire (RD&I) Value-Chains
... Their Technologies will be economic necessity in (all) lower volume markets!
Computing Technologies in Business Context
Businesses have to be Competitive, Money Making Machines today ...
13
... Old Compute Markets remain; but are no-longer the Technology Drivers!
Business Opportunities Drive Technology Developments
...And 21c Products are increasingly ‘Intelligent’
1970 1980 1990 2000 2010 2020 2030
MillionsofUnits
1st Era
Select work-tasks
2nd Era
Broad-based computing
for specific tasks
3rd Era
Computing as part
of our lives
14
 How often can ...
 An Anti-lock Braking system be unavailable?
 Your Mobile Phone crash/restart?
 An Autopilot be unavailable?
... As often as it likes: As long as it is available when you need it!
 The Power Grid crash/restart?
 An Engine Management unit get stuck at Full Throttle?
 A spurious Cash Transaction in your Bank Account?
... Never!
 A PC crash before it is unusable?
 Weather forecast be incorrect before it matters?
... Surprisingly often: Humans are inclined to blame themselves.
... Dependability is Subjective; Application, User and Context dependent (Quality)
What Dependable Computing do we Expect?
“To be trusted to do or provide what is needed” (Merriam-Webster)
15
End-Products are about Function, not aboutTechnology
You can’t tell which bits are done in Hardware and which in Software?
Hardware Module ?Software Module ?
Hardware + Software Module ?
... So where are the Dependability Vulnerabilities located?
16
 Boolean Mathematics is Dependable; but implementation depends on reliably mapping
its equations to the physical world through Logic-Gates
 (For HW and SW!)
 CMOS has been a reliable Boolean mapping for 30 years, but ...
 Today’s 20nm transistors at have larger variability, and there are more of
them on a chip (Typically 500M in 2012)
 At 70degC, Vtn=130mv (sigma ~25mv) around 1 in 5 million,
transistors have Vt<0 (Can’t be turned off)
 That’s ~100 transistors/chip that don’t switch off
 And another hundred that only turn-on weakly (low drive/slow)
 And they will always be randomly placed!
... So today’s chips shouldn’t work?
Is Hardware (Logic) Dependable? 1/3
B
A
+V
A
B
OUTNAND
OUT
17
Mitigating this we have ...
 Transistors: Not all ...
 Are at 70 degC even if the die is (local variation)
 Are Minimum Size ... Increasing ‘area’ reduces variability
 Are on Critical Paths ... And ‘chains’ of gates perform closer to average!
 Non-Functionality is (easily) Observable ... The effects can be very subtle.
 CMOS Logic: Is very robust and will continue to work with extreme transistors
 Leaky Gates and Faster Transitions are not usually failure criteria
 The chance of a second extreme transistor on a single Critical Path is the order of <1:1,000,000
 Memory: Circuits are much more sensitive to Vt/gm variation ...
 But spare rows/columns are part of SRAM designs and allow lots of defects to be ‘repaired’
 AND >75% of typical SoC die area is memory, so ...
 Most of the sensitive area has a repair strategy! ..and...
 The rest is inherently more robust!
Is Hardware (Logic) Dependable? 2/3
18
 But we haven't included ...
 Internally and Externally generated synchronous supply noise? (Greater susceptibility at lower voltages)
 High-energy particles? (Greater susceptibility at smaller geometries)
 Wear-out (Vt/Gain drift)? (Greater susceptibility at smaller geometries)
 Temperatures greater than 70degC (140C is not uncommon)
 Limitations of Verification and Test (Limited exploration of state-space)
 We repeatedly multiplying tiny-improbables, by large-numbers ...
 And many of the values are only guesses!
 We have no real idea about the reliability/dependability of modern Systems or Components
 We only know that as process geometries shrink, Susceptibility will get worse ...
 Chips will get ever more complex (and more chips will be used in more complex Systems)
 Transistors will get smaller and Designers will erode safety margins to get performance
... Despite this Chips and Systems do Yield today more than we would rightly expect ...
... So we must be utilising Unknown Safety Factors!
Is Hardware (Logic) Dependable? 3/3
19
 All Software Crashes!
 Software providers seldom guarantee the functionality of their product
 Quality is tested-in; and improved by bug-fixes/patches in the field (To what level?)
 So software Reuse offers improved Quality and Productivity (But over what?)
 Residual Errors ...
 No code has zero residual errors!!
 Well structured and tested Source-Code has ~5 errors per 1,000 lines of code (E-KLOC)
 Commercial code is typically ~5x worse than this
 No Useful Correlation between residual-errors and their system-impact severity
 Only the Heuristic, that ‘most of them are harmless’.
 Formal-Methods are better; but cost is high if you need a clean-sheet design.
 Even Perfect-Software would have to work with an Imperfect-Platform
 Don’t underestimate the Commercial Importance of TTM and Cost !!!
Is Software Dependable? 1/3
Demonstrating the limitations of achieving Quality throughTest ...
20
Is Software Dependable? 2/3
Hardware and Software Design are indistinguishable ...
// A master-slave type D-Flip Flop
module flop (data, clock, clear, q, qb);
input data, clock, clear;
output q, qb;
// primitive #delay instance-name
// (output, input1, input2, .....),
nand #10 nd1 (a, data, clock, clear),
nd2 (b, ndata, clock),
nd4 (d, c, b, clear),
nd5 (e, c, nclock),
nd6 (f, d, nclock),
nd8 (qb, q, f, clear);
nand #9 nd3 (c, a, d),
nd7 (q, e, qb);
not #10 inv1 (ndata, data),
inv2 (nclock, clock);
endmodule
Hardware (Verilog Language)? Software (C Language)?
#include<time.h>
/* Use the PC's timer to check */
/* processing time */
main()
{
clock_t time,deltime;
long junk,i;
float secs;
LOOP:
printf("input loop count: ");
scanf("%ld",&junk);
time = clock();
for(i=0;i<junk;i++)
deltime = clock() - time;
secs = (float) deltime/CLOCKS_PER
printf("for %ld loops, #tics = %
%fn",junk,deltime,secs);
goto LOOP;
...
Target Platform
HW ----- & ----- SW
Target Architecture Info
Compilers
HW ----------- SW
Configuration Files
HW -------------- SW
21
Is Software Dependable? 3/3
Somebody will see the bugs! (The Open Source Delusion)
1: http://www.wired.com/2014/04/heartbleedslesson/
2: http://veridicalsystems.com/blog/of-money-responsibility-and-pride/
“It is now very clear that
OpenSSL development could
benefit from dedicated full-time,
properly funded developers”
“OSF typically receives only
$2,000 a year in donations”
 OpenSSL HeartBleed bug 1
 Update was received just before a Public Holiday
 Editor was a known and high-quality source
 Code was reviewed informally and released
 Editor was conflicted with day-job, family and holiday pressure 2
 Too little resources to do a proper job.
 This was a E-KLOC error ...
 Not a Formatting error, nor a Functional error
 It was a System error (an omission in a non-functional aspect of the code).
... Was the ‘fault’ with the software Source (OpenSSL Software Foundation (OSF)) ?
... Or a User Community too-ready to believe in the Quality of Open Source software?
22
‘Optimal’ Platform
HW1 HW2 HW3 HW4
Hardware Interface
RTOS/Drivers
Thread
Bus(es) Processor(s)
F1
F2
F3
F4
F5
Create Functional-Model1 on a ‘Generic’ Platform
(F1) (F3)
(F5)(F2)
Designing the Computing System ...
... is about creating a Model of Behaviour to meet Non-Functional Constraints
Translate to Functional-Model on an ‘Optimal’ Platform
1: This includes a Model of Execution such as a Java VM.
23
Typical 2014 Computing Platform ...
... is just 137.2 x 70.5 x 5.9 mm
24
Typical 2014 Computing Platform
Exynos 5422
Eight 32 bit CPUs (big.LITTLE):
• Four big (2.1GHz ARM A15) for
heavy tasks;
• Four small (1.5GHz ARM A7) for
lighter tasks.
+ Nine Mali GPU cores ...
... A ~30 Core Heterogeneous Multi-Processor ... In your Shirt Pocket!
... 21 significant ‘Chips’
25
2010:Apple’s A4 SIP Package (Cross-section)
IC Packaging Technology
 The processor is the centre rectangle. The silver circles beneath it are solder balls.
 Two rectangles above are RAM die, offset to make room for the wirebonds.
 Putting the RAM close to the processor reduces latency, making RAM
faster and reduces power consumption ... But increases cost.
 Memory: Unknown
 Processor: Samsung/Apple (ARM Processor)
 Packaging: Unknown (SIP Technology)
Source ... http://www.ifixit.com
Processor SOC Die
2 Memory Dies
Glue
Memory
‘Package’
4-Layer Platform
Package’
Steve Jobs WWDC 2010
26
2013: Samsung Solid-State Memory
 Smart Memory Interface (eMMC)
 16-128Gb in a single package
 8Gb/die. Stacked 2-16 die/package
 Handles errors in the bulk-data store
 Package just 1.4mm thick! (11.5x13x1.4mm)
... Smaller than a postage stamp
27
2012: Nvidea’s Tegra 3 Processor Unit (Around 1B transistors)
NB: The Tegra 3 is similar to the Apple A4
28
Component and Sub-Systems from Global Enterprise ...
... Global Teams contributing Specialist Knowledge & Knowhow
 Apple ID’d 159 Tier-1 Suppliers ...
 Thousands of Engineers Globally
 Est. 10x Tier-2 Suppliers ...
 Including Virtual Components1 and
Sub-Systems (ARM and other IP Providers)
 Multiple Technologies ...
 Hardware, Software, Optics,
Mechanics, Acoustics, RF, Plastics, etc
 Manufacturing, Test, Qualification, etc.
 Methods, Tools, Training, etc
 Tens of thousands Engineers Globally
... More than 90% of Technology and
Methods are Reused (productivity)!
1: Virtual Components do not appear on BOM
29
10nm
100nm
1um
10um
100um
ApproximateProcessGeometry
ITRS’99
Transistors/Chip(M)
Transistor/PM(K)
X
http://en.wikipedia.org/wiki/Moore’s_law
Moore’s Law:ATechnology Opportunity...
30
10nm
100nm
1um
10um
100um
ApproximateProcessGeometry
ITRS’99
Transistors/Chip(M)
Transistor/PM(K)
http://en.wikipedia.org/wiki/Moore’s_law
Moore’s Law:An Increasing Design Problem...
31
Designer Productivity has become theTechnology Driver
 The Product Possibilities offered by utilising the Billions of Affordable and Aesthetically
Encapsulate-able Transistors is Commercially Beguiling!
 But the only way to utilise these possibilities in a reasonable time, with a reasonable
team and at a reasonable cost; is huge amounts of Reuse of Design and Technology ...
 Hardware, Software and other Technologies; Methods and Tools
 In-Company: Sourced and Evolved from Predecessor Products
 Ex-Companies: Sourced from businesses with lesser-known(?) Histories, but Specialist Knowledge
 Reuse Improves Quality; as objects are designed more carefully, and bug-fixes are incremental
 But this is ‘trend towards zero-defects’, not ‘zero-defects’ approach.
... Reuse Methods do seems to be good-enough for Commercial Applications!
... ‘Rigorous lean-sheet approaches’ will be orders of magnitude higher cost, so use of
Commercial Techniques for Dependable Systems are inevitable!
... The Available Components and Sub-Systems are unreliable; “get over it!”
32
ARM: brings the Right Horse to the Right Course ...
... Delivering ~5x speed (Architecture + Process + Clock)
About 50MTr
About 50KTr
33
...Which means: 24 Processors in 6 Families ...
34
... CoreLink for Hetrogeneous Multi-Processing ...
ACE
ACE
NIC-400 Network Interconnect
Flash GPIO
NIC-400
USBQuad
Cortex-
A15
L2 cache
Interrupt Control
CoreLink™
DMC-520
x72
DDR4-3200
PHY
AHB
Snoop
Filter
Quad
Cortex-
A15
L2 cache
Quad
Cortex-
A15
L2 cache
Quad
Cortex-
A15
L2 cache
CoreLink™
DMC-520
x72
DDR4-3200
8-16MB L3 cache
PCIe
10-40
GbE
DPI Crypto
CoreLink™ CCN-504 Cache Coherent Network
IO Virtualisation with System MMU
DSP
DSP
DSP
SATA
Dual channel
DDR3/4 x72
Up to 4 cores
per cluster
Up to 4
coherent
clusters
Integrated
L3 cache
Up to 18 AMBA
interfaces for
I/O coherent
accelerators
and IO
Peripheral address space
Heterogeneous processors – CPU, GPU, DSP and
accelerators
Virtualized Interrupts
Uniform
System
memory
35
… Tools, Libraries and Partners to Realize the Opportunity
 Technology to build Electronic System solutions:
 Software, Drivers, OS-Ports, Tools, Utilities to create
efficient system with optimized software solutions
 Diverse Physical Components, including CPU and GPU
processors designed for specific tasks
 Interconnect System IP delivering coherency and the
quality of service required for lowest memory bandwidth
 Optimised Cell-Libraries for a highly optimized SoC
implementations
 Well Connected to Partners in the Life-Cycle:
 For complementary tools and methods required by
System Developers
 Global Technology Global Partners:
 >900 Licences; Millions of Developers
36
 We Can’t Design it Right
 HW is SW; and Coding errors remain. State-space too big for simulation
exploration. Can’t model or explore whole Systems and they are too
complex for Formal methods
 We Can’t Make it Right
 Chips are subject to Process Imperfections and Variability. Chips and
Systems are subject to Verifications and Test Escapes. Boolean math is
absolute; logic cells are not
 We Can’t Keep it Right
 Chips are susceptible to Supply Transients, Wear-Out and High-Energy
particles.
... And all it get worse as processes shrink and complexity grows
... Yet we DO make Complex Electronic Systems that work!
... What is the explanation? (can we quantify it and use it?)
... Or are we just being Harbingers of a Ever-Threatening Doom ?
Where Do All The Errors Go?
37
 System-Level Dependability is what matters ...
 Dependable Systems need to Reuse Components and Sub-Systems (Physical and Virtual)
for Productivity; and the only affordable ones are of Commercial quality!
 Clean-Sheet design is off-the-table for almost all complex products!
... the possible exception being the (diminishing) cost-no-object market!
 The Only Place to implement System-Level Dependability is in the System ‘Layer’!
 Dependability of Component and Sub-Systems may be enhanced, which will help with the
System-Level task; but they cannot achieve System-Level Dependability by themselves!
... I believe this is the only viable Strategy for creation of Dependable Systems
Facing the Unavoidable Truth
Dependable on Undependable ...
38
Toolbox to help us “Get over It”...
 The only universal interpretation of Fail-Safe is Fail-Functional!
 Probably impossible for the General Case; but may be for Specific Critical Cases.
 So the identification of Failure and the initiation of appropriate Response must be the
highest System-Layer; Above the Functional-Integration-Layer.
 This can include the ‘zero-case’ (In the even that it is all non-critical)
 Recognising the differing requirements for Failure Survival (All cases are not equal)
 Components and Sub-Systems may have protection built in, to increase their Reliability
(How probable are they to fail? How many/What type of defects can be tolerated?)
 We need a Toolbox (equivalent of ‘Spare Rows and Columns’) for the System-Level
 Memory Chip providers build in Repair mechanisms to overcome process limitations
 Memory Systems providers Overcome memory limitations by handling Files not Addresses.
 Redundancy (Double/Triple) is a black-box implementation strategy for logic blocks
 Defensive Programming is a technique for building checking into software
 ...
39
Conclusions
 Systems are what End-Customers buy; they expect them to be Dependable Enough.
 A subjective level which is Application, State and Context dependent.
 Commercial Components and Sub-Systems (HW/SW) are the building blocks
 Commercial use has given us the Technologies which we are economically bound to use
 They work better than we would rightly expect, but we cannot quantifying their quality
 We can improve their Quality/Reliability/Dependability; but 100% is an asymptotic goal!
 Dependable Systems must be based on Less-Dependable Components
 So: System Dependability must be handled by the System-Level Software (Top-Level); only it can
determine the expected action and appropriate corrective action for everything in its domain.
 And: Because Dependability is Application and State Dependent, then it can only be handled by a
Methodology ... Not every System state needs the same Dependability.
... The Commercial Imperative won’t wait for the ‘right way’
... before it produces systems that People Depend on!
40
The END ...

Contenu connexe

En vedette

Regional Bayarea Aug Sept09
Regional Bayarea Aug Sept09Regional Bayarea Aug Sept09
Regional Bayarea Aug Sept09
guest736619023
 
Arbol De Problemas
Arbol De ProblemasArbol De Problemas
Arbol De Problemas
martinicora
 

En vedette (17)

Capabilities: The Bridge Between R-&-D - 21may14
Capabilities: The Bridge Between R-&-D - 21may14Capabilities: The Bridge Between R-&-D - 21may14
Capabilities: The Bridge Between R-&-D - 21may14
 
Technology Trends - Electronic Systems
Technology Trends - Electronic SystemsTechnology Trends - Electronic Systems
Technology Trends - Electronic Systems
 
Stronger than its Weakest Link
Stronger than its Weakest LinkStronger than its Weakest Link
Stronger than its Weakest Link
 
Intervention: Embedded Systems and Advanced Computing
Intervention: Embedded Systems and Advanced ComputingIntervention: Embedded Systems and Advanced Computing
Intervention: Embedded Systems and Advanced Computing
 
ESCO: Fifty Shades of Grey
ESCO: Fifty Shades of GreyESCO: Fifty Shades of Grey
ESCO: Fifty Shades of Grey
 
DEDON
DEDONDEDON
DEDON
 
Computing for CPS in 2025
Computing for CPS in 2025Computing for CPS in 2025
Computing for CPS in 2025
 
Being a Design Engineer
Being a Design EngineerBeing a Design Engineer
Being a Design Engineer
 
Representing, Proving and Sharing Trustworthiness of Web Resources Using Vera...
Representing, Proving and Sharing Trustworthiness of Web Resources Using Vera...Representing, Proving and Sharing Trustworthiness of Web Resources Using Vera...
Representing, Proving and Sharing Trustworthiness of Web Resources Using Vera...
 
Regional Bayarea Aug Sept09
Regional Bayarea Aug Sept09Regional Bayarea Aug Sept09
Regional Bayarea Aug Sept09
 
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic GraphsStretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
 
9 Common OOPS! to avoid at your next trade shows
9 Common OOPS! to avoid at your next trade shows9 Common OOPS! to avoid at your next trade shows
9 Common OOPS! to avoid at your next trade shows
 
B282.1 Web
B282.1 WebB282.1 Web
B282.1 Web
 
23
2323
23
 
Arbol De Problemas
Arbol De ProblemasArbol De Problemas
Arbol De Problemas
 
The Alternative Media Access Center
The Alternative Media Access CenterThe Alternative Media Access Center
The Alternative Media Access Center
 
MODELO
MODELOMODELO
MODELO
 

Similaire à EDCC14 Keynote, Newcastle 15may14

HIS 2015: Prof. Ian Phillips - Stronger than its weakest link
HIS 2015: Prof. Ian Phillips - Stronger than its weakest linkHIS 2015: Prof. Ian Phillips - Stronger than its weakest link
HIS 2015: Prof. Ian Phillips - Stronger than its weakest link
AdaCore
 
The Role Of Software And Hardware As A Common Part Of The...
The Role Of Software And Hardware As A Common Part Of The...The Role Of Software And Hardware As A Common Part Of The...
The Role Of Software And Hardware As A Common Part Of The...
Sheena Crouch
 
Characteristics of a computer
Characteristics of a computerCharacteristics of a computer
Characteristics of a computer
Sulaman Jamil
 

Similaire à EDCC14 Keynote, Newcastle 15may14 (20)

Energy Efficiant Computing in the 21c
Energy Efficiant Computing in the 21cEnergy Efficiant Computing in the 21c
Energy Efficiant Computing in the 21c
 
HIS 2015: Prof. Ian Phillips - Stronger than its weakest link
HIS 2015: Prof. Ian Phillips - Stronger than its weakest linkHIS 2015: Prof. Ian Phillips - Stronger than its weakest link
HIS 2015: Prof. Ian Phillips - Stronger than its weakest link
 
Chapter - One.ppt
Chapter - One.pptChapter - One.ppt
Chapter - One.ppt
 
Computing Platforms for the XXIc - DSD/SEAA Keynote
Computing Platforms for the XXIc - DSD/SEAA KeynoteComputing Platforms for the XXIc - DSD/SEAA Keynote
Computing Platforms for the XXIc - DSD/SEAA Keynote
 
S4x16_Europe_Krotofil
S4x16_Europe_KrotofilS4x16_Europe_Krotofil
S4x16_Europe_Krotofil
 
Energy Efficient Computing - 26mar13
Energy Efficient Computing - 26mar13Energy Efficient Computing - 26mar13
Energy Efficient Computing - 26mar13
 
Three things that rowhammer taught me by Halvar Flake
Three things that rowhammer taught me by Halvar FlakeThree things that rowhammer taught me by Halvar Flake
Three things that rowhammer taught me by Halvar Flake
 
Computing Platforms for the 21C - 25feb14
Computing Platforms for the 21C - 25feb14Computing Platforms for the 21C - 25feb14
Computing Platforms for the 21C - 25feb14
 
Telco survival
Telco survivalTelco survival
Telco survival
 
Global Technology Trends - Electronic Systems
Global Technology Trends - Electronic SystemsGlobal Technology Trends - Electronic Systems
Global Technology Trends - Electronic Systems
 
IS 139 Lecture 1 - 2015
IS 139 Lecture 1 - 2015IS 139 Lecture 1 - 2015
IS 139 Lecture 1 - 2015
 
The Role Of Software And Hardware As A Common Part Of The...
The Role Of Software And Hardware As A Common Part Of The...The Role Of Software And Hardware As A Common Part Of The...
The Role Of Software And Hardware As A Common Part Of The...
 
Our Concurrent Past; Our Distributed Future
Our Concurrent Past; Our Distributed FutureOur Concurrent Past; Our Distributed Future
Our Concurrent Past; Our Distributed Future
 
2020-04-29 SIT Insights in Technology - Serguei Beloussov
2020-04-29 SIT Insights in Technology - Serguei Beloussov2020-04-29 SIT Insights in Technology - Serguei Beloussov
2020-04-29 SIT Insights in Technology - Serguei Beloussov
 
Networks for An Infinite Service Future
Networks for An Infinite Service FutureNetworks for An Infinite Service Future
Networks for An Infinite Service Future
 
Low cost embedded system
Low cost embedded systemLow cost embedded system
Low cost embedded system
 
Microprocessors and microcontrollers
Microprocessors and microcontrollersMicroprocessors and microcontrollers
Microprocessors and microcontrollers
 
Characteristics of a computer
Characteristics of a computerCharacteristics of a computer
Characteristics of a computer
 
ritesh (3)
ritesh (3)ritesh (3)
ritesh (3)
 
The Quality “Logs”-Jam: Why Alerting for Cybersecurity is Awash with False Po...
The Quality “Logs”-Jam: Why Alerting for Cybersecurity is Awash with False Po...The Quality “Logs”-Jam: Why Alerting for Cybersecurity is Awash with False Po...
The Quality “Logs”-Jam: Why Alerting for Cybersecurity is Awash with False Po...
 

Dernier

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 

EDCC14 Keynote, Newcastle 15may14

  • 1. 1 Where Did All The Errors Go? European Dependable Computing Conference http://conferences.ncl.ac.uk/edcc2014/ Newcastle, 13-16may14 Prof. Ian Phillips Principal Staff Engineer ARM Ltd ian.phillips@arm.com Visiting Prof. at ... Contribution to Industry Award 2008 Opinions expressed are my own ... Links to Pdf and SlideCast @ http://ianp24.blogspot.com
  • 2. 2 When we think of Computing we think of ...  HPC and Mainframe ... maybe Desktop ... but not really Laptop or (Heaven forbid) Pocketable?
  • 3. 3 The Visible Face of Computing Today Essential but not Vital ... All want Reliable
  • 4. 4 The Invisible Face of Computing Today Unrecognised but Vital ... All need Dependabile
  • 5. 5 ... State (s) and Time (t) are usually factors in this.  It can include phenomena ranging from human thinking to calculations with a narrower meaning. ... Wikipedia  Usually used it to animate analogies (models) of real-world situations ... frequently fast enough to be used as a stabilising factor in a loop (Real-time). ... Not prescriptive about the choice of Implementation Technology! ... Nor prescriptive about Programmability! SoWhat is Computing ... A mechanism for the algebraic manipulation of Data ... y=F(x,t,s) IN (x) Enumerated Phenomena OUT (y) Processed Data/ Information
  • 6. 6 Hipparchos’s Antikythera - c87BC Early-Mechanical Computation Hipparchos c.190 BC – c.120 BC. Ancient Greek Astronomer, Philosopher and Mathematician  A Machine for Calculating Planetary Positions  Technology: Metal, Hand-Cut Gears, Analogue  Found in the Mediterranean in 1900 (Believe there might have been 10’s)
  • 7. 7 Orrery c1700 ... Planet Motion Computer  Inventor: George Graham (1674-1751). English Clock-Maker.  Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!) Mechanical Technology
  • 8. 8  A Machine for Computing Polynomial Tables  Technology: Metal, Precision Gears, Digital (base 10)  Beyond the gear-cutting technology of the day Babbage's Difference Engine - 1837 Constructed 2000 Late-Mechanical Computation
  • 9. 9 Amsler’s Planimeter - c1856 Mechanical Computation Planimeter 2014 !  A Machine for Calculating Area of an arbitrary 2D shape  Technology: Precision Mechanics, Analogue  Available today ... Electronically enhanced
  • 10. 10  General Purpose (Programmable) Computing Machine  Technology: Electronics (valves), Digital (base 2)  Available today ... Micro-Electronically enhanced (Mainframe <=> Laptop) Uo.Manchester’s “Baby” - 1947 (Reconstruction) Electronic Computation
  • 11. 11  Digital Electronics  Software  Memory  Optics  Analogue Electronic  Sensors/Transducers  Mechanics  Micro-Motors  Displays  Discharge Tube  Robotic Assembly  Plastic, Metal, Glass Image Input => Compute (Image Processing) => Data-File ... Many Technologies working, seamlessly, to Enhanced Human Memory Electronic System (Cyber-physical System) - 2014 1: aka; Cyber-Physical System (Geek-Talk!) Incorporating DIGIC5+ (ARM) System-Level Computation ‘Classic’ Computer
  • 12. 12  They sell things that Customers want to buy  Supporting the End-Customers needs ... Who maybe several ‘layers’ above their business.  Focus on their Core Competencies in a Globally Competitive Market  Avoid Commoditisation by Differentiation  Cost and Quality (by improving Process) ..and..  Improved Business-Models (which make the Money) ..and..  New/Improved Technology (which are Expensive and/or Risky)  Product Development is a Cost (Risk) to be Minimised  Technology (HW, SW, Mechanics, Optics, Graphene, etc) just enables Options!  New-Technology may cost more (including risk) than it delivers in Product Value!  Over-Design costs ... Cannot afford the Precautionary Principle! ... Because successful End-Products fund their entire (RD&I) Value-Chains ... Their Technologies will be economic necessity in (all) lower volume markets! Computing Technologies in Business Context Businesses have to be Competitive, Money Making Machines today ...
  • 13. 13 ... Old Compute Markets remain; but are no-longer the Technology Drivers! Business Opportunities Drive Technology Developments ...And 21c Products are increasingly ‘Intelligent’ 1970 1980 1990 2000 2010 2020 2030 MillionsofUnits 1st Era Select work-tasks 2nd Era Broad-based computing for specific tasks 3rd Era Computing as part of our lives
  • 14. 14  How often can ...  An Anti-lock Braking system be unavailable?  Your Mobile Phone crash/restart?  An Autopilot be unavailable? ... As often as it likes: As long as it is available when you need it!  The Power Grid crash/restart?  An Engine Management unit get stuck at Full Throttle?  A spurious Cash Transaction in your Bank Account? ... Never!  A PC crash before it is unusable?  Weather forecast be incorrect before it matters? ... Surprisingly often: Humans are inclined to blame themselves. ... Dependability is Subjective; Application, User and Context dependent (Quality) What Dependable Computing do we Expect? “To be trusted to do or provide what is needed” (Merriam-Webster)
  • 15. 15 End-Products are about Function, not aboutTechnology You can’t tell which bits are done in Hardware and which in Software? Hardware Module ?Software Module ? Hardware + Software Module ? ... So where are the Dependability Vulnerabilities located?
  • 16. 16  Boolean Mathematics is Dependable; but implementation depends on reliably mapping its equations to the physical world through Logic-Gates  (For HW and SW!)  CMOS has been a reliable Boolean mapping for 30 years, but ...  Today’s 20nm transistors at have larger variability, and there are more of them on a chip (Typically 500M in 2012)  At 70degC, Vtn=130mv (sigma ~25mv) around 1 in 5 million, transistors have Vt<0 (Can’t be turned off)  That’s ~100 transistors/chip that don’t switch off  And another hundred that only turn-on weakly (low drive/slow)  And they will always be randomly placed! ... So today’s chips shouldn’t work? Is Hardware (Logic) Dependable? 1/3 B A +V A B OUTNAND OUT
  • 17. 17 Mitigating this we have ...  Transistors: Not all ...  Are at 70 degC even if the die is (local variation)  Are Minimum Size ... Increasing ‘area’ reduces variability  Are on Critical Paths ... And ‘chains’ of gates perform closer to average!  Non-Functionality is (easily) Observable ... The effects can be very subtle.  CMOS Logic: Is very robust and will continue to work with extreme transistors  Leaky Gates and Faster Transitions are not usually failure criteria  The chance of a second extreme transistor on a single Critical Path is the order of <1:1,000,000  Memory: Circuits are much more sensitive to Vt/gm variation ...  But spare rows/columns are part of SRAM designs and allow lots of defects to be ‘repaired’  AND >75% of typical SoC die area is memory, so ...  Most of the sensitive area has a repair strategy! ..and...  The rest is inherently more robust! Is Hardware (Logic) Dependable? 2/3
  • 18. 18  But we haven't included ...  Internally and Externally generated synchronous supply noise? (Greater susceptibility at lower voltages)  High-energy particles? (Greater susceptibility at smaller geometries)  Wear-out (Vt/Gain drift)? (Greater susceptibility at smaller geometries)  Temperatures greater than 70degC (140C is not uncommon)  Limitations of Verification and Test (Limited exploration of state-space)  We repeatedly multiplying tiny-improbables, by large-numbers ...  And many of the values are only guesses!  We have no real idea about the reliability/dependability of modern Systems or Components  We only know that as process geometries shrink, Susceptibility will get worse ...  Chips will get ever more complex (and more chips will be used in more complex Systems)  Transistors will get smaller and Designers will erode safety margins to get performance ... Despite this Chips and Systems do Yield today more than we would rightly expect ... ... So we must be utilising Unknown Safety Factors! Is Hardware (Logic) Dependable? 3/3
  • 19. 19  All Software Crashes!  Software providers seldom guarantee the functionality of their product  Quality is tested-in; and improved by bug-fixes/patches in the field (To what level?)  So software Reuse offers improved Quality and Productivity (But over what?)  Residual Errors ...  No code has zero residual errors!!  Well structured and tested Source-Code has ~5 errors per 1,000 lines of code (E-KLOC)  Commercial code is typically ~5x worse than this  No Useful Correlation between residual-errors and their system-impact severity  Only the Heuristic, that ‘most of them are harmless’.  Formal-Methods are better; but cost is high if you need a clean-sheet design.  Even Perfect-Software would have to work with an Imperfect-Platform  Don’t underestimate the Commercial Importance of TTM and Cost !!! Is Software Dependable? 1/3 Demonstrating the limitations of achieving Quality throughTest ...
  • 20. 20 Is Software Dependable? 2/3 Hardware and Software Design are indistinguishable ... // A master-slave type D-Flip Flop module flop (data, clock, clear, q, qb); input data, clock, clear; output q, qb; // primitive #delay instance-name // (output, input1, input2, .....), nand #10 nd1 (a, data, clock, clear), nd2 (b, ndata, clock), nd4 (d, c, b, clear), nd5 (e, c, nclock), nd6 (f, d, nclock), nd8 (qb, q, f, clear); nand #9 nd3 (c, a, d), nd7 (q, e, qb); not #10 inv1 (ndata, data), inv2 (nclock, clock); endmodule Hardware (Verilog Language)? Software (C Language)? #include<time.h> /* Use the PC's timer to check */ /* processing time */ main() { clock_t time,deltime; long junk,i; float secs; LOOP: printf("input loop count: "); scanf("%ld",&junk); time = clock(); for(i=0;i<junk;i++) deltime = clock() - time; secs = (float) deltime/CLOCKS_PER printf("for %ld loops, #tics = % %fn",junk,deltime,secs); goto LOOP; ... Target Platform HW ----- & ----- SW Target Architecture Info Compilers HW ----------- SW Configuration Files HW -------------- SW
  • 21. 21 Is Software Dependable? 3/3 Somebody will see the bugs! (The Open Source Delusion) 1: http://www.wired.com/2014/04/heartbleedslesson/ 2: http://veridicalsystems.com/blog/of-money-responsibility-and-pride/ “It is now very clear that OpenSSL development could benefit from dedicated full-time, properly funded developers” “OSF typically receives only $2,000 a year in donations”  OpenSSL HeartBleed bug 1  Update was received just before a Public Holiday  Editor was a known and high-quality source  Code was reviewed informally and released  Editor was conflicted with day-job, family and holiday pressure 2  Too little resources to do a proper job.  This was a E-KLOC error ...  Not a Formatting error, nor a Functional error  It was a System error (an omission in a non-functional aspect of the code). ... Was the ‘fault’ with the software Source (OpenSSL Software Foundation (OSF)) ? ... Or a User Community too-ready to believe in the Quality of Open Source software?
  • 22. 22 ‘Optimal’ Platform HW1 HW2 HW3 HW4 Hardware Interface RTOS/Drivers Thread Bus(es) Processor(s) F1 F2 F3 F4 F5 Create Functional-Model1 on a ‘Generic’ Platform (F1) (F3) (F5)(F2) Designing the Computing System ... ... is about creating a Model of Behaviour to meet Non-Functional Constraints Translate to Functional-Model on an ‘Optimal’ Platform 1: This includes a Model of Execution such as a Java VM.
  • 23. 23 Typical 2014 Computing Platform ... ... is just 137.2 x 70.5 x 5.9 mm
  • 24. 24 Typical 2014 Computing Platform Exynos 5422 Eight 32 bit CPUs (big.LITTLE): • Four big (2.1GHz ARM A15) for heavy tasks; • Four small (1.5GHz ARM A7) for lighter tasks. + Nine Mali GPU cores ... ... A ~30 Core Heterogeneous Multi-Processor ... In your Shirt Pocket! ... 21 significant ‘Chips’
  • 25. 25 2010:Apple’s A4 SIP Package (Cross-section) IC Packaging Technology  The processor is the centre rectangle. The silver circles beneath it are solder balls.  Two rectangles above are RAM die, offset to make room for the wirebonds.  Putting the RAM close to the processor reduces latency, making RAM faster and reduces power consumption ... But increases cost.  Memory: Unknown  Processor: Samsung/Apple (ARM Processor)  Packaging: Unknown (SIP Technology) Source ... http://www.ifixit.com Processor SOC Die 2 Memory Dies Glue Memory ‘Package’ 4-Layer Platform Package’ Steve Jobs WWDC 2010
  • 26. 26 2013: Samsung Solid-State Memory  Smart Memory Interface (eMMC)  16-128Gb in a single package  8Gb/die. Stacked 2-16 die/package  Handles errors in the bulk-data store  Package just 1.4mm thick! (11.5x13x1.4mm) ... Smaller than a postage stamp
  • 27. 27 2012: Nvidea’s Tegra 3 Processor Unit (Around 1B transistors) NB: The Tegra 3 is similar to the Apple A4
  • 28. 28 Component and Sub-Systems from Global Enterprise ... ... Global Teams contributing Specialist Knowledge & Knowhow  Apple ID’d 159 Tier-1 Suppliers ...  Thousands of Engineers Globally  Est. 10x Tier-2 Suppliers ...  Including Virtual Components1 and Sub-Systems (ARM and other IP Providers)  Multiple Technologies ...  Hardware, Software, Optics, Mechanics, Acoustics, RF, Plastics, etc  Manufacturing, Test, Qualification, etc.  Methods, Tools, Training, etc  Tens of thousands Engineers Globally ... More than 90% of Technology and Methods are Reused (productivity)! 1: Virtual Components do not appear on BOM
  • 31. 31 Designer Productivity has become theTechnology Driver  The Product Possibilities offered by utilising the Billions of Affordable and Aesthetically Encapsulate-able Transistors is Commercially Beguiling!  But the only way to utilise these possibilities in a reasonable time, with a reasonable team and at a reasonable cost; is huge amounts of Reuse of Design and Technology ...  Hardware, Software and other Technologies; Methods and Tools  In-Company: Sourced and Evolved from Predecessor Products  Ex-Companies: Sourced from businesses with lesser-known(?) Histories, but Specialist Knowledge  Reuse Improves Quality; as objects are designed more carefully, and bug-fixes are incremental  But this is ‘trend towards zero-defects’, not ‘zero-defects’ approach. ... Reuse Methods do seems to be good-enough for Commercial Applications! ... ‘Rigorous lean-sheet approaches’ will be orders of magnitude higher cost, so use of Commercial Techniques for Dependable Systems are inevitable! ... The Available Components and Sub-Systems are unreliable; “get over it!”
  • 32. 32 ARM: brings the Right Horse to the Right Course ... ... Delivering ~5x speed (Architecture + Process + Clock) About 50MTr About 50KTr
  • 33. 33 ...Which means: 24 Processors in 6 Families ...
  • 34. 34 ... CoreLink for Hetrogeneous Multi-Processing ... ACE ACE NIC-400 Network Interconnect Flash GPIO NIC-400 USBQuad Cortex- A15 L2 cache Interrupt Control CoreLink™ DMC-520 x72 DDR4-3200 PHY AHB Snoop Filter Quad Cortex- A15 L2 cache Quad Cortex- A15 L2 cache Quad Cortex- A15 L2 cache CoreLink™ DMC-520 x72 DDR4-3200 8-16MB L3 cache PCIe 10-40 GbE DPI Crypto CoreLink™ CCN-504 Cache Coherent Network IO Virtualisation with System MMU DSP DSP DSP SATA Dual channel DDR3/4 x72 Up to 4 cores per cluster Up to 4 coherent clusters Integrated L3 cache Up to 18 AMBA interfaces for I/O coherent accelerators and IO Peripheral address space Heterogeneous processors – CPU, GPU, DSP and accelerators Virtualized Interrupts Uniform System memory
  • 35. 35 … Tools, Libraries and Partners to Realize the Opportunity  Technology to build Electronic System solutions:  Software, Drivers, OS-Ports, Tools, Utilities to create efficient system with optimized software solutions  Diverse Physical Components, including CPU and GPU processors designed for specific tasks  Interconnect System IP delivering coherency and the quality of service required for lowest memory bandwidth  Optimised Cell-Libraries for a highly optimized SoC implementations  Well Connected to Partners in the Life-Cycle:  For complementary tools and methods required by System Developers  Global Technology Global Partners:  >900 Licences; Millions of Developers
  • 36. 36  We Can’t Design it Right  HW is SW; and Coding errors remain. State-space too big for simulation exploration. Can’t model or explore whole Systems and they are too complex for Formal methods  We Can’t Make it Right  Chips are subject to Process Imperfections and Variability. Chips and Systems are subject to Verifications and Test Escapes. Boolean math is absolute; logic cells are not  We Can’t Keep it Right  Chips are susceptible to Supply Transients, Wear-Out and High-Energy particles. ... And all it get worse as processes shrink and complexity grows ... Yet we DO make Complex Electronic Systems that work! ... What is the explanation? (can we quantify it and use it?) ... Or are we just being Harbingers of a Ever-Threatening Doom ? Where Do All The Errors Go?
  • 37. 37  System-Level Dependability is what matters ...  Dependable Systems need to Reuse Components and Sub-Systems (Physical and Virtual) for Productivity; and the only affordable ones are of Commercial quality!  Clean-Sheet design is off-the-table for almost all complex products! ... the possible exception being the (diminishing) cost-no-object market!  The Only Place to implement System-Level Dependability is in the System ‘Layer’!  Dependability of Component and Sub-Systems may be enhanced, which will help with the System-Level task; but they cannot achieve System-Level Dependability by themselves! ... I believe this is the only viable Strategy for creation of Dependable Systems Facing the Unavoidable Truth Dependable on Undependable ...
  • 38. 38 Toolbox to help us “Get over It”...  The only universal interpretation of Fail-Safe is Fail-Functional!  Probably impossible for the General Case; but may be for Specific Critical Cases.  So the identification of Failure and the initiation of appropriate Response must be the highest System-Layer; Above the Functional-Integration-Layer.  This can include the ‘zero-case’ (In the even that it is all non-critical)  Recognising the differing requirements for Failure Survival (All cases are not equal)  Components and Sub-Systems may have protection built in, to increase their Reliability (How probable are they to fail? How many/What type of defects can be tolerated?)  We need a Toolbox (equivalent of ‘Spare Rows and Columns’) for the System-Level  Memory Chip providers build in Repair mechanisms to overcome process limitations  Memory Systems providers Overcome memory limitations by handling Files not Addresses.  Redundancy (Double/Triple) is a black-box implementation strategy for logic blocks  Defensive Programming is a technique for building checking into software  ...
  • 39. 39 Conclusions  Systems are what End-Customers buy; they expect them to be Dependable Enough.  A subjective level which is Application, State and Context dependent.  Commercial Components and Sub-Systems (HW/SW) are the building blocks  Commercial use has given us the Technologies which we are economically bound to use  They work better than we would rightly expect, but we cannot quantifying their quality  We can improve their Quality/Reliability/Dependability; but 100% is an asymptotic goal!  Dependable Systems must be based on Less-Dependable Components  So: System Dependability must be handled by the System-Level Software (Top-Level); only it can determine the expected action and appropriate corrective action for everything in its domain.  And: Because Dependability is Application and State Dependent, then it can only be handled by a Methodology ... Not every System state needs the same Dependability. ... The Commercial Imperative won’t wait for the ‘right way’ ... before it produces systems that People Depend on!