SlideShare une entreprise Scribd logo
1  sur  56
Télécharger pour lire hors ligne
SERENE'14 Autumn School 
ENGINEERING RESILIENT CYBER PHYSICAL SYSTEMS 
System-Level Concurrent Error Detection 
Dr. Luigi Pomante 
Università degli Studi dell’’Aquila 
Center of Excellence DEWS 
luigi.pomante@univaq.it
Introduction 
Resilience 
Reliability 
Fault Tolerance 
Concurrent 
Error 
Detection 
System Level CED - 2 - © 2014 - Luigi Pomante
Introduction 
Error detection is one of the basic feature needed 
to support reliability and then resilience in CPS 
So, this talk focuses on error detection issues in the cyber 
part of a CPS 
Such a part is normally a customized electronic digital system, 
with an ad-hoc hw/sw architecture, typically embedded in a 
more complex heterogeneous system that heavily interacts 
with some physical processes 
System Level CED - 3 - © 2014 - Luigi Pomante
Introduction 
Error Detection Methodologies 
Off-line vs. Concurrent 
System-Level Design Methodologies 
System-Level Specification 
Functional characterization of the system without dealing 
with implementation aspects 
Specification of implementation objectives and constraints 
Timing, Power Consumption, Area 
Estimation of the influence of different alternatives on the 
final implementation 
HW/SW system composition 
Different processors and/or alternative technologies 
System Level CED - 4 - © 2014 - Luigi Pomante
Introduction 
Typically, system resislience/reliability aspects are neglected 
while dealing with the higher levels of system synthesis process 
They are postponed to lower abstraction levels but the use of 
resislience/reliability methodologies could significantly 
impacts on timing, energy and area 
It is necessary to transfer these aspects toward the upper 
levels of the synthesis flow by adding the resilience/reliability 
constraint to the classical cost parameters 
This work investigates the problem of adopting design for 
reliability/resilience approaches at system level, when all the 
solutions are still open for the implementation of the device, 
presenting a set of design methodologies to provide concurrent 
error detection (CED) properties to the final implementation 
System Level CED - 5 - © 2014 - Luigi Pomante
Goal 
The achievement of this wide resilience/reliability 
co-design project consists of the following aspects 
specification of systems in a co-design environment 
supporting resilience/reliability constraints 
design methodologies providing the desired CED properties 
hw/sw system partitioning on the basis of metrics taking 
into account both traditional co-design issues and 
resilience/reliability constraints 
System Level CED - 6 - © 2014 - Luigi Pomante
Overview 
Problem Definition 
Target System Architecture 
Fault Model 
System Specification 
Design Methodologies for Reliability 
Design Analysis and Metrics 
Hw/Sw System Partitioning 
A Case Study: a Reliable Pacemaker 
System Level CED - 7 - © 2014 - Luigi Pomante
Problem Definition 
A Section is a subset of the system specification 
A Critical Section is a section where the CED 
property is required 
A Reliable Section is a critical section that 
propagates either error free critical results or faulty 
critical results associated with an error indication 
System Level CED - 8 - © 2014 - Luigi Pomante
Problem Definition 
The underlying assumption refers to the fact that the input 
data processed by the reliable section is error free 
The upstream sections provide either correct data by 
definition or they are designed to be reliable themselves 
The downstream sections also need to be designed reliable or 
no reliability constraint applies to them 
In the former case reliability is extended to all downstream 
elements, in the latter the property has a pure local effect 
System Level CED - 9 - © 2014 - Luigi Pomante
Problem Definition 
In order to define formally these two different 
characterizations, the following definitions are 
introduced 
Local Reliability 
The Local Reliability property of a critical section specifies 
that the reliability constraints involve only the related critical 
section 
Global Reliability 
The Global Reliability property of a critical section specifies 
that the reliability constraints involve the related sections 
and recursively all the downstream sections 
System Level CED - 10 - © 2014 - Luigi Pomante
Problem Definition 
Local and Global Reliability Specification 
A 
B 
D 
D 
C 
A 
B 
D 
E 
C 
Local reliability on B: the data 
provided to A are reliable 
Global reliability on B: the data 
provided to A and B are reliable 
System Level CED - 11 - © 2014 - Luigi Pomante
Problem Definition 
The need of two kinds of reliability is due to the possibility 
that a specification could comprehend also the environment 
description, that doesn’t need any property, or a set of 
functionalities of which only one should be reliable 
For example, a digital control system specification for a car could 
comprehend tachometer, temperature and ABS control: the 
reliability is needed only for the ABS 
In order to be able to specify which sections must be reliable 
and what kind of reliability is desired particular system level 
specification languages (or proper extension to the existing 
ones) are required 
System Level CED - 12 - © 2014 - Luigi Pomante
System Specification 
Two languages has been considered for system 
specification: Occam II and SystemC 
The first one has been selected since the TOSCA 
environment (a Co-design environment for embedded 
systems), used in our studies to verify the proposed 
approaches, is based on it 
The second language is becoming increasingly popular for 
system level specification, thus making its adoption almost 
a requirement when pursuing the integration of the 
proposed approaches in a real design flow 
System Level CED - 13 - © 2014 - Luigi Pomante
System Specification 
Reliability constraints in Occam II 
The language has been extended with the introduction of 
statements for identifying critical sections to be added to 
the standard constraint definition section 
CS FROM label1 TO label2 IS LOCAL (GLOBAL) 
INT a,b 
CHAN OF INT in,out: 
TAG A: 
SEQ 
a:=0 
WHILE TRUE 
TAG B: 
SEQ 
a:=a+1 
out ! a 
TAG C: 
in ? b 
a:=a+b 
Declaration of a communication 
channel 
TAG D: 
MAXDELAY FROM B TO C IS 10: 
MAXRATE OF B IS 100: 
CS FROM A to D IS LOCAL: 
Tag definition 
Timing constraints 
Reliability constraint 
System Level CED - 14 - © 2014 - Luigi Pomante
System Specification 
Reliability constraints in SystemC 
The language allows an intervention at different 
abstraction levels: module or process 
While working at module level, reliability constraints are 
imposed by extending the basic class using the inheritance 
mechanisms 
SC_MODULE_GCS, SC_MODULE_LCS 
– A reliability constraint imposed to the module applies directly to 
all processes included in the module itself 
When moving to process level, macro mechanisms can be 
adopted, by introducing additional macros for specifying 
critical sections and the local/global reliability constraint 
SC_GCS, SC_LCS 
System Level CED - 15 - © 2014 - Luigi Pomante
Target System Architecture 
The reference architecture consists of the basic processor 
block (either general purpose or DSP), which executes software 
processes, main memory and a set of co-processors (ASIC or 
FPGA) implementing hardware functionalities if required 
Communication between hardware modules uses the available 
bus, memory otherwise 
CPU 
Memory 
I/O Interface Co-Processors 
System Level CED - 16 - © 2014 - Luigi Pomante
Fault Model 
The adopted fault model is represented by the Single 
Functional Failure, where any number of physical faults 
causes a functional module to perform incorrectly 
The considered faults affect the hardware structure of the 
system, mining the behavior of the software too, but no software 
failures are considered in this work 
The modules that may fail are, thus, the main processor, the 
co-processors, the main memory, the system bus and the 
dedicated channels for hardware-hardware module 
communication 
Such a single failure model is based on a commonly adopted 
hypothesis: module failure is detected before another module 
fails 
System Level CED - 17 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
The resilience/reliability project has investigated design 
methodologies for guaranteeing error detection capabilities 
based on the adoption of redundancy strategies 
Architectural and information redundancy 
The methodologies that have been analyzed and developed can 
be classified 
On the basis of the functionality to be performed and controlled 
Data Processing or Communication 
On the partitions involved 
HW or SW 
On the CED techniques adopted for guaranteeing the reliability 
properties 
System Level CED - 18 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
The design approach considers as the basic element any 
functionality that the system must provide in a reliable way 
Nominal (N) 
Denotes such basic element 
Checking (C) 
Identifies the redundant functional elements designed to provide 
error detection capabilities 
Checker (CK) 
Is the functional element that detects a mismatching behavior 
between N and C due to failures 
Each one of these three elements (N, C and CK) can be 
independently implemented in hardware or in software, 
leading to several classes of methodologies 
System Level CED - 19 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
Reliable Data Processing 
Nominal 
Architecture 
Checking 
Architecture 
Sw 
Checker 
Hw 
Sw 
Hw 
Sw 
Hw 
Solution Nominal Checker Checking 
1 SW SW SW 
2 SW HW SW 
3 SW SW HW 
4 SW HW HW 
5 HW SW SW 
6 HW HW SW 
7 HW SW HW 
8 HW HW HW 
System Level CED - 20 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
Reliable Data Processing 
Class 1: SW Nominal, SW Checker, and SW Checking 
Self-Checking SW 
Assertions 
Dual-Processor Checking 
VLIW Checking 
Class 2: SW Nominal, HW Checker, and SW Checking 
Interface for Functional Redundancy Check 
DMA Checker 
VLIW Checking with HW Checker 
System Level CED - 21 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
Reliable Data Processing 
Class 4: SW Nominal, HW Checker, and HW Checking 
Dynamically Re-Configurable Checker 
Class 8: HW Nominal, HW Checker, and HW Checking 
Device Duplication 
TSC Scheduling 
TSC Devices 
System Level CED - 22 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
Reliable Communications 
It is necessary to guarantee that any fault on 
communication lines is detected 
Either hardware redundancy (lines duplication) or 
information redundancy (data encoding) can be adopted 
Two possibilities should be considered 
Communications between procedures implemented in HW 
Other kind of communications 
– SW-SW, SW-HW, HW-SW 
System Level CED - 23 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
Reliable Communications 
Communications between procedures implemented in HW 
A pair of HW sections communicates by means of dedicated 
lines 
– Line Duplication vs. Data Encoding 
Other kinds of communication 
When the communication involves a SW section then it makes 
use of the system bus 
– The only viable solution is the use of error detection codes 
– The best results are obtained keeping the data in memory in a 
coding form and let the CPU working only with non-coded data 
» HW TSC Encoder/Decoder/ChecKer for the processor and 
one (or more) for the HW devices 
System Level CED - 24 - © 2014 - Luigi Pomante
Design Methodologies 
for Reliability 
Reliable Communications 
Architecture with reliable communications 
CPU 
Memory 
(Coded Data) 
TSC 
EDCK 
TSC 
EDCK 
TSC 
EDCK 
TSC 
CK 
I/O Interface Co-Processors 
System Level CED - 25 - © 2014 - Luigi Pomante
Design Analysis 
and Metrics 
All the methodologies have been analyzed in details 
in order to give prominence to main design issues 
and to evaluate benefits and costs 
The design issues have been analyzed qualitatively 
according to a reference schema in order to quickly show 
the main differences between different approaches 
Benefits and costs have been analyzed defining a set of 
significant parameters, constituting the basic elements 
needed to build metrics useful to compare the quality of 
different solutions, metrics that play an important role in 
the partitioning step 
System Level CED - 26 - © 2014 - Luigi Pomante
Design Analysis 
and Metrics 
Design issues reference schema: key concepts 
Selection of number and typology of processing elements 
Detection of the need for a special architecture 
Analysis of synchronization issues between processing elements 
Analysis for possible physical and logical resources sharing 
Detection of modification needs of the original specification 
Selection of the execution policies for each processing element 
Allocation of the checker memory space 
Selection of the checking policies 
Analysis of the checker structure and complexity 
Selection of a mechanism to enable the checker to rise exceptions 
to report error detection 
System Level CED - 27 - © 2014 - Luigi Pomante
Design Analysis 
and Metrics 
Benefits and Cost 
Let us define the Efficiency of a given methodology as its 
characterization relatively to three factors 
Coverage 
– It is the percentage of functional faults that it is possible to 
detect with respect to the complete fault set 
Detection Latency (DL) 
– It is the time between the instant a fault causes an error and the 
instant the error is detected 
Performance Degradation (PD) 
– It is related to the overhead (i.e., additional execution time) 
caused by fault detection tasks with respect to the original 
system 
System Level CED - 28 - © 2014 - Luigi Pomante
Design Analysis 
and Metrics 
Benefits and Costs 
Let define the Cost of a given solution as the overhead 
with respect to the original system 
Physical cost (Cp) 
– It represents the cost of the physical components added to the 
original architecture 
Design Cost (Cd) 
– It represents the effort needed to design and implement a given 
solution 
System Level CED - 29 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
Once the system, the constraints, and the set of possible 
design solution are specified, the partitioning step selects the 
implementation of each task, either hardware or software 
The achieved solution is checked against the designer's 
constraints and, if they are met, the solution is accepted, 
otherwise a backtrack is performed and another allocation 
solution is pursued 
This process is extremely complex and time consuming, due to 
the large number of possible alternatives and to the fact that, 
although heuristics and tuned estimation functions have been 
defined, it is the final co-simulation of the suggested system 
implementation that confirms it to be a solution or not 
System Level CED - 30 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
The reliability aspects add a significant number of parameters 
to the partitioning step for the selection of the final 
implementation, making this task too complex 
In order to cope with the complexity of the partitioning step 
when reliability goals are also included, a two-level approach 
is here proposed 
A first partitioning is performed which takes into account only the 
classical aspects and cost functions, meeting the usually stringent 
time constraints 
Given the first assessed solution, a second-level partitioning 
considers the additional reliability constraints, analyzes the 
possible approaches, within the set of defined methodologies 
which fulfill them, and provides the solution that has the best 
tradeoff (if it exists) 
System Level CED - 31 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
S P E C IF IC A T IO N 
P A R T IT IO N IN G 
R E L IA B IL IT Y 
T A G S 
T IM IN G 
P O W E R 
A R E A 
C O S T 
T IM IN G 
T A G S 
A R C H IT E C T U R E 
I N T 
H W S W 
O .S . 
IN I T I A L 
S O L U T IO N 
N O R E L IA B IL IT Y 
Y E S 
R E Q . 
c o n s t r a in ts 
c o n s t r a in ts 
P A R T IT IO N IN G R E L IA B IL IT Y 
M O D E L 
S T R E N G T H 
H A R D /S O F T 
p a ra m e te rs 
F A U L T C O V E R A G E 
D E T E C T IO N L A T E N C Y 
A R E A O V E R H E A D 
P E R F O R M A N C E 
D E G R A D A T IO N 
S P E C I F IC S O L U T IO N 
A R C H . 
Y E S N O 
N O 
Y E S 
O P T IM IZ A T IO N 
H W S W 
I N T 
H W S W 
O .S . 
H W /S W S Y N T H E S IS 
R E L IA B IL IT Y 
C O -D E S IG N 
P A R T IT IO N IN G 
S E C T IO N S F O R 
R E L IA B IL IT Y 
S O L U T IO N 
W IT H F A U L T 
D E T E C T IO N 
System Level CED - 32 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
The 2th-level partitioning problem consists of both 
Reliability Model Identification 
Defining a criterion for the identification of the relation 
between the constrained procedure and the most suitable CED 
method 
Optimization 
Optimizing the result produced by the assignment criteria 
with respect to the global solution 
System Level CED - 33 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
Reliability Model Identification 
For each approach is identified a correct evaluation, or a 
qualitative estimation, of the considered parameter 
Methodologies Fault Coverage Detection 
Latency 
Performance 
Degradation 
Area Overhead 
SCS min/med/max med/max med/max med/max 
A min/med/max min/med med/max med/max 
DP 100% med/max min/med med/max 
VLIWS 100% 0 med/max min 
IFRC 100% 0 0 max 
DMAC 100% med/max med/max max 
VLIWH 100% 0 0 max 
DCC 100% med med max 
D 100% 0 0 max 
TSCS 100% med/max med/max med/max 
TSCD 100% 0 0 min/med 
System Level CED - 34 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
Reliability Model Identification 
A crisp tag (100% fault coverage, 0 detection latency, etc.) 
represents a hard system constraint that has to be 
enforced at any cost 
A fuzzy tag (i.e. min, med, max) represents a soft system 
requirement that is a design directive of the required 
effort for the identification of anomalies during the device 
operational time 
Note that, for soft requirements, a maximum requirement 
includes methodologies belonging to the medium or minimum 
partitions; and a medium requirement includes minimum 
System Level CED - 35 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
Reliability Model Identification 
Crisp tags force a partition on the methodologies set 
In particular, 100% fault coverage induces the partitions 
hard_fc and soft_fc, 0 detection latency induces the 
partitions hard_dl and soft_dl while, 0 performance 
degradation induces the partition hard_pd and soft_pd 
Since the applicability of a methodology to a specific 
procedure depends on its hardware/software 
characteristic, a further partition is induced 
System Level CED - 36 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
Reliability Model Identification 
By analyzing the properties of the methodologies, the 
following partitions are identified: 
swfc = { {IFRC, DP, DMAC, DCC, VLIWH, VLIWS} ; {A, SCS} } 
hwfc = { {TSCS, TSCD, D} ; {} } 
swdl = { {IFRC, VLIWH, VLIWS} ; {DP, DMAC, DCC, A, SCS} } 
hwdl = { {D, TSCD} ; {TSCS} } 
swpd = { {IFRC, VLIWH} ; {DMAC, DP, DCC, VLIWS, A, SCS} } 
hwpd = { {D, TSCD} ; {TSCS} } 
System Level CED - 37 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
Reliability Model Identification 
The second level partitioning takes into account the hard 
parameters first for selecting suitable CED techniques, and 
uses the soft parameters for selecting among them 
More precisely, for each critical procedure, on the basis of its 
allocation in hardware or in software, the  partitions 
fulfilling the hard/soft requirements are selected, and the 
intersection between them provides the set of suitable CED 
techniques 
The partitioning thus proceeds with the next critical 
procedure and moves toward the end of this local CED 
allocation analysis. At the end, all procedures are associated 
with a set of admissible CED implementations 
System Level CED - 38 - © 2014 - Luigi Pomante
Hw/Sw System Partitioning 
Optimization 
The global solution determining for each procedure the CED 
technique actually adopted is pursued by means of a 
process of solution extraction and simulation, to verify that 
the constraints of the first partitioning are still met 
This process takes into account the fact that there are 
techniques with a global effect (such as IFRC, DP), which 
prevail over those with a local impact (A, SCS) 
As an optimization policy, the final solution does not 
include overlapped methods in order to achieve a 
significant efficiency 
System Level CED - 39 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
The goal of this case study is to co-design a reliable pacemaker 
able to detect any anomalies in its behavior due to physical 
faults in its components 
In order to obtain this goal, by starting from system-level 
specification and following a reliable co-design flow, the 
design space is explored, identifying an optimal partitioning 
between hardware and software, validated through system-level 
co-simulation 
Hence, by taking into account the reliability requirements, the 
proper CED methodologies able to meet all the constraints are 
selected and then the one with the best cost-benefit tradeoff 
is identified and adopted for the final design 
System Level CED - 40 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
Behavioral analysis 
LRL 
PVARP AEIr 
BP 
AVIr 
CSW 
AVI 
Time Intervals Min-Max (ms) 
PVARP 300-400 
AEIr 0-400 
BP 25 
CSW 75 
AVIr 100 
Electrocardiographic diagram 
showing the relevant timing parameters 
Typical values for each interval 
System Level CED - 41 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
State Diagram 
BP 
Natural V 
time_out / 
reset_timer, set_AEIr_timer 
PVARP AEIr 
Natural V / 
reset_timer, set_PVARP_timer 
Natural A / 
reset_timer, set_BP_timer 
time_out / 
Stimulated A 
reset_timer, set_BP_timer 
AVIrp CSW 
AVI r 
Start 
time_out / 
set_CSW_timer 
time_out / 
reset_timer, set_AVIr_timer 
Natural V / 
set_AVIrp_timer 
time_out / 
Stimultaed V 
rset_timer, set_PVARP_timer 
NAtural V/ 
reset_timer, set_PVARP_timer 
time_out / 
Stimulated V 
reset_timer, set_PVARP_timer 
System Level CED - 42 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
Timing Constraints 
State Min-Max (ms) 
PVARP 300-400 
AEIr 300-800 
BP 325-825 
CSW 400-900 
AVIr 500-1000 
Other Constraints 
Timing bounds for the intervals 
The other constraints to be considered in the first-level 
partitioning step are the classical ones: power dissipation, 
area and cost 
They must be kept as much as possible to minimum values 
System Level CED - 43 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
Reliability Constraints 
Considering the criticality of the system for the human 
safety, a hard reliability is imposed on the whole system 
More in detail 
100% fault coverage is required 
Performance degradation is allowed as long timing constraints 
are still met 
Detection latency and area overhead must be kept as much as 
possible to minimum values 
System Level CED - 44 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
System Level Specification: the Environment 
Main 
Heart System 
Test 
bench 
Environment 
Channels 
Calls 
RTS 
[1] 
RTS 
[0] 
The heart ... inside 
System Level CED - 45 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
System Level Specification: the System 
Channels System 
Pace 
maker 
PVARP 
AEIr 
AVIr 
Time 
out[0] 
Time 
Out 
[2][3][4] 
Time 
out[1] 
Calls 
System Level CED - 46 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
Timing and Reliability Requirements Specification 
PROC Pacemaker( CHAN OF BIT R; CHAN OF BIT V; CHAN OF BIT P; 
CHAN OF BIT A; CHAN OF BIT inh_R; CHAN OF BIT inh_P ) 
BIT val: 
-- Main body 
SEQ 
R ? val 
WHILE (TRUE) 
SEQ 
TAG P1: 
PVARP[0]( R, V, P, A, inh_R, inh_P, val) 
TAG P2: 
: 
MINDELAY FROM P1 TO P2 IS 500 (MS): 
MAXDELAY FROM P1 TO P2 IS 1000 (MS): 
CS FROM P1 TO P2 IS GLOBAL: 
System Level CED - 47 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
1st Level Partitioning 
TOSCA 
Embedded Ultra-Low Power Intel 486 GX 
Genetic Algorithm 
Communication Costs 
Procedures Allocation Test results 
Pacemaker PVARP AEIr AVI Timeout[0] [1] [2] [3] [4] T1 T2 T3 T4 T5 T6 
SW SW SW SW SW SW SW SW SW OK OK OK OK OK OK 
SW SW SW SW HW HW HW HW HW OK OK Max 
SW HW HW HW SW SW SW SW SW OK Max 
HW HW HW HW HW HW HW HW HW OK OK OK OK OK OK 
Selected Solution 
All-in-sw implementation (E486 16 Mhz) 
AVI 
Max 
AEIr 
OK Max 
AVI 
PVARP 
AVI 
Max 
AEIr 
Max 
AEIr 
OK Max 
AVI 
System Level CED - 48 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
2th Level Partitioning 
Reliability Constraints 
FC = 100% 
PD = medium 
DL = maximum 
A = maximum 
Partitions 
FC 100% 
– swfc = {hard_fc} = {IFRC, DP, DMAC, DCC, VLIWH, VLIWS} 
PD medium 
– swpd = {hard_pd; soft_pd} 
= {{IFRC, VLIWH };{DMAC, DP, DCC, VLIWS, A, SCS}} 
– swpd = {{IFRC, VLIWH };{DP}} 
System Level CED - 49 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
2th Level Partitioning 
Potential Solutions 
{IFRC, DP, VLIWH} 
Methodologies Comparison 
IFRC and VLIWH doesn’t affect system behavior 
DP requires co-simulation (Nominal, Checking, Checker) 
Test results 
T1 T2 T3 T4 T5 T6 
OK OK Max 
AEIr 
Max 
AVI 
PVARP 
OK Max 
AEIr 
PVARP 
– The timing constraints aren’t met: the solution is discarded 
System Level CED - 50 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
Selected Solution 
The feasible solutions are IFRC and VLIWH 
These alternatives are characterized by the same area 
overhead and detection latency, so they are equivalent 
The designer, considering the particular aspects related to 
other steps of the co-design flow can make the final choice 
For example, the IFRC is applicable independently from the 
number of reliable procedures while VLIWH requires a specific 
software synthesis step for each reliable procedure 
– The first solution has thus a cost that is independent of the 
number of critical sections, which is not true for VLIWH solutions 
– Since in the present case study all the system procedures are 
made reliable, the first architectural solution requires a lower 
effort and design cost and may be preferable 
System Level CED - 51 - © 2014 - Luigi Pomante
A Case Study: 
a Reliable Pacemaker 
Selected Solution 
The final architectural solution for the reliable pacemaker 
CPU 
Memory CPU_chk 
BUS Interface 
and Checker 
I/O Interface 
The selected solution doesn't allow any significant back 
annotation to the first level partitioning, since the initial 
hw/sw partitioning achieved an acceptable all-in-software 
solution, loading all tasks efficiently on one processor 
System Level CED - 52 - © 2014 - Luigi Pomante
Conclusions 
The resilience/reliability co-design project aims at 
integrating in a standard co-design flow the 
elements for achieving a final system able to 
autonomously detect the occurrence of faults during 
the operational life of the system 
The entire flow has been presented in this work, 
discussing the key elements of the proposed 
framework 
Specification 
Design Methodologies 
System Partitioning 
System Level CED - 53 - © 2014 - Luigi Pomante
Conclusions 
Language specification extensions have been 
defined to specify reliability requirements 
A set of possible hw/sw architectural design 
methodologies has been analyzed considering the 
possibilities to implement any part of the complete 
system (nominal, checking and checker) either in 
hardware or in software 
A metric has been introduced taking into account 
the peculiar elements of reliability properties 
System Level CED - 54 - © 2014 - Luigi Pomante
Conclusions 
A two-level hw/sw partitioning process has been 
defined, acting initially as a traditional approach to 
determine a valid solution, while the second step 
explores the alternatives taking into account the 
fault detection properties 
A case study shows the results of our work 
Further research efforts are directed toward the 
tuning of metrics with respect to the selected suite 
of design methodologies, to better support the 
partitioning step 
System Level CED - 55 - © 2014 - Luigi Pomante
References 
L. Pomante. “System Level Concurrent Error Detection”, Technical Report No. 2001.62, 
Politecnico di Milano, 2001 
L. Pomante. “System-Level Co-Design of Heterogeneous Multiprocessor Embedded 
Systems”, PhD Thesis, Politecnico di Milano, 2002 
L. Pomante, C. Bolchini, F. Salice, D. Sciuto. "Reliability Properties Assessment at 
System Level: a Co Design Framework", Journal of Electronic Testing - Theory and 
Application (JETTA), Kluwer Academic Publishers, 2002 
L. Pomante, A. Miele, F. Salice, C. Bolchini, D. Sciuto, "Reliable System Co-Design: the 
FIR Case Study", IEEE International Symposium on Defect and Fault Tolerance in VLSI 
Systems (DFT 2004) 
L. Pomante, F. Salice, C. Bolchini, D. Sciuto, “Reliable System Specification for Self- 
Checking Data-Paths”, Design, Automation and Test in Europe – Conference & Exibition 
(DATE 2005), 2005 
L. Pomante, D. Sciuto, F. Salice, W. Fornaciari, C. Brandolese. “Affinity-Driven System 
Design Exploration for Heterogeneous Multiprocessor SoC”, IEEE Transactions on 
Computers, vol. 55, no. 5, 2006 
L. Pomante. “System-Level Design Space Exploration for Dedicated Heterogeneous Multi- 
Processor Systems”. IEEE International Conference on Application-specific Systems, 
Architectures and Processors, 2011 
L. Pomante. “HW/SW Co-Design of Dedicated Heterogeneous Parallel Systems: an 
Extended Design Space Exploration Approach”. IET Computers & Digital Techniques, 
Institution of Engineering and Technology, 2013 
System Level CED - 56 - © 2014 - Luigi Pomante

Contenu connexe

Tendances

Reverse Engineering of Module Dependencies
Reverse Engineering of Module DependenciesReverse Engineering of Module Dependencies
Reverse Engineering of Module DependenciesDharmalingam Ganesan
 
Software evolution and maintenance
Software evolution and maintenanceSoftware evolution and maintenance
Software evolution and maintenanceFeliciano Colella
 
UVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER CORE
UVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER COREUVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER CORE
UVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER COREVLSICS Design
 
Domain Modelling
Domain ModellingDomain Modelling
Domain Modellingkim.mens
 
Command center processing and display system replacement (ccpds-r) - Case Study
Command center processing and display system  replacement (ccpds-r) - Case StudyCommand center processing and display system  replacement (ccpds-r) - Case Study
Command center processing and display system replacement (ccpds-r) - Case StudyKuppusamy P
 
SE2_Lec 23_Introduction to Cloud Computing
SE2_Lec 23_Introduction to Cloud ComputingSE2_Lec 23_Introduction to Cloud Computing
SE2_Lec 23_Introduction to Cloud ComputingAmr E. Mohamed
 
D. C. Behringer resume, v.13.2.1
D. C. Behringer resume, v.13.2.1D. C. Behringer resume, v.13.2.1
D. C. Behringer resume, v.13.2.1Dave Behringer
 
Configuration Management in Software Engineering - SE29
Configuration Management in Software Engineering - SE29Configuration Management in Software Engineering - SE29
Configuration Management in Software Engineering - SE29koolkampus
 
Ian Sommerville, Software Engineering, 9th Edition Ch 4
Ian Sommerville,  Software Engineering, 9th Edition Ch 4Ian Sommerville,  Software Engineering, 9th Edition Ch 4
Ian Sommerville, Software Engineering, 9th Edition Ch 4Mohammed Romi
 
10. Software testing overview
10. Software testing overview10. Software testing overview
10. Software testing overviewghayour abbas
 
Software Evolution and Maintenance Models
Software Evolution and Maintenance ModelsSoftware Evolution and Maintenance Models
Software Evolution and Maintenance ModelsMoutasm Tamimi
 
Software Quality Assurance
Software Quality AssuranceSoftware Quality Assurance
Software Quality AssuranceSanthiya Grace
 
Computer Programming For Power Systems Analysts.
Computer Programming For Power Systems Analysts.Computer Programming For Power Systems Analysts.
Computer Programming For Power Systems Analysts.H. Kheir
 
Thornton_Test-Tech
Thornton_Test-TechThornton_Test-Tech
Thornton_Test-TechBen Thornton
 

Tendances (18)

Reverse Engineering of Module Dependencies
Reverse Engineering of Module DependenciesReverse Engineering of Module Dependencies
Reverse Engineering of Module Dependencies
 
Software evolution and maintenance
Software evolution and maintenanceSoftware evolution and maintenance
Software evolution and maintenance
 
UVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER CORE
UVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER COREUVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER CORE
UVM BASED REUSABLE VERIFICATION IP FOR WISHBONE COMPLIANT SPI MASTER CORE
 
Domain Modelling
Domain ModellingDomain Modelling
Domain Modelling
 
Command center processing and display system replacement (ccpds-r) - Case Study
Command center processing and display system  replacement (ccpds-r) - Case StudyCommand center processing and display system  replacement (ccpds-r) - Case Study
Command center processing and display system replacement (ccpds-r) - Case Study
 
SE2_Lec 23_Introduction to Cloud Computing
SE2_Lec 23_Introduction to Cloud ComputingSE2_Lec 23_Introduction to Cloud Computing
SE2_Lec 23_Introduction to Cloud Computing
 
D. C. Behringer resume, v.13.2.1
D. C. Behringer resume, v.13.2.1D. C. Behringer resume, v.13.2.1
D. C. Behringer resume, v.13.2.1
 
Ch19 systems engineering
Ch19 systems engineeringCh19 systems engineering
Ch19 systems engineering
 
Software Evolution
Software EvolutionSoftware Evolution
Software Evolution
 
Configuration Management in Software Engineering - SE29
Configuration Management in Software Engineering - SE29Configuration Management in Software Engineering - SE29
Configuration Management in Software Engineering - SE29
 
Ch21 real time software engineering
Ch21 real time software engineeringCh21 real time software engineering
Ch21 real time software engineering
 
Ian Sommerville, Software Engineering, 9th Edition Ch 4
Ian Sommerville,  Software Engineering, 9th Edition Ch 4Ian Sommerville,  Software Engineering, 9th Edition Ch 4
Ian Sommerville, Software Engineering, 9th Edition Ch 4
 
Buys2011a
Buys2011aBuys2011a
Buys2011a
 
10. Software testing overview
10. Software testing overview10. Software testing overview
10. Software testing overview
 
Software Evolution and Maintenance Models
Software Evolution and Maintenance ModelsSoftware Evolution and Maintenance Models
Software Evolution and Maintenance Models
 
Software Quality Assurance
Software Quality AssuranceSoftware Quality Assurance
Software Quality Assurance
 
Computer Programming For Power Systems Analysts.
Computer Programming For Power Systems Analysts.Computer Programming For Power Systems Analysts.
Computer Programming For Power Systems Analysts.
 
Thornton_Test-Tech
Thornton_Test-TechThornton_Test-Tech
Thornton_Test-Tech
 

Similaire à SERENE 2014 School: Luigi pomante serene2014_school

D pduapi user-manual
D pduapi user-manualD pduapi user-manual
D pduapi user-manuallinhdoanbro
 
Slides 6 design of sw arch using add
Slides 6 design of sw arch using addSlides 6 design of sw arch using add
Slides 6 design of sw arch using addJavid iqbal hashmi
 
V&V Lessons Learnt under multiple Standards
V&V Lessons Learnt under multiple StandardsV&V Lessons Learnt under multiple Standards
V&V Lessons Learnt under multiple StandardsOak Systems
 
Nishar_Resume
Nishar_ResumeNishar_Resume
Nishar_ResumeMD NISHAR
 
System Center Configuration Manager 2012 Overview
System Center Configuration Manager 2012 OverviewSystem Center Configuration Manager 2012 Overview
System Center Configuration Manager 2012 OverviewAmit Gatenyo
 
IRJET- Secure Scheme For Cloud-Based Multimedia Content Storage
IRJET-  	  Secure Scheme For Cloud-Based Multimedia Content StorageIRJET-  	  Secure Scheme For Cloud-Based Multimedia Content Storage
IRJET- Secure Scheme For Cloud-Based Multimedia Content StorageIRJET Journal
 
Server Emulator and Virtualizer for Next-Generation Rack Servers
Server Emulator and Virtualizer for Next-Generation Rack ServersServer Emulator and Virtualizer for Next-Generation Rack Servers
Server Emulator and Virtualizer for Next-Generation Rack ServersIRJET Journal
 
DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...
DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...
DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...IEEEGLOBALSOFTTECHNOLOGIES
 
Michael_Joshua_Validation
Michael_Joshua_ValidationMichael_Joshua_Validation
Michael_Joshua_ValidationMichaelJoshua
 
IRJET- Research Study on Testing Mantle in SDLC
IRJET- Research Study on Testing Mantle in SDLCIRJET- Research Study on Testing Mantle in SDLC
IRJET- Research Study on Testing Mantle in SDLCIRJET Journal
 
Surekha_haoop_exp
Surekha_haoop_expSurekha_haoop_exp
Surekha_haoop_expsurekhakadi
 
Software Architecture: introduction to the abstraction
Software Architecture: introduction to the abstractionSoftware Architecture: introduction to the abstraction
Software Architecture: introduction to the abstractionHenry Muccini
 
SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...
SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...
SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...IJSEA
 
Brian muirhead v1-27-12
Brian muirhead v1-27-12Brian muirhead v1-27-12
Brian muirhead v1-27-12NASAPMC
 
Software engineering lecture notes
Software engineering   lecture notesSoftware engineering   lecture notes
Software engineering lecture notesAmmar Shafiq
 

Similaire à SERENE 2014 School: Luigi pomante serene2014_school (20)

Bhavani HS
Bhavani HSBhavani HS
Bhavani HS
 
D pduapi user-manual
D pduapi user-manualD pduapi user-manual
D pduapi user-manual
 
Slides 6 design of sw arch using add
Slides 6 design of sw arch using addSlides 6 design of sw arch using add
Slides 6 design of sw arch using add
 
Ia rm001 -en-p
Ia rm001 -en-pIa rm001 -en-p
Ia rm001 -en-p
 
V&V Lessons Learnt under multiple Standards
V&V Lessons Learnt under multiple StandardsV&V Lessons Learnt under multiple Standards
V&V Lessons Learnt under multiple Standards
 
Nishar_Resume
Nishar_ResumeNishar_Resume
Nishar_Resume
 
System Center Configuration Manager 2012 Overview
System Center Configuration Manager 2012 OverviewSystem Center Configuration Manager 2012 Overview
System Center Configuration Manager 2012 Overview
 
IRJET- Secure Scheme For Cloud-Based Multimedia Content Storage
IRJET-  	  Secure Scheme For Cloud-Based Multimedia Content StorageIRJET-  	  Secure Scheme For Cloud-Based Multimedia Content Storage
IRJET- Secure Scheme For Cloud-Based Multimedia Content Storage
 
Server Emulator and Virtualizer for Next-Generation Rack Servers
Server Emulator and Virtualizer for Next-Generation Rack ServersServer Emulator and Virtualizer for Next-Generation Rack Servers
Server Emulator and Virtualizer for Next-Generation Rack Servers
 
DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...
DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...
DOTNET 2013 IEEE MOBILECOMPUTING PROJECT Model based analysis of wireless sys...
 
13431758.ppt
13431758.ppt13431758.ppt
13431758.ppt
 
Michael_Joshua_Validation
Michael_Joshua_ValidationMichael_Joshua_Validation
Michael_Joshua_Validation
 
IRJET- Research Study on Testing Mantle in SDLC
IRJET- Research Study on Testing Mantle in SDLCIRJET- Research Study on Testing Mantle in SDLC
IRJET- Research Study on Testing Mantle in SDLC
 
Surekha_haoop_exp
Surekha_haoop_expSurekha_haoop_exp
Surekha_haoop_exp
 
Software Architecture: introduction to the abstraction
Software Architecture: introduction to the abstractionSoftware Architecture: introduction to the abstraction
Software Architecture: introduction to the abstraction
 
4213ijsea06
4213ijsea064213ijsea06
4213ijsea06
 
SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...
SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...
SIMULATION-BASED APPLICATION SOFTWARE DEVELOPMENT IN TIME-TRIGGERED COMMUNICA...
 
Brian muirhead v1-27-12
Brian muirhead v1-27-12Brian muirhead v1-27-12
Brian muirhead v1-27-12
 
Chapter1
Chapter1Chapter1
Chapter1
 
Software engineering lecture notes
Software engineering   lecture notesSoftware engineering   lecture notes
Software engineering lecture notes
 

Plus de Henry Muccini

Human Behaviour Centred Design
Human Behaviour Centred Design Human Behaviour Centred Design
Human Behaviour Centred Design Henry Muccini
 
How cultural heritage, cyber-physical spaces, and software engineering can wo...
How cultural heritage, cyber-physical spaces, and software engineering can wo...How cultural heritage, cyber-physical spaces, and software engineering can wo...
How cultural heritage, cyber-physical spaces, and software engineering can wo...Henry Muccini
 
La gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle Segreterie
La gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle SegreterieLa gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle Segreterie
La gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle SegreterieHenry Muccini
 
Turismo 4.0: l'ICT a supporto del turismo sostenibile
Turismo 4.0: l'ICT a supporto del turismo sostenibileTurismo 4.0: l'ICT a supporto del turismo sostenibile
Turismo 4.0: l'ICT a supporto del turismo sostenibileHenry Muccini
 
Sustainable Tourism - IoT and crowd management
Sustainable Tourism - IoT and crowd managementSustainable Tourism - IoT and crowd management
Sustainable Tourism - IoT and crowd managementHenry Muccini
 
Software Engineering at the age of the Internet of Things
Software Engineering at the age of the Internet of ThingsSoftware Engineering at the age of the Internet of Things
Software Engineering at the age of the Internet of ThingsHenry Muccini
 
The influence of Group Decision Making on Architecture Design Decisions
The influence of Group Decision Making on Architecture Design DecisionsThe influence of Group Decision Making on Architecture Design Decisions
The influence of Group Decision Making on Architecture Design DecisionsHenry Muccini
 
An IoT Software Architecture for an Evacuable Building Architecture
An IoT Software Architecture for an Evacuable Building ArchitectureAn IoT Software Architecture for an Evacuable Building Architecture
An IoT Software Architecture for an Evacuable Building ArchitectureHenry Muccini
 
Web Engineering L8: User-centered Design (8/8)
Web Engineering L8: User-centered Design (8/8)Web Engineering L8: User-centered Design (8/8)
Web Engineering L8: User-centered Design (8/8)Henry Muccini
 
Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)
Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)
Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)Henry Muccini
 
Web Engineering L6: Software Architecture for the Web (6/8)
Web Engineering L6: Software Architecture for the Web (6/8)Web Engineering L6: Software Architecture for the Web (6/8)
Web Engineering L6: Software Architecture for the Web (6/8)Henry Muccini
 
Web Engineering L5: Content Model (5/8)
Web Engineering L5: Content Model (5/8)Web Engineering L5: Content Model (5/8)
Web Engineering L5: Content Model (5/8)Henry Muccini
 
Web Engineering L3: Project Planning (3/8)
Web Engineering L3: Project Planning (3/8)Web Engineering L3: Project Planning (3/8)
Web Engineering L3: Project Planning (3/8)Henry Muccini
 
Web Engineering L2: Requirements Elicitation for the Web (2/8)
Web Engineering L2: Requirements Elicitation for the Web (2/8)Web Engineering L2: Requirements Elicitation for the Web (2/8)
Web Engineering L2: Requirements Elicitation for the Web (2/8)Henry Muccini
 
Web Engineering L1: introduction to Web Engineering (1/8)
Web Engineering L1: introduction to Web Engineering (1/8)Web Engineering L1: introduction to Web Engineering (1/8)
Web Engineering L1: introduction to Web Engineering (1/8)Henry Muccini
 
Web Engineering L4: Requirements and Planning in concrete (4/8)
Web Engineering L4: Requirements and Planning in concrete (4/8)Web Engineering L4: Requirements and Planning in concrete (4/8)
Web Engineering L4: Requirements and Planning in concrete (4/8)Henry Muccini
 
Collaborative aspects of Decision Making and its impact on Sustainability
Collaborative aspects of Decision Making and its impact on SustainabilityCollaborative aspects of Decision Making and its impact on Sustainability
Collaborative aspects of Decision Making and its impact on SustainabilityHenry Muccini
 
Engineering Cyber Physical Spaces
Engineering Cyber Physical SpacesEngineering Cyber Physical Spaces
Engineering Cyber Physical SpacesHenry Muccini
 
I progetti UnivAq-UFFIZI, INCIPICT, e  CUSPIS
I progetti UnivAq-UFFIZI, INCIPICT, e  CUSPISI progetti UnivAq-UFFIZI, INCIPICT, e  CUSPIS
I progetti UnivAq-UFFIZI, INCIPICT, e  CUSPISHenry Muccini
 
Exploring the Temporal Aspects of Software Architecture
Exploring the Temporal Aspects of Software ArchitectureExploring the Temporal Aspects of Software Architecture
Exploring the Temporal Aspects of Software ArchitectureHenry Muccini
 

Plus de Henry Muccini (20)

Human Behaviour Centred Design
Human Behaviour Centred Design Human Behaviour Centred Design
Human Behaviour Centred Design
 
How cultural heritage, cyber-physical spaces, and software engineering can wo...
How cultural heritage, cyber-physical spaces, and software engineering can wo...How cultural heritage, cyber-physical spaces, and software engineering can wo...
How cultural heritage, cyber-physical spaces, and software engineering can wo...
 
La gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle Segreterie
La gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle SegreterieLa gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle Segreterie
La gestione dell’utenza numerosa - dalle Segreterie, ai Musei, alle Segreterie
 
Turismo 4.0: l'ICT a supporto del turismo sostenibile
Turismo 4.0: l'ICT a supporto del turismo sostenibileTurismo 4.0: l'ICT a supporto del turismo sostenibile
Turismo 4.0: l'ICT a supporto del turismo sostenibile
 
Sustainable Tourism - IoT and crowd management
Sustainable Tourism - IoT and crowd managementSustainable Tourism - IoT and crowd management
Sustainable Tourism - IoT and crowd management
 
Software Engineering at the age of the Internet of Things
Software Engineering at the age of the Internet of ThingsSoftware Engineering at the age of the Internet of Things
Software Engineering at the age of the Internet of Things
 
The influence of Group Decision Making on Architecture Design Decisions
The influence of Group Decision Making on Architecture Design DecisionsThe influence of Group Decision Making on Architecture Design Decisions
The influence of Group Decision Making on Architecture Design Decisions
 
An IoT Software Architecture for an Evacuable Building Architecture
An IoT Software Architecture for an Evacuable Building ArchitectureAn IoT Software Architecture for an Evacuable Building Architecture
An IoT Software Architecture for an Evacuable Building Architecture
 
Web Engineering L8: User-centered Design (8/8)
Web Engineering L8: User-centered Design (8/8)Web Engineering L8: User-centered Design (8/8)
Web Engineering L8: User-centered Design (8/8)
 
Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)
Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)
Web Engineering L7: Sequence Diagrams and Design Decisions (7/8)
 
Web Engineering L6: Software Architecture for the Web (6/8)
Web Engineering L6: Software Architecture for the Web (6/8)Web Engineering L6: Software Architecture for the Web (6/8)
Web Engineering L6: Software Architecture for the Web (6/8)
 
Web Engineering L5: Content Model (5/8)
Web Engineering L5: Content Model (5/8)Web Engineering L5: Content Model (5/8)
Web Engineering L5: Content Model (5/8)
 
Web Engineering L3: Project Planning (3/8)
Web Engineering L3: Project Planning (3/8)Web Engineering L3: Project Planning (3/8)
Web Engineering L3: Project Planning (3/8)
 
Web Engineering L2: Requirements Elicitation for the Web (2/8)
Web Engineering L2: Requirements Elicitation for the Web (2/8)Web Engineering L2: Requirements Elicitation for the Web (2/8)
Web Engineering L2: Requirements Elicitation for the Web (2/8)
 
Web Engineering L1: introduction to Web Engineering (1/8)
Web Engineering L1: introduction to Web Engineering (1/8)Web Engineering L1: introduction to Web Engineering (1/8)
Web Engineering L1: introduction to Web Engineering (1/8)
 
Web Engineering L4: Requirements and Planning in concrete (4/8)
Web Engineering L4: Requirements and Planning in concrete (4/8)Web Engineering L4: Requirements and Planning in concrete (4/8)
Web Engineering L4: Requirements and Planning in concrete (4/8)
 
Collaborative aspects of Decision Making and its impact on Sustainability
Collaborative aspects of Decision Making and its impact on SustainabilityCollaborative aspects of Decision Making and its impact on Sustainability
Collaborative aspects of Decision Making and its impact on Sustainability
 
Engineering Cyber Physical Spaces
Engineering Cyber Physical SpacesEngineering Cyber Physical Spaces
Engineering Cyber Physical Spaces
 
I progetti UnivAq-UFFIZI, INCIPICT, e  CUSPIS
I progetti UnivAq-UFFIZI, INCIPICT, e  CUSPISI progetti UnivAq-UFFIZI, INCIPICT, e  CUSPIS
I progetti UnivAq-UFFIZI, INCIPICT, e  CUSPIS
 
Exploring the Temporal Aspects of Software Architecture
Exploring the Temporal Aspects of Software ArchitectureExploring the Temporal Aspects of Software Architecture
Exploring the Temporal Aspects of Software Architecture
 

SERENE 2014 School: Luigi pomante serene2014_school

  • 1. SERENE'14 Autumn School ENGINEERING RESILIENT CYBER PHYSICAL SYSTEMS System-Level Concurrent Error Detection Dr. Luigi Pomante Università degli Studi dell’’Aquila Center of Excellence DEWS luigi.pomante@univaq.it
  • 2. Introduction Resilience Reliability Fault Tolerance Concurrent Error Detection System Level CED - 2 - © 2014 - Luigi Pomante
  • 3. Introduction Error detection is one of the basic feature needed to support reliability and then resilience in CPS So, this talk focuses on error detection issues in the cyber part of a CPS Such a part is normally a customized electronic digital system, with an ad-hoc hw/sw architecture, typically embedded in a more complex heterogeneous system that heavily interacts with some physical processes System Level CED - 3 - © 2014 - Luigi Pomante
  • 4. Introduction Error Detection Methodologies Off-line vs. Concurrent System-Level Design Methodologies System-Level Specification Functional characterization of the system without dealing with implementation aspects Specification of implementation objectives and constraints Timing, Power Consumption, Area Estimation of the influence of different alternatives on the final implementation HW/SW system composition Different processors and/or alternative technologies System Level CED - 4 - © 2014 - Luigi Pomante
  • 5. Introduction Typically, system resislience/reliability aspects are neglected while dealing with the higher levels of system synthesis process They are postponed to lower abstraction levels but the use of resislience/reliability methodologies could significantly impacts on timing, energy and area It is necessary to transfer these aspects toward the upper levels of the synthesis flow by adding the resilience/reliability constraint to the classical cost parameters This work investigates the problem of adopting design for reliability/resilience approaches at system level, when all the solutions are still open for the implementation of the device, presenting a set of design methodologies to provide concurrent error detection (CED) properties to the final implementation System Level CED - 5 - © 2014 - Luigi Pomante
  • 6. Goal The achievement of this wide resilience/reliability co-design project consists of the following aspects specification of systems in a co-design environment supporting resilience/reliability constraints design methodologies providing the desired CED properties hw/sw system partitioning on the basis of metrics taking into account both traditional co-design issues and resilience/reliability constraints System Level CED - 6 - © 2014 - Luigi Pomante
  • 7. Overview Problem Definition Target System Architecture Fault Model System Specification Design Methodologies for Reliability Design Analysis and Metrics Hw/Sw System Partitioning A Case Study: a Reliable Pacemaker System Level CED - 7 - © 2014 - Luigi Pomante
  • 8. Problem Definition A Section is a subset of the system specification A Critical Section is a section where the CED property is required A Reliable Section is a critical section that propagates either error free critical results or faulty critical results associated with an error indication System Level CED - 8 - © 2014 - Luigi Pomante
  • 9. Problem Definition The underlying assumption refers to the fact that the input data processed by the reliable section is error free The upstream sections provide either correct data by definition or they are designed to be reliable themselves The downstream sections also need to be designed reliable or no reliability constraint applies to them In the former case reliability is extended to all downstream elements, in the latter the property has a pure local effect System Level CED - 9 - © 2014 - Luigi Pomante
  • 10. Problem Definition In order to define formally these two different characterizations, the following definitions are introduced Local Reliability The Local Reliability property of a critical section specifies that the reliability constraints involve only the related critical section Global Reliability The Global Reliability property of a critical section specifies that the reliability constraints involve the related sections and recursively all the downstream sections System Level CED - 10 - © 2014 - Luigi Pomante
  • 11. Problem Definition Local and Global Reliability Specification A B D D C A B D E C Local reliability on B: the data provided to A are reliable Global reliability on B: the data provided to A and B are reliable System Level CED - 11 - © 2014 - Luigi Pomante
  • 12. Problem Definition The need of two kinds of reliability is due to the possibility that a specification could comprehend also the environment description, that doesn’t need any property, or a set of functionalities of which only one should be reliable For example, a digital control system specification for a car could comprehend tachometer, temperature and ABS control: the reliability is needed only for the ABS In order to be able to specify which sections must be reliable and what kind of reliability is desired particular system level specification languages (or proper extension to the existing ones) are required System Level CED - 12 - © 2014 - Luigi Pomante
  • 13. System Specification Two languages has been considered for system specification: Occam II and SystemC The first one has been selected since the TOSCA environment (a Co-design environment for embedded systems), used in our studies to verify the proposed approaches, is based on it The second language is becoming increasingly popular for system level specification, thus making its adoption almost a requirement when pursuing the integration of the proposed approaches in a real design flow System Level CED - 13 - © 2014 - Luigi Pomante
  • 14. System Specification Reliability constraints in Occam II The language has been extended with the introduction of statements for identifying critical sections to be added to the standard constraint definition section CS FROM label1 TO label2 IS LOCAL (GLOBAL) INT a,b CHAN OF INT in,out: TAG A: SEQ a:=0 WHILE TRUE TAG B: SEQ a:=a+1 out ! a TAG C: in ? b a:=a+b Declaration of a communication channel TAG D: MAXDELAY FROM B TO C IS 10: MAXRATE OF B IS 100: CS FROM A to D IS LOCAL: Tag definition Timing constraints Reliability constraint System Level CED - 14 - © 2014 - Luigi Pomante
  • 15. System Specification Reliability constraints in SystemC The language allows an intervention at different abstraction levels: module or process While working at module level, reliability constraints are imposed by extending the basic class using the inheritance mechanisms SC_MODULE_GCS, SC_MODULE_LCS – A reliability constraint imposed to the module applies directly to all processes included in the module itself When moving to process level, macro mechanisms can be adopted, by introducing additional macros for specifying critical sections and the local/global reliability constraint SC_GCS, SC_LCS System Level CED - 15 - © 2014 - Luigi Pomante
  • 16. Target System Architecture The reference architecture consists of the basic processor block (either general purpose or DSP), which executes software processes, main memory and a set of co-processors (ASIC or FPGA) implementing hardware functionalities if required Communication between hardware modules uses the available bus, memory otherwise CPU Memory I/O Interface Co-Processors System Level CED - 16 - © 2014 - Luigi Pomante
  • 17. Fault Model The adopted fault model is represented by the Single Functional Failure, where any number of physical faults causes a functional module to perform incorrectly The considered faults affect the hardware structure of the system, mining the behavior of the software too, but no software failures are considered in this work The modules that may fail are, thus, the main processor, the co-processors, the main memory, the system bus and the dedicated channels for hardware-hardware module communication Such a single failure model is based on a commonly adopted hypothesis: module failure is detected before another module fails System Level CED - 17 - © 2014 - Luigi Pomante
  • 18. Design Methodologies for Reliability The resilience/reliability project has investigated design methodologies for guaranteeing error detection capabilities based on the adoption of redundancy strategies Architectural and information redundancy The methodologies that have been analyzed and developed can be classified On the basis of the functionality to be performed and controlled Data Processing or Communication On the partitions involved HW or SW On the CED techniques adopted for guaranteeing the reliability properties System Level CED - 18 - © 2014 - Luigi Pomante
  • 19. Design Methodologies for Reliability The design approach considers as the basic element any functionality that the system must provide in a reliable way Nominal (N) Denotes such basic element Checking (C) Identifies the redundant functional elements designed to provide error detection capabilities Checker (CK) Is the functional element that detects a mismatching behavior between N and C due to failures Each one of these three elements (N, C and CK) can be independently implemented in hardware or in software, leading to several classes of methodologies System Level CED - 19 - © 2014 - Luigi Pomante
  • 20. Design Methodologies for Reliability Reliable Data Processing Nominal Architecture Checking Architecture Sw Checker Hw Sw Hw Sw Hw Solution Nominal Checker Checking 1 SW SW SW 2 SW HW SW 3 SW SW HW 4 SW HW HW 5 HW SW SW 6 HW HW SW 7 HW SW HW 8 HW HW HW System Level CED - 20 - © 2014 - Luigi Pomante
  • 21. Design Methodologies for Reliability Reliable Data Processing Class 1: SW Nominal, SW Checker, and SW Checking Self-Checking SW Assertions Dual-Processor Checking VLIW Checking Class 2: SW Nominal, HW Checker, and SW Checking Interface for Functional Redundancy Check DMA Checker VLIW Checking with HW Checker System Level CED - 21 - © 2014 - Luigi Pomante
  • 22. Design Methodologies for Reliability Reliable Data Processing Class 4: SW Nominal, HW Checker, and HW Checking Dynamically Re-Configurable Checker Class 8: HW Nominal, HW Checker, and HW Checking Device Duplication TSC Scheduling TSC Devices System Level CED - 22 - © 2014 - Luigi Pomante
  • 23. Design Methodologies for Reliability Reliable Communications It is necessary to guarantee that any fault on communication lines is detected Either hardware redundancy (lines duplication) or information redundancy (data encoding) can be adopted Two possibilities should be considered Communications between procedures implemented in HW Other kind of communications – SW-SW, SW-HW, HW-SW System Level CED - 23 - © 2014 - Luigi Pomante
  • 24. Design Methodologies for Reliability Reliable Communications Communications between procedures implemented in HW A pair of HW sections communicates by means of dedicated lines – Line Duplication vs. Data Encoding Other kinds of communication When the communication involves a SW section then it makes use of the system bus – The only viable solution is the use of error detection codes – The best results are obtained keeping the data in memory in a coding form and let the CPU working only with non-coded data » HW TSC Encoder/Decoder/ChecKer for the processor and one (or more) for the HW devices System Level CED - 24 - © 2014 - Luigi Pomante
  • 25. Design Methodologies for Reliability Reliable Communications Architecture with reliable communications CPU Memory (Coded Data) TSC EDCK TSC EDCK TSC EDCK TSC CK I/O Interface Co-Processors System Level CED - 25 - © 2014 - Luigi Pomante
  • 26. Design Analysis and Metrics All the methodologies have been analyzed in details in order to give prominence to main design issues and to evaluate benefits and costs The design issues have been analyzed qualitatively according to a reference schema in order to quickly show the main differences between different approaches Benefits and costs have been analyzed defining a set of significant parameters, constituting the basic elements needed to build metrics useful to compare the quality of different solutions, metrics that play an important role in the partitioning step System Level CED - 26 - © 2014 - Luigi Pomante
  • 27. Design Analysis and Metrics Design issues reference schema: key concepts Selection of number and typology of processing elements Detection of the need for a special architecture Analysis of synchronization issues between processing elements Analysis for possible physical and logical resources sharing Detection of modification needs of the original specification Selection of the execution policies for each processing element Allocation of the checker memory space Selection of the checking policies Analysis of the checker structure and complexity Selection of a mechanism to enable the checker to rise exceptions to report error detection System Level CED - 27 - © 2014 - Luigi Pomante
  • 28. Design Analysis and Metrics Benefits and Cost Let us define the Efficiency of a given methodology as its characterization relatively to three factors Coverage – It is the percentage of functional faults that it is possible to detect with respect to the complete fault set Detection Latency (DL) – It is the time between the instant a fault causes an error and the instant the error is detected Performance Degradation (PD) – It is related to the overhead (i.e., additional execution time) caused by fault detection tasks with respect to the original system System Level CED - 28 - © 2014 - Luigi Pomante
  • 29. Design Analysis and Metrics Benefits and Costs Let define the Cost of a given solution as the overhead with respect to the original system Physical cost (Cp) – It represents the cost of the physical components added to the original architecture Design Cost (Cd) – It represents the effort needed to design and implement a given solution System Level CED - 29 - © 2014 - Luigi Pomante
  • 30. Hw/Sw System Partitioning Once the system, the constraints, and the set of possible design solution are specified, the partitioning step selects the implementation of each task, either hardware or software The achieved solution is checked against the designer's constraints and, if they are met, the solution is accepted, otherwise a backtrack is performed and another allocation solution is pursued This process is extremely complex and time consuming, due to the large number of possible alternatives and to the fact that, although heuristics and tuned estimation functions have been defined, it is the final co-simulation of the suggested system implementation that confirms it to be a solution or not System Level CED - 30 - © 2014 - Luigi Pomante
  • 31. Hw/Sw System Partitioning The reliability aspects add a significant number of parameters to the partitioning step for the selection of the final implementation, making this task too complex In order to cope with the complexity of the partitioning step when reliability goals are also included, a two-level approach is here proposed A first partitioning is performed which takes into account only the classical aspects and cost functions, meeting the usually stringent time constraints Given the first assessed solution, a second-level partitioning considers the additional reliability constraints, analyzes the possible approaches, within the set of defined methodologies which fulfill them, and provides the solution that has the best tradeoff (if it exists) System Level CED - 31 - © 2014 - Luigi Pomante
  • 32. Hw/Sw System Partitioning S P E C IF IC A T IO N P A R T IT IO N IN G R E L IA B IL IT Y T A G S T IM IN G P O W E R A R E A C O S T T IM IN G T A G S A R C H IT E C T U R E I N T H W S W O .S . IN I T I A L S O L U T IO N N O R E L IA B IL IT Y Y E S R E Q . c o n s t r a in ts c o n s t r a in ts P A R T IT IO N IN G R E L IA B IL IT Y M O D E L S T R E N G T H H A R D /S O F T p a ra m e te rs F A U L T C O V E R A G E D E T E C T IO N L A T E N C Y A R E A O V E R H E A D P E R F O R M A N C E D E G R A D A T IO N S P E C I F IC S O L U T IO N A R C H . Y E S N O N O Y E S O P T IM IZ A T IO N H W S W I N T H W S W O .S . H W /S W S Y N T H E S IS R E L IA B IL IT Y C O -D E S IG N P A R T IT IO N IN G S E C T IO N S F O R R E L IA B IL IT Y S O L U T IO N W IT H F A U L T D E T E C T IO N System Level CED - 32 - © 2014 - Luigi Pomante
  • 33. Hw/Sw System Partitioning The 2th-level partitioning problem consists of both Reliability Model Identification Defining a criterion for the identification of the relation between the constrained procedure and the most suitable CED method Optimization Optimizing the result produced by the assignment criteria with respect to the global solution System Level CED - 33 - © 2014 - Luigi Pomante
  • 34. Hw/Sw System Partitioning Reliability Model Identification For each approach is identified a correct evaluation, or a qualitative estimation, of the considered parameter Methodologies Fault Coverage Detection Latency Performance Degradation Area Overhead SCS min/med/max med/max med/max med/max A min/med/max min/med med/max med/max DP 100% med/max min/med med/max VLIWS 100% 0 med/max min IFRC 100% 0 0 max DMAC 100% med/max med/max max VLIWH 100% 0 0 max DCC 100% med med max D 100% 0 0 max TSCS 100% med/max med/max med/max TSCD 100% 0 0 min/med System Level CED - 34 - © 2014 - Luigi Pomante
  • 35. Hw/Sw System Partitioning Reliability Model Identification A crisp tag (100% fault coverage, 0 detection latency, etc.) represents a hard system constraint that has to be enforced at any cost A fuzzy tag (i.e. min, med, max) represents a soft system requirement that is a design directive of the required effort for the identification of anomalies during the device operational time Note that, for soft requirements, a maximum requirement includes methodologies belonging to the medium or minimum partitions; and a medium requirement includes minimum System Level CED - 35 - © 2014 - Luigi Pomante
  • 36. Hw/Sw System Partitioning Reliability Model Identification Crisp tags force a partition on the methodologies set In particular, 100% fault coverage induces the partitions hard_fc and soft_fc, 0 detection latency induces the partitions hard_dl and soft_dl while, 0 performance degradation induces the partition hard_pd and soft_pd Since the applicability of a methodology to a specific procedure depends on its hardware/software characteristic, a further partition is induced System Level CED - 36 - © 2014 - Luigi Pomante
  • 37. Hw/Sw System Partitioning Reliability Model Identification By analyzing the properties of the methodologies, the following partitions are identified: swfc = { {IFRC, DP, DMAC, DCC, VLIWH, VLIWS} ; {A, SCS} } hwfc = { {TSCS, TSCD, D} ; {} } swdl = { {IFRC, VLIWH, VLIWS} ; {DP, DMAC, DCC, A, SCS} } hwdl = { {D, TSCD} ; {TSCS} } swpd = { {IFRC, VLIWH} ; {DMAC, DP, DCC, VLIWS, A, SCS} } hwpd = { {D, TSCD} ; {TSCS} } System Level CED - 37 - © 2014 - Luigi Pomante
  • 38. Hw/Sw System Partitioning Reliability Model Identification The second level partitioning takes into account the hard parameters first for selecting suitable CED techniques, and uses the soft parameters for selecting among them More precisely, for each critical procedure, on the basis of its allocation in hardware or in software, the  partitions fulfilling the hard/soft requirements are selected, and the intersection between them provides the set of suitable CED techniques The partitioning thus proceeds with the next critical procedure and moves toward the end of this local CED allocation analysis. At the end, all procedures are associated with a set of admissible CED implementations System Level CED - 38 - © 2014 - Luigi Pomante
  • 39. Hw/Sw System Partitioning Optimization The global solution determining for each procedure the CED technique actually adopted is pursued by means of a process of solution extraction and simulation, to verify that the constraints of the first partitioning are still met This process takes into account the fact that there are techniques with a global effect (such as IFRC, DP), which prevail over those with a local impact (A, SCS) As an optimization policy, the final solution does not include overlapped methods in order to achieve a significant efficiency System Level CED - 39 - © 2014 - Luigi Pomante
  • 40. A Case Study: a Reliable Pacemaker The goal of this case study is to co-design a reliable pacemaker able to detect any anomalies in its behavior due to physical faults in its components In order to obtain this goal, by starting from system-level specification and following a reliable co-design flow, the design space is explored, identifying an optimal partitioning between hardware and software, validated through system-level co-simulation Hence, by taking into account the reliability requirements, the proper CED methodologies able to meet all the constraints are selected and then the one with the best cost-benefit tradeoff is identified and adopted for the final design System Level CED - 40 - © 2014 - Luigi Pomante
  • 41. A Case Study: a Reliable Pacemaker Behavioral analysis LRL PVARP AEIr BP AVIr CSW AVI Time Intervals Min-Max (ms) PVARP 300-400 AEIr 0-400 BP 25 CSW 75 AVIr 100 Electrocardiographic diagram showing the relevant timing parameters Typical values for each interval System Level CED - 41 - © 2014 - Luigi Pomante
  • 42. A Case Study: a Reliable Pacemaker State Diagram BP Natural V time_out / reset_timer, set_AEIr_timer PVARP AEIr Natural V / reset_timer, set_PVARP_timer Natural A / reset_timer, set_BP_timer time_out / Stimulated A reset_timer, set_BP_timer AVIrp CSW AVI r Start time_out / set_CSW_timer time_out / reset_timer, set_AVIr_timer Natural V / set_AVIrp_timer time_out / Stimultaed V rset_timer, set_PVARP_timer NAtural V/ reset_timer, set_PVARP_timer time_out / Stimulated V reset_timer, set_PVARP_timer System Level CED - 42 - © 2014 - Luigi Pomante
  • 43. A Case Study: a Reliable Pacemaker Timing Constraints State Min-Max (ms) PVARP 300-400 AEIr 300-800 BP 325-825 CSW 400-900 AVIr 500-1000 Other Constraints Timing bounds for the intervals The other constraints to be considered in the first-level partitioning step are the classical ones: power dissipation, area and cost They must be kept as much as possible to minimum values System Level CED - 43 - © 2014 - Luigi Pomante
  • 44. A Case Study: a Reliable Pacemaker Reliability Constraints Considering the criticality of the system for the human safety, a hard reliability is imposed on the whole system More in detail 100% fault coverage is required Performance degradation is allowed as long timing constraints are still met Detection latency and area overhead must be kept as much as possible to minimum values System Level CED - 44 - © 2014 - Luigi Pomante
  • 45. A Case Study: a Reliable Pacemaker System Level Specification: the Environment Main Heart System Test bench Environment Channels Calls RTS [1] RTS [0] The heart ... inside System Level CED - 45 - © 2014 - Luigi Pomante
  • 46. A Case Study: a Reliable Pacemaker System Level Specification: the System Channels System Pace maker PVARP AEIr AVIr Time out[0] Time Out [2][3][4] Time out[1] Calls System Level CED - 46 - © 2014 - Luigi Pomante
  • 47. A Case Study: a Reliable Pacemaker Timing and Reliability Requirements Specification PROC Pacemaker( CHAN OF BIT R; CHAN OF BIT V; CHAN OF BIT P; CHAN OF BIT A; CHAN OF BIT inh_R; CHAN OF BIT inh_P ) BIT val: -- Main body SEQ R ? val WHILE (TRUE) SEQ TAG P1: PVARP[0]( R, V, P, A, inh_R, inh_P, val) TAG P2: : MINDELAY FROM P1 TO P2 IS 500 (MS): MAXDELAY FROM P1 TO P2 IS 1000 (MS): CS FROM P1 TO P2 IS GLOBAL: System Level CED - 47 - © 2014 - Luigi Pomante
  • 48. A Case Study: a Reliable Pacemaker 1st Level Partitioning TOSCA Embedded Ultra-Low Power Intel 486 GX Genetic Algorithm Communication Costs Procedures Allocation Test results Pacemaker PVARP AEIr AVI Timeout[0] [1] [2] [3] [4] T1 T2 T3 T4 T5 T6 SW SW SW SW SW SW SW SW SW OK OK OK OK OK OK SW SW SW SW HW HW HW HW HW OK OK Max SW HW HW HW SW SW SW SW SW OK Max HW HW HW HW HW HW HW HW HW OK OK OK OK OK OK Selected Solution All-in-sw implementation (E486 16 Mhz) AVI Max AEIr OK Max AVI PVARP AVI Max AEIr Max AEIr OK Max AVI System Level CED - 48 - © 2014 - Luigi Pomante
  • 49. A Case Study: a Reliable Pacemaker 2th Level Partitioning Reliability Constraints FC = 100% PD = medium DL = maximum A = maximum Partitions FC 100% – swfc = {hard_fc} = {IFRC, DP, DMAC, DCC, VLIWH, VLIWS} PD medium – swpd = {hard_pd; soft_pd} = {{IFRC, VLIWH };{DMAC, DP, DCC, VLIWS, A, SCS}} – swpd = {{IFRC, VLIWH };{DP}} System Level CED - 49 - © 2014 - Luigi Pomante
  • 50. A Case Study: a Reliable Pacemaker 2th Level Partitioning Potential Solutions {IFRC, DP, VLIWH} Methodologies Comparison IFRC and VLIWH doesn’t affect system behavior DP requires co-simulation (Nominal, Checking, Checker) Test results T1 T2 T3 T4 T5 T6 OK OK Max AEIr Max AVI PVARP OK Max AEIr PVARP – The timing constraints aren’t met: the solution is discarded System Level CED - 50 - © 2014 - Luigi Pomante
  • 51. A Case Study: a Reliable Pacemaker Selected Solution The feasible solutions are IFRC and VLIWH These alternatives are characterized by the same area overhead and detection latency, so they are equivalent The designer, considering the particular aspects related to other steps of the co-design flow can make the final choice For example, the IFRC is applicable independently from the number of reliable procedures while VLIWH requires a specific software synthesis step for each reliable procedure – The first solution has thus a cost that is independent of the number of critical sections, which is not true for VLIWH solutions – Since in the present case study all the system procedures are made reliable, the first architectural solution requires a lower effort and design cost and may be preferable System Level CED - 51 - © 2014 - Luigi Pomante
  • 52. A Case Study: a Reliable Pacemaker Selected Solution The final architectural solution for the reliable pacemaker CPU Memory CPU_chk BUS Interface and Checker I/O Interface The selected solution doesn't allow any significant back annotation to the first level partitioning, since the initial hw/sw partitioning achieved an acceptable all-in-software solution, loading all tasks efficiently on one processor System Level CED - 52 - © 2014 - Luigi Pomante
  • 53. Conclusions The resilience/reliability co-design project aims at integrating in a standard co-design flow the elements for achieving a final system able to autonomously detect the occurrence of faults during the operational life of the system The entire flow has been presented in this work, discussing the key elements of the proposed framework Specification Design Methodologies System Partitioning System Level CED - 53 - © 2014 - Luigi Pomante
  • 54. Conclusions Language specification extensions have been defined to specify reliability requirements A set of possible hw/sw architectural design methodologies has been analyzed considering the possibilities to implement any part of the complete system (nominal, checking and checker) either in hardware or in software A metric has been introduced taking into account the peculiar elements of reliability properties System Level CED - 54 - © 2014 - Luigi Pomante
  • 55. Conclusions A two-level hw/sw partitioning process has been defined, acting initially as a traditional approach to determine a valid solution, while the second step explores the alternatives taking into account the fault detection properties A case study shows the results of our work Further research efforts are directed toward the tuning of metrics with respect to the selected suite of design methodologies, to better support the partitioning step System Level CED - 55 - © 2014 - Luigi Pomante
  • 56. References L. Pomante. “System Level Concurrent Error Detection”, Technical Report No. 2001.62, Politecnico di Milano, 2001 L. Pomante. “System-Level Co-Design of Heterogeneous Multiprocessor Embedded Systems”, PhD Thesis, Politecnico di Milano, 2002 L. Pomante, C. Bolchini, F. Salice, D. Sciuto. "Reliability Properties Assessment at System Level: a Co Design Framework", Journal of Electronic Testing - Theory and Application (JETTA), Kluwer Academic Publishers, 2002 L. Pomante, A. Miele, F. Salice, C. Bolchini, D. Sciuto, "Reliable System Co-Design: the FIR Case Study", IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT 2004) L. Pomante, F. Salice, C. Bolchini, D. Sciuto, “Reliable System Specification for Self- Checking Data-Paths”, Design, Automation and Test in Europe – Conference & Exibition (DATE 2005), 2005 L. Pomante, D. Sciuto, F. Salice, W. Fornaciari, C. Brandolese. “Affinity-Driven System Design Exploration for Heterogeneous Multiprocessor SoC”, IEEE Transactions on Computers, vol. 55, no. 5, 2006 L. Pomante. “System-Level Design Space Exploration for Dedicated Heterogeneous Multi- Processor Systems”. IEEE International Conference on Application-specific Systems, Architectures and Processors, 2011 L. Pomante. “HW/SW Co-Design of Dedicated Heterogeneous Parallel Systems: an Extended Design Space Exploration Approach”. IET Computers & Digital Techniques, Institution of Engineering and Technology, 2013 System Level CED - 56 - © 2014 - Luigi Pomante