SlideShare une entreprise Scribd logo
1  sur  39
Selling an executable
data flow graph based IR
John Yates
Order of presentation
• Who am I and why am I here?
• 2010: Netezza needs a new architecture
• A family of statically typed acyclic DFG IRs
• (Time permitting: Some engineering details)
• Q&A
“Who am I and why am I here?”
(with apologies to Adm. Stockdale)
1970: Maybe I’ll be a programmer
• NYC hippie, ponytail, curled handlebar mustache
• Liberal arts high school, lousy student
• Wanted to build things, real things
• Computers seemed interesting and intuitive
• Luckily in 1970 programmers were scarce
40 years…
– 1970: learning the craft, various jobs (all in assembler)
– 1978: Digital Equipment Corp
• Pascal frontend, dynamic programming code selector
– 1983: Apollo Computer
• Designed RISC ISP w/ explicit parallel dispatch (pre-VLIW)
• Lead architect for RISC backend optimizer; built team
• 1st commercial: SSA IR, SW pipeliner, lattice const prop
– 1992: Binary translation: DEC (sw), Chromatic (hw-support)
• More SSA IR, lowering; built teams; lot of patents (many hw)
– 1999: Everfile - NFS-like Win32 internet file system
– 2002: Netezza, badge #26
• Storage: compression, indices, access methods, txns, CBTs
20+years
2010: Netezza needs
a new architecture
Data parallel analytics engine
• Data partitioned across a cluster of nodes
– Multiple “slices” per node to exploit multi-core
• Execution model:
– Leader accepts query, produces an execution plan
– Leader broadcasts plan’s parallel components
– Cluster performs data parallel work
– Leader performs work requiring a single locus
• Competition: Teradata, Green Plum, DB2, …
Netezza’s architecture
PG
Plan
Split1
Split2
Gen
FPGA
Gen
C++
Gen
C++
CompileCompile
Load
DLL
Bcast
Load
DLL
Load
FPGA
ExecuteExecute
N workers
Latency
Netezza’s problems
PG
Plan
Split1
Split2
Gen
FPGA
Gen
C++
Gen
C++
CompileCompile
Load
DLL
Bcast
Load
DLL
Load
FPGA
ExecuteExecute
Very simplistic code generator:
-Lowering across an enormous
semantic gulf
- No intermediate representation
- Very complex, very fragile
- Difficult to implement much more
than general case code patterns
Hardware
development
time scales
N workers
Garth’s incomplete Marlin vision
• What is the real input to the interpreter?
• How do we get from query plan to that form?
PG
Plan
Split
Bcast
Interpret
(faster?)
Interpret
(faster?)
N workers
Unspecified
miracle
Multi-
core?
A family of statically typed
acyclic data flow graph IRs
Working backwards
• Graph
• Dataflow
• Acyclic
• Statically typed
• A family of … IRs
Graph
• Operators
– Label names a function
– Edge connections in and out
• Edges
– Directed (“dataflow”)
Dataflow
• Dataflow machines
– Apply history, wisdom, insights to the interpreter
• Value semantics
– All edges carry data
– No other kinds of edges (i.e. no anti-dependence)
– No updatable shared state (i.e. no store)
• Expose all opportunities for concurrency
Acyclic
• No backedges ≡ no cycles J
• Can exploit topological ordering
– Fact propagation: rDFS (forward) or DFS (reverse)
– No iteration, guaranteed termination
– Linear algorithms, O(graph)
Statically typed
• Edges initially have unknown type
• A well-formed graph can be statically typed
– Linear pass over topologically ordered Operators
– Assign edge types per Operator descriptors
– Inconsistencies can be diagnosed and reported
• Well-nested subsets of edge type vocabularies
• Constraining edge types constrains operators
A family of … IRs
PG
Plan
Split
Bcast
Interpret
N workers
Lower
andOpt
Lower
andOpt
Lower
andOpt
Interpret
Tree
patterns
Graph1
patterns
Graph2
patterns
High level tree - tuples
High level graph - tuples
Mid level graph - nullable values
Low level graph - values
Common
pattern
notation
Topo
expand,
insert
CLONEs
Topo
expand,
insert
CLONEs
≈
Nothing convinces like working code
• First delivery
– Table drive operator semantics
– Utilities: build, edit & expand
– Topologically sort
– Type check & report errors
Split
Bcast
Interpret
N workers
Interpret
Topo
expand,
insert
CLONEs
Topo
expand,
insert
CLONEs
Graph
assembler
Graph
assembly
program
Sold!
• Working code rendered my
successive lowerings idea credible
• Overall Marlin added ~10 engineers; I got 3
• My team got itsfirst end-to-end test case working
PG
Plan
Split
Bcast
Interpret
N workers
Lower
andOpt
Lower
andOpt
Lower
andOpt
Interpret
Tree
patterns
Graph1
patterns
Graph2
patterns
Topo
expand,
insert
CLONEs
Topo
expand,
insert
CLONEs
IBM killed the Marlin program…
• Marlin was a clean up project promising…
– Performance and shorter development cycles
– But no new features nor functionality
• It is always hard to fund significant clean up
– Especially if not legitimately tied to a coveted feature
• Harder if your company is under duress
• Harder still if DB2 is gunning for your headcount
Question?
Some engineering details
Why clone?
• After expansion all edges are point-to-point
– No output is multiply-consumed
• Chunk handoff along an edge becomes trivial
– Think C++11’s new move semantics
• So only clones implement reference counting
Broadcast
• Serialize / deserialize
• On network size matters
• Graph object
– Small number of scalar members
– Handful of C++ vector (some ephemeral)
– Position independent (no pointers in vectors)
No pointers
• Pointers index the linear address space
– Implicit context (there is only one address space)
• Unsigned as vector index
– User must provide explicit context (vector base)
– 32 bit indices are ½ the size of 64 bit pointers
– Position independence simplifies serialization
The graph object
• Exposed read-only data
– Vector of Operator objects
– Vector of EdgeIn objects
– Vector of EdgeOut objects
– Literal table and pool
• Private data (may be missing or elided)
– Vector of EdgeIn next links
– Vector of Operator BreadCrumbs
Discardable elements
• vecBc: BreadCrumbs vector
• vecNxt: EdgeIn sibling links
• LiteralPool hash table array
Graph vector details
Vector Index Type Element Type Element Size
g.vecOp OperatorIndex Operator 16 bytes
g.vecOut EdgeOutIndex EdgeOut 8 bytes
g.vecIn EdgeInIndex EdgeIn 8 bytes
g.lit LiteralKey Literal multiple of 8 bytes
g.vecNxt EdgeInIndex EdgeInIndex 4 bytes
g.vecBc OperatorIndex BreadCrumb 4 bytes
Connectivity: Operator objects
• Operator private members
– Operator’s edges are sub-vectors of g.vecIn, g.vecOut
– Start of EdgeIn objects: EdgeInIndex baseIn_;
– Start of EdgeOut objects: EdgeOutIndex baseOut_;
• Number of connections
– Inputs: vecOp[x+1].baseIn_ - vecOp[x].baseIn_
– Outputs: vecOp[x+1].baseOut_ - vecOp[x].baseOut_
Connectivity: EdgeIn objects
• EdgeIn private members
– Sink Operator: OperatorIndex dstOp_;
– Source EdgeOut: EdgeOutIndex src_;
• EdgeIn connection position
– Use pointer arithmetic:
this - (vecIn + vecOp[dstOp_].baseIn_);
Connectivity: EdgeOut objects
• EdgeOut private members
– Source Operator: OperatorIndex srcOp_;
– Sink EdgeIn: EdgeInXIndex dst_;
• EdgeOut connection position
– Use pointer arithmetic
this - (vecOut + vecOp[srcOp_].baseOut_);
Working with XG
Thin graph construction
Method Effect
graph.add(BreadCrumb, Op, Locus, Expansion,
unsigned nVarIn =0, unsigned nVarOut =0);
Add an Operator and its
Edge resources
graph.connect(OperatorIndex srcOp, unsigned srcPos,
OperatorIndex dstOp, unsigned dstPos);
Guarantee a srcOp[srcPos] to
dstOp[dstPos] edge exists
Whole graph operations
Operation Effect
Graph(); Construct an empty Graph
void done(); Topo sort and type check
Graph(Graph const thinGraph&, bool forSpu); Partitioning constructor
BinStream& operator << (BinStream&, Graph const&); Put to a BinStream (cheap)
BinStream& operator >> (BinStream&, Graph&); Get from a BinStream (cheap)
void expand(bool forSpu, Environment const& env); Expand, insert clones, etc.
Graph states and conversions
• Start with a “thin” graph
• Leader plus one representative node and dataslice
• Operators tagged with a locus and expansion rule
• Outputs can have multiple consumers
• Partition into leader-side & node-side subsets
• Expand based on loci and system topology
• Duplicate operators, adjust in and out arities, add sites
• Expand edges: fan-in, fan-out, parallel
• Introduce clones as needed
Graph overlay
• Template object publically derived from Graph
• Macro hides lots of template boilerplate
• User supplied types for parallel vectors
– MyOperator ovOp[OperatorIndex]
– MyEdgeIn ovIn[EdgeInIndex]
– MyEdgeOut ovOut[EdgeOutIndex]
• Constructor shares vectors and LiteralTable
1973: Began 2-axis controller
I wrote every line of code (in assembler)
1975: First installation
0.5 MegaWatt torch cutting up to ¾”
steel plate at Marion Power Shovel
1975: Torch on… I was hooked!

Contenu connexe

Similaire à MathWorks Interview Lecture

Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large GraphsNishant Gandhi
 
Implementing a JavaScript Engine
Implementing a JavaScript EngineImplementing a JavaScript Engine
Implementing a JavaScript EngineKris Mok
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsFlink Forward
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADtab0ris_1
 
Pivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalRPivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalRgo-pivotal
 
1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real World1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real WorldAchim Friedland
 
Jump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksJump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksDatabricks
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRPivotalOpenSourceHub
 
Graph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft EcosystemGraph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft EcosystemMarco Parenzan
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWSAmazon Web Services
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
Nebula Graph nMeetup in Shanghai - Meet with Graph Technology Enthusiasts
Nebula Graph nMeetup in Shanghai - Meet with Graph Technology EnthusiastsNebula Graph nMeetup in Shanghai - Meet with Graph Technology Enthusiasts
Nebula Graph nMeetup in Shanghai - Meet with Graph Technology EnthusiastsNebula Graph
 
Xia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATLXia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATLMLconf
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiDatabricks
 

Similaire à MathWorks Interview Lecture (20)

Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large Graphs
 
Implementing a JavaScript Engine
Implementing a JavaScript EngineImplementing a JavaScript Engine
Implementing a JavaScript Engine
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADta
 
JavaOne_2010
JavaOne_2010JavaOne_2010
JavaOne_2010
 
Pivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalRPivotal OSS meetup - MADlib and PivotalR
Pivotal OSS meetup - MADlib and PivotalR
 
1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real World1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real World
 
R programmingmilano
R programmingmilanoR programmingmilano
R programmingmilano
 
Jump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksJump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and Databricks
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
 
Graph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft EcosystemGraph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft Ecosystem
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
Essentials of R
Essentials of REssentials of R
Essentials of R
 
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Nebula Graph nMeetup in Shanghai - Meet with Graph Technology Enthusiasts
Nebula Graph nMeetup in Shanghai - Meet with Graph Technology EnthusiastsNebula Graph nMeetup in Shanghai - Meet with Graph Technology Enthusiasts
Nebula Graph nMeetup in Shanghai - Meet with Graph Technology Enthusiasts
 
Xia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATLXia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATL
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
 

MathWorks Interview Lecture

  • 1. Selling an executable data flow graph based IR John Yates
  • 2. Order of presentation • Who am I and why am I here? • 2010: Netezza needs a new architecture • A family of statically typed acyclic DFG IRs • (Time permitting: Some engineering details) • Q&A
  • 3. “Who am I and why am I here?” (with apologies to Adm. Stockdale)
  • 4. 1970: Maybe I’ll be a programmer • NYC hippie, ponytail, curled handlebar mustache • Liberal arts high school, lousy student • Wanted to build things, real things • Computers seemed interesting and intuitive • Luckily in 1970 programmers were scarce
  • 5. 40 years… – 1970: learning the craft, various jobs (all in assembler) – 1978: Digital Equipment Corp • Pascal frontend, dynamic programming code selector – 1983: Apollo Computer • Designed RISC ISP w/ explicit parallel dispatch (pre-VLIW) • Lead architect for RISC backend optimizer; built team • 1st commercial: SSA IR, SW pipeliner, lattice const prop – 1992: Binary translation: DEC (sw), Chromatic (hw-support) • More SSA IR, lowering; built teams; lot of patents (many hw) – 1999: Everfile - NFS-like Win32 internet file system – 2002: Netezza, badge #26 • Storage: compression, indices, access methods, txns, CBTs 20+years
  • 6. 2010: Netezza needs a new architecture
  • 7. Data parallel analytics engine • Data partitioned across a cluster of nodes – Multiple “slices” per node to exploit multi-core • Execution model: – Leader accepts query, produces an execution plan – Leader broadcasts plan’s parallel components – Cluster performs data parallel work – Leader performs work requiring a single locus • Competition: Teradata, Green Plum, DB2, …
  • 9. Latency Netezza’s problems PG Plan Split1 Split2 Gen FPGA Gen C++ Gen C++ CompileCompile Load DLL Bcast Load DLL Load FPGA ExecuteExecute Very simplistic code generator: -Lowering across an enormous semantic gulf - No intermediate representation - Very complex, very fragile - Difficult to implement much more than general case code patterns Hardware development time scales N workers
  • 10. Garth’s incomplete Marlin vision • What is the real input to the interpreter? • How do we get from query plan to that form? PG Plan Split Bcast Interpret (faster?) Interpret (faster?) N workers Unspecified miracle Multi- core?
  • 11. A family of statically typed acyclic data flow graph IRs
  • 12. Working backwards • Graph • Dataflow • Acyclic • Statically typed • A family of … IRs
  • 13. Graph • Operators – Label names a function – Edge connections in and out • Edges – Directed (“dataflow”)
  • 14. Dataflow • Dataflow machines – Apply history, wisdom, insights to the interpreter • Value semantics – All edges carry data – No other kinds of edges (i.e. no anti-dependence) – No updatable shared state (i.e. no store) • Expose all opportunities for concurrency
  • 15. Acyclic • No backedges ≡ no cycles J • Can exploit topological ordering – Fact propagation: rDFS (forward) or DFS (reverse) – No iteration, guaranteed termination – Linear algorithms, O(graph)
  • 16. Statically typed • Edges initially have unknown type • A well-formed graph can be statically typed – Linear pass over topologically ordered Operators – Assign edge types per Operator descriptors – Inconsistencies can be diagnosed and reported
  • 17. • Well-nested subsets of edge type vocabularies • Constraining edge types constrains operators A family of … IRs PG Plan Split Bcast Interpret N workers Lower andOpt Lower andOpt Lower andOpt Interpret Tree patterns Graph1 patterns Graph2 patterns High level tree - tuples High level graph - tuples Mid level graph - nullable values Low level graph - values Common pattern notation Topo expand, insert CLONEs Topo expand, insert CLONEs ≈
  • 18. Nothing convinces like working code • First delivery – Table drive operator semantics – Utilities: build, edit & expand – Topologically sort – Type check & report errors Split Bcast Interpret N workers Interpret Topo expand, insert CLONEs Topo expand, insert CLONEs Graph assembler Graph assembly program
  • 19. Sold! • Working code rendered my successive lowerings idea credible • Overall Marlin added ~10 engineers; I got 3 • My team got itsfirst end-to-end test case working PG Plan Split Bcast Interpret N workers Lower andOpt Lower andOpt Lower andOpt Interpret Tree patterns Graph1 patterns Graph2 patterns Topo expand, insert CLONEs Topo expand, insert CLONEs
  • 20. IBM killed the Marlin program… • Marlin was a clean up project promising… – Performance and shorter development cycles – But no new features nor functionality • It is always hard to fund significant clean up – Especially if not legitimately tied to a coveted feature • Harder if your company is under duress • Harder still if DB2 is gunning for your headcount
  • 23. Why clone? • After expansion all edges are point-to-point – No output is multiply-consumed • Chunk handoff along an edge becomes trivial – Think C++11’s new move semantics • So only clones implement reference counting
  • 24. Broadcast • Serialize / deserialize • On network size matters • Graph object – Small number of scalar members – Handful of C++ vector (some ephemeral) – Position independent (no pointers in vectors)
  • 25. No pointers • Pointers index the linear address space – Implicit context (there is only one address space) • Unsigned as vector index – User must provide explicit context (vector base) – 32 bit indices are ½ the size of 64 bit pointers – Position independence simplifies serialization
  • 26. The graph object • Exposed read-only data – Vector of Operator objects – Vector of EdgeIn objects – Vector of EdgeOut objects – Literal table and pool • Private data (may be missing or elided) – Vector of EdgeIn next links – Vector of Operator BreadCrumbs
  • 27. Discardable elements • vecBc: BreadCrumbs vector • vecNxt: EdgeIn sibling links • LiteralPool hash table array
  • 28. Graph vector details Vector Index Type Element Type Element Size g.vecOp OperatorIndex Operator 16 bytes g.vecOut EdgeOutIndex EdgeOut 8 bytes g.vecIn EdgeInIndex EdgeIn 8 bytes g.lit LiteralKey Literal multiple of 8 bytes g.vecNxt EdgeInIndex EdgeInIndex 4 bytes g.vecBc OperatorIndex BreadCrumb 4 bytes
  • 29. Connectivity: Operator objects • Operator private members – Operator’s edges are sub-vectors of g.vecIn, g.vecOut – Start of EdgeIn objects: EdgeInIndex baseIn_; – Start of EdgeOut objects: EdgeOutIndex baseOut_; • Number of connections – Inputs: vecOp[x+1].baseIn_ - vecOp[x].baseIn_ – Outputs: vecOp[x+1].baseOut_ - vecOp[x].baseOut_
  • 30. Connectivity: EdgeIn objects • EdgeIn private members – Sink Operator: OperatorIndex dstOp_; – Source EdgeOut: EdgeOutIndex src_; • EdgeIn connection position – Use pointer arithmetic: this - (vecIn + vecOp[dstOp_].baseIn_);
  • 31. Connectivity: EdgeOut objects • EdgeOut private members – Source Operator: OperatorIndex srcOp_; – Sink EdgeIn: EdgeInXIndex dst_; • EdgeOut connection position – Use pointer arithmetic this - (vecOut + vecOp[srcOp_].baseOut_);
  • 33. Thin graph construction Method Effect graph.add(BreadCrumb, Op, Locus, Expansion, unsigned nVarIn =0, unsigned nVarOut =0); Add an Operator and its Edge resources graph.connect(OperatorIndex srcOp, unsigned srcPos, OperatorIndex dstOp, unsigned dstPos); Guarantee a srcOp[srcPos] to dstOp[dstPos] edge exists
  • 34. Whole graph operations Operation Effect Graph(); Construct an empty Graph void done(); Topo sort and type check Graph(Graph const thinGraph&, bool forSpu); Partitioning constructor BinStream& operator << (BinStream&, Graph const&); Put to a BinStream (cheap) BinStream& operator >> (BinStream&, Graph&); Get from a BinStream (cheap) void expand(bool forSpu, Environment const& env); Expand, insert clones, etc.
  • 35. Graph states and conversions • Start with a “thin” graph • Leader plus one representative node and dataslice • Operators tagged with a locus and expansion rule • Outputs can have multiple consumers • Partition into leader-side & node-side subsets • Expand based on loci and system topology • Duplicate operators, adjust in and out arities, add sites • Expand edges: fan-in, fan-out, parallel • Introduce clones as needed
  • 36. Graph overlay • Template object publically derived from Graph • Macro hides lots of template boilerplate • User supplied types for parallel vectors – MyOperator ovOp[OperatorIndex] – MyEdgeIn ovIn[EdgeInIndex] – MyEdgeOut ovOut[EdgeOutIndex] • Constructor shares vectors and LiteralTable
  • 37. 1973: Began 2-axis controller I wrote every line of code (in assembler)
  • 38. 1975: First installation 0.5 MegaWatt torch cutting up to ¾” steel plate at Marion Power Shovel
  • 39. 1975: Torch on… I was hooked!