This talk discusses advanced computationally assisted reasoning about large interaction-dominated systems and addresses the role of involve details of huge numbers and levels of intricate interactions in current fields of research.It was delivered at the SMART Infrastructure Facility by Professor Chris Barrett on September 26, 2012. For more detail, see http://goo.gl/gLp7c.
3. What is interaction? What’s the issue?
• A finite undirected graph Y
• A sequence of local maps
• An ordering of the vertex set of Y
[FY,p] = P Fp(i)
5. “Genuine” social entities & interactions
“ .. [usual causal] hierarchy collapses when
causality crosses across units and
levels….human behavior in social setting is
interdependent …. although … not a new
insight, social life is interdependent in …
spatial forms – things “go together” in and
across distinct places …. which might be better
described as neighborhood causal
processes…”
Robert J. Sampson, The Great American City, 2012
6. What interacts in an evolving city?
• People are entities that have
purposes, needs, capacities and interact
• Neighborhoods are entities that have
purposes, needs, capacities and interact, “have their
own logic and causality.”
• Causes, causal interactions, occur across “normal”
causal boundaries
– People interact with people and neighborhoods
– Neighborhoods interact with neighborhoods and people
– E.g., self selection bias, extra neighborhood proximity
processes etc are within and among network processes
that do not supervene one another.
7. This is not entirely unique to
neighborhood selection
• Traffic and transportation
• Motives/goals, activities, transport resource, transport
infrastructure, resource competition, form and function of
infrastructure, traffic, communicated dynamics, time
delays, goal failure/success etc…. loop and evolve
• Genetic predisposition, homophily, family and
peer mimicry, other social functionalities affect
• Success, variously
• Suicide
• Smoking
• Obesity
• Healthful behaviors ……., etc.
8. In fact, it is seen in biology
• Suzuki, et al, 2003
• The pigmentation control gene Fox1 is defective in a
mutant mouse and shuts down the normal process by
which pigmentation patterns are stabilized in the skin/hair
of the mouse.
• It creates moving waves of color striping
• This gene normally can produce all solid, spot and striped
patterns by simply activating at a particular times in
embryonic development; the morphology of the embryo at
that time determines the pattern created by the gene
• In the mutant case, the continued malfunction, given the
most recent color morphology, generates a new pattern,
and so on.
9. Traveling Stripes- Suzuki
These dynamics are the same kind
as in Belousov-Zhabotinskii reaction
of nonlinear waves in excitable
physical media
10. Are they are the same causal class?
• The Chemical Basis of Morphogenesis, Turing
1951
• Usually, diffusion processes (local
communication) stabilizes in a mixed system, but
under “exicitable” media conditions, structure
appears and evolves
• It is seen in physical chemistry and excitable
media
• It is, essentially, a rewrite computational THEORY
of interaction that is just now being really
discovered
13. And even in physical systems
• Chemical morphogenesis, B-Z dynamics
14. The “inside-outside” problem is a
related issue of non-supervenance
• What is an organ?
– Biome
• What is in an organism, what is outside of it?
• What/ where is a thought?
– Extended mind
– Distributed algorithmic accounts of causes
• What is an agent?
– Is agency necessarily encapsulated?
– Driver behavior
• What is an urban agent?
16. Where does Big Data come from?
Metric, declarative, procedural sources & integration
17. What is Big?
• The world's technological per-capita capacity
to store information has roughly doubled
every 40 months since the 1980s; as of
2012, every day 2.5 quintillion (2.5×1018)
bytes of data were created [stored].
Wikipedia, “Big Data”, August 2012
25. Data creation, deletion & storage
• We will know what - of all that data - it is
possible to “forget” only when we know how
to summarize what is possible
• That’s a big analytical problem as we will see
• Use of graph theory and graphical dynamical
systems (networks) is essential
– Computationally very intensive
28. Massively Interacting Systems
• These things produce branching processes
• Sometimes they are periodic, sometimes they are not
• They do not explore the entire possible state space (all
morphologies are not expressed)
• So even with the immense amount of underlying data
necessary, decision analysis must produce infinitely more
• This complicates measurement as well as theory making in
sense of acceptable explanations of observations
• It makes observed, metric, data; declarative data and
procedural information all essential
• It is the effects of processes of composition of complex
interactions that ultimately generates so much data, both
measured and synthesized.
29. Q: How can we support human
analytical capabilities in this situation?
30. The end of the great man theories
of….. decision making
• Many stakeholder synthetic information
• The analysis environment is not separated from “the
world”
• An entirely new interaction medium will create NEW
REALITIES
• The approach must involve human expertise and
context, it must be a cognitive augmentation system
• It must involve distributed, social cognition
• It must follow context & allow information deletion
• It will change scientific process and assumptions
31. People are interconnected properties
Age 26 26 7
Income $27k $16k $0
Status worker worker student
Automobile
32. Extra-household connectivities also influence/
reflect motives, activities and behavior
Office Links Jill Shawn
Friendship
John Links
Joe Mar
y
Ron Family Jane Tim
Links
35. Built, functional, locational structure defines where
activities occur and influences movement/comms
• Synthetic activity locations, such as homes,
are placed with probability proportional to
location geo-functional weights:
(type: home location – # people, cost, etc.)
California
Illinois
36. Bipartite map of people with activities onto
appropriate locations with functional capacities
Motivated People Activity- appropriate Locations
Vertex attributes: Vertex attributes:
age Coordinates
household size Type
gender
income
Edge attributes:
activity type: shop, work, school
(start time 1, end time 1)
(start time 2, end time 2)
38. Example: large scale socio-physical interaction
• Attack in Washington DC
– NPS1, a 2006-based unclassified study scenario with
lots of people publishing and even putting lectures on
YouTube
• Basically we wanted to know if there really might
be significant social behavior options in the
immediate aftermath that could be imagined &
that might have long term influence
• Disaggregate, detailed socially-coupled
simulation used combined with physical
modeling
39. Technical Perspective: Socially-coupled
systems
• Massively interacting systems generating arbitrarily much data
• Want general, re-usable, approach. Many examples:
transport, facebook, biosystems, economic systems
• Generally, the topic of HPC based data-centric methods, network
science/ network dynamics are central
• Socially-coupled systems display a lack of symmetries => problems
for usual dimensional reduction approaches
• Systems are huge, details matter
• Detailed disaggregate modeling, appropriate abstractions, novel
HPC simulation methods & statistical approaches are necessary
• Necessary source information is diverse, including process
knowledge
• Totally different view of decision analysis necessary
41. Contextual Synthetic Information
• The information platform is the interaction
medium
• The only way to really deal with the massively
interactive, branching—thus extreme data—
world.
43. Physical Event in a Social Context
• Event put “on top of” a
normally functioning day’s
population dynamics
• National Planning Scenario 1
• Unannounced detonation
• Time: 11:15 EDT
• Date: May 15, 2006
44. Time Damage to power network and long
0:00
term power outage area
• Probability of damage to individual substations
Aggregated outage area
• / / : High/medium/low: probability of damage
• Long-term outage area devised by geographically relating the location of substations in the city with
the blast damage zones.
• Loss of a substation has a much more widespread impact on provided power to the customers.
45. Time
0:00 Infrastructure: initial laydown
• Positions and demographic identities of
individual synthetic people in the DC region
were calculated at the time of detonation.
• Street addresses mapped to geo-functional
data
• Persons traveling to destinations were placed
outside on transportation networks –walk,
roadway, metro, bus.
• Power outage, damage, collapse, rubble, blast
temp, radiation dose rate assigned to each
location and transportation network node
Built Infrastructure
Power Outages
Position of People
52. CIIMS Avatars automatically create realistic individual behaviors
through large scale interaction, local machine intelligence
New timeline feature:
Scenario displays details
connected to timeline
New use of
timeline:
detailed analysis
of
interdependent
individual
behaviors
54. A drama in machine intelligence: Reuniting a family after the disaster
Clair and Denise
• Mother and infant daughter
• +0:00 - Home
• Both uninjured
Cliff
• Father
Theo • +0:00 - At work
• Son • Uninjured
• +0:00 Daycare
• Uninjured
55. Calls finally go through
Clair and Denise
• +3:05 - Evacuate City
• Doesn’t know where Theo is
Cliff
• +3:00 – Call to Clair successful
• Stops panicking and finds shelter
• +3:10 – Call to Theo (i.e., daycare
Theo worker) successful
• Continues shelter in Daycare
56. Initial Panic
Clair and Denise
• +0:00 – Shelter at home
• Repeatedly calls 911
• Both exposed to 10cGy first 10
minutes
Cliff
• +0:00 – Panics, abandon’s
Theo car, heads to nearest hospital
• +0:10 – Workers bring children • Exposed to 0.4cGy first 50
to nearby building for shelter minutes
• No exposure
58. Evacuation
Cliff
• +45:00 – Arrives at
daycare
• Evacuates city with Theo
59. Aggregate behavioral details & exposure to injury
• Each individuals' daily or event context- driven activities take them inside and
outside periodically, the details affect their injury level at the time of, as well as
after, the blast.
• Injury traversing rubble
• Delay of access to care, etc
Outdoors Indoors
60. Socio-technical influences on individual behavior
• If communication is provided earlier and contact made, less panic
unstructured behavior, more sheltering, less searching, etc.
• There are hundreds of thousands of these avatars and many
different specific motivations, or perhaps, different complex
contextual embodiments of similar generic motivations
• The composite effect on many things, including exposure to injury
cannot be always be calculated in aggregate in particular scenarios
from data obtained elsewhere.
• Supporting problem evolution and the extreme importance of
sparse sequential analysis is a major conclusion of this study.
• The 1st 72 hrs is not the same problem as what follows. Saturated
performance from initial behavioral models as situation evolves.
• These methods do more than better answer a given question:
Its general, there is theoretical form to the question that is semantics-freePredecessor existence and reachability, “validation” and “prediction” are both very subtle, the systems branchThe theory is new, deeply connecting to theory of computation, maps onto HPC
Neighborhoods have both social, functional (graphical) and spatial (different graphical) structure and these all interact
Neither the gene or the embryonic state supervene with respect to the morphology of the pigmentation
Normally the gene is presumed to create the color pattern. Here the gene in contact with the last color pattern makes a new one. One aspect does not supervene the other wrt the morphological state of the mouse. Only in the interaction is it possible to create the phenomenon
Stiglitz housing patterns are a similar excitable medium with diffusive communicationTerrible story of Belousov-Zhabotinsky
T=0State that expert opinion was used to create this slide/information
Buildings from DTRA with red indicating areas of high casuality probability (upper right) street network (NAVTEQ) and people positions at the time of detonation in the detailed study area (lower right)Bottom left picture shows power outage area as light purple polygon. Locations in power outage area are plotted as red, locations with power are green. 730,833 persons in the DSA at time of detonationT=0146,337 locations (includes transportation nodes)Small label People, built infrasture, position of people of DSA and left power outages
note: Green: no collapse; Yellow: sideways collapse;> Red: 100% collapse.the blue ringis 2.2km circle, different colors on the buildings represent thelevel of collapse. If needed, I can generate another one quicklytomorrow morning, maybe using the data for 3.2km circle.T=0Buildings DTRA says had collapsed
Roadway network from NAVTEQ with damage (upper left)Road network zoomed with level of damage included (lower left)Walk network with damage (lower right)
CloseAlive_Pairs.movPoint – you can look at the data this way – transportation system is same in both cases – lots of bars on the roads, because that is where people are…Building points are the front door – thus bar on the streetDistribution of the population -
Blue – Cell 1 greaterPurple – Cell 2 greaterTitle:tansportaion link demand/or density
Green is bad, red is goodAverage level of health state (ie high number, red, is Full Health) per location based on inside vs. outside. All health levels shown, so uninjured are averaged in. Move to t=0Blank spots are the sparse areas where we have very few to 0 people at the time of the blast
Database Table sizesInput Data Tables: 3.55 GBOutput Dynamic Data for 1 cell (126 iteration, 80 hours of simulated time): 8.06 GB location tables, 19 GB person tablesDisk Usage:Input Data: 1.16 GBDynamic Data for 1 cell(126 iterations, 80 hours of simulated time): 15 GBComputation Time – for Run 1413:Behavior Module runs in about 2 minutes but uses 96 cores so time spent in computation is roughly 2*96 = 192 minutes/iterationRouter execution time varies depending on number of routes. To compute approximately 200,000 routes, the runtime is about 8 minutes and uses 6696 nodes/12 threads node for all 124 iterations.The router uses approximately 40 nodes with 12 threads/node so computation time is roughly 40*12*10 = 4800 minutes;