Successfully reported this slideshow.
Social Networks, Meetings andSocial Networks, Meetings and
Contact FrequenciesContact Frequencies
February 2008
ROADEF Con...
17:01 - 2
OutlineOutline
1. Motivations: From Enterprise Efficiency to Social
Networks
2. Social Networks with Contact Fre...
17:01 - 3
Motivations (I) : Enterprise EfficiencyMotivations (I) : Enterprise Efficiency
 Time management is a crucial fa...
17:01 - 4
Motivation (2): Enterprise SimulationMotivation (2): Enterprise Simulation
“SIFOA”: A project to evaluate « comm...
17:01 - 5
Motivation (3) : Do we need (so many) meetings ?Motivation (3) : Do we need (so many) meetings ?
Of course:
 Pr...
17:01 - 6
Motivations (4): a Theory of Meetings ?Motivations (4): a Theory of Meetings ?
Many possible contributions from ...
17:01 - 7
Part IIPart II
1. Context: From Enterprise Efficiency to Social Networks
2. Social Networks with Contact Frequen...
17:01 - 8
Social Networks and Contact FrequenciesSocial Networks and Contact Frequencies
 Social networks have received a...
17:01 - 9
Random generation of a source networkRandom generation of a source network
Classical random graph model adapted...
17:01 -
Samples of Network GenerationSamples of Network Generation
These results are similar to the SN results from DW …
...
17:01 -
Building a network from the demand patternsBuilding a network from the demand patterns
0% 10% 30% 50% 100%
Sorted(...
17:01 -
Part IIIPart III
1. Context: From Enterprise Efficiency to Social Networks
2. Social Networks with Contact Frequen...
17:01 -
Affiliation NetworksAffiliation Networks
 Our topic is how to « cover » a source
TVSN with a set of scheduled mee...
17:01 -
CMS : Dimensions and ParametersCMS : Dimensions and Parameters
 Here we suppose that all meetings last for one ho...
17:01 -
5 measures for a « System of Meetings » (CMS)5 measures for a « System of Meetings » (CMS)
 Latency
latency is th...
17:01 -
Coverage ProblemCoverage Problem
 Our goal: optimize latency under throughput constraints
 The throughput requir...
17:01 -
« Coverage Algorithm »« Coverage Algorithm »
Cover (h, m, M, F%)
 Invariants: s(x) = T - ∑{f(r) | r contains x & ...
17:01 -
VariantsVariants
H0 = default algorithm, defined by its choice heuristic h :
 find the meeting which covers e wit...
17:01 -
Computer ExperimentsComputer Experiments
An experiment:
 Generate a random source network
 Generate a CMS (usin...
17:01 -
Results (Results (Various Covering Algorithm for the CMS ProblemVarious Covering Algorithm for the CMS Problem))
F...
17:01 -
Results (meeting size)Results (meeting size)
The larger the meeting attendance, the better the latency
 At the ex...
17:01 -
Results (meeting frequency)Results (meeting frequency)
Frequent meetings provide with a latency improvement
 The ...
17:01 -
Part IVPart IV
1. Context: From Enterprise Efficiency to Social Networks
2. Social Networks with Contact Frequency...
17:01 -
Approximate Formula for LatencyApproximate Formula for Latency
D = [log(Di) / log(Dr)] * R
 Actually an exact for...
17:01 -
CommentsComments
 Precision level similar to what is known about the input
 Works better within a « typical » va...
17:01 -
A refresher slide on « small-world networks »A refresher slide on « small-world networks »
 6 degree of separatio...
17:01 -
« Small-world » networks and CMS« Small-world » networks and CMS
 « Small world » structure - cf. D. Watts
 … ne...
17:01 -
Another « Small-World » exampleAnother « Small-World » example
 Here we use a series of high frequency strategies...
17:01 -
Part VPart V
1. Context: From Enterprise Efficiency to Social Networks
2. Social Networks with Contact Frequency
3...
17:01 -
Next StepsNext Steps
 On this approach …
 Try better algorithm to get a lower bound
 More experiments
 larger ...
17:01 -
EnterpriseEnterprise Simulation ModelSimulation Model
3 components
BPEM (Business Processes Enterprise Model)
 E...
17:01 -
Communication Channels ModelCommunication Channels Model
 Each channel is characterized by the following paramete...
17:01 -
ConclusionConclusion
 Time-Valued Social Networks
 Natural and interesting extension
 Time dimension is a key a...
17:01 -
Informal Conclusion : CMSInformal Conclusion : CMS
Importance of « meeting diameter »
Favor latency
 => small h...
Prochain SlideShare
Chargement dans…5
×

Google socialnetworksmarch08

651 vues

Publié le

Theory of Meeting, Affiliation Networks, Social Networks, contact frequency. A 2008 presentation about computer models to better understand the efficency of meetings

Publié dans : Sciences
  • Soyez le premier à commenter

Google socialnetworksmarch08

  1. 1. Social Networks, Meetings andSocial Networks, Meetings and Contact FrequenciesContact Frequencies February 2008 ROADEF Conference – Google US Office Visit Yves Caseau
  2. 2. 17:01 - 2 OutlineOutline 1. Motivations: From Enterprise Efficiency to Social Networks 2. Social Networks with Contact Frequency 3. Corporate Meeting System : a coverage problem 4. Preliminary findings 5. Perspectives 17:01 – 2 / 32
  3. 3. 17:01 - 3 Motivations (I) : Enterprise EfficiencyMotivations (I) : Enterprise Efficiency  Time management is a crucial factor of efficiency  Time is the most critical resource, including when communication is concerned (e.g., the time to read one’s email)  A key topic is the cooperation between multiple communication channels  Optimizing the flow of information is crucial in today’s digitalized enterprise  Latency (of information propagation) is a key performance indicator  When companies aim to be reactive, agile, etc.  Web Quote “How to Run a Meeting Like Google” – Sept. 27, 2006  “Mayer, who has a background in engineering and computer science, jokingly refers to micro-meetings as "reducing latency in the pipeline." That means if she has an employee with an issue that comes up Tuesday, he or she can schedule a 10-minute micro-meeting during Mayer's large time block, instead of waiting for her next 30- minute opening, which might not be available for two weeks.” 17:01 – 3 / 32
  4. 4. 17:01 - 4 Motivation (2): Enterprise SimulationMotivation (2): Enterprise Simulation “SIFOA”: A project to evaluate « communication channel competition » through simulation  Overall unit-to-unit communication flow, related to business processes  Communication channels are segmented:  One-on-one : Synchronous / Asynchronous – non scheduled  Scheduled meetings  Scheduled one-on-one meeting (related to hierarchical organization)  Characterized by :  Propagation delay (latency)  Bandwidth  Sensitivity to occupation rate (from queuing theory)  Loss (probability to need to re-emit)  Bandwidth (information quantity) is measured with time x persons  Both on the sending and receiving end 17:01 – 4 / 32
  5. 5. 17:01 - 5 Motivation (3) : Do we need (so many) meetings ?Motivation (3) : Do we need (so many) meetings ? Of course:  Priorizing of topics (implicit waiting queue)  Flow sharing (1 sender -> n receivers)  Ability to schedule (vs. spontaneous meetings) & avoid overflows  Topic “mulualization” (avoid “setup/ moving” costs) But:  The set of all scheduled committees is a rigid struction which may not be best suited to reacting to high priority events  Tendency to “accumulate” “useless” participants  Tendency to fill the “time line”, against other forms of activity (reinforced by modern computer tools ) So ? 1. The “efficiency” of a meeting totally depends on the company’s context … 2. But, as a communication tool, could we compage its efficiency against other channels ? (set of one-to-one meetings, emails, etc.) ?  -> CMS performance measure  -> finding out how much time should be allocated to meetings ? 17:01 – 5 / 32
  6. 6. 17:01 - 6 Motivations (4): a Theory of Meetings ?Motivations (4): a Theory of Meetings ? Many possible contributions from multiple fields: CMC, sociology, psychology  Ex: Information transmission speed (write/read/listen)  Group behavior: engagement / responsibility Operations Research  Ex: scheduling, queuing theory -> importance of occupation rate (why scheduled meetings are useful )  Flows: managing bandwidth in communication networks Social Networks  Cf. work on affiliation networks (exec. Boards, movies, …)  The topic of this talk  17:01 – 6 / 32
  7. 7. 17:01 - 7 Part IIPart II 1. Context: From Enterprise Efficiency to Social Networks 2. Social Networks with Contact Frequency 3. Corporate Meeting System : a coverage problem 4. Preliminary findings 5. Perspectives 17:01 – 7 / 32
  8. 8. 17:01 - 8 Social Networks and Contact FrequenciesSocial Networks and Contact Frequencies  Social networks have received a lot of attention for the last past 10 years (computer models meet sociology) » - cf. Duncan Watts  Our approach starts with the addition of a tag to each edge which represents contact frequency  Not all contacts are equal  Time is what actually control the information flow  Ex: one may see a few people often or many people more rarely   A Time-Valued Social Network (TVSN) may represent the network of desired interactions. Caroline Armand 1h / semaine 1h / semaine 1h / s 1h / s 2h / mois 2h / m 1h / m 1h / m 2h / s 1h / m 1h / mois 1h / 2jours 1h / 2j 1h / 2jours 1h / 2j 1h / s 17:01 – 8 / 32
  9. 9. 17:01 - 9 Random generation of a source networkRandom generation of a source network Classical random graph model adapted to TVSN:  From a few hundreds to a few thousand nodes  Random addition of edges with selection …  … according to characteristic parameters:  Degrees (average / distribution)  Contact frequencies  Clustering rate Under constraints:  Constant sum of frequencies (node-wise) – 200h/month  Connected network  Measurable:  Path length  Average distance (“diameter”)  Time distance = inverse of frequency (1/2) 17:01 – 9 / 32
  10. 10. 17:01 - Samples of Network GenerationSamples of Network Generation These results are similar to the SN results from DW …  smaller networks  High level of connectivity (rightest part of the DW’s figures)  Many possible variations, mostly according to the distribution of contact frequencies, degrees, and cluster rate (correspond to different enterprise cultures) 17:01 – 10 / 32
  11. 11. 17:01 - Building a network from the demand patternsBuilding a network from the demand patterns 0% 10% 30% 50% 100% Sorted(8) D 17.51 19.1 22.35 25.59 33.56 L 2.94 3.18 3.69 4.20 5.44 Random(8) D 24.3 24.3 24.3 24.3 24.3 L 4.01 4.01 4.01 4.01 4.01 Sorted(16) D 19.36 21.06 24.5 27.9 36.58 L 2.17 2.37 2.77 3.17 4.17 Random(16) D 27.06 27.06 27.06 27.06 27.06 L 3.08 3.08 3.08 3.08 3.08 • Selecting a subset is a good idea for a stable situation  • It is not a trivial matter since it may introduce biases (worse than random with random distance) • In the remainder of the talk, we shall use a mix (70%: predictive, 30% random) • A key parameter is the average number of contacts = 70% * degree + 30% * N. We call it the Information Diameter (Di ) 17:01 – 11 / 32 • What-if we need to extract a manageable-sized network from an original one with high degree ?
  12. 12. 17:01 - Part IIIPart III 1. Context: From Enterprise Efficiency to Social Networks 2. Social Networks with Contact Frequency 3. Corporate Meeting System : a coverage problem 4. Preliminary findings 5. Perspectives 17:01 – 12 / 32
  13. 13. 17:01 - Affiliation NetworksAffiliation Networks  Our topic is how to « cover » a source TVSN with a set of scheduled meeting (CSM) …  which serves the communication requirement (represented by the TVSN) as well as possible  The set of meetings may be seen as an hypergraph, or as an affiliation network (bipartite graph)  Each meeting has its frequency  A fair amount is known about affiliation networks … graph metrics may be applied:  L, C, D, …  Dm : diameter = size of the set of people that one meets  Cf. M. Latapy et al. Basic Notions for the Analysis of Large Two- mode Networks” Di (information) Dm 17:01 – 13 / 32
  14. 14. 17:01 - CMS : Dimensions and ParametersCMS : Dimensions and Parameters  Here we suppose that all meetings last for one hour  The set of meetings may be seen as one large schedule N: Number of people A R : number of meetings/person T = 100 (100h of meetings/committee per month) A few « simple laws »  Fm * R = T  M * Fm = N / A * T (clearly holds in the case of a regular tiling) Consequently, two trade-offs must be found:  For each person, between few frequent meetings and many infrequent meetings  Generally, few large meetings or many small meetings.F: frequency of each meeting 1/100 3/100 3/100 3/100 M : number of meetings 17:01 – 14 / 32
  15. 15. 17:01 - 5 measures for a « System of Meetings » (CMS)5 measures for a « System of Meetings » (CMS)  Latency latency is the speed of information propagation. It is measured though the average distance between two nodes (following the previously mentioned pattern: 70% from the source network, 30% random).  Throughput Throughput is the ability from the meeting system to transport information. It is measured as the sum of the products (duration x frequency) for all meetings.  Feedback Feedback is defined as the ability to check appropriation/understanding when some information is transmitted. A simple measure is the average speech time each attendee may expect in a meeting, that is the sum of (duration x frequency x inverse of number of attendees).  Loss Loss is the opposite to the capacity to transport information without change. The simplest measure is the average path length.  Quality I throw this last catchall category to represent the rich and complex nature of human interaction. There are aspects from group dynamics, psychology, responsibility, etc. which means that each type of meeting is more or less suited to each purpose (decision, brainstorming, information, …). 17:01 – 15 / 32
  16. 16. 17:01 - Coverage ProblemCoverage Problem  Our goal: optimize latency under throughput constraints  The throughput requirement depends on the nature of the business  Our problem: design a CMS from the forecast network  The goal is not necessarily to find « the optimal solution »  Approximate input data + Approximate latency (distance)  The goal is to reflect what an enterprise would do  we do not focus on “searchability” for two reasons: 1. we suppose that the organization is small enough and/or the committee structure is simple enough for short paths to be found implicitly. 2. when the latency of high priority information is concerned, a form of parallelism (redundancy) is implicit : all path are explored at the same time (hence the shortest path is found) Pick e (forecast) Assign e to H Maintain H.f= f(H.cover) Create new H Post-optimization 17:01 – 16 / 32
  17. 17. 17:01 - « Coverage Algorithm »« Coverage Algorithm » Cover (h, m, M, F%)  Invariants: s(x) = T - ∑{f(r) | r contains x & r’s frequency is f} f(r) = f(r) once a frequency has been picked F% * max({f(e) | e is “covered” by r}) otherwise Repeat until all edges e have been selected: { (1) Pick an edge e = (x,y) to be « covered » by a meeting:  Enumerate all edges e that have not been selected yet, choose e = (x,y) such that h(e) is highest & s(x) > 0 & s(y) > 0 (2) Look for a meeting which could “cover” this edge:  Select, according to the distance d(e,r) = |f(e) – f(r)|, the best meeting which contains x (resp. y), with size < M, such that s(y) > f(r) (resp. s(x) > f(r)) (3) If such a meeting is found: {add e to the “coverSet” de r, if r’s size is more than m, its frequency is set (f(r) := f(r)) } otherwise create a new meeting which contains x and y, which “cover set” is {e}. } 17:01 – 17 / 32
  18. 18. 17:01 - VariantsVariants H0 = default algorithm, defined by its choice heuristic h :  find the meeting which covers e with maximum frequency. If it exists, return f(e) –f(r), otherwise return 10 x min(f(e),s(x),s(y) H1 = simpler solution : h(e) = f(e) H2 = try to build a complete meeting one at a time:  Preference to pick edges which are adjacent to the meeting that is being constructed. H3 = opposite strategy: avoid to pick edges that are adjacent to the current meeting, to favor spread. H4 = remove the easy step of post-optimization:  Fill schedules that still show “free time” with relevant meetings  H5 = add another step of post-optimization  2-opt local optimization, using frequency « swaps » depending on the “utility” of each meeting. H6 = remove the early set-up of frequency (based on the m parameter) Equally tested but without success: randomization, different aggregation patterns (mean instead of max) 18/1317:01 – 18 / 32
  19. 19. 17:01 - Computer ExperimentsComputer Experiments An experiment:  Generate a random source network  Generate a CMS (using one of the variants)  Measure  L: average path length  D: average distance  B: throughput Typical experiment  100 source network generation (enough to get stable results)  10000 random pairs of nodes -> distribution derived from the source network Quite intensive for large networks (100000 edges) 17:01 – 19 / 32
  20. 20. 17:01 - Results (Results (Various Covering Algorithm for the CMS ProblemVarious Covering Algorithm for the CMS Problem)) F1 F2 (20) F3 (large) D L B D L B D L B H0 10.52 1.48 10.5 7.625 1.41 9.83 10.83 1.59 10.4 H1 11.07 1.52 10.6 7.93 1.45 9.79 11.43 1.62 10.3 H2 10.71 1.48 10.5 7.51 1.41 10.1 11.03 1.59 10.4 H3 10.53 1.48 10.5 7.65 1.41 9.85 10.74 1.57 10.4 H4 10.54 1.49 10.5 7.68 1.43 9.84 10.86 1.60 10.4 H5 10.51 1.48 10.5 7.623 1.41 9.83 10.83 1.59 10.4 H6 10.6 1.45 10.5 7.71 1.38 9.72 10.86 1.54 10.3 Results (cf. article)  H0 > H1  Post-optimization is not significant  m = M * 70% yields some improvement  There is no stable pattern as far as H0/H2/H3 is concerned
  21. 21. 17:01 - Results (meeting size)Results (meeting size) The larger the meeting attendance, the better the latency  At the expense of throughput (and feedback)  Improvement of loss, larger meeting diameter 17:01 – 21 / 32
  22. 22. 17:01 - Results (meeting frequency)Results (meeting frequency) Frequent meetings provide with a latency improvement  The loss in Dm is more than compensated by the improvement with the individual meeting latency  No degradation of bandwidth (small improvement)  Small degradation of loss 17:01 – 22 / 32
  23. 23. 17:01 - Part IVPart IV 1. Context: From Enterprise Efficiency to Social Networks 2. Social Networks with Contact Frequency 3. Corporate Meeting System : a coverage problem 4. Preliminary findings 5. Perspectives 17:01 – 23 / 32
  24. 24. 17:01 - Approximate Formula for LatencyApproximate Formula for Latency D = [log(Di) / log(Dr)] * R  Actually an exact formula for simple cases  Following table example : standard deviation less than 10%, average is 100% (of actual value)  0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 500 DR ratio D*10 17:01 – 24 / 32
  25. 25. 17:01 - CommentsComments  Precision level similar to what is known about the input  Works better within a « typical » value domain:  meetings with 6-12 people, N big enough, …  Precision level similar to the differences between coverage algorithm  Hypothesis: the distance between current algorithm and optimal solution is much smaller  Useful  As a rule of thumb  For enterprise simulation (SIFOA) –cf. later  Warning: “nobody knows” the $ value of latency   Which is why it is hard to evaluate the impact of the loss of email services for a few hours, or its real contribution compared to surface-mail … 17:01 – 25 / 32
  26. 26. 17:01 - A refresher slide on « small-world networks »A refresher slide on « small-world networks »  6 degree of separation (67, Stanley Milgram) …  Means that the diameter has a log(N) complexity  Which should come as no surprise in a random graph …  … but social networks are not ! They are clustered  Actually, they exhibit this “small world property” because of the existence of a few “random” edges (the “different” friends)  Demonstrated in real life … and reproduced by Duncan Watts
  27. 27. 17:01 - « Small-world » networks and CMS« Small-world » networks and CMS  « Small world » structure - cf. D. Watts  … networks which displayed the high local clustering of disconnected caves but were connected such that any node could be reached from any other in an average of a few steps.  If the CMS is made of small heavily connected clusters (with a high frequency), adding a few more « transverse » meetings which act as a social binder will increase the overall performance, compared to an homogeneous design strategy. D L B A=7.85 11.1 1.6 11.9 A=8.8 10.6 1.51 10.5 A=9.8 10.4 1.43 9.35 A = 11.7 10.0 1.37 7.7 Mix 10.1 1.42 10.3 17:01 – 27 / 32
  28. 28. 17:01 - Another « Small-World » exampleAnother « Small-World » example  Here we use a series of high frequency strategies (cf. previously)  raising the frequency reduces latency at the expense of path length.  A mixed structure gives better results than an homogeneous one  Here we see an improvement on all accounts   Experiences: more connected => easier to make meetings work (no surprise here)  Reciprocal : large scope => performance of CMS is important ! D L B A=7.7 8.84 1.59 11.7 A=9.7 8.73 1.53 9.55 A=12.5 8.37 1.42 7.31 A = 15.3 8.08 1.35 6.03 Mix 8.6 1.52 10.3 17:01 – 28 / 32
  29. 29. 17:01 - Part VPart V 1. Context: From Enterprise Efficiency to Social Networks 2. Social Networks with Contact Frequency 3. Corporate Meeting System : a coverage problem 4. Preliminary findings 5. Perspectives 17:01 – 29 / 32
  30. 30. 17:01 - Next StepsNext Steps  On this approach …  Try better algorithm to get a lower bound  More experiments  larger data sets,  more combinations cluster rate / distribution  Derive practical metrics from graph theory  Intermediation Centrality (cf. Linton Freeman)  « Centrality in valued graph : A measure of betweenness based on network flow », in Social Networks, vol. 13, 1991.  Cf. Malcom Gladwell’s “The Tipping Point”  Feedback (CMC’s « bandwidth »)  Better characterization requires better input (qualify the need communication – processes, priority, etc.)  Enterprise simulation  Requires a higher level of abstraction (CMS as a channel) 17:01 – 30 / 32
  31. 31. 17:01 - EnterpriseEnterprise Simulation ModelSimulation Model 3 components BPEM (Business Processes Enterprise Model)  Enterprise Activity is a set of processes with time- dependant valuation (deadline, TTM, …)  BP describe the cooperation of multiple units and generate information flows Communication Channels Mixed Monte-Carlo and Evolutionary Simulation (many unknown parameters)  Optimization algorithm (e.g. GA) for parameters that are bound to the enterprise strategy (e.g. how to best use the communication channels ?)  Monte-Carlo simulation for all other parameters (mostly parameters that describe the “environment”) 17:01 – 31 / 32
  32. 32. 17:01 - Communication Channels ModelCommunication Channels Model  Each channel is characterized by the following parameters:  Repetition Rate (R) = average number of times a message need to be sent to be efficient. (a way to factor in the effects of fidelity/loss)  Mutualization (M): average number of receivers to a message.  Usage (U): number of participants that ar busy during the exchange, including those who are not really useful.  Frequency (F) : average frequency of access to this channel.  Sending/Receiving Speed: processing time as a ratio compared to voice  Each information flow is divided into blocks that need to be scheduled  macro-scheduling to take occupation rate into account  Each bloc is tailored to the channel that is being used:  The length is adjusted according to processing speed  Repetition is added as needed (cf. R or hierarchical channel).  Two send/receive blocks are scheduled, in a sync/async manner 17:01 – 32 / 32
  33. 33. 17:01 - ConclusionConclusion  Time-Valued Social Networks  Natural and interesting extension  Time dimension is a key aspect of efficiency  The « Corporate Meeting System » is an interesting object:  Mathematical structure (hypergraph) and related tools  Key topic of Management Science  First contributions:  Greedy algorithm to cover a TVSN with a CMS  « Small-world structure» -> shown the interest of a hybrid coverage strategy  Approximate characterization of latency 17:01 – 33 / 32
  34. 34. 17:01 - Informal Conclusion : CMSInformal Conclusion : CMS Importance of « meeting diameter » Favor latency  => small high frequency clusters  Create contact points (ex: cafeteria as a daily routine) The meeting system is a key part of the organization  As important as the hierarchical structure (from a in information transfer, from a time management and also from a symbolic perspectives) A « meeting system » is efficient but is not flexible  One should put boundaries on total time spent in meetings  Latency may and should be evaluated once in a while using the propagation of new topics as an testing measure (a common finding of post-crisis analysis).  The topic of performance indicators for a CMS is still open  People I meet in a month People I meet often compromise: often / in depth / the same

×