SlideShare a Scribd company logo
1 of 25
Reality Mining, (Big Data) and
        Urban Sensing

        Darshan Santani
         ETH Zurich

           15 April 2010
Trivia




15 April 2010            2
Trivia




                Taxi Observations* by Location and Booking Frequency of Zone in
                                          Singapore1
15 April 2010                                                  * Sampled dataset (~10,000 observations)   3
Outline
•    Reality Mining
•    Applications
•    Holy Grail!
•    Challenges
•    Discussion and Q& A



15 April 2010               4
Reality Mining       Study2

• “ … collection and analysis of machine-sensed
  environmental data pertaining to human social
  behavior, with the goal to identify predictable patterns
  of future human behavior …”

• .. extracting information from real world sensor data …

• Reality Mining vs. Data Mining

• Nathan Eagle, Alex (Sandy) Pentland, MIT, 2005
• 100 Mobile phones, 9 months, 45,000 hours of
  communication logs, location and proximity data
15 April 2010                                                5
Key Results




15 April 2010                 6
Key Results




                Social Network Analysis in the wild!3
15 April 2010                                           7
Why do we care?
• Social Science
       – Social Network Analysis
       – Behavioral Modeling
       – Human Mobility


• Systems Research
       – Transportation
       – Environmental Modeling
       – Healthcare

15 April 2010                      8
Enabled Applications




                Human Mobility Patterns using CDRs 4
15 April 2010                                          9
Why do we care (again)?
• Social Science
       – Social Network Analysis
       – Behavioral Modeling
       – Human Mobility


• Systems Research
       – Transportation
       – Environmental Modeling
       – Healthcare

15 April 2010                             10
Enabled Applications (contd.)




                Environmental Monitoring - Noisetube5

15 April 2010                                           11
Real-time Traffic   Monitoring 6




15 April 2010                                12
Mobile Millennium, UC Berkeley7

                       100 probe vehicles, carrying
                      GPS-enabled N95

                       San Francisco Bay
                      Area, California

                       Virtual Trip Lines (VTL)




15 April 2010                                  13
Mobile Millennium, UC Berkeley




15 April 2010                       14
Holy Grail!
• Urban Planning and Management

       – Real time city
                • Are the sidewalks along the Belleuve lake good for jogging
                  today, given the air and noise pollution levels?

       – Macroscopic view
                • Is there a need for running supplementary tram services (or sending
                  an additional fleet of taxis) towards the end of a soccer match
                  between Switzerland and Germany?

       – Emergency/Crisis Response
                • 2009 Mumbai terrorist blasts

       – Disease Outbreak
15 April 2010                                                                     15
Selective Information Broadcasting1




                Booking Frequency by second



15 April 2010                                 16
Holy Grail!
• Urban Planning and Management

       – Real time city
                • Are the sidewalks along the Belleuve lake good for jogging today,
                  given the air and noise pollution levels?

       – Macroscopic vs. Microscopic
                • Is there a need for running supplementary tram services (or sending
                  an additional fleet of taxis) towards the end of a soccer match
                  between Switzerland and Germany?

       – Emergency/Crisis Response
                • 2009 Mumbai terrorist blasts

       – Disease Outbreak/Epidemic Modeling
15 April 2010                                                                     17
Challenge #1: Big Data
• How big is big enough?
       – Wal-Mart: 100-400 GB/day of RFID data8
       – LHC: 40 TB/day9


• Storage is cheap!

• Stream data mining


15 April 2010                                     18
Challenge #2: Abstraction
• Low level details
       – Parallelism!
       – Task distribution
       – Load balancing
       – Fault tolerance

• Programming Productivity

• Google’s MapReduce

15 April 2010                               19
Challenge #3: Privacy(!)
• A “new deal” on data? 10
       – right to possess your data
       – control the use of your data
       – right to distribute or dispose your data

• How thin or thick the line is between publicity
  and privacy?

• Trivia again!
       – Erica is travelling to Helsinki in May 2010?
       – Florian and Stephan visited Brussels in February 2010?

15 April 2010                                                     20
Big Money!
    IBM Smarter Planet         HP CeNSE




15 April 2010                             21
Q&A
Takeaway Message
Last 5 years have spurred an industrial revolution of sensor
data. I believe that applying empirical (and later, computational
methodologies) on this real world data would help us better
understand the underlying cognitive, social, policy and
engineering issues present in our socio-technical systems.
Reality Mining, which sits at the intersection of computer
science, statistics and social science, fits in this role nicely.




15 April 2010                                                   23
References
1.      Darshan Santani, Rajesh Krishna Balan, and C. Jason Woodard, Understanding and Improving a
        GPS-based Taxi System, In 6th USENIX International Conference on Mobile
        Systems, Applications, and Services (MobiSys), Breckenridge, Colorado, June 2008
2.      N. Eagle and A. (Sandy) Pentland. Reality mining: sensing complex social systems. Personal
        Ubiquitous Computing, 10(4):255–268, 2006.
3.      N. Eagle, A. S. Pentland, and D. Lazer. Inferring friendship network structure by using mobile
        phone data. Proceedings of the National Academy of Sciences, 106(36):15274–15278, 2009
4.      C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi. Limits of Predictability in Human Mobility.
        Science, 327(5968):1018–1021, 2010
5.      N. Maisonneuve, M. Stevens, M. E. Niessen, and L. Steels.Noisetube: Measuring and mapping
        noise pollution with mobile phones. In I. N. Athanasiadis, P. A. Mitkas, A. E.Rizzoli, and J. M.
        Gómez, editors, ITEE, pages 215–228. Springer, 2009.
6.      J. Yoon, B. Noble, and M. Liu. Surface street traffic estimation. In MobiSys ’07: Proceedings of
        the 5th International conference on Mobile systems, applications and services, pages 220–
        232, New York, NY, USA, 2007
7.      J. C. Herrera, D. B.Work, R. Herring, X. J. Ban, , and A. M.Bayen. Evaluation of traffic data
        obtained via gps-enabled mobile phones: the mobile century field experiment. Working
        Paper, UCB-ITS-VWP-2009-8, August 2009
8.      I. Alexander, G. Andrea, M. Florian, and E. Fleisch.Estimating data volumes of rfid-enabled
        supply chains. In AMCIS 2009 Proceedings, page 636, 2009
9.      CERN LHC Computing. http://public.web.cern.ch/public/en/LHC/Computing-en.html, April 2010
10.     Alex (Sandy) Pentland, Reality Mining for Companies, in O’reilly Where2.0 Conference, May 19-
        21, SanJose CA, 2009

15 April 2010                                                                                        24
Thank you!
Please feel free to contact dsantani@student.ethz.ch for more details.

More Related Content

Viewers also liked

Experience in 3D Modeling
Experience in 3D ModelingExperience in 3D Modeling
Experience in 3D Modeling
Neolant
 
многоаспектное управление регионом
многоаспектное управление региономмногоаспектное управление регионом
многоаспектное управление регионом
Neolant
 
Abbott's Textual Analysis : Software Engineering 2
Abbott's Textual Analysis : Software Engineering 2Abbott's Textual Analysis : Software Engineering 2
Abbott's Textual Analysis : Software Engineering 2
wahab13
 
Computer modelling and simulations
Computer modelling and simulationsComputer modelling and simulations
Computer modelling and simulations
tangytangling
 

Viewers also liked (20)

The Analytics Opportunity in Healthcare
The Analytics Opportunity in HealthcareThe Analytics Opportunity in Healthcare
The Analytics Opportunity in Healthcare
 
Parallel Algorithms for Trillion Edges Graph Problems
Parallel Algorithms for Trillion Edges Graph ProblemsParallel Algorithms for Trillion Edges Graph Problems
Parallel Algorithms for Trillion Edges Graph Problems
 
Current Technolgies For Reality Capture Deck
Current Technolgies For Reality Capture DeckCurrent Technolgies For Reality Capture Deck
Current Technolgies For Reality Capture Deck
 
8 Unexpected Ways in which 3D Modeling and Animation can give you Better Busi...
8 Unexpected Ways in which 3D Modeling and Animation can give you Better Busi...8 Unexpected Ways in which 3D Modeling and Animation can give you Better Busi...
8 Unexpected Ways in which 3D Modeling and Animation can give you Better Busi...
 
Электронный карьер как средство оперативного мониторинга процессов добычи
Электронный карьер как средство  оперативного мониторинга процессов добычиЭлектронный карьер как средство  оперативного мониторинга процессов добычи
Электронный карьер как средство оперативного мониторинга процессов добычи
 
Интеграция пространственных данных в системах оперативного управления в орган...
Интеграция пространственных данных в системах оперативного управления в орган...Интеграция пространственных данных в системах оперативного управления в орган...
Интеграция пространственных данных в системах оперативного управления в орган...
 
Reality Mining
Reality MiningReality Mining
Reality Mining
 
Architype - Green BIM UK Perspective, NTU Taipei, 13 February 2014
Architype - Green BIM UK Perspective, NTU Taipei, 13 February 2014Architype - Green BIM UK Perspective, NTU Taipei, 13 February 2014
Architype - Green BIM UK Perspective, NTU Taipei, 13 February 2014
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
Experience in 3D Modeling
Experience in 3D ModelingExperience in 3D Modeling
Experience in 3D Modeling
 
Real Time Analytics for Big Data - A twitter inspired case study
Real Time Analytics for Big Data - A twitter inspired case studyReal Time Analytics for Big Data - A twitter inspired case study
Real Time Analytics for Big Data - A twitter inspired case study
 
многоаспектное управление регионом
многоаспектное управление региономмногоаспектное управление регионом
многоаспектное управление регионом
 
Abbott's Textual Analysis : Software Engineering 2
Abbott's Textual Analysis : Software Engineering 2Abbott's Textual Analysis : Software Engineering 2
Abbott's Textual Analysis : Software Engineering 2
 
CAAD FUTURES 2015: Development of High-definition Virtual Reality for Histo...
CAAD FUTURES 2015: Development of  High-definition Virtual Reality for  Histo...CAAD FUTURES 2015: Development of  High-definition Virtual Reality for  Histo...
CAAD FUTURES 2015: Development of High-definition Virtual Reality for Histo...
 
FARO 2014 3D Documentation Presentation by Direct Dimensions "3D Scanning for...
FARO 2014 3D Documentation Presentation by Direct Dimensions "3D Scanning for...FARO 2014 3D Documentation Presentation by Direct Dimensions "3D Scanning for...
FARO 2014 3D Documentation Presentation by Direct Dimensions "3D Scanning for...
 
Computer modelling and simulations
Computer modelling and simulationsComputer modelling and simulations
Computer modelling and simulations
 
MTECH BIM project case study
MTECH BIM project case studyMTECH BIM project case study
MTECH BIM project case study
 
iBIM value to Data Center from design to operate
iBIM  value to Data Center from  design to operateiBIM  value to Data Center from  design to operate
iBIM value to Data Center from design to operate
 
BIM project execution
BIM project executionBIM project execution
BIM project execution
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 

Similar to Reality Mining and Urban Sensing

Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesOntology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
Paolo Nesi
 
OpenStreetMap Past, Present and Future
OpenStreetMap Past, Present and FutureOpenStreetMap Past, Present and Future
OpenStreetMap Past, Present and Future
Peter Batty
 
GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)
GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)
GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)
IGN Vorstand
 
Towards emergency vehicle routing using Geolinked Open Data: the case study o...
Towards emergency vehicle routing using Geolinked Open Data: the case study o...Towards emergency vehicle routing using Geolinked Open Data: the case study o...
Towards emergency vehicle routing using Geolinked Open Data: the case study o...
Sergio Consoli
 

Similar to Reality Mining and Urban Sensing (20)

Ongoing Research in Data Studies
Ongoing Research in Data StudiesOngoing Research in Data Studies
Ongoing Research in Data Studies
 
Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesOntology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
 
OpenStreetMap Past, Present and Future
OpenStreetMap Past, Present and FutureOpenStreetMap Past, Present and Future
OpenStreetMap Past, Present and Future
 
Data Driven Ontology Practices: The Real world objects of Ordnance Survey Ir...
Data Driven Ontology Practices: The Real world objects of  Ordnance Survey Ir...Data Driven Ontology Practices: The Real world objects of  Ordnance Survey Ir...
Data Driven Ontology Practices: The Real world objects of Ordnance Survey Ir...
 
GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)
GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)
GI2010 symposium-longhorn (longhorn keynote-presentation_14_may)
 
10 Jahre Web Science
10 Jahre Web Science10 Jahre Web Science
10 Jahre Web Science
 
FUTURE INTERNET AND THE “THINGS”, NANOTHINGS!
FUTURE INTERNET AND THE “THINGS”, NANOTHINGS!FUTURE INTERNET AND THE “THINGS”, NANOTHINGS!
FUTURE INTERNET AND THE “THINGS”, NANOTHINGS!
 
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial Data
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial DataA Data Scientist Exploration in the World of Heterogeneous Open Geospatial Data
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial Data
 
Towards emergency vehicle routing using Geolinked Open Data: the case study o...
Towards emergency vehicle routing using Geolinked Open Data: the case study o...Towards emergency vehicle routing using Geolinked Open Data: the case study o...
Towards emergency vehicle routing using Geolinked Open Data: the case study o...
 
FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to suppo...
FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to suppo...FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to suppo...
FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to suppo...
 
Using gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planningUsing gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planning
 
Geographic Information Management Transformation
Geographic Information Management TransformationGeographic Information Management Transformation
Geographic Information Management Transformation
 
The whole is other than the sum of its parts: where is the spatial data infra...
The whole is other than the sum of its parts: where is the spatial data infra...The whole is other than the sum of its parts: where is the spatial data infra...
The whole is other than the sum of its parts: where is the spatial data infra...
 
Ethics of Automation
Ethics of AutomationEthics of Automation
Ethics of Automation
 
Challenges and opportunities of geo-social media
Challenges and opportunities of geo-social mediaChallenges and opportunities of geo-social media
Challenges and opportunities of geo-social media
 
Foresight Analytics
Foresight AnalyticsForesight Analytics
Foresight Analytics
 
Zeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadhZeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadh
 
Data: Activism, Access, Open
Data: Activism, Access, OpenData: Activism, Access, Open
Data: Activism, Access, Open
 
Urban Computing in LarKC
Urban Computing in LarKCUrban Computing in LarKC
Urban Computing in LarKC
 
(Social) Multiimedia Forensics
(Social) Multiimedia Forensics(Social) Multiimedia Forensics
(Social) Multiimedia Forensics
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Reality Mining and Urban Sensing

  • 1. Reality Mining, (Big Data) and Urban Sensing Darshan Santani ETH Zurich 15 April 2010
  • 3. Trivia Taxi Observations* by Location and Booking Frequency of Zone in Singapore1 15 April 2010 * Sampled dataset (~10,000 observations) 3
  • 4. Outline • Reality Mining • Applications • Holy Grail! • Challenges • Discussion and Q& A 15 April 2010 4
  • 5. Reality Mining Study2 • “ … collection and analysis of machine-sensed environmental data pertaining to human social behavior, with the goal to identify predictable patterns of future human behavior …” • .. extracting information from real world sensor data … • Reality Mining vs. Data Mining • Nathan Eagle, Alex (Sandy) Pentland, MIT, 2005 • 100 Mobile phones, 9 months, 45,000 hours of communication logs, location and proximity data 15 April 2010 5
  • 7. Key Results Social Network Analysis in the wild!3 15 April 2010 7
  • 8. Why do we care? • Social Science – Social Network Analysis – Behavioral Modeling – Human Mobility • Systems Research – Transportation – Environmental Modeling – Healthcare 15 April 2010 8
  • 9. Enabled Applications Human Mobility Patterns using CDRs 4 15 April 2010 9
  • 10. Why do we care (again)? • Social Science – Social Network Analysis – Behavioral Modeling – Human Mobility • Systems Research – Transportation – Environmental Modeling – Healthcare 15 April 2010 10
  • 11. Enabled Applications (contd.) Environmental Monitoring - Noisetube5 15 April 2010 11
  • 12. Real-time Traffic Monitoring 6 15 April 2010 12
  • 13. Mobile Millennium, UC Berkeley7  100 probe vehicles, carrying GPS-enabled N95  San Francisco Bay Area, California  Virtual Trip Lines (VTL) 15 April 2010 13
  • 14. Mobile Millennium, UC Berkeley 15 April 2010 14
  • 15. Holy Grail! • Urban Planning and Management – Real time city • Are the sidewalks along the Belleuve lake good for jogging today, given the air and noise pollution levels? – Macroscopic view • Is there a need for running supplementary tram services (or sending an additional fleet of taxis) towards the end of a soccer match between Switzerland and Germany? – Emergency/Crisis Response • 2009 Mumbai terrorist blasts – Disease Outbreak 15 April 2010 15
  • 16. Selective Information Broadcasting1 Booking Frequency by second 15 April 2010 16
  • 17. Holy Grail! • Urban Planning and Management – Real time city • Are the sidewalks along the Belleuve lake good for jogging today, given the air and noise pollution levels? – Macroscopic vs. Microscopic • Is there a need for running supplementary tram services (or sending an additional fleet of taxis) towards the end of a soccer match between Switzerland and Germany? – Emergency/Crisis Response • 2009 Mumbai terrorist blasts – Disease Outbreak/Epidemic Modeling 15 April 2010 17
  • 18. Challenge #1: Big Data • How big is big enough? – Wal-Mart: 100-400 GB/day of RFID data8 – LHC: 40 TB/day9 • Storage is cheap! • Stream data mining 15 April 2010 18
  • 19. Challenge #2: Abstraction • Low level details – Parallelism! – Task distribution – Load balancing – Fault tolerance • Programming Productivity • Google’s MapReduce 15 April 2010 19
  • 20. Challenge #3: Privacy(!) • A “new deal” on data? 10 – right to possess your data – control the use of your data – right to distribute or dispose your data • How thin or thick the line is between publicity and privacy? • Trivia again! – Erica is travelling to Helsinki in May 2010? – Florian and Stephan visited Brussels in February 2010? 15 April 2010 20
  • 21. Big Money! IBM Smarter Planet HP CeNSE 15 April 2010 21
  • 22. Q&A
  • 23. Takeaway Message Last 5 years have spurred an industrial revolution of sensor data. I believe that applying empirical (and later, computational methodologies) on this real world data would help us better understand the underlying cognitive, social, policy and engineering issues present in our socio-technical systems. Reality Mining, which sits at the intersection of computer science, statistics and social science, fits in this role nicely. 15 April 2010 23
  • 24. References 1. Darshan Santani, Rajesh Krishna Balan, and C. Jason Woodard, Understanding and Improving a GPS-based Taxi System, In 6th USENIX International Conference on Mobile Systems, Applications, and Services (MobiSys), Breckenridge, Colorado, June 2008 2. N. Eagle and A. (Sandy) Pentland. Reality mining: sensing complex social systems. Personal Ubiquitous Computing, 10(4):255–268, 2006. 3. N. Eagle, A. S. Pentland, and D. Lazer. Inferring friendship network structure by using mobile phone data. Proceedings of the National Academy of Sciences, 106(36):15274–15278, 2009 4. C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi. Limits of Predictability in Human Mobility. Science, 327(5968):1018–1021, 2010 5. N. Maisonneuve, M. Stevens, M. E. Niessen, and L. Steels.Noisetube: Measuring and mapping noise pollution with mobile phones. In I. N. Athanasiadis, P. A. Mitkas, A. E.Rizzoli, and J. M. Gómez, editors, ITEE, pages 215–228. Springer, 2009. 6. J. Yoon, B. Noble, and M. Liu. Surface street traffic estimation. In MobiSys ’07: Proceedings of the 5th International conference on Mobile systems, applications and services, pages 220– 232, New York, NY, USA, 2007 7. J. C. Herrera, D. B.Work, R. Herring, X. J. Ban, , and A. M.Bayen. Evaluation of traffic data obtained via gps-enabled mobile phones: the mobile century field experiment. Working Paper, UCB-ITS-VWP-2009-8, August 2009 8. I. Alexander, G. Andrea, M. Florian, and E. Fleisch.Estimating data volumes of rfid-enabled supply chains. In AMCIS 2009 Proceedings, page 636, 2009 9. CERN LHC Computing. http://public.web.cern.ch/public/en/LHC/Computing-en.html, April 2010 10. Alex (Sandy) Pentland, Reality Mining for Companies, in O’reilly Where2.0 Conference, May 19- 21, SanJose CA, 2009 15 April 2010 24
  • 25. Thank you! Please feel free to contact dsantani@student.ethz.ch for more details.