SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 1
Big Data Projects: Unknowns, Estimates and
Returns
Rim Moussa
ENI-Carthage
University of Carthage
LaTICE Lab.
7th
Euro-African Conference on Finance and Economics @
Beït-al-Hekma, Carthage
22nd
of June, 2018
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 2
Outline
●Big Data: the 5 V's
●Overview of some Big Data Vertical Markets
»Web (Google)
»Social Networks (Facebook)
»Maritime Trajectory (Marine Traffic)
●Big Data Projects
»Costs Estimates?
»Failures' causes?
»“High qualifications” Pattern
●DEBS'2018 Grand Challenge
●Conclusions
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 3
Big Data
↬ The 5 V's
●Volume
»Data at rest: historical data
»Volume refers to the amount of data (terabytes to
petabytes),
»the challenge is data processing at scale.
●Velocity
»Data-in-motion
»Velocity refers to the speed at which new data is
generated,
»The challenge is to integrate and analyze data while it is
being generated.
●Variety
»Data in many forms
»Variety refers to different types of data; e.g. structured
(relational data), semi-structured (XML, JSON, BSON),
unstructured, multimedia,
»The challenge is processing different types of data.
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 4
Big Data
↬ The 5 V's
●Veracity
»Veracity refers to the messiness or trustworthiness of the
data,
»the challenge is to integrate uncertain data quality in data
sources
●Value
»Value refers to our ability to turn data into value.
»Big invests (infrastructure, experts, software dvpt...) ,
returned insights must lead to valuable insights
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 5
Big Data
↬ Information Retrieval on Web Data
●Crawling, Indexing, Information Retrieval
●The 5 V's
»Volume
●How big is the web?
●Google: the Indexed Web contains at least ~45 billion of
pages (Saturday, 16 June, 2018).
●http://www.worldwidewebsize.com/
»Velocity
●Real-time data: web pages edit, new web content, ...
»Variety
●Web docs (.html), Text docs (.doc,.pdf), Images, Videos,
News,
»Veracity
●To investigate
»Value
●To investigate
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 6
Big Data
↬ Google vs Bing
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 7
Big Data
↬ Google Business Model
●Leader in Algorithmic Search Technology
●Google Revenue Equation
» Revenue = Amount of Time on the Web
» websites hold a Google Ad slot
●Hidden revenue business model
»Keeps users out of the equation, so they don't pay for the
service or product offered,
●The revenue streams come from advertising money spent by
businesses bidding on keywords
●As of 2017, over 90 billion dollars, which consisted of 86% of
google revenues came from advertising
●Google AdWords and Google Ad Sense
●A win-win-win business model
»CPC (cost per click)
»CPM (cost per mile, cost per thousand)
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 8
Big Data
↬ Social Networks
●Facebook companies
»Facebook Payments Inc.: to let Facebook generate
revenue through payment business.
»Atlas: ad-serving and measurement platform, offering
services to advertisers and agencies.
»Instagram: Media Sharing Platform.
»Onavo: Mobile utility application.
»Parse: back end infrastructure provider for mobile
applications.
»Moves: Exercise (steps) tracking application.
»Oculus: Virtual reality technology.
»LiveRail: Publisher Monetization Platform.
»WhatsApp: Instant Messaging Client.
»Masquerade: Visual Filters mobile application.
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 9
Big Data
↬ Facebook Ads on Users ' Timelines
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 10
Big Data
↬ Twitter
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 11
Big Data
↬ Trajectory Maritime Data
Snapshot of vessels tracked by MarineTraffic on 22nd
of June, 2018 (8am GMT+1)
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 12
Big Data
↬ Flights Data
Snapshot of vessels tracked by flightradar on 22nd
of June, 2018 (8am GMT+1)
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 13
Big Data
↬ The 5 V's by example: Trajectory Maritime Data
●The Danish Maritime Authority (DMA) makes historical
AIS data (2006 : 2018) available to anyone interested,
»1.8 TB
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 14
Big Data
↬ Trajectory Maritime Data: Data Pricing
1000 Danish Krone = 134.2 euros (on 22nd
of June 2018)
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 15
Big Data
↬ MarineTraffic Online Services
prices on 1st
June, 2018
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 16
Big Data
↬ MarineTraffic Online Services
prices on 1st
June, 2018
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 17
Big Data
↬ MarineTraffic Online Services
prices on 1st
June, 2018
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 18
Big Data
↬ Maritime Traffic: New Agenda
●Autonomous Vessels
●Smart Vessels
●Connected Vessels
●Increase
»Maritime surveillance
»Safety
»Security
»Economy
●Optimum route planning
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 19
Big Data
↬ Maritime Traffic: New Agenda
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 20
Data Generation & Consumption Models
●Old Model
»Few companies (producers) are generating data, all
others are consuming data
●New Model
»All of us are generating data, and all of us are
consuming data
If you aren’t paying for it,
you’re the product!
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 21
Big Data Project Cost Estimation
●Hardware cost
●Software cost
●Humans Resources cost
»Hardware technicians
»Software developers
»Domain experts
»Decision makers
»Researchers
●...
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 22
Hardware Cost
●Hardware Trends
»Large Hard Drives Capacities
»High computing Capacities
»High speed networks
»I/O bottleneck
●I/O bottleneck
»Hard Drives are like bottles of
different sizes having the same
throughput
●Solution:
»Aggregate RW throughputs
»Read/Write from/into multiple hard
drives
●RAID systems by Patterson, Gibson
and Katz @ Berkeley University
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 23
New Hardware Architecture for Big Data
»But, High cost → let's migrate data/software to
cloud,
New Software,
High power consumption,
Cooling systems ...
Scale-out: Horizontal Scaling
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 24
Big Data Technologies
↬ Landscape
Source: https://chiefmartec.com/2017/05/marketing-techniology-landscape-supergraphic-2017/
Retrieved: 1st
of June 2018
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 25
Why some Big Data Projects Fail?
●Unknown Unknowns
»"There are known knowns; there are things we
know we know. ... There are known unknowns; that
is to say we know there are some things we do not
know. But there are also unknown unknowns -- the
ones we don't know we don't know." D. Rumsfeld (US
Defense Secretary, 2002)
●Data will Speak for themselves
»business questions which are undefined or imprecise
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 26
Why do some Big Data Projects Fail?
●Data quality and provenance,
●Hardware Cost
●Complex architectures
●Cost of hiring skilled teams,
»Expert software developers (Graduate studies in CS)
»Business experts in each vertical market
●Immature technologies
●Cost management of systems in the cloud
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 27
Qualifications
↬ Example: DEBS'2018 Grand Challenge
●Data
»Static information
●Ports' locations around the world.
»History Data of data streams
●Each ship sends a tuple according to its behaviour based
on the AIS specifications
●Queries
»Q1: Predicting destinations of ships
»Q2: Predicting arrival times of ships
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 28
DEBS'2018 Grand Challenge
↬ Computing Vessels' Trips Patterns (by R. Moussa, 2018)
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 29
DEBS'2018 Grand Challenge
↬ Real-time prediction of vessels' future locations
(by R. Moussa, 2018)
Departure Port
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 30
time
DEBS'2018 Grand Challenge
↬ Real-time prediction of vessels' future locations
(by R. Moussa, 2018)
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 31
time
DEBS'2018 Grand Challenge
↬ Real-time prediction of vessels' future locations
(by R. Moussa, 2018)
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 32
DEBS'2018 Grand Challenge
↬ Solution's Engineering (graduate classes)
●Theoretical knowledge
»Advanced System Architectures
»Distributed processing
»Advanced Algorithmics
»Spatial data processing
»Information Retrieval
»Engineering a solution
●Practical Knowledge
»Java programing
»Big Data Frameworks
●Apache Spark
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 33
Conclusions
●There is still big room for innovations and improvement in
several directions including: architecture, applications and
systems
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 34
Thank you for your Attention
Q & A
Big Data Projects: Unknowns, Estimates and
Returns
Rim Moussa
22nd
of June, 2018
7th
Euro-African Conference on Finance and Economics
@ Beït-al-Hekma, Carthage
22nd
June, 2018 7th
Euro-African Conference on Finance and Economics 35
About Me
Rim Moussa is a tenured associate professor at University of Carthage, and
researcher at LaTICE lab.. She is also habilitated as associate professor in Computer
Science Engineering by the the French National Council of Universities. She received
her M.Sc. and Ph.D in Computer Science (Scalable and Distributed Data Management
Systems) from Université Paris IX Dauphine (France) under the supervision of Pr.
Witold LITWIN.
She ensures both undergraduate and graduate lectures, related to operating
systems, distributed data management systems, agile methods for software
engineering, business intelligence fundamentals and practices: Data Warehousing
and OLAP, NoSQL databases, Spatial databases, and Cloud Computing & High
Performance Computing (Big Data, Apache Hadoop, Apache Spark..).
She participated to multiple R&D projects (SDDS fund by Microsoft CERIA, HA Grid
CERN, ICONS, GORDA, WebArchive, DataScale PIA Inria). Her current research
interests include Scalable and Distributed Data Management systems,
Multidimensional data modeling and querying, Data warehousing and OLAP, Smart
Cities, Big Data Architectures at scale and Spatial Computing at scale.

Contenu connexe

Tendances

Sistema de recomendación entiempo real usando Delta Lake
Sistema de recomendación entiempo real usando Delta LakeSistema de recomendación entiempo real usando Delta Lake
Sistema de recomendación entiempo real usando Delta LakeGlobant
 
Bde sc3 2nd_workshop_2016_10_04_p09_csi
Bde sc3 2nd_workshop_2016_10_04_p09_csiBde sc3 2nd_workshop_2016_10_04_p09_csi
Bde sc3 2nd_workshop_2016_10_04_p09_csiBigData_Europe
 
Demystifying Big Data with Scala and Akka
Demystifying Big Data with Scala and AkkaDemystifying Big Data with Scala and Akka
Demystifying Big Data with Scala and AkkaKnoldus Inc.
 
Das QROWD-Projekt - Because Big Data Integration is Humanly Possible
Das QROWD-Projekt - Because Big Data Integration is Humanly PossibleDas QROWD-Projekt - Because Big Data Integration is Humanly Possible
Das QROWD-Projekt - Because Big Data Integration is Humanly PossibleLeipziger Semantic Web Tag
 
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sap
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sapBde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sap
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sapBigData_Europe
 
Big Data Europe Concept and Platform
Big Data Europe Concept and PlatformBig Data Europe Concept and Platform
Big Data Europe Concept and PlatformBigData_Europe
 
Open data hackathon jelgava - report
Open data hackathon   jelgava - reportOpen data hackathon   jelgava - report
Open data hackathon jelgava - reportWirelessInfo
 
Schneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama PapersSchneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama PapersNeo4j
 
Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...
Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...
Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...Alina Vilk
 
Big data overwiew
Big data overwiewBig data overwiew
Big data overwiewDataArt
 

Tendances (10)

Sistema de recomendación entiempo real usando Delta Lake
Sistema de recomendación entiempo real usando Delta LakeSistema de recomendación entiempo real usando Delta Lake
Sistema de recomendación entiempo real usando Delta Lake
 
Bde sc3 2nd_workshop_2016_10_04_p09_csi
Bde sc3 2nd_workshop_2016_10_04_p09_csiBde sc3 2nd_workshop_2016_10_04_p09_csi
Bde sc3 2nd_workshop_2016_10_04_p09_csi
 
Demystifying Big Data with Scala and Akka
Demystifying Big Data with Scala and AkkaDemystifying Big Data with Scala and Akka
Demystifying Big Data with Scala and Akka
 
Das QROWD-Projekt - Because Big Data Integration is Humanly Possible
Das QROWD-Projekt - Because Big Data Integration is Humanly PossibleDas QROWD-Projekt - Because Big Data Integration is Humanly Possible
Das QROWD-Projekt - Because Big Data Integration is Humanly Possible
 
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sap
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sapBde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sap
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sap
 
Big Data Europe Concept and Platform
Big Data Europe Concept and PlatformBig Data Europe Concept and Platform
Big Data Europe Concept and Platform
 
Open data hackathon jelgava - report
Open data hackathon   jelgava - reportOpen data hackathon   jelgava - report
Open data hackathon jelgava - report
 
Schneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama PapersSchneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama Papers
 
Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...
Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...
Big data overwiew, Татьяна Матвиенко/Александр Павленко, Senior Java/BigData ...
 
Big data overwiew
Big data overwiewBig data overwiew
Big data overwiew
 

Similaire à Big Data Projects

DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!FAO
 
IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...
IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...
IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...StatsCommunications
 
Business intelligence on the US greentech market
Business intelligence on the US greentech marketBusiness intelligence on the US greentech market
Business intelligence on the US greentech marketEC2i
 
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)ICARUS2020.aero
 
GTM Research Welcome Presentation
GTM Research Welcome PresentationGTM Research Welcome Presentation
GTM Research Welcome PresentationNicole Green
 
GTM Research Presentation with ComEd
GTM Research Presentation with ComEdGTM Research Presentation with ComEd
GTM Research Presentation with ComEdNicole Green
 
SC6 Workshop 1: What can big data do for you?
SC6 Workshop 1: What can big data do for you? SC6 Workshop 1: What can big data do for you?
SC6 Workshop 1: What can big data do for you? BigData_Europe
 
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...Santiago Cabrera-Naranjo
 
[Day 1] Welcome, Looking Back, and Agenda
[Day 1] Welcome, Looking Back, and Agenda[Day 1] Welcome, Looking Back, and Agenda
[Day 1] Welcome, Looking Back, and Agendacsi2009
 
Business intelligence on the chinese greentech market
Business intelligence on the chinese greentech marketBusiness intelligence on the chinese greentech market
Business intelligence on the chinese greentech marketEC2i
 
TFF2023 - Navigating Tourism Data Nexus
TFF2023 - Navigating Tourism Data NexusTFF2023 - Navigating Tourism Data Nexus
TFF2023 - Navigating Tourism Data NexusTourismFastForward
 
Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Miguel González-Fierro
 
EUBrasilCloudFORUM Final results
EUBrasilCloudFORUM Final resultsEUBrasilCloudFORUM Final results
EUBrasilCloudFORUM Final resultsATMOSPHERE .
 
2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...
2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...
2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...apidays
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Geoffrey Fox
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Geoffrey Fox
 

Similaire à Big Data Projects (20)

DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!
 
IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...
IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...
IAOS 2018 - Regional statistics agency as a change agent – experiences of GCC...
 
Business intelligence on the US greentech market
Business intelligence on the US greentech marketBusiness intelligence on the US greentech market
Business intelligence on the US greentech market
 
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
 
The Innovator #7
The Innovator #7The Innovator #7
The Innovator #7
 
BDVA default slide pack
BDVA default slide packBDVA default slide pack
BDVA default slide pack
 
The Digital Economy & Global Value Chains: Implications for Korea
The Digital Economy & Global Value Chains: Implications for KoreaThe Digital Economy & Global Value Chains: Implications for Korea
The Digital Economy & Global Value Chains: Implications for Korea
 
GTM Research Welcome Presentation
GTM Research Welcome PresentationGTM Research Welcome Presentation
GTM Research Welcome Presentation
 
Datapreneurs
DatapreneursDatapreneurs
Datapreneurs
 
GTM Research Presentation with ComEd
GTM Research Presentation with ComEdGTM Research Presentation with ComEd
GTM Research Presentation with ComEd
 
SC6 Workshop 1: What can big data do for you?
SC6 Workshop 1: What can big data do for you? SC6 Workshop 1: What can big data do for you?
SC6 Workshop 1: What can big data do for you?
 
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
 
[Day 1] Welcome, Looking Back, and Agenda
[Day 1] Welcome, Looking Back, and Agenda[Day 1] Welcome, Looking Back, and Agenda
[Day 1] Welcome, Looking Back, and Agenda
 
Business intelligence on the chinese greentech market
Business intelligence on the chinese greentech marketBusiness intelligence on the chinese greentech market
Business intelligence on the chinese greentech market
 
TFF2023 - Navigating Tourism Data Nexus
TFF2023 - Navigating Tourism Data NexusTFF2023 - Navigating Tourism Data Nexus
TFF2023 - Navigating Tourism Data Nexus
 
Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19Knowledge Graph Recommendation Systems For COVID-19
Knowledge Graph Recommendation Systems For COVID-19
 
EUBrasilCloudFORUM Final results
EUBrasilCloudFORUM Final resultsEUBrasilCloudFORUM Final results
EUBrasilCloudFORUM Final results
 
2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...
2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...
2022 apidays LIVE Helsinki & North_Open banking APIs for sustainability: An o...
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
 

Plus de Rim Moussa

polystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdfpolystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdfRim Moussa
 
ER 2016 Tutorial
ER 2016 TutorialER 2016 Tutorial
ER 2016 TutorialRim Moussa
 
Ismis2014 dbaas expert
Ismis2014 dbaas expertIsmis2014 dbaas expert
Ismis2014 dbaas expertRim Moussa
 
Parallel Sequence Generator
Parallel Sequence GeneratorParallel Sequence Generator
Parallel Sequence GeneratorRim Moussa
 
Hadoop ensma poitiers
Hadoop ensma poitiersHadoop ensma poitiers
Hadoop ensma poitiersRim Moussa
 
Multidimensional DB design, revolving TPC-H benchmark into OLAP bench
Multidimensional DB design, revolving TPC-H benchmark into OLAP benchMultidimensional DB design, revolving TPC-H benchmark into OLAP bench
Multidimensional DB design, revolving TPC-H benchmark into OLAP benchRim Moussa
 
Automation of MultiDimensional DB Design (poster)
Automation of MultiDimensional DB Design (poster)Automation of MultiDimensional DB Design (poster)
Automation of MultiDimensional DB Design (poster)Rim Moussa
 
TPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data cloudsTPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data cloudsRim Moussa
 
Benchmarking data warehouse systems in the cloud: new requirements & new metrics
Benchmarking data warehouse systems in the cloud: new requirements & new metricsBenchmarking data warehouse systems in the cloud: new requirements & new metrics
Benchmarking data warehouse systems in the cloud: new requirements & new metricsRim Moussa
 
highly available distributed databases (poster)
highly available distributed databases (poster)highly available distributed databases (poster)
highly available distributed databases (poster)Rim Moussa
 

Plus de Rim Moussa (15)

polystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdfpolystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdf
 
ISNCC 2017
ISNCC 2017ISNCC 2017
ISNCC 2017
 
EMR AWS Demo
EMR AWS DemoEMR AWS Demo
EMR AWS Demo
 
ER 2016 Tutorial
ER 2016 TutorialER 2016 Tutorial
ER 2016 Tutorial
 
BICOD-2017
BICOD-2017BICOD-2017
BICOD-2017
 
Asd 2015
Asd 2015Asd 2015
Asd 2015
 
Ismis2014 dbaas expert
Ismis2014 dbaas expertIsmis2014 dbaas expert
Ismis2014 dbaas expert
 
Parallel Sequence Generator
Parallel Sequence GeneratorParallel Sequence Generator
Parallel Sequence Generator
 
Hadoop ensma poitiers
Hadoop ensma poitiersHadoop ensma poitiers
Hadoop ensma poitiers
 
Multidimensional DB design, revolving TPC-H benchmark into OLAP bench
Multidimensional DB design, revolving TPC-H benchmark into OLAP benchMultidimensional DB design, revolving TPC-H benchmark into OLAP bench
Multidimensional DB design, revolving TPC-H benchmark into OLAP bench
 
Automation of MultiDimensional DB Design (poster)
Automation of MultiDimensional DB Design (poster)Automation of MultiDimensional DB Design (poster)
Automation of MultiDimensional DB Design (poster)
 
TPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data cloudsTPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data clouds
 
Benchmarking data warehouse systems in the cloud: new requirements & new metrics
Benchmarking data warehouse systems in the cloud: new requirements & new metricsBenchmarking data warehouse systems in the cloud: new requirements & new metrics
Benchmarking data warehouse systems in the cloud: new requirements & new metrics
 
highly available distributed databases (poster)
highly available distributed databases (poster)highly available distributed databases (poster)
highly available distributed databases (poster)
 
parallel OLAP
parallel OLAPparallel OLAP
parallel OLAP
 

Dernier

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 

Dernier (20)

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 

Big Data Projects

  • 1. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 1 Big Data Projects: Unknowns, Estimates and Returns Rim Moussa ENI-Carthage University of Carthage LaTICE Lab. 7th Euro-African Conference on Finance and Economics @ Beït-al-Hekma, Carthage 22nd of June, 2018
  • 2. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 2 Outline ●Big Data: the 5 V's ●Overview of some Big Data Vertical Markets »Web (Google) »Social Networks (Facebook) »Maritime Trajectory (Marine Traffic) ●Big Data Projects »Costs Estimates? »Failures' causes? »“High qualifications” Pattern ●DEBS'2018 Grand Challenge ●Conclusions
  • 3. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 3 Big Data ↬ The 5 V's ●Volume »Data at rest: historical data »Volume refers to the amount of data (terabytes to petabytes), »the challenge is data processing at scale. ●Velocity »Data-in-motion »Velocity refers to the speed at which new data is generated, »The challenge is to integrate and analyze data while it is being generated. ●Variety »Data in many forms »Variety refers to different types of data; e.g. structured (relational data), semi-structured (XML, JSON, BSON), unstructured, multimedia, »The challenge is processing different types of data.
  • 4. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 4 Big Data ↬ The 5 V's ●Veracity »Veracity refers to the messiness or trustworthiness of the data, »the challenge is to integrate uncertain data quality in data sources ●Value »Value refers to our ability to turn data into value. »Big invests (infrastructure, experts, software dvpt...) , returned insights must lead to valuable insights
  • 5. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 5 Big Data ↬ Information Retrieval on Web Data ●Crawling, Indexing, Information Retrieval ●The 5 V's »Volume ●How big is the web? ●Google: the Indexed Web contains at least ~45 billion of pages (Saturday, 16 June, 2018). ●http://www.worldwidewebsize.com/ »Velocity ●Real-time data: web pages edit, new web content, ... »Variety ●Web docs (.html), Text docs (.doc,.pdf), Images, Videos, News, »Veracity ●To investigate »Value ●To investigate
  • 6. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 6 Big Data ↬ Google vs Bing
  • 7. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 7 Big Data ↬ Google Business Model ●Leader in Algorithmic Search Technology ●Google Revenue Equation » Revenue = Amount of Time on the Web » websites hold a Google Ad slot ●Hidden revenue business model »Keeps users out of the equation, so they don't pay for the service or product offered, ●The revenue streams come from advertising money spent by businesses bidding on keywords ●As of 2017, over 90 billion dollars, which consisted of 86% of google revenues came from advertising ●Google AdWords and Google Ad Sense ●A win-win-win business model »CPC (cost per click) »CPM (cost per mile, cost per thousand)
  • 8. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 8 Big Data ↬ Social Networks ●Facebook companies »Facebook Payments Inc.: to let Facebook generate revenue through payment business. »Atlas: ad-serving and measurement platform, offering services to advertisers and agencies. »Instagram: Media Sharing Platform. »Onavo: Mobile utility application. »Parse: back end infrastructure provider for mobile applications. »Moves: Exercise (steps) tracking application. »Oculus: Virtual reality technology. »LiveRail: Publisher Monetization Platform. »WhatsApp: Instant Messaging Client. »Masquerade: Visual Filters mobile application.
  • 9. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 9 Big Data ↬ Facebook Ads on Users ' Timelines
  • 10. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 10 Big Data ↬ Twitter
  • 11. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 11 Big Data ↬ Trajectory Maritime Data Snapshot of vessels tracked by MarineTraffic on 22nd of June, 2018 (8am GMT+1)
  • 12. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 12 Big Data ↬ Flights Data Snapshot of vessels tracked by flightradar on 22nd of June, 2018 (8am GMT+1)
  • 13. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 13 Big Data ↬ The 5 V's by example: Trajectory Maritime Data ●The Danish Maritime Authority (DMA) makes historical AIS data (2006 : 2018) available to anyone interested, »1.8 TB
  • 14. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 14 Big Data ↬ Trajectory Maritime Data: Data Pricing 1000 Danish Krone = 134.2 euros (on 22nd of June 2018)
  • 15. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 15 Big Data ↬ MarineTraffic Online Services prices on 1st June, 2018
  • 16. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 16 Big Data ↬ MarineTraffic Online Services prices on 1st June, 2018
  • 17. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 17 Big Data ↬ MarineTraffic Online Services prices on 1st June, 2018
  • 18. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 18 Big Data ↬ Maritime Traffic: New Agenda ●Autonomous Vessels ●Smart Vessels ●Connected Vessels ●Increase »Maritime surveillance »Safety »Security »Economy ●Optimum route planning
  • 19. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 19 Big Data ↬ Maritime Traffic: New Agenda
  • 20. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 20 Data Generation & Consumption Models ●Old Model »Few companies (producers) are generating data, all others are consuming data ●New Model »All of us are generating data, and all of us are consuming data If you aren’t paying for it, you’re the product!
  • 21. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 21 Big Data Project Cost Estimation ●Hardware cost ●Software cost ●Humans Resources cost »Hardware technicians »Software developers »Domain experts »Decision makers »Researchers ●...
  • 22. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 22 Hardware Cost ●Hardware Trends »Large Hard Drives Capacities »High computing Capacities »High speed networks »I/O bottleneck ●I/O bottleneck »Hard Drives are like bottles of different sizes having the same throughput ●Solution: »Aggregate RW throughputs »Read/Write from/into multiple hard drives ●RAID systems by Patterson, Gibson and Katz @ Berkeley University
  • 23. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 23 New Hardware Architecture for Big Data »But, High cost → let's migrate data/software to cloud, New Software, High power consumption, Cooling systems ... Scale-out: Horizontal Scaling
  • 24. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 24 Big Data Technologies ↬ Landscape Source: https://chiefmartec.com/2017/05/marketing-techniology-landscape-supergraphic-2017/ Retrieved: 1st of June 2018
  • 25. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 25 Why some Big Data Projects Fail? ●Unknown Unknowns »"There are known knowns; there are things we know we know. ... There are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know." D. Rumsfeld (US Defense Secretary, 2002) ●Data will Speak for themselves »business questions which are undefined or imprecise
  • 26. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 26 Why do some Big Data Projects Fail? ●Data quality and provenance, ●Hardware Cost ●Complex architectures ●Cost of hiring skilled teams, »Expert software developers (Graduate studies in CS) »Business experts in each vertical market ●Immature technologies ●Cost management of systems in the cloud
  • 27. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 27 Qualifications ↬ Example: DEBS'2018 Grand Challenge ●Data »Static information ●Ports' locations around the world. »History Data of data streams ●Each ship sends a tuple according to its behaviour based on the AIS specifications ●Queries »Q1: Predicting destinations of ships »Q2: Predicting arrival times of ships
  • 28. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 28 DEBS'2018 Grand Challenge ↬ Computing Vessels' Trips Patterns (by R. Moussa, 2018)
  • 29. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 29 DEBS'2018 Grand Challenge ↬ Real-time prediction of vessels' future locations (by R. Moussa, 2018) Departure Port
  • 30. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 30 time DEBS'2018 Grand Challenge ↬ Real-time prediction of vessels' future locations (by R. Moussa, 2018)
  • 31. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 31 time DEBS'2018 Grand Challenge ↬ Real-time prediction of vessels' future locations (by R. Moussa, 2018)
  • 32. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 32 DEBS'2018 Grand Challenge ↬ Solution's Engineering (graduate classes) ●Theoretical knowledge »Advanced System Architectures »Distributed processing »Advanced Algorithmics »Spatial data processing »Information Retrieval »Engineering a solution ●Practical Knowledge »Java programing »Big Data Frameworks ●Apache Spark
  • 33. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 33 Conclusions ●There is still big room for innovations and improvement in several directions including: architecture, applications and systems
  • 34. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 34 Thank you for your Attention Q & A Big Data Projects: Unknowns, Estimates and Returns Rim Moussa 22nd of June, 2018 7th Euro-African Conference on Finance and Economics @ Beït-al-Hekma, Carthage
  • 35. 22nd June, 2018 7th Euro-African Conference on Finance and Economics 35 About Me Rim Moussa is a tenured associate professor at University of Carthage, and researcher at LaTICE lab.. She is also habilitated as associate professor in Computer Science Engineering by the the French National Council of Universities. She received her M.Sc. and Ph.D in Computer Science (Scalable and Distributed Data Management Systems) from Université Paris IX Dauphine (France) under the supervision of Pr. Witold LITWIN. She ensures both undergraduate and graduate lectures, related to operating systems, distributed data management systems, agile methods for software engineering, business intelligence fundamentals and practices: Data Warehousing and OLAP, NoSQL databases, Spatial databases, and Cloud Computing & High Performance Computing (Big Data, Apache Hadoop, Apache Spark..). She participated to multiple R&D projects (SDDS fund by Microsoft CERIA, HA Grid CERN, ICONS, GORDA, WebArchive, DataScale PIA Inria). Her current research interests include Scalable and Distributed Data Management systems, Multidimensional data modeling and querying, Data warehousing and OLAP, Smart Cities, Big Data Architectures at scale and Spatial Computing at scale.