SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Simulate all
the things !
Gautier Krings
Svend Vanderveken
BigData.be meetup
January 2018
We should talk about
random data
CDRs
(caller,
callee,
time,
position…)
Analytics
(RFM,
network
analysis,
mobility…)
Recommendations
(reporting,
campaigns…)
A typical pipeline for a
telecom analytics company
How do I know if my
code is robust
to larger sizes?
to errors in data?
to diversity in
dimensions?
Also, don’t forget… is coming
Using client data, are you?
The search for “good” fake data
Looks like real data Parameterizable Same complexity
Trumania
A free open-source
random data generator
Developed and
maintained by
Demo time!
Populations of customers, shops, locations, SIM cards…
Essentially, dimensional data
person = create_population(
name="person", size=1000000, id_generator=...)
person.create_attribute("NAME", values=...)
person.create_attribute("AGE", generator=...)
+-------------------+-----------------+---------+
| | NAME | AGE |
|-------------------+-----------------+---------|
| PERSON_0000000000 | Amy Berger | 28.588 |
| PERSON_0000000001 | Michael Curry | 28.7499 |
| PERSON_0000000002 | Robert Ramirez | 35.9242 |
| PERSON_0000000003 | Derek Gonzalez | 34.7091 |
Generators: any parametric distribution, empirical
distribution from real data or deterministic
Story: dynamic aspects of a scenario: sequence of
operations that generate logs
example_story = create_story(
initiating_population=person,
member_id_field="PERSON_ID", …
)
+-------+-------------------+
| | PERSON_ID |
|-------+-------------------+
| 0 | PERSON_0000000000 |
| 1 | PERSON_0000000001 |
| 2 | PERSON_0000000002 |
| 3 | PERSON_0000000003 |
example_story.set_operations(
clock.timestamp(named_as="TIME"),
ConstantGenerator(value="hello world").generate(named_as="MESSAGE")
)
+-------------------+---------------------+-------------+
| PERSON_ID | TIME | MESSAGE |
+-------------------+---------------------+-------------|
| PERSON_0000000000 | 2017-01-01 01:14:12 | hello world |
| PERSON_0000000001 | 2017-01-01 01:23:51 | hello world |
| PERSON_0000000002 | 2017-01-01 01:37:11 | hello world |
| PERSON_0000000003 | 2017-01-01 01:33:12 | hello world |
example_story.set_operations(
clock.timestamp(named_as="TIME"),
FakerGenerator(method="word").generate(named_as="MESSAGE"),
person.select_one(named_as="OTHER_PERSON") ...)
+-------------------+---------------------+-------------+-------------------+
| PERSON_ID | TIME | MESSAGE | OTHER_PERSON |
+-------------------+---------------------+-------------+-------------------+
| PERSON_0000000000 | 2017-01-01 01:14:12 | motorbike | PERSON_0000000852 |
| PERSON_0000000001 | 2017-01-01 01:23:51 | tree | PERSON_0000000429 |
| PERSON_0000000002 | 2017-01-01 01:37:11 | Sunday | PERSON_0000000925 |
| PERSON_0000000003 | 2017-01-01 01:33:12 | table | PERSON_0000000347 |
Simulate all the things!
Weighted relationships between elements of your
populations: social networks, mobility preferences,
hierarchical distribution networks...
example_story.set_operations(
clock.timestamp(named_as="TIME"),
FakerGenerator(method="word").generate(named_as="MESSAGE"),
person.get_relationship("friends").select_one(named_as="OTHER_PERSON"),
person.lookup(select={"NAME": "EMITTER_NAME"}),
person.lookup(select={"NAME": "RECEIVER_NAME"}) )
+------------+----------+---------+--------------+---------------+---------------------+
| PERSON_ID | TIME | MESSAGE | OTHER_PERSON | EMITTER_NAME | RECEIVER_NAME |
+------------+----------+---------+--------------+---------------+---------------------|
| PERSON_000 | 01:14:12 | Become | PERSON_058 | Ann Cruz | Victoria Washington |
| PERSON_001 | 01:23:51 | Month | PERSON_066 | Kimberly S | Steven Williams |
| PERSON_002 | 01:37:11 | Blue | PERSON_013 | Bethany Smith | Frances Davis |
A relationship relating each person to 20 friends on
average shows like this in the output dataset
We can add temporal patterns for user activity, shop hours...
1.0
github.com/RealImpactAnalytics/trumania trumania.slack.com
gautier@jetpack.ai
Thank you
We’re hiring!
info@realimpactanalytics.com
Svend Vanderveken
http://svend.kelesia.com
@sv3ndk

Contenu connexe

Similaire à Trumania: generate all the things!

Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics PlatformIntroduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
Srinath Perera
 
Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...
Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...
Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...
PROIDEA
 

Similaire à Trumania: generate all the things! (20)

Machine Learning with Microsoft Azure
Machine Learning with Microsoft AzureMachine Learning with Microsoft Azure
Machine Learning with Microsoft Azure
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
 
Using a mobile phone as a therapist - Superweek 2018
Using a mobile phone as a therapist - Superweek 2018Using a mobile phone as a therapist - Superweek 2018
Using a mobile phone as a therapist - Superweek 2018
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case study
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail Marketing
 
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics PlatformIntroduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Learning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard WayLearning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard Way
 
Gartner BI PlaceIQ presentation with Kognitio
Gartner BI PlaceIQ presentation with KognitioGartner BI PlaceIQ presentation with Kognitio
Gartner BI PlaceIQ presentation with Kognitio
 
Oracle 122 partitioning_in_action_slide_share
Oracle 122 partitioning_in_action_slide_shareOracle 122 partitioning_in_action_slide_share
Oracle 122 partitioning_in_action_slide_share
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
Common Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsCommon Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo apps
 
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day MunichReal Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
 
Snowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessSnowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your business
 
Enterprise Analytics 2016 - IIH Nordic Int.
Enterprise Analytics 2016 - IIH Nordic Int.Enterprise Analytics 2016 - IIH Nordic Int.
Enterprise Analytics 2016 - IIH Nordic Int.
 
Enterprise Analytics - WAW Copenhagen - January 20th 2016
Enterprise Analytics - WAW Copenhagen - January 20th 2016Enterprise Analytics - WAW Copenhagen - January 20th 2016
Enterprise Analytics - WAW Copenhagen - January 20th 2016
 
Practical JSON in MySQL 5.7 and Beyond
Practical JSON in MySQL 5.7 and BeyondPractical JSON in MySQL 5.7 and Beyond
Practical JSON in MySQL 5.7 and Beyond
 
Powering Systems of Engagement
Powering Systems of EngagementPowering Systems of Engagement
Powering Systems of Engagement
 
Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...
Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...
Atmosphere 2014: Hadoop: Challenge accepted! - Arkadiusz Osinski, Robert Mroc...
 

Dernier

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Dernier (20)

What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 

Trumania: generate all the things!