SlideShare a Scribd company logo
1 of 15
Download to read offline
Big Data at News International!
Welcome !
Big Data at News International!
!
Big Data Event - 29/05/2013!
Mike Keating: Product Owner!
@mikerkeating!
	
  
Slide 01/14!
Big Data at News International!
Introduction!
•  Where to start!
•  Data and decisions!
•  Our technology choices!
•  Lessons!
!
Big Data Event - 29/05/2013!Slide 02/14!
Big Data at News International!
Where to start…!
Big Data Event - 29/05/2013!Slide 03/14!
Lots of data, suppliers, technology and teams!
!
Take control, bring digital data into one place!
Link data sets and look for new approaches!
!
Make data/ information/ knowledge/ insight available!
Build awareness and decision making capability!
	
  
Big Data at News International!
Making the basics available via dashboards!
Big Data Event - 29/05/2013!Slide 04/14!
Big Data at News International!
Understanding our content and consumption!
Big Data Event - 29/05/2013!Slide 07/14!
Visits per Visitor
ViewsperVisit
Behaviours Across Website Sections
Big Data at News International!
Understanding our content in a social world!
Big Data Event - 29/05/2013!Slide 08/14!
Big Data at News International!
Tracking	
  subscrip.on	
  growth	
  across	
  products!
Big Data Event - 29/05/2013!Slide 05/14!
Days Following First Subscription
Subscriptions
Product Growth Following Launch
Big Data at News International!
Analysing attributes that indicate churn!
Big Data Event - 29/05/2013!Slide 06/14!
Big Data at News International!
Designing products based on patterns in navigation !
The	
  iPhone	
  –	
  Octopus	
  Naviga4on	
  
The	
  Website	
  –	
  Flower	
  Petal	
  Naviga4on	
  
The	
  iPad	
  Edi4on	
  -­‐	
  Linear	
  Naviga4on	
  
Big Data Event - 29/05/2013!Slide 10/14!
Big Data at News International!
Our technology choices – what?!
Big Data Event - 29/05/2013!Slide 11/14!
Infrastructure: AWS EC2, S3, RDS, EMR, Cloudformation, Vagrant!
Ops: Jenkins, Anthill Pro, Maven, Nexus, Zabbix, CloudWatch!
Code & Config: Puppet, Github!
Data Retrieval: Java & Python!
Data Pipeline: Java Map Reduce, Apache Crunch, !
Spark, MRUnit, Celery, RabbitMQ, Python, Flume!
Data Storage: HDFS, HBase, AWS S3, MySQL, Redis!
Data Schema: Avro!
Data Access: Python APIs, Tornado, S3, Hive & AWS EMR!
Analysis: R, Pandas, Excel -> analyst’s choice!
Visualisation: R, D3, Highcharts, Google Charts!
Products: APIs, JS+HTML+CSS!
Big Data at News International!
Our technology choices – why?!
Big Data Event - 29/05/2013!Slide 11/14!
•  Team: Build on team’s skillsets and knowledge. !
•  Recruitment: Be conscious of “hire-ability”!
•  Open Source: Big wins from usage; great
communities; contribute back!
•  Versions: use what works; work with alpha releases
but not as production code!
•  Consistency: Try and use what you do today, e.g.
AWS!
•  Flexibility: Use a “better” product where it makes
sense!
Big Data at News International!
Our technology choices – who?!
Big Data Event - 29/05/2013!Slide 11/14!
•  Tech Lead: Architecture; Design; Hands-On!
•  Delivery Manager: Project Management w Agile!
•  Hadoop: build from Java Map Reduce !
•  Python: Tornado, Native Python, real-time processing!
•  Data Science: Hive, R, Modeling!
•  DevOps: AWS, Vagrant, Puppet!
•  Experience: Practical experience of Hadoop in production!
•  Capability: Ability to learn new tech, design and build!
•  Demo: contributions to projects, working examples!
Big Data at News International!
Lessons!
Big Data Event - 29/05/2013!Slide 12/14!
•  Building awareness & common knowledge!
•  Building on existing teams, systems and their work!
•  Looking for extra capability and output!
•  Focus on visuals – it needs to be sharable/ visible!
•  Working with range of teams to share outputs!
•  Making good tech choices!
Big Data at News International!
Thanks!
Big Data Event - 29/05/2013!Slide 13/14!
Big Data Team – DevOps, Hadoop, Python, UI,
Analysts, Test!
!
Technology Teams – Design, Production Ops, Perf
Testing, Security, Products, Platforms, Service Desk!
!
Editorial, Marketing, Commercial, Finance Teams!
	
  
Big Data at News International!
!
!
Thanks!
Big Data at News International	
  
	
  
	
  	
  	
  	
  	
  Contact	
  Us	
  
News	
  Interna.onal	
  Technology	
  @techatni	
  
Mike	
  Kea.ng;	
  Product	
  Owner	
  @mikerkea.ng	
  
Jobs	
  via	
  hBp://joinnitech.co.uk/	
  
Big Data Event - 29/05/2013!Slide 14/14!

More Related Content

Similar to Mike keating - News Int - 18th BDL meetup

Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionSplunk
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...DataWorks Summit
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Databricks
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Karthik Murugesan
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Sarah Aerni
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...BigDataEverywhere
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionSplunk
 
Splunk hunkbeta
Splunk hunkbetaSplunk hunkbeta
Splunk hunkbetaAhnku Toh
 
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...Neo4j
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataSenturus
 
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios SpagoWorld
 
Journées SQL Server 2014 - Keynote Jour 2
Journées SQL Server 2014 - Keynote Jour 2Journées SQL Server 2014 - Keynote Jour 2
Journées SQL Server 2014 - Keynote Jour 2GUSS
 
Data Science with Hadoop - A primer
Data Science with Hadoop - A primerData Science with Hadoop - A primer
Data Science with Hadoop - A primerOfer Mendelevitch
 
Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow Analytics
 
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Garrett Teoh Hor Keong
 
Data Science with Hadoop: A Primer
Data Science with Hadoop: A PrimerData Science with Hadoop: A Primer
Data Science with Hadoop: A PrimerDataWorks Summit
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 

Similar to Mike keating - News Int - 18th BDL meetup (20)

Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
 
Splunk hunkbeta
Splunk hunkbetaSplunk hunkbeta
Splunk hunkbeta
 
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big Data
 
HadoopWorkshopJuly2014
HadoopWorkshopJuly2014HadoopWorkshopJuly2014
HadoopWorkshopJuly2014
 
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios
 
Journées SQL Server 2014 - Keynote Jour 2
Journées SQL Server 2014 - Keynote Jour 2Journées SQL Server 2014 - Keynote Jour 2
Journées SQL Server 2014 - Keynote Jour 2
 
Data Science with Hadoop - A primer
Data Science with Hadoop - A primerData Science with Hadoop - A primer
Data Science with Hadoop - A primer
 
Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3
 
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
 
Data Science with Hadoop: A Primer
Data Science with Hadoop: A PrimerData Science with Hadoop: A Primer
Data Science with Hadoop: A Primer
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-Ari
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Mike keating - News Int - 18th BDL meetup

  • 1. Big Data at News International! Welcome ! Big Data at News International! ! Big Data Event - 29/05/2013! Mike Keating: Product Owner! @mikerkeating!   Slide 01/14!
  • 2. Big Data at News International! Introduction! •  Where to start! •  Data and decisions! •  Our technology choices! •  Lessons! ! Big Data Event - 29/05/2013!Slide 02/14!
  • 3. Big Data at News International! Where to start…! Big Data Event - 29/05/2013!Slide 03/14! Lots of data, suppliers, technology and teams! ! Take control, bring digital data into one place! Link data sets and look for new approaches! ! Make data/ information/ knowledge/ insight available! Build awareness and decision making capability!  
  • 4. Big Data at News International! Making the basics available via dashboards! Big Data Event - 29/05/2013!Slide 04/14!
  • 5. Big Data at News International! Understanding our content and consumption! Big Data Event - 29/05/2013!Slide 07/14! Visits per Visitor ViewsperVisit Behaviours Across Website Sections
  • 6. Big Data at News International! Understanding our content in a social world! Big Data Event - 29/05/2013!Slide 08/14!
  • 7. Big Data at News International! Tracking  subscrip.on  growth  across  products! Big Data Event - 29/05/2013!Slide 05/14! Days Following First Subscription Subscriptions Product Growth Following Launch
  • 8. Big Data at News International! Analysing attributes that indicate churn! Big Data Event - 29/05/2013!Slide 06/14!
  • 9. Big Data at News International! Designing products based on patterns in navigation ! The  iPhone  –  Octopus  Naviga4on   The  Website  –  Flower  Petal  Naviga4on   The  iPad  Edi4on  -­‐  Linear  Naviga4on   Big Data Event - 29/05/2013!Slide 10/14!
  • 10. Big Data at News International! Our technology choices – what?! Big Data Event - 29/05/2013!Slide 11/14! Infrastructure: AWS EC2, S3, RDS, EMR, Cloudformation, Vagrant! Ops: Jenkins, Anthill Pro, Maven, Nexus, Zabbix, CloudWatch! Code & Config: Puppet, Github! Data Retrieval: Java & Python! Data Pipeline: Java Map Reduce, Apache Crunch, ! Spark, MRUnit, Celery, RabbitMQ, Python, Flume! Data Storage: HDFS, HBase, AWS S3, MySQL, Redis! Data Schema: Avro! Data Access: Python APIs, Tornado, S3, Hive & AWS EMR! Analysis: R, Pandas, Excel -> analyst’s choice! Visualisation: R, D3, Highcharts, Google Charts! Products: APIs, JS+HTML+CSS!
  • 11. Big Data at News International! Our technology choices – why?! Big Data Event - 29/05/2013!Slide 11/14! •  Team: Build on team’s skillsets and knowledge. ! •  Recruitment: Be conscious of “hire-ability”! •  Open Source: Big wins from usage; great communities; contribute back! •  Versions: use what works; work with alpha releases but not as production code! •  Consistency: Try and use what you do today, e.g. AWS! •  Flexibility: Use a “better” product where it makes sense!
  • 12. Big Data at News International! Our technology choices – who?! Big Data Event - 29/05/2013!Slide 11/14! •  Tech Lead: Architecture; Design; Hands-On! •  Delivery Manager: Project Management w Agile! •  Hadoop: build from Java Map Reduce ! •  Python: Tornado, Native Python, real-time processing! •  Data Science: Hive, R, Modeling! •  DevOps: AWS, Vagrant, Puppet! •  Experience: Practical experience of Hadoop in production! •  Capability: Ability to learn new tech, design and build! •  Demo: contributions to projects, working examples!
  • 13. Big Data at News International! Lessons! Big Data Event - 29/05/2013!Slide 12/14! •  Building awareness & common knowledge! •  Building on existing teams, systems and their work! •  Looking for extra capability and output! •  Focus on visuals – it needs to be sharable/ visible! •  Working with range of teams to share outputs! •  Making good tech choices!
  • 14. Big Data at News International! Thanks! Big Data Event - 29/05/2013!Slide 13/14! Big Data Team – DevOps, Hadoop, Python, UI, Analysts, Test! ! Technology Teams – Design, Production Ops, Perf Testing, Security, Products, Platforms, Service Desk! ! Editorial, Marketing, Commercial, Finance Teams!  
  • 15. Big Data at News International! ! ! Thanks! Big Data at News International              Contact  Us   News  Interna.onal  Technology  @techatni   Mike  Kea.ng;  Product  Owner  @mikerkea.ng   Jobs  via  hBp://joinnitech.co.uk/   Big Data Event - 29/05/2013!Slide 14/14!