SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
Open-source software
development frameworks
Open-source software dev frameworks




           (and many more...)
Every platform needs
open-source,
code-based dev frameworks
•   Avoid repeat work
•   Remain 100% customizable
•   Escape vendor lock-in
But for data?
  Nothing.
Data work today
•   Proprietary / GUI / No framework
•   Not reusable
•   Little collaboration
•   Too many errors, too slow to fix
an open source dev framework for data
“Mortar takes something complex
and makes it simple and intuitive.”
     —Jon Coveney, Twitter
What big data system should I use? Hadoop, HPCC, Disco, Storm…
What Hadoop distro should I use?
How much will this cost? Should I do it some other way?
How long is this going to take to learn? Should I do it some other way?
How many machines should I buy?
Can I run in the cloud?
How should I interact with it? Pig, Hive, Cascading, Scalding, Cascalog


                               Weeks
Can I use libraries I need?
How do I install everything?
How can my team share code?
How can I reuse code?
Is my workload typical?
How can I safely deploy?
How do I know if what I've written is correct?
Are there any libraries I could start with?
Can I connect to my key-value store?
Can I do machine learning in Hadoop?
Is it secure?
What if I need help?
More Weeks
Even More Weeks
—Alan Gates, Hortonworks co-founder
 OPEN SOURCED PIG


“Our focus in designing Pig has always been to
make Hadoop easy...

Mortar's approach is right on—they extend our
quick start and ease of use focuses with pre-
built Hadoop clusters, clear examples, code
organization templates, and github for social
sharing of the code.”
Can Mortar help you?
Mortar is for analyzing lots of data in AWS.
Who is Mortar for?
Mortar serves companies of all sizes from any
industry.
—Dwight Merriman
 FOUNDER OF 10GEN (MAKER OF MONGODB), DOUBLECLICK (ACQ.
 GOOGLE), SHOPWIKI, BUSINESS INSIDER, GILT GROUPE


“...Mortar fits right in with our vision of the
future...
With this exciting launch, MongoDB users can
now also seamlessly use Mortar.”
By and for engineers
 and data scientists
> gem install mortar
> mortar new my_project
> git clone your_project
> mortar run your_project
Pig is easy to learn
(and we’ve made it easier)
Illustrate is awesome
•   Find your mistakes
•   Understand code before collaborating
•   Automated tests: a way to test every condition
Hadoop & Python are powerful data science tools
...but they haven’t worked together before.
Now you can use Hadoop & real Python on Mortar
What you just saw
•   Installed Mortar
•   Made a new project
•   Cloned a project
•   Ran the project
•   Illustrated project
•   Use Python and other libraries on Hadoop
2 options for using Mortar:

- Git Projects: modularity, testability, code
 sharing, local dev, and revision control.

- Web Projects: zero install, in the browser
One-hour challenge
•   Use your browser
•   Minutes to connect data
•   Productive in one hour
How does Mortar fit with other
As a good citizen, Mortar has a rich API
How about speed?
Full speed, directly on Hadoop
Mortar revolutionizes
your data pipeline.
•   Easy start
•   Keeps you productive
•   Collaborate with data
•   No lock-in
•   Easy to budget
Tiers
•   Free | Service use unlimited | 10 node-hours
•   Pay as you Go | $0.89/node-hour | support
•   Enterprise | $3,000/month | $0.69/node-hour | live
    support
mortardata.com / @mortardata

Contenu connexe

Tendances

Recommender Trends 2014
Recommender Trends 2014Recommender Trends 2014
Recommender Trends 2014Torben Brodt
 
Lightning talk: building a cloud of fares
Lightning talk: building a cloud of faresLightning talk: building a cloud of fares
Lightning talk: building a cloud of faresRalph Ligtenberg
 
M gray ands_ttt2_perth_pawsey_training_dsw2018
M gray ands_ttt2_perth_pawsey_training_dsw2018M gray ands_ttt2_perth_pawsey_training_dsw2018
M gray ands_ttt2_perth_pawsey_training_dsw2018ARDC
 
LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...
LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...
LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...LogDNA
 
Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Chetan Sharma
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Chris Jang
 

Tendances (11)

Recommender Trends 2014
Recommender Trends 2014Recommender Trends 2014
Recommender Trends 2014
 
Lightning talk: building a cloud of fares
Lightning talk: building a cloud of faresLightning talk: building a cloud of fares
Lightning talk: building a cloud of fares
 
Introduction to Azure HDInsight
Introduction to Azure HDInsightIntroduction to Azure HDInsight
Introduction to Azure HDInsight
 
GigaOM 2013 highlights
GigaOM 2013 highlightsGigaOM 2013 highlights
GigaOM 2013 highlights
 
M gray ands_ttt2_perth_pawsey_training_dsw2018
M gray ands_ttt2_perth_pawsey_training_dsw2018M gray ands_ttt2_perth_pawsey_training_dsw2018
M gray ands_ttt2_perth_pawsey_training_dsw2018
 
JBCN barcelona 2017 kappa architecture 2.0
JBCN barcelona 2017 kappa architecture 2.0JBCN barcelona 2017 kappa architecture 2.0
JBCN barcelona 2017 kappa architecture 2.0
 
LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...
LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...
LogDNA and CloudFoundry Webinar: Open Ecosystems, Interoperability + Multi-Cl...
 
OTTO-multicloud
OTTO-multicloudOTTO-multicloud
OTTO-multicloud
 
Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
 
Google cloud
Google cloudGoogle cloud
Google cloud
 

Similaire à Open-source dev framework makes data work simple

How Open Source / Open Technology Could Help On Your Project
How Open Source / Open Technology Could Help On Your ProjectHow Open Source / Open Technology Could Help On Your Project
How Open Source / Open Technology Could Help On Your ProjectWan Leung Wong
 
IoT is Something to Figure Out
IoT is Something to Figure OutIoT is Something to Figure Out
IoT is Something to Figure OutPeter Hoddie
 
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache ApexMaking sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache ApexApache Apex
 
Azure Notebooks - Jupyter for the Cloud
Azure Notebooks - Jupyter for the CloudAzure Notebooks - Jupyter for the Cloud
Azure Notebooks - Jupyter for the CloudCameron Vetter
 
Cloud computing and Hadoop introduction
Cloud computing and Hadoop introductionCloud computing and Hadoop introduction
Cloud computing and Hadoop introductionchristian.perez
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019Travis Oliphant
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)Amazon Web Services
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...Evans Ye
 
Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.Gladson DSouza
 
Getting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudGetting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudRightScale
 
Public PaaS Throwdown!
Public PaaS Throwdown!Public PaaS Throwdown!
Public PaaS Throwdown!Ronak Mallik
 
All about that reactive ui
All about that reactive uiAll about that reactive ui
All about that reactive uiPaul van Zyl
 
Habitat Overview
Habitat OverviewHabitat Overview
Habitat OverviewMandi Walls
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture corehard_by
 
Hadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTakHadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTakRam Kishor Tak
 
Software Engineering in Startups
Software Engineering in StartupsSoftware Engineering in Startups
Software Engineering in StartupsDusan Omercevic
 

Similaire à Open-source dev framework makes data work simple (20)

How Open Source / Open Technology Could Help On Your Project
How Open Source / Open Technology Could Help On Your ProjectHow Open Source / Open Technology Could Help On Your Project
How Open Source / Open Technology Could Help On Your Project
 
IoT is Something to Figure Out
IoT is Something to Figure OutIoT is Something to Figure Out
IoT is Something to Figure Out
 
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache ApexMaking sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
 
Azure Notebooks - Jupyter for the Cloud
Azure Notebooks - Jupyter for the CloudAzure Notebooks - Jupyter for the Cloud
Azure Notebooks - Jupyter for the Cloud
 
Cloud computing and Hadoop introduction
Cloud computing and Hadoop introductionCloud computing and Hadoop introduction
Cloud computing and Hadoop introduction
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Ice dec05-04-wan leung
Ice dec05-04-wan leungIce dec05-04-wan leung
Ice dec05-04-wan leung
 
Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.
 
Getting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudGetting Started with Big Data in the Cloud
Getting Started with Big Data in the Cloud
 
Dev Ops without the Ops
Dev Ops without the OpsDev Ops without the Ops
Dev Ops without the Ops
 
Public PaaS Throwdown!
Public PaaS Throwdown!Public PaaS Throwdown!
Public PaaS Throwdown!
 
All about that reactive ui
All about that reactive uiAll about that reactive ui
All about that reactive ui
 
Habitat Overview
Habitat OverviewHabitat Overview
Habitat Overview
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
 
Hadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTakHadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTak
 
2014 Picking a Platform by Anand Kulkarni
2014 Picking a Platform by Anand Kulkarni2014 Picking a Platform by Anand Kulkarni
2014 Picking a Platform by Anand Kulkarni
 
Software Engineering in Startups
Software Engineering in StartupsSoftware Engineering in Startups
Software Engineering in Startups
 

Plus de mortardata

Daeil Kim: Machine Learning at the New York Times
Daeil Kim: Machine Learning at the New York TimesDaeil Kim: Machine Learning at the New York Times
Daeil Kim: Machine Learning at the New York Timesmortardata
 
Jonathan Coveney: Why Pig?
Jonathan Coveney: Why Pig?Jonathan Coveney: Why Pig?
Jonathan Coveney: Why Pig?mortardata
 
Can Big Data Save the World? By Jake Porway
Can Big Data Save the World? By Jake PorwayCan Big Data Save the World? By Jake Porway
Can Big Data Save the World? By Jake Porwaymortardata
 
Max Shron, Thinking with Data at the NYC Data Science Meetup
Max Shron, Thinking with Data at the NYC Data Science MeetupMax Shron, Thinking with Data at the NYC Data Science Meetup
Max Shron, Thinking with Data at the NYC Data Science Meetupmortardata
 
Drew Conway: A Social Scientist's Perspective on Data Science
Drew Conway: A Social Scientist's Perspective on Data ScienceDrew Conway: A Social Scientist's Perspective on Data Science
Drew Conway: A Social Scientist's Perspective on Data Sciencemortardata
 
Data Science at Tumblr
Data Science at TumblrData Science at Tumblr
Data Science at Tumblrmortardata
 
Hadoop, Pig, and Python (PyData NYC 2012)
Hadoop, Pig, and Python (PyData NYC 2012)Hadoop, Pig, and Python (PyData NYC 2012)
Hadoop, Pig, and Python (PyData NYC 2012)mortardata
 

Plus de mortardata (8)

Daeil Kim: Machine Learning at the New York Times
Daeil Kim: Machine Learning at the New York TimesDaeil Kim: Machine Learning at the New York Times
Daeil Kim: Machine Learning at the New York Times
 
Jonathan Coveney: Why Pig?
Jonathan Coveney: Why Pig?Jonathan Coveney: Why Pig?
Jonathan Coveney: Why Pig?
 
Pig on Spark
Pig on SparkPig on Spark
Pig on Spark
 
Can Big Data Save the World? By Jake Porway
Can Big Data Save the World? By Jake PorwayCan Big Data Save the World? By Jake Porway
Can Big Data Save the World? By Jake Porway
 
Max Shron, Thinking with Data at the NYC Data Science Meetup
Max Shron, Thinking with Data at the NYC Data Science MeetupMax Shron, Thinking with Data at the NYC Data Science Meetup
Max Shron, Thinking with Data at the NYC Data Science Meetup
 
Drew Conway: A Social Scientist's Perspective on Data Science
Drew Conway: A Social Scientist's Perspective on Data ScienceDrew Conway: A Social Scientist's Perspective on Data Science
Drew Conway: A Social Scientist's Perspective on Data Science
 
Data Science at Tumblr
Data Science at TumblrData Science at Tumblr
Data Science at Tumblr
 
Hadoop, Pig, and Python (PyData NYC 2012)
Hadoop, Pig, and Python (PyData NYC 2012)Hadoop, Pig, and Python (PyData NYC 2012)
Hadoop, Pig, and Python (PyData NYC 2012)
 

Dernier

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Dernier (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Open-source dev framework makes data work simple

  • 1.
  • 3. Open-source software dev frameworks (and many more...)
  • 4. Every platform needs open-source, code-based dev frameworks • Avoid repeat work • Remain 100% customizable • Escape vendor lock-in
  • 5. But for data? Nothing.
  • 6. Data work today • Proprietary / GUI / No framework • Not reusable • Little collaboration • Too many errors, too slow to fix
  • 7. an open source dev framework for data
  • 8. “Mortar takes something complex and makes it simple and intuitive.” —Jon Coveney, Twitter
  • 9. What big data system should I use? Hadoop, HPCC, Disco, Storm… What Hadoop distro should I use? How much will this cost? Should I do it some other way? How long is this going to take to learn? Should I do it some other way? How many machines should I buy? Can I run in the cloud? How should I interact with it? Pig, Hive, Cascading, Scalding, Cascalog Weeks Can I use libraries I need? How do I install everything? How can my team share code? How can I reuse code? Is my workload typical? How can I safely deploy? How do I know if what I've written is correct? Are there any libraries I could start with? Can I connect to my key-value store? Can I do machine learning in Hadoop? Is it secure? What if I need help?
  • 12. —Alan Gates, Hortonworks co-founder OPEN SOURCED PIG “Our focus in designing Pig has always been to make Hadoop easy... Mortar's approach is right on—they extend our quick start and ease of use focuses with pre- built Hadoop clusters, clear examples, code organization templates, and github for social sharing of the code.”
  • 13. Can Mortar help you? Mortar is for analyzing lots of data in AWS.
  • 14. Who is Mortar for? Mortar serves companies of all sizes from any industry.
  • 15. —Dwight Merriman FOUNDER OF 10GEN (MAKER OF MONGODB), DOUBLECLICK (ACQ. GOOGLE), SHOPWIKI, BUSINESS INSIDER, GILT GROUPE “...Mortar fits right in with our vision of the future... With this exciting launch, MongoDB users can now also seamlessly use Mortar.”
  • 16. By and for engineers and data scientists
  • 17. > gem install mortar
  • 18. > mortar new my_project
  • 19.
  • 20. > git clone your_project
  • 21. > mortar run your_project
  • 22.
  • 23.
  • 24. Pig is easy to learn (and we’ve made it easier)
  • 25.
  • 26.
  • 27. Illustrate is awesome • Find your mistakes • Understand code before collaborating • Automated tests: a way to test every condition
  • 28. Hadoop & Python are powerful data science tools
  • 29. ...but they haven’t worked together before.
  • 30. Now you can use Hadoop & real Python on Mortar
  • 31.
  • 32.
  • 33. What you just saw • Installed Mortar • Made a new project • Cloned a project • Ran the project • Illustrated project • Use Python and other libraries on Hadoop
  • 34. 2 options for using Mortar: - Git Projects: modularity, testability, code sharing, local dev, and revision control. - Web Projects: zero install, in the browser
  • 35.
  • 36. One-hour challenge • Use your browser • Minutes to connect data • Productive in one hour
  • 37. How does Mortar fit with other As a good citizen, Mortar has a rich API
  • 38. How about speed? Full speed, directly on Hadoop
  • 39. Mortar revolutionizes your data pipeline. • Easy start • Keeps you productive • Collaborate with data • No lock-in • Easy to budget
  • 40. Tiers • Free | Service use unlimited | 10 node-hours • Pay as you Go | $0.89/node-hour | support • Enterprise | $3,000/month | $0.69/node-hour | live support
  • 41.