This presentation shows people with SQL knowledge how to get started in the Big Data world, from creating your own cluster, to getting hands on experience with Spark and Hadoop. Best of all, you can do all of these without spending dollar one.
1. BIG DATA FOR SQL DEVELOPERS:
GET STARTED (FOR FREE)
JULY 6, 2017
2. BIG DATA FOR SQL DEVELOPERS
I KNOW SQL, BUT…
1. How can I run a cluster without setting one
up at home, or paying for expensive cloud
services?
2. Where do I find “big data” to analyze?
3. Do I need to learn a different programming
language?
3. BIG DATA FOR SQL DEVELOPERS
YOUR RDBMS - HOW WE COMMONLY VIEW IT
RDBMS
SELECT … FROM …
4. BIG DATA FOR SQL DEVELOPERS
YOUR RDBMS - A COLLECTION OF SYSTEMS
SELECT * FROM …
QUERY LANGUAGE INTERPRETER
QUERY PLANNER & OPTIMIZER
SECURITY
I/O
MEMORY CACHE
DATA STORAGE
DISASTER
RECOVERY
LOGGING
CONCURRENCY&TRANSACTIONALCONSISTENCY
5. BIG DATA FOR SQL DEVELOPERS
YOUR RDBMS - A COLLECTION OF SYSTEMS
SELECT … FROM …
OR
SQLCONTEXT.SQL(“
SELECT…”)
YARN
HDFS
CONSISTENCY?SECURITY?
6. BIG DATA FOR SQL DEVELOPERS
QUESTION #1 - HOW DO I GET MY OWN CLUSTER?
‣ Use IaaS
‣ AWS Free Tier
‣ Use Managed Services (AWS EMR, Azure
HDInsight)
‣ Can, but have to wait for them to contact you.
‣ Databricks Community Edition
7. BIG DATA FOR SQL DEVELOPERS
QUESTION #2 - WHERE CAN I FIND DATA?
‣ Census.gov
‣ The CIA world Factbook
‣ HealthData.gov
‣ World Health Organization
‣ AWS Public Datasets
‣ Facebook Graph API
‣ Google Public Data
‣ Databricks
9. BIG DATA FOR SQL DEVELOPERS
QUESTION #3 - DO I NEED TO LEARN ANOTHER LANGUAGE?
‣Yes, but you don’t have to be a software dev
‣ Python, Java, or Scala (my choice: Python)
‣ Good news: Your SQL can still help you!
10. BIG DATA FOR SQL DEVELOPERS
WHY SPARK?
‣Popular, vital
‣A framework for processing distributed datasets
‣Has a SQL Implementation
12. BIG DATA FOR SQL DEVELOPERS
YOUR NEXT STEPS
1. Sign up for Databricks Community Edition
2. Read and complete “A Gentle Introduction to
Apache Spark on Databricks”
3. Read and complete “Apache Spark on
Databricks for Data Engineers”
4. Read a book on the RDBMS you use most often.
13. BIG DATA FOR SQL DEVELOPERS
DEMO
https://community.cloud.databricks.com/?
o=8158027403376652#notebook/
3636183528035570/command/3636183528035585
14. DO YOU KNOW DATA, OR
DO YOU KNOW A FLAVOR
OF SQL?
15. BIG DATA FOR SQL DEVELOPERS
RESOURCES & CONTACT
Brent Lightsey
http://firstlightanalytics.com
brent@firstlightanalytics.com
405-295-5502
https://www.linkedin.com/in/brentlightsey/