Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Fast, Powerful and Scalable Analytics

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 27 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Fast, Powerful and Scalable Analytics (20)

Publicité

Plus par MariaDB Corporation (20)

Plus récents (20)

Publicité

Fast, Powerful and Scalable Analytics

  1. 1. Fast, Powerful and Scalable Analytics
  2. 2. Why Analytics ? • Get the most value of your data asset • Faster Better decision making process • Cost reduction • New products and services
  3. 3. Type of Analytics • Descriptive: What happened ? • Diagnostics: Why did it happen? • Predictive: What is likely to happen? • Prescriptive: What should I do about it ?
  4. 4. Descriptive: What happened ? ● Reports ○ Sales Report ○ Expense summary ● Ad-hoc requests to analyst
  5. 5. Diagnostics: Why did it happen ● Aggregates: aggregate measure over one or more dimension ○ Find total sales ○ Top five product ranked by sales ● Roll-ups: Aggregate at different levels of dimension hierarchy ○ given total sales by city, roll-up to get sales by state ● Drill-down: Inverse of roll-ups ○ given total sales by state, drill-down to get total by city ● Slicing and Dicing: ○ Equality and range selections on one or more dimensions
  6. 6. Predictive: What is likely to happen ● Sales Prediction ○ Analyze data to identify trends, spot weakness or determine conditions among broader data sets for making decisions about the future ● Targeted marketing ○ what is likelihood of a customer buying a particular product based on past buying behavior
  7. 7. Prescriptive: What is the best course of action? Paradox of choices With too many choices, which one is the best?
  8. 8. Data Analytics Use Cases By industry Finance Identify trade patterns Detect fraud and anomalies Predict trading outcomes Manufacturing Simulations to improve design/yield Detect production anomalies Predict machine failures (sensor data) Telecom Behavioral analysis of customer calls Network analysis (perf and reliability) Healthcare Find genetic profiles/matches Analyze health vs spending Predict viral outbreaks
  9. 9. Data Analytics Solution Consideration • Technical Considerations • Real-time analytics – High speed data ingestion – High speed read queries • Analytics – Built in analytics – Choice of BI tools • Business Considerations • Cost of deployment and use – Hardware and Price/Performance ratio – Large talent pool
  10. 10. Existing Approaches Limited real time analytics Slow releases of product innovation Expensive hardware and software Data Warehouses Hadoop / NoSQL LIMITED SQL SUPPORT DIFFICULT TO INSTALL/MANAGE LIMITED TALENT POOL DATA LAKE W/ NO DATA MANAGEMENT Hard to use Purpose Built rather than predictive analytics
  11. 11. MariaDB Big Data Solution MariaDB AX and MariaDB ColumnStore
  12. 12. MariaDB AX Analytics - simple, fast, scalable… and open source
  13. 13. MariaDB AX MariaDB Server MariaDB MaxScale MariaDB ColumnStore Parallel queries Distributed storage No indexes Automatic partitioning Read optimized High compression Low disk IO ColumnStore Storage ColumnStore Storage ColumnStore Storage MariaDB Server ColumnStore MariaDB Server ColumnStore MariaDB MaxScale MariaDB Server ColumnStore ColumnStore Storage MariaDB MaxScale UM User Module PM Performance Module
  14. 14. MariaDB ColumnStore High performance columnar storage engine that supports a wide variety of analytical use cases in highly scalable distributed environments Parallel query processing for distributed environments Faster, More Efficient Queries Single Interface for OLTP and analytics Easy to Manage and Scale Easier Enterprise Analytics Power of SQL and Freedom of Open Source to Big Data Analytics Better Price Performance
  15. 15. Better Price Performance Flexible deployment option • Cloud and On-premise • Run on commodity hardware • Open Source, Subscription based pricing 90.3% less per TB per year Commercial Data Warehouse MariaDB ColumnStore No need to maintain a third platform • Run analytics from the same SQL front end • No need to update application code • Leverage MariaDB Extensible architecture High data compression • More efficient at storing big data • Less hardware Customers have saved by going to MariaDB AX against Oracle(HealthCare), MemSQL(Auto-parts), Vertica(Finance, SEO Marketing): Come see them at M18!
  16. 16. Easier Enterprise Analytics ANSI SQL Single SQL Front-end • Use a single SQL interface for analytics and OLTP • Leverage MariaDB Security features - Encryption for data in motion, role based access and auditing Full ANSI SQL • No more SQL “like” query • Support complex join, aggregation and window function Easy to manage and scale • Eliminate needs for indexes and views • Automated horizontal/vertical partitioning • Linear scalable by adding new nodes as data grows • Out of box connection with BI tools MariaDB AX customers across industries: Auto Parts, Finance, Ad analytics, Asset management, Telecommunication, Healthcare, Digital Media, Carpooling App
  17. 17. Faster, More Efficient Queries Optimized for Columnar storage • Columnar storage reduces disk I/O • Blazing fast read-intensive workload • Ultra fast data import Parallel Query Processing Parallel distributed query execution • Distributed queries into series of parallel operations • Fully parallel high speed data ingestion – TPCH lineitem table - 750K to 1 million rows per min Highly available analytic environment • Built-in Redundancy • Automatic fail-over MariaDB AX customers across industries: Auto Parts, Finance, Ad analytics, Asset management, Telecommunication, Healthcare, Digital Media, Carpooling App
  18. 18. Ingestion Analytics Data Services Bulk Data Adapters Apache Kafka Streaming Data Adapters Spark / Python / ML Bulk Data Adapters Operations Transaction (OLTP) MariaDB Server InnoDB MariaDB MaxScale Web/Mobile Services MariaDB MaxScale Analytics (OLAP) MariaDB Server ColumnStore Simple & Streamlined data ingestion
  19. 19. Streaming data adapters – Apache Kafka Stream all messages published to Apache Kafka topics to MariaDB AX automatically and continuously - enable data from many sources to be streamed and collected for analysis without complex code MariaDB Server ColumnStore Apache Kafka ColumnStore Storage ColumnStore StorageColumnStore Storage Write API Write API Write API MariaDB Server ColumnStore Streaming Data Adapter (Kafka Client) Topic Topic Topic
  20. 20. OLTP to OLAP: Streaming data adapters – MaxScale CDC Stream all writes from MariaDB TX to MariaDB AX automatically and continously - ensure analytical data is up to date and not stale, no need for batch jobs, manual processes or human intervention MariaDB Server InnoDB MariaDB Server ColumnStore MariaDB MaxScale ColumnStore Storage ColumnStore StorageColumnStore Storage Write API Write API Write API MariaDB Server ColumnStore Streaming Data Adapter (CDC Client) CDC Server
  21. 21. MariaDB AX Use Cases
  22. 22. IHME - Institute of Health Metrics and Evaluation IHME Visualizations library: http://www.healthdata.org/results/data-visualizations Started with 4.2 TB, with goal to go to 30TB of data
  23. 23. Customer Use Case -1 Industry: healthcare (Medicaid) Data: surveys Use case: decision support system Details: • Identify trends and patterns • Determine population cohorts • Predict health outcomes • Anticipate funding / capacity • Recommend intervention Can’t do complex queries on current hardware with Oracle and snowflake schemas Limited to optimizing for simple, known queries (2-3 columns) Replaced with ColumnStore > a single table > 2.5 million rows, 248 columns > complex, ad-hoc queries > query 20+ columns in seconds
  24. 24. Customer Use Case - 2 Industry: biotechnology (genetics) Data: genotypes Use case: genetic profiling Details: • Find genetic mates (beef and dairy) • Predict meat production (pork) • Gene/DNA analysis Had to convert to CSV files and schedule import jobs (cron) Always receiving new genetic data Migrated to data adapter (Python) > streamline import process > remove steps / possible error > remove delays > import data on demand > immediate customer access
  25. 25. Customer Use Case - 3 Industry:Mobile text/call app Data: call and text logs Use case: Mobile app use analytics Details: • 30 million text and 3 million phone call per day • 1.5 billion rows of logs per day • The text and call volume rate will continue to grow InnoDB backend hit the scale limit of 6TB and it requires lot of performance tuning and index management Migrated to MariaDB AX > Able to process 24 month - 24TB vs 6 months limitation of InnoDB > Same BI tools and client applications worked with MariaDB AX seamlessly
  26. 26. MariaDB AX Analytics made easy – simple, fast, scalable…
  27. 27. Thank you

×