SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
September 12
Time Series DB for IoT
Choosing the Right IoT Data Platform
On-Premises and Azure
About me
• Project Manager @
o 15 years professional experience
o .NET Web Development MCPD
• External Expert Horizon 2020
• External Expert Eurostars & IFD
• Business Interests
o Web Development, SOA, Integration
o Security & Performance Optimization
o IoT, Computer Intelligence
• Contact
o ivelin.andreev@icb.bg
o www.linkedin.com/in/ivelin
o www.slideshare.net/ivoandreev
Agenda
Time Series
o Why Time Series
o Choosing the Right IoT Data Platform
o InfluxDB vs Competitors
o Key Concepts
o Demo
IoT Data
• IoT (buzzword of the year)
• 30 billion “things” by 2020 (Forbes)
• Top IoT Industries
• Manufacturing
• Healthcare
• What are the benefits from IoT?
• The discussion has shifted
• How to make IoT work?
• How to gain insight on hidden relations?
• How to get actionable results?
ASPires
Distributed IoT Platform
Funded by
EUROPEAN CIVIL PROTECTION
AND HUMANITARIAN AID
OPERATIONS
2016/PREV/03 (ASPIRES)
IoT Data is Time Stamp Data
• Series
o Identified by source name (i.e. sensor ID) and metric name (i.e. temp)
o Consists of ordered {time, value} measurements
o Regular and Irregular
• Time Series DB
o Optimized for TS Data
• Process Historian – more than TS DB
o Interfaces to read data from multiple data sources
o Render graphics for meaningful points
o Statistical process control
o Redundancy and high availability
TS Data Characteristics
• Writes
o 95%-99% of all operations
o Streaming live data from multiple devices
o Typically sequential appends
• Updates to modify values are rare
• Deletes are bulk on large ranges (days, months, years)
• Queries
o Typically sequential
o Concurrent reads are common
• Performance issues are typically I/O bound
o Caching does not work well for BigData
o Systems are typically distributed by design
Credits: Baron Schwartz
https://www.xaprb.com/blog/2014/06/08/time-series-database-requirements/
This is TS Data
And this is NOT
Choosing a Proper Data Platform
Data Platform Services for IoT Data
• Data Aggregation Services
o Collect, normalize, aggregate metrics and events data
• Data Storage Services
o Distributed by nature
o High write load, fast retrieval, efficient compression
• Analytics and Visualization Services
• Technologies not designed for the use case
o Relational DB Engines
o Columnar or key-value databases (HBase, Cassandra)
o Search engine (Elasticsearch)
o Document-oriented (MongoDB, DocumentDB)
o First gen. TS DB (Graphite, OpenTSDB)
When TS overperform RDBMS
• Target scenarios
o High I/O rate
o Number of tags
o Volume of data
o Aggregation of irregular data
o Compression & De-duplication
• Requires a learning
o Do you expect that many data?
o Do you need to plot?
o Do you need to aggregate?
InfluxDB vs SQL Server 2016
Credits: Angelin Nedelchev (ICB)
Validation Setup
InfluxDB v1.3.0 single node SQL Server 2016 Standard
CPU: Intel Core i7 3.40GHz (quad, 8MB cache)
Memory: 16GB DDR3 1333MHz
OS: Windows 10 Enterprise, 64 bit
SSD: Samsung 850 EVO 250GB
HDD: Seagate 7200RPM, 16MB cache, 500GB
Name Type Indexed
ID (PK) int Yes
TagID int Yes
Value int No
Timestamp DateTime Yes
Name Type
tagID Int Tag Key
value Int Field Key
time DateTime TimeStamp
InfluxDB schema: SQL Server schema:
Write Performance (writes/sec)
0
50000
100000
150000
200000
250000
0 20,000,000 40,000,000 60,000,000 80,000,000 100,000,000
Influx
Insertspersecond
• Influx average write speed of 200,000 writes/second
• SQL Server average write speed of 5,000-8,000 writes/second
• Memory optimized tables increases write performance by 1.4x
• Write trend unaffected by the amount of data (in this volumes)
• Influx outperforms SQL Server 40x
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 20,000,000 40,000,000 60,000,000 80,000,000 100,000,000
SQL In-Memory OLTP
Disk Storage
• 27% improvement is SQL Server page compression is enabled (1883MB)
• Influx has 19x-26x less disk requirements for the same functionality
2579MB
98MB
0
500
1000
1500
2000
2500
3000
On-Disk Storage
SQL Server
Influx
DiskSpaceUsed(megabytes)DiskSpaceUsed(megabytes)
Read Performance (queries/sec)
0
500
1000
1500
2000
2500
3000
3500
4000
1 2 4 8
InfluxDB
SQL Server
QueriesperSecond
Concurrency
Benchmark query:
Aggregate samples of 10,000 points, average over 1,000 runs.
• SQL Server mean query response time – 78ms (13 queries/sec)
• Influx mean query response time – 1,3ms (769 queries/sec)
• Influx has 59x better query throughput than SQL Server
Read Performance (rows at a time)
• SQL Server relatively unaffected by SSD (due to caching)
• Influx performance improves up to 100% on SSD, chunk size 10’000
• SQL Server is up to 2x better on HDD (Influx is better up to 600K)
• Relatively equal on SSD (Influx is better up to 1.6M)
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
0 5 10 15 20 25 30
HDD Performance
SQL Influx
Queriedresults
Seconds to execute (lower is better)
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
0 5 10 15 20 25 30
SSD Performance
SQL Influx
Seconds to execute (lower is better)
InfluxDB vs NonSQL for TS Data
• InfluxDB vs MongoDB
o WRITE: 27x greater
o STORAGE: 84x less
o QUERY: Equal performance
• InfluxDB vs Elasticsearch
o WRITE: 8x greater
o STORAGE: 4x less
o QUERY: 3.5x – 7.5x faster
• InfluxDB vs OpenTSDB
o WRITE: 5x greater
o STORAGE: 16.5x less
o QUERY: 4x faster
• InfluxDB vs Cassandra
o WRITE: 5.3x greater
o STORAGE: 9.3x less
o QUERY: up to 168x faster
• InfluxDB vs DocumentDB
o More popular (Google Trends)
o Cloud and on-premises
o No external dependencies
o Aggregations
Database as a Service in Azure
• Typical Azure IoT Services (Before 2017.04.20)
• With Time Series Insigths (beta)
• Other
o CosmosDB with MongoDB (SLA 10ms reads, 15ms writes)
Azure Stream Insights
Fully managed analytics, storage and visualization service
Benefits
• Reduced number of services used
• Monitor IoT solutions
• Event source from IoT Hub and Event Hub (only)
• Visualize and analyze data at large scale
• Root cause analysis and anomaly detection
• APIs available for management of raw data
Cons
• Pricey (€126.53, 1M events - €1,138.50 month/unit, 10M events)
• Specific use cases
• Limited capabilities
• Open-source distributed TS database
• Key Features
o Easy setup, no external dependencies, implemented in Go
o Runs on Linux, Windows, OS X
o Supports .NET, Java, JS, R, PHP, Python, Ruby, Go, Node.js
o Comprehensive documentation
o Scalable, highly efficient
o REST API (JSON)
o SQL-like syntax
o On-premise and cloud
• Top ranked TS DB
Key Concepts
Term Description
Measurement Container (Table)
Point Single record for timestamp
Field Set Required; Not-indexed
Field key Define what is measured
Field value Actual measured value
(string, bool, int64, float64)
Tag Set Metadata about the point
Optional; Indexed; Key-value;
Tag key Unique per measurement
Tag value Unique per tag key
Series Points with common tag set
• Aggregation functions
• Retention policies
• Downsampling
• Continuous queries
Scalability & Sizing Guidelines
• Single node or cluster
o Single node is open source and free
• Recommendations
• Query complexity (Moderate)
o Multiple functions, few regular expressions
o Complex GROUP BY clause or sampling over weeks
o Runtime 500ms – 5sec
Load Resources Writes/Sec
Moderate
Queries/Sec.
Unique
Series
Low Cores: 2-4; RAM: 2-4 GB 0 - 5K 0 - 5 0 – 100K
Moderate Cores: 4-6; RAM: 8-32 GB 5K - 250K 5 - 25 100K - 1M
High Cores: 8+; RAM: 32+ 250K – 750K 25 - 100 1M - 10M
Functions
Aggregations Selectors Transformations Predictors
COUNT() BOTTOM() CEILING() HOLT_WINTERS()
DISTINCT() FIRST() CUMULATIVE_SUM()
INTEGRAL() LAST() DERIVATIVE()
MEAN() MAX() DIFFERENCE()
MEDIAN() MIN() ELAPSED()
MODE() PERCENTILE() FLOOR()
SPREAD() SAMPLE() HISTOGRAM()
STDDEV() TOP() MOVING_AVERAGE()
SUM() NON_NEGATIVE_DERIVATIVE()
InfluxData Advantages
• End-to-end solution
• Flexible tag support
o Graphite, RRD (no tags at all)
o OpenTSDB, Kairos (tag number)
• Regular & Irregular TS
o Graphite, OpenTSDB
• Multiple Data Types
• High Performance
o Hbase, Elastic, Cassandra
• Functionality
o Hbase, Cassandra
InfluxDB & Grafana
DEMO
Takeaways
• Why Time Series Matters for IoT
o https://www.influxdata.com/ resources/webinar-time-series-monitoring-
metrics-real-time-analytics-iotsensor-data/
• InfluxDB
o https://www.influxdata.com/resources/
o https://docs.influxdata.com/influxdb/v1.3/guides/hardware_sizing/
o https://docs.influxdata.com/influxdb/v1.3/concepts/glossary/
o https://docs.influxdata.com/influxdb/v1.3/query_language/schema_exploration/
• Grafana Plugins
o https://grafana.com/plugins
• Time Series Insights
o https://docs.microsoft.com/en-us/azure/time-series-insights/time-series-insights-overview
Thanks to our Sponsors:
Global Sponsor:
With the support of:
Upcoming events
SQLSaturday #642 (Sofia) October 14
http://www.sqlsaturday.com/642/
JS Talks (Sofia), November 18
http://jstalks.net/

Contenu connexe

Tendances

hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata StorageDataWorks Summit/Hadoop Summit
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Eric Sun
 
04 spark-pair rdd-rdd-persistence
04 spark-pair rdd-rdd-persistence04 spark-pair rdd-rdd-persistence
04 spark-pair rdd-rdd-persistenceVenkat Datla
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkKoalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkDatabricks
 
Understanding InfluxDB Basics: Tags, Fields and Measurements
Understanding InfluxDB Basics: Tags, Fields and MeasurementsUnderstanding InfluxDB Basics: Tags, Fields and Measurements
Understanding InfluxDB Basics: Tags, Fields and MeasurementsInfluxData
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
Architecture of Big Data Solutions
Architecture of Big Data SolutionsArchitecture of Big Data Solutions
Architecture of Big Data SolutionsGuido Schmutz
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Databricks
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataDataWorks Summit/Hadoop Summit
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardParis Data Engineers !
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 

Tendances (20)

hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)
 
04 spark-pair rdd-rdd-persistence
04 spark-pair rdd-rdd-persistence04 spark-pair rdd-rdd-persistence
04 spark-pair rdd-rdd-persistence
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkKoalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache Spark
 
Understanding InfluxDB Basics: Tags, Fields and Measurements
Understanding InfluxDB Basics: Tags, Fields and MeasurementsUnderstanding InfluxDB Basics: Tags, Fields and Measurements
Understanding InfluxDB Basics: Tags, Fields and Measurements
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Architecture of Big Data Solutions
Architecture of Big Data SolutionsArchitecture of Big Data Solutions
Architecture of Big Data Solutions
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Graph database
Graph database Graph database
Graph database
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 

Similaire à Time Series Databases for IoT (On-premises and Azure)

IoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDBIoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDBIvo Andreev
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Tech Triveni
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTjixuan1989
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Clustrix
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL ServicesAmazon Web Services
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataAbishek V S
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Lucidworks
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Tugdual Grall
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...Caserta
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsClusterpoint
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolEDB
 
Time Series Analytics Azure ADX
Time Series Analytics Azure ADXTime Series Analytics Azure ADX
Time Series Analytics Azure ADXRiccardo Zamana
 
Suburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeSuburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeTorsten Steinbach
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, whenEugenio Minardi
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDBAWS Germany
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBMongoDB
 

Similaire à Time Series Databases for IoT (On-premises and Azure) (20)

IoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDBIoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDB
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoT
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutions
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
 
Time Series Analytics Azure ADX
Time Series Analytics Azure ADXTime Series Analytics Azure ADX
Time Series Analytics Azure ADX
 
Suburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeSuburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data Lake
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
 

Plus de Ivo Andreev

Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2Ivo Andreev
 
Architecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for BusinessArchitecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for BusinessIvo Andreev
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadIvo Andreev
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIIvo Andreev
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersIvo Andreev
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsIvo Andreev
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneIvo Andreev
 
Collecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn DataCollecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn DataIvo Andreev
 
Collecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure OrbitalCollecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure OrbitalIvo Andreev
 
Language Studio and Custom Models
Language Studio and Custom ModelsLanguage Studio and Custom Models
Language Studio and Custom ModelsIvo Andreev
 
CosmosDB for IoT Scenarios
CosmosDB for IoT ScenariosCosmosDB for IoT Scenarios
CosmosDB for IoT ScenariosIvo Andreev
 
Forecasting time series powerful and simple
Forecasting time series powerful and simpleForecasting time series powerful and simple
Forecasting time series powerful and simpleIvo Andreev
 
Constrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project BonsaiConstrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project BonsaiIvo Andreev
 
Azure security guidelines for developers
Azure security guidelines for developers Azure security guidelines for developers
Azure security guidelines for developers Ivo Andreev
 
Autonomous Machines with Project Bonsai
Autonomous Machines with Project BonsaiAutonomous Machines with Project Bonsai
Autonomous Machines with Project BonsaiIvo Andreev
 
Global azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure LighthouseGlobal azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure LighthouseIvo Andreev
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSIvo Andreev
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesIvo Andreev
 
Industrial IoT on Azure
Industrial IoT on AzureIndustrial IoT on Azure
Industrial IoT on AzureIvo Andreev
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 

Plus de Ivo Andreev (20)

Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2
 
Architecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for BusinessArchitecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for Business
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
Collecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn DataCollecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn Data
 
Collecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure OrbitalCollecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure Orbital
 
Language Studio and Custom Models
Language Studio and Custom ModelsLanguage Studio and Custom Models
Language Studio and Custom Models
 
CosmosDB for IoT Scenarios
CosmosDB for IoT ScenariosCosmosDB for IoT Scenarios
CosmosDB for IoT Scenarios
 
Forecasting time series powerful and simple
Forecasting time series powerful and simpleForecasting time series powerful and simple
Forecasting time series powerful and simple
 
Constrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project BonsaiConstrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project Bonsai
 
Azure security guidelines for developers
Azure security guidelines for developers Azure security guidelines for developers
Azure security guidelines for developers
 
Autonomous Machines with Project Bonsai
Autonomous Machines with Project BonsaiAutonomous Machines with Project Bonsai
Autonomous Machines with Project Bonsai
 
Global azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure LighthouseGlobal azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure Lighthouse
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JS
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
 
Industrial IoT on Azure
Industrial IoT on AzureIndustrial IoT on Azure
Industrial IoT on Azure
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 

Dernier

Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Dernier (20)

E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

Time Series Databases for IoT (On-premises and Azure)

  • 1. September 12 Time Series DB for IoT Choosing the Right IoT Data Platform On-Premises and Azure
  • 2. About me • Project Manager @ o 15 years professional experience o .NET Web Development MCPD • External Expert Horizon 2020 • External Expert Eurostars & IFD • Business Interests o Web Development, SOA, Integration o Security & Performance Optimization o IoT, Computer Intelligence • Contact o ivelin.andreev@icb.bg o www.linkedin.com/in/ivelin o www.slideshare.net/ivoandreev
  • 3. Agenda Time Series o Why Time Series o Choosing the Right IoT Data Platform o InfluxDB vs Competitors o Key Concepts o Demo
  • 4. IoT Data • IoT (buzzword of the year) • 30 billion “things” by 2020 (Forbes) • Top IoT Industries • Manufacturing • Healthcare • What are the benefits from IoT? • The discussion has shifted • How to make IoT work? • How to gain insight on hidden relations? • How to get actionable results?
  • 5. ASPires Distributed IoT Platform Funded by EUROPEAN CIVIL PROTECTION AND HUMANITARIAN AID OPERATIONS 2016/PREV/03 (ASPIRES)
  • 6. IoT Data is Time Stamp Data • Series o Identified by source name (i.e. sensor ID) and metric name (i.e. temp) o Consists of ordered {time, value} measurements o Regular and Irregular • Time Series DB o Optimized for TS Data • Process Historian – more than TS DB o Interfaces to read data from multiple data sources o Render graphics for meaningful points o Statistical process control o Redundancy and high availability
  • 7. TS Data Characteristics • Writes o 95%-99% of all operations o Streaming live data from multiple devices o Typically sequential appends • Updates to modify values are rare • Deletes are bulk on large ranges (days, months, years) • Queries o Typically sequential o Concurrent reads are common • Performance issues are typically I/O bound o Caching does not work well for BigData o Systems are typically distributed by design Credits: Baron Schwartz https://www.xaprb.com/blog/2014/06/08/time-series-database-requirements/
  • 8. This is TS Data
  • 10. Choosing a Proper Data Platform
  • 11. Data Platform Services for IoT Data • Data Aggregation Services o Collect, normalize, aggregate metrics and events data • Data Storage Services o Distributed by nature o High write load, fast retrieval, efficient compression • Analytics and Visualization Services • Technologies not designed for the use case o Relational DB Engines o Columnar or key-value databases (HBase, Cassandra) o Search engine (Elasticsearch) o Document-oriented (MongoDB, DocumentDB) o First gen. TS DB (Graphite, OpenTSDB)
  • 12. When TS overperform RDBMS • Target scenarios o High I/O rate o Number of tags o Volume of data o Aggregation of irregular data o Compression & De-duplication • Requires a learning o Do you expect that many data? o Do you need to plot? o Do you need to aggregate?
  • 13. InfluxDB vs SQL Server 2016 Credits: Angelin Nedelchev (ICB)
  • 14. Validation Setup InfluxDB v1.3.0 single node SQL Server 2016 Standard CPU: Intel Core i7 3.40GHz (quad, 8MB cache) Memory: 16GB DDR3 1333MHz OS: Windows 10 Enterprise, 64 bit SSD: Samsung 850 EVO 250GB HDD: Seagate 7200RPM, 16MB cache, 500GB Name Type Indexed ID (PK) int Yes TagID int Yes Value int No Timestamp DateTime Yes Name Type tagID Int Tag Key value Int Field Key time DateTime TimeStamp InfluxDB schema: SQL Server schema:
  • 15. Write Performance (writes/sec) 0 50000 100000 150000 200000 250000 0 20,000,000 40,000,000 60,000,000 80,000,000 100,000,000 Influx Insertspersecond • Influx average write speed of 200,000 writes/second • SQL Server average write speed of 5,000-8,000 writes/second • Memory optimized tables increases write performance by 1.4x • Write trend unaffected by the amount of data (in this volumes) • Influx outperforms SQL Server 40x 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 20,000,000 40,000,000 60,000,000 80,000,000 100,000,000 SQL In-Memory OLTP
  • 16. Disk Storage • 27% improvement is SQL Server page compression is enabled (1883MB) • Influx has 19x-26x less disk requirements for the same functionality 2579MB 98MB 0 500 1000 1500 2000 2500 3000 On-Disk Storage SQL Server Influx DiskSpaceUsed(megabytes)DiskSpaceUsed(megabytes)
  • 17. Read Performance (queries/sec) 0 500 1000 1500 2000 2500 3000 3500 4000 1 2 4 8 InfluxDB SQL Server QueriesperSecond Concurrency Benchmark query: Aggregate samples of 10,000 points, average over 1,000 runs. • SQL Server mean query response time – 78ms (13 queries/sec) • Influx mean query response time – 1,3ms (769 queries/sec) • Influx has 59x better query throughput than SQL Server
  • 18. Read Performance (rows at a time) • SQL Server relatively unaffected by SSD (due to caching) • Influx performance improves up to 100% on SSD, chunk size 10’000 • SQL Server is up to 2x better on HDD (Influx is better up to 600K) • Relatively equal on SSD (Influx is better up to 1.6M) 0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 0 5 10 15 20 25 30 HDD Performance SQL Influx Queriedresults Seconds to execute (lower is better) 0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 0 5 10 15 20 25 30 SSD Performance SQL Influx Seconds to execute (lower is better)
  • 19. InfluxDB vs NonSQL for TS Data • InfluxDB vs MongoDB o WRITE: 27x greater o STORAGE: 84x less o QUERY: Equal performance • InfluxDB vs Elasticsearch o WRITE: 8x greater o STORAGE: 4x less o QUERY: 3.5x – 7.5x faster • InfluxDB vs OpenTSDB o WRITE: 5x greater o STORAGE: 16.5x less o QUERY: 4x faster • InfluxDB vs Cassandra o WRITE: 5.3x greater o STORAGE: 9.3x less o QUERY: up to 168x faster • InfluxDB vs DocumentDB o More popular (Google Trends) o Cloud and on-premises o No external dependencies o Aggregations
  • 20. Database as a Service in Azure • Typical Azure IoT Services (Before 2017.04.20) • With Time Series Insigths (beta) • Other o CosmosDB with MongoDB (SLA 10ms reads, 15ms writes)
  • 21. Azure Stream Insights Fully managed analytics, storage and visualization service Benefits • Reduced number of services used • Monitor IoT solutions • Event source from IoT Hub and Event Hub (only) • Visualize and analyze data at large scale • Root cause analysis and anomaly detection • APIs available for management of raw data Cons • Pricey (€126.53, 1M events - €1,138.50 month/unit, 10M events) • Specific use cases • Limited capabilities
  • 22. • Open-source distributed TS database • Key Features o Easy setup, no external dependencies, implemented in Go o Runs on Linux, Windows, OS X o Supports .NET, Java, JS, R, PHP, Python, Ruby, Go, Node.js o Comprehensive documentation o Scalable, highly efficient o REST API (JSON) o SQL-like syntax o On-premise and cloud • Top ranked TS DB
  • 23. Key Concepts Term Description Measurement Container (Table) Point Single record for timestamp Field Set Required; Not-indexed Field key Define what is measured Field value Actual measured value (string, bool, int64, float64) Tag Set Metadata about the point Optional; Indexed; Key-value; Tag key Unique per measurement Tag value Unique per tag key Series Points with common tag set • Aggregation functions • Retention policies • Downsampling • Continuous queries
  • 24. Scalability & Sizing Guidelines • Single node or cluster o Single node is open source and free • Recommendations • Query complexity (Moderate) o Multiple functions, few regular expressions o Complex GROUP BY clause or sampling over weeks o Runtime 500ms – 5sec Load Resources Writes/Sec Moderate Queries/Sec. Unique Series Low Cores: 2-4; RAM: 2-4 GB 0 - 5K 0 - 5 0 – 100K Moderate Cores: 4-6; RAM: 8-32 GB 5K - 250K 5 - 25 100K - 1M High Cores: 8+; RAM: 32+ 250K – 750K 25 - 100 1M - 10M
  • 25. Functions Aggregations Selectors Transformations Predictors COUNT() BOTTOM() CEILING() HOLT_WINTERS() DISTINCT() FIRST() CUMULATIVE_SUM() INTEGRAL() LAST() DERIVATIVE() MEAN() MAX() DIFFERENCE() MEDIAN() MIN() ELAPSED() MODE() PERCENTILE() FLOOR() SPREAD() SAMPLE() HISTOGRAM() STDDEV() TOP() MOVING_AVERAGE() SUM() NON_NEGATIVE_DERIVATIVE()
  • 26. InfluxData Advantages • End-to-end solution • Flexible tag support o Graphite, RRD (no tags at all) o OpenTSDB, Kairos (tag number) • Regular & Irregular TS o Graphite, OpenTSDB • Multiple Data Types • High Performance o Hbase, Elastic, Cassandra • Functionality o Hbase, Cassandra
  • 28. Takeaways • Why Time Series Matters for IoT o https://www.influxdata.com/ resources/webinar-time-series-monitoring- metrics-real-time-analytics-iotsensor-data/ • InfluxDB o https://www.influxdata.com/resources/ o https://docs.influxdata.com/influxdb/v1.3/guides/hardware_sizing/ o https://docs.influxdata.com/influxdb/v1.3/concepts/glossary/ o https://docs.influxdata.com/influxdb/v1.3/query_language/schema_exploration/ • Grafana Plugins o https://grafana.com/plugins • Time Series Insights o https://docs.microsoft.com/en-us/azure/time-series-insights/time-series-insights-overview
  • 29. Thanks to our Sponsors: Global Sponsor: With the support of:
  • 30. Upcoming events SQLSaturday #642 (Sofia) October 14 http://www.sqlsaturday.com/642/ JS Talks (Sofia), November 18 http://jstalks.net/