Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Introduction to Amazon Kinesis Data Streams

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 15 Publicité

Introduction to Amazon Kinesis Data Streams

Télécharger pour lire hors ligne

This session will provide a brief Introduction to Amazon Kinesis Data Streams. Through this session, we'll learn about Kinesis Data Streams, it's architecture, terminology and key concepts related to it and some basic operations that we can perform on the data streams.

This session will provide a brief Introduction to Amazon Kinesis Data Streams. Through this session, we'll learn about Kinesis Data Streams, it's architecture, terminology and key concepts related to it and some basic operations that we can perform on the data streams.

Publicité
Publicité

Plus De Contenu Connexe

Similaire à Introduction to Amazon Kinesis Data Streams (20)

Plus par Knoldus Inc. (20)

Publicité

Plus récents (20)

Introduction to Amazon Kinesis Data Streams

  1. 1. Presented By: Prateek Gupta Introduction to Amazon Kinesis Data Streams
  2. 2. Lack of etiquette and manners is a huge turn off. KnolX Etiquettes Punctuality Join the session 5 minutes prior to the session start time. We start on time and conclude on time! Feedback Make sure to submit a constructive feedback for all sessions as it is very helpful for the presenter. Silent Mode Keep your mobile devices in silent mode, feel free to move out of session in case you need to attend an urgent call. Avoid Disturbance Avoid unwanted chit chat during the session.
  3. 3. Our Agenda 02 Amazon Kinesis Data Streams 03 High-Level Architecture 04 Key Concepts and Terminology 05 Basic Operations 01 What is Streaming Data? 06 Demo
  4. 4. What is Streaming Data?
  5. 5. What is Streaming Data? Streaming data refers to the data that is generated continuously in real time by thousands of data sources and delivered to a system for processing. Key Points: ● Real-time ● Continuous flow ● Variety of sources ● Variety of formats ● Requires specialized processing Examples: ● Ecommerce purchases ● Game data ● Information from social networks ● Log data ● Stock prices ● GPS data ● IoT Sensor Data
  6. 6. Amazon Kinesis Data Streams Amazon Kinesis Data Streams is a real-time streaming data service by AWS. It makes it easy to collect and process real-time streaming data at high scale. Some key points to understand: ● Real-time data ● Highly Scalable ● Data sources ● Processing ● Cost-effective ● Easy to use
  7. 7. High-Level Architecture ● The producers continually push data to Kinesis Data Streams, and the consumers process the data in real time. ● Once the processing is done by the consumer, the result are stored using an AWS service such as Amazon DynamoDB, Amazon Redshift, or Amazon S3.
  8. 8. Key Concepts and Terminology ➢ Producer: It is an application that puts the data records into Amazon Kinesis Data Streams. ➢ Consumer: It is an application that retrieves the data records from Amazon Kinesis Data Streams and process them. ➢ Kinesis Data Stream: ○ A Kinesis data stream is a set of shards. ○ Each shard has a sequence of data records. ○ Each data record has a sequence number. ○ Data retains for 24 hours by default.
  9. 9. ➢ Shard: ○ A shard is a uniquely identified sequence of data records ○ A stream is composed of one or more shards, each of which provides a fixed unit of capacity. ○ Each shard can support up to 1000 PUT records per second(or 1MB/sec), and up to 1,000 GET records per second(or 2MB/sec) ○ The data capacity of a stream is a function of the number of shards. ○ If the data rate increases, increase the number of shards allocated to the stream. ➢ Data Record: ○ A data record is the unit of data stored in a Kinesis data stream. ○ Each data record is composed of a sequence number, a partition key, and a data blob(up to 1MB).
  10. 10. ➢ Sequence Number: ○ A sequence number is a unique identifier for each data record. ○ Allows to read data in the order and also to determine which records have been processed ➢ Partition Key: ○ A partition key is a meaningful identifier that is associated with each record. ○ It is used by the service to determine which shard to store the record in. ○ Specified by the data producer while putting data into a data stream ○ Records with the same partition key are stored together in the same shard. ➢ Retention Period: ○ Amount of time that data records are stored in an Amazon Kinesis Data Stream. ○ Default data retention period for a stream is 24 hours(configurable upto 365 days)
  11. 11. ➢ Capacity Mode: ○ The capacity mode determines how capacity is managed and the usage charges for a data stream. ○ Currently, in Kinesis Data Streams, we can choose between an on-demand mode and a provisioned mode for our data streams.
  12. 12. Basic Operations Amazon Kinesis Data Streams provides a number of operations that can be performed on a data stream. Here are some basic operations: ● create-stream ● describe-stream ● list-streams ● put-record ● get-shard-iterator ● get-records ● split-shard ● merge-shards ● delete-stream
  13. 13. Demo
  14. 14. References ● Kinesis Data Streams Official Documentation ● AWS Kinesis - Javatpoint
  15. 15. Thank You !

×