SlideShare une entreprise Scribd logo
1  sur  66
Télécharger pour lire hors ligne
Pravega
March Community Meeting
Welcome
• Last call was November 6th, 2020 via Zoom
• Since then:
• Pravega was accepted as a CNCF Sandbox project (Nov. 10th, 2020)
• 0.7.3 was released (Dec. 9th, 2020)
• 0.8.1 was released (Jan. 14th, 2021)
• 0.9.0 was released (Mar. 3rd, 2021)
• This call via Cloud Native Community Groups (aka Bevy.com)
• We will host these monthly
Community Developments
• Pravaga Akka connector gets updated with Key Value Tables support
• https://github.com/akka/alpakka/pull/2566
• Pravega on ARM64 and RISCV
• RISCV support issues are opened in upstream dependencies
• PR for ARM64 support: https://github.com/pravega/pravega/pull/5747
• Documentation improvements coming soon
• New guides to help Dev & Admin roles get started, updates to existing docs
• New website coming soon
• Maintainers group and Steering Committee are forming
• Maintainers group invites have been sent out
• Steering Committee invites will be coming soon
• Join the users mailing list: https://lists.cncf.io/g/cncf-pravega-users
Next call
April 16th – 7 AM Pacific
Topic: Connectors
• Akka connector
• Presto connector
• Flink connector
• Spark connector
• NiFi connector
Open to suggestions, especially if you want to show off your connector.
(And please suggest topics for future monthly calls!)
Suggestions & feedback – send to: Derek.Moore@dell.com
Today’s call – Agenda
• State of Pravega
• Overview Experimental Features of Pravega
• Schema Registry
• Consumption Based Retention
• Simplified Long-Term Storage (SLTS)
• SLTS Plugin for BookKeeper
• Key Value Tables
• Overview Performance Evaluation
time ran short, so Key Value Tables and
Performance Evaluation presentations were
rescheduled
April 16th Community Call will feature:
Key Value Tables (KVT)
Performance Evaluation
Akka connector w/ KVT
EDIT
State of Pravega
Flavio Junqueira
Pravega
What's Pravega?
• Pravega is about streaming data...
8
Pravega Community Meeting - March 2021
• Data sources
• Continuously
generated
data
Processing
applications
Pravega
• Visualize
• Alert
• Train
• Infer
Scale-out storage
(e.g., Object Store)
Ingest and store providing:
• Consistency
• Elasticity
• Durability
What's Pravega?
• Pravega is about streaming data...
9
Pravega Community Meeting - March 2021
• Data sources
• Continuously
generated
data
Processing
applications
Pravega
• Visualize
• Alert
• Train
• Infer
Scale-out storage
(e.g., Object Store)
Durable
Log
Tiered storage
Ingest and store providing:
• Consistency
• Elasticity
• Durability
What's Pravega?
• Pravega is about streaming data...
10
Pravega Community Meeting - March 2021
• Data sources
• Continuously
generated
data
Processing
applications
Pravega
• Visualize
• Alert
• Train
• Infer
Scale-out storage
(e.g., Object Store)
Ingest and store providing:
• Consistency
• Elasticity
• Durability
Durable
Log
Tiered storage
Clients
• Java
• Others under
development
Where's Pravega used?
11
Pravega Community Meeting - March 2021
Streaming Data Platform
12
Pravega Community Meeting - March 2021
Pravega Community Meeting - March 2021 13
https://www.delltechnologies.com/en-us/blog/episode-two-the-
best-ride-of-your-life/
https://www.youtube.com/watch?v=BTh1gkf0kQQ
https://www.youtube.com/watch?v=89IDFI9jry8
Dell Tech Customer Profile - RWTH
Amusement Parks
Construction sites
Industrial IoT
Looking forward to seeing community use cases
14
Pravega Community Meeting - March 2021
Open-source trajectory
15
Pravega Community Meeting - March 2021
Timeline and status
• Open-sourced early in 2017
• First open-source release: 0.1.0 – Dec. 19, 2017
• 0.9.0 is fresh out of the oven
• In 2020, CNCF sandboxing
• Transition
• Overall bootstrapping
• Setting up communication channels
• Web site revamp (coming soon!)
• Organizing repositories
• Documenting governance
https://github.com/cncf/toc/issues/560
Source: https://star-history.t9t.io/#pravega/pravega
16
Pravega Community Meeting - March 2021
Repositories
17
Pravega Community Meeting - March 2021
Pravega Core
Total: 43 Repositories
Connectors:
• Apache Flink
• Apache Spark
• Apache NiFi
• Logstash
• Presto (brand new)
Kubernetes Operators:
• Pravega
• Apache BookKeeper
• Apache Zookeeper
Tools:
• Pravega tools
• Flink tools
• Benchmark
Client bindings:
• Rust
• Python
Contributions
• Unique collaborators across repositories: 135
• Vast majority from Dell
• Expect more non-Dell contributions
• Many open issues and opportunities to contribute
• Look for guidance, maybe a mentor, if you want to get involved
• Going forward
• Expect more Pravega features
• Important focus on ecosystem
Pravega Community Meeting - March 2021 18
Get involved!
19
Pravega Community Meeting - March 2021
https://github.com/pravega/pravega/wiki/Contributing
T-Shirts for the best questions
20
Pravega Community Meeting - March 2021
Pravega Community Meeting - March 2021 21
Thank you!
Consumption Based Retention
(CBR)
http://pravega.io
Prajakta Belgundi
12/03/2021
@PravegaIO https://github.com/pravega/pravega
Data Retention for Streams (without CBR)
• A Stream can be configured for:
• No Retention Policy –
• Data is never auto truncated.
• Manual Truncation using explicit API invocation possible.
• SIZE/TIME based Retention Policy
• Data is periodically truncated based on Size/Time limits.
• The Policy supports specifying an “atleast” (min) value.
• A Retention Cycle runs periodically on Controller that truncates Streams that breach the policy
limit.
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
Stream Truncation
• Every time the retention cycle runs, a new Stream-Cut is generated at the tail of the Stream
• Truncation an happen only at a specific Stream-Cut.
• When a Stream is found to have more data than the configured retention limit, the Controller
identifies a Stream-Cut, that satisfies the Retention Policy and truncates the Stream at this
Stream-Cut.
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
Size Based Retention
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
Limitations
• Stream truncation is agnostic of reads by Reader Group(s).
• Un-read data could be lost.
• Streams tend to consume more space, as approach to deletion is conservative.
• No max limit on Stream size.
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
Why CBR ?
• Some streaming use-cases, do not require data to be stored over the long term.
• Once data is “read” by specific Reader-Group(s) it can be deleted.
• Environments may have constrained Storage capacity – e.g.: Edge Gateways.
• Need to cycle/move out data as soon as it is read.
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
What is CBR ?
• Stream Truncation can happen based on read positions of “specific” Reader Groups reading from
the Stream.
• These Reader Groups need to be created as “subscriber” Reader Groups.
• Read positions of “non-subscriber” Reader Groups do not impact Stream truncation.
• The Stream Retention policy also has a max limit (in addition to the min limit discussed earlier)
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
Configuring CBR
• The existing Retention policy (SIZE/TIME based) can stay as is.
• To enable CBR, update the Reader-Group(s) configuration on Client to be “subscriber” Reader-
Group(s).
• If all Reader Groups are non-subscribers, the Stream won’t have Consumption Based Retention.
• Optionally, Set a max(at-most) limit on Stream Retention Policy. Defaults to LONG_MAX.
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
How CBR works
• A subscriber Reader Group periodically publishes the Stream-Cut corresponding to its “read”
positions in the Stream to Controller.
• This stream-cut is stored on Controller.
• When the retention-cycle on Controller runs, Stream-Cuts from all “subscriber” Reader Groups,
are used to compute a single subscriber-lowerbound-stream-cut.
• The Stream is truncated at this subscriber-lowerbound-stream-cut if it satisfies the min/max limit
criteria.
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
Reader Group Configuration changes …
Retention Type - A New parameter that can be used to enable CBR:
• AUTOMATIC_RELEASE_AT_LAST_CHECKPOINT –
• At every checkpoint completion, the Reader Group automatically emits the Stream-cut
corresponding to the checkpoint as “read” acknowledgement to Controller.
• MANUAL_RELEASE_AT_USER_STREAMCUT –
• No automatic publishing of Stream-Cuts from Client to Controller.
• Users need to create manual Checkpoints and publish Stream-Cuts corresponding to this
checkpoint to Controller using the updateRetentionStreamCut()API.
• NONE (default) –
• This Reader Group does not participate in Consumption Based Retention.
• Its read positions do not impact Stream truncation.
http://pravega.io
https://pravega-io.slack.com
@PravegaIO https://github.com/pravega/pravega
Min/Max and Subscriber Lower Bound
• If Min < SLB < Max, truncate at SLB
• If SLB < Min, truncate at Min
• If SLB > Max, truncate at Max
http://pravega.io
https://pravega-io.slack.com
Questions ?
http://pravega.io
SLTS
http://pravega.io
Introducing Simplified Long
Term Storage.
Sachin Joshi
Sr. Principal Software Engineer
DELL EMC
@PravegaProject https://github.com/pravega/pravega
http://pravega.io
https://pravega-io.slack.com
Why: Pravega is Streaming Storage
• Durability is fundamental
• Once acknowledged data is never lost
• Performance is critical
• Low latency
• High throughput
• Storage Efficiency is important
• Single Unified API that works for
• Real time data
• Historical Data
• Excellent choice for Kappa Architecture
• Automatic Tiering
• between low latency short term
storage and large-capacity external long-
term storages.
• space efficient and performant
• completely transparent.
• Bring your own external storage.
• Cloud Native
• Multi-cloud
• Meet where customers are already
moving (Object stores)
• Enable edge.
http://pravega.io
LTS is integral part of Pravega Storage
Traditional Way Pravega way
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
LTS is integral part of Pravega Storage
Traditional Way Pravega way
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
Quick Background
• Concepts
• Stream
• Segments
• Append only semantics. Can be sealed.
• Transactions use concatenation
• Scaling up or down
• Mapping of Routing key to segment is maintained by
Controller
• Implementation
• Segment Store
• Multiple containers per segment store.
• Number of containers are fixed for a deployment.
• Mapping from segment to containers is consistent.
• Storages
• Tier-1 : In cluster Short-term storage for Write Ahead Log
• Tier-2 : External Long-term storage
• Cache – Ephemeral (Internal spillover cache)
• Assumptions
• Tier-2 writes are async, not on critical path
• Throughput matters more
• Segment is an opaque sequence of bytes
• Strong assumptions about tier-1 fencing.
• Tier-2 reads can be optimized by prefetching
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
Goal : Provide Segment Abstraction to upper layers
Requirements
• Segments are dynamic:
• Grow at tail end
• Shrink at head end
• Otherwise immutable
• Segments are everywhere
• Segments form the very foundation on top of which higher
order Pravega features, and data structures are built.
• Almost all the data in Pravega is ultimately stored in such
segments.
• User streams, attribute segments,
• Key Value tables,
• internal Pravega streams,
• client state management etc.
• All of these assume that this segment abstraction
works as expected
Challenges
• How to build segments out of immutable
objects?
• How to enforce single writer pattern?
• How to implement Atomic appends?
• How to deal with eventual consistency?
• How to truncate at the head ?
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
Problem 1 : Split Brain & Fencing
Problem
• Requirement:
• Data is appended atomically
• Strong single writer pattern
• Split-brain: In case of network partition more than one Segment
Stores may end up writing to the same underlying file causing
data corruption.
• Writes to storage are async, so older SS may be simply be
flushing data even if not accepting new traffic
• Long GC pauses
• Safe appends are needed:
• NFS 3 locks are not problematic
• HDFS appends are atomic, but not concurrency safe –
cannot append at specific offset.
• AWS S3 has eventual consistency
Current Solution
• Fencing
• Use "fencing" to mark ownership on underlying storage object
to prevent older SS from writing
• Implementation depends on guarantees and capabilities
of underlying storage
• How we do it today
• ECS S3 – use offset conditional appends
• HDFS – use atomic renames
• NFS – overwrite data
• Downside
• Each storage binding provides different sets of guarantees
• New fencing solution needs to be provided for each new
binding
• Possible performance degradation
• Hard to reason about correctness and liveness properties.
• Saving grace: SS are simply applying changelog from WAL, so they
should produce same data at same offset.
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
Problem 2 : Object Stores
Problem
• Requirement:
• Data is appended atomically
• Strong single writer pattern
• No Append functionality:
• AWS S3 does not provide appends or partial updates
• Entire object must be overwritten
• Eventually consistent:
• Provides only read-after-write guarantee that too only on
new object creation
• All GET s are eventually consistent – you may get old
version
• Listing objects is eventually consistent
• Versions are also eventually consistent
• Cost
• Per operation charges
Current Solution
• Not supported today
• Possible solution : Use multi-part upload
• Create big object out of small parts
• Use CopyPartRequest
• Issues
• Still must deal with eventual consistency
• Managing objects is hard with eventual consistency
• Versioning doesn't help
• No mechanism today to manage multiple objects
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
What : Defining scope
Goals
• Simplify API contract for storage bindings
• Eliminate need for complex fencing
during failover. (E.g. During partitions)
• Give freedom to optimize append logic
• Leverage storage native
merge/concatenation capability
• Provide extension points for the future
• Additional background services –
Defragmentation, Integrity checks
• Data compression, encryption, erasure
encoding at
• Access control
• Multiple tiers
Out of scope
• This design does not target following
• Reading data written by Pravega
without using Pravega.
• Import or export of pre-existing
data in other formats (e.g. Avro)
• Multiple tiers
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
How : Architecture Change
Segment Store - Current
ReadIndex
DurableLog
SegmentMapper
StorageReader StorageWriter
Cache
AsyncStorage
RollingStorage
SyncStorage
SyncStorage
Segment Store - New design
ReadIndex
DurableLog
SegmentMapper
StorageReader StorageWriter
Cache
ChunkedSegmentStorage
Metadata
TableStore
SyncStorage
ChunkStorage
Storage Optimizer
(Defragment,
GarbageCollector)
Unchanged Deleted Added
Changed
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
Unified Chunk Management Layer
• Chunk
• “Unit” of storage: stored on underlying storage as files or
objects.
• Append-only writes:
• Effectively leverage append or concat from underlying storage wherever we can.
• Otherwise each write is separate chunk
• Strong single writer pattern: Chunks can be written by only one
Segment Store.
• Immutable: once they become inactive, they are considered
“sealed”.
• Names : Arbitrary but must be globally unique.
• Segment
• Made from chunks – conceptually a linked list of chunks
• New chunk is added when
• New segment is written for the first time
• After failover
• Underlying file/object reaches its limit
• Underlying storage does not provide safe append semantics
• Append only writes - There can be only one active chunk per
segment at any time.
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
ChunkedSegmentStorage
Component in each segment store container that manages metadata for segment it owns
• Conceptually the segment metadata consists of
• A header describing various properties of a segment
• plus a linked list of chunk metadata records describing each chunk.
• Metadata is stored in BK based Table
• Stored as KV pairs
• Table is pinned to a container.
• Metadata updates are atomic (when using multiple records).
• Metadata updates are fenced by tier-1
• Metadata updates are efficient
• Metadata updates are infrequent and updated lazily
• Metadata records are cached using write-through cache for read performance.
• Important optimization – metadata about chunk can be updated only when it becomes inactive.
• Metadata records offer points of extensibility
• Access Control lists at segment level
• CRC-check sums per chunk
• Import and export of metadata to a file/segment
• Metadata can be stored on
• Any storage that supports “read after writes” consistency
http://pravega.io
ChunkStorage
Simple contract that each storage provider must implement
ChunkStorage work only at chunk level. They are not aware of segment layout
• Required
• Create (C) -
• Read (R)
• Open (O) - Open chunk for read or write, returns SegmentHandle
• Write (W) - The write must be appended in a concurrency safe way. ( Otherwise each write needs to be separate
chunk)
• Delete (D)
• List (L) - List chunks
• Info (I) - Get attributes of the chunk (e.g. Size)
• Optionally
• Merge (M) - concat existing chunks
• Truncate(T) - truncate at the end
• Make Writable/Read-only -
• Not Required
• Fencing logic
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
Advantage of SLTS
Technical
• Separation of Responsibility
• Storage providers must only support simple
CRUDL operations
• Clean implementation of Single Writer
pattern
• Increased concurrency
• higher degree of thread utilization
• higher degree of parallelism
• higher read and write throughput and
lower latency.
• Robust failure handling
Functional
• Plug-in model
• Enables third party storage adapters for
wide variety of systems.
• SDK and Test Suite (in near future)
• Batteries included
• Pravega comes with built in adapters
for NFS, HDFS and ECS
• S3 (in near future)
• Its own built-in metadata store
• fine-tuned for SLTS usage pattern.
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
http://pravega.io
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
Timeline
Timeline Notes
0.8 Alpha – Initial implementation
0.9 Beta 1 – Experimental. Initial bindings for File System (NFS), HDFS and ECS.
0.9.1 Beta 2 – Experimental. Stability improvements and bug fixes.
0.10 Stable Release. Additional bindings (E.g. S3). Admin and diagnostic tools.
0.11 SLTS SDK and Test suite. Migration Tools.
Future Default Pravega LTS.
http://pravega.io
@PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
References
• Design documents - PDP 34
• https://github.com/pravega/pravega/wiki/PDP-34:-Simplified-Tier-2
• Slack Channel : pravega-lts
• https://pravega-io.slack.com/archives/C013PPW5WC9
Backup slides
Key Operations : Metadata-only Operations
Operation How Implemented
Create New metadata record is added.
Open Metadata record is checked for access
Exists Metadata record is checked for existence
Seal Status field in Metadata record updated
Unseal Status field in Metadata record updated
Delete First Marked for delete.
Key Operations : Data Operations
Operation How Implemented
Read Chunk metadata is retrieved based on offset. (Variant of binary search.)
Offset within the chunk is calculated.
Read is issued on underlying storage.
Parallel Reads Like above except multiple offset ranges are read in parallel
Write Active chunk metadata is retrieved
Write is issued.
Chunk metadata is updated lazily.
Key Operations : Layout changes
Operation How Implemented
Concatenation • When transactions are committed
• Two linked lists for chunks are concatenated.
• Data is not moved
Defrag • A big chunk is created by concatenating smaller chunks
• Replace multiple chunk metadata records with single record for larger chunk
Truncate Truncated at the head by deleting unneeded chunks
Rollover When size limit is reached a new chunk is added.
Key Scenarios
• Rolling Storage
• Add new chunk each time chunk size limit is exceeded
• Segment Store Failover
1. New SS records the size of chunk that it sees.
2. New SS seals the chunk at that offset (from previous step)
3. Old SS can keep on writing even after this, but that doesn’t matter as we'll not read data
after recorded offset.
4. Old SS is fenced for tier-1 from making any metadata updates (all table segment updates
go through tier-1)
5. New SS starts a new chunk
6. New SS adds a metadata record for the new chunk
7. New SS replays the Write Ahead Log
8. New SS saves data to new chunk
9. If new SS fails, the process repeats
SLTS for BlobIt! Object Store on BookKeeper
BlobIt.org
github.com/diennea/blobit
Pravega BlobIt ChuckManager
github.com/diegosalvi/pravega-blobit-chunkmanager
EDIT
Thank you! See you next time!
• Slack Invite – pravega-slack-invite.herokuapp.com
• Slack Workspace – pravega-io.slack.com
• Blog – blog.pravega.io
• April 16th meeting: KVT, Perf & Akka – community.cncf.io/e/m9mdcn
• Feedback – Derek.Moore@dell.com

Contenu connexe

Tendances

StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9StorPool Storage
 
Containers and Nutanix - Acropolis Container Services
Containers and Nutanix - Acropolis Container ServicesContainers and Nutanix - Acropolis Container Services
Containers and Nutanix - Acropolis Container ServicesNEXTtour
 
Responding to Digital Transformation With RDS Database Technology
Responding to Digital Transformation With RDS Database TechnologyResponding to Digital Transformation With RDS Database Technology
Responding to Digital Transformation With RDS Database TechnologyAlibaba Cloud
 
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSManage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSMesosphere Inc.
 
[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack Networking
[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack Networking[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack Networking
[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack NetworkingOpenStack Korea Community
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed_Hat_Storage
 
Nutanix - Expert Session - Metro Availability
Nutanix -  Expert Session - Metro AvailabilityNutanix -  Expert Session - Metro Availability
Nutanix - Expert Session - Metro AvailabilityChristian Johannsen
 
SUSE: Infraestructura definida por software para BigData
SUSE: Infraestructura definida por software para BigDataSUSE: Infraestructura definida por software para BigData
SUSE: Infraestructura definida por software para BigDataJuan Herrera Utande
 
Red Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed_Hat_Storage
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageMatthew Sheppard
 
10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage10 reasons why to choose Pure Storage
10 reasons why to choose Pure StorageMarketingArrowECS_CZ
 
SQL Server 2016 New Security Features
SQL Server 2016 New Security FeaturesSQL Server 2016 New Security Features
SQL Server 2016 New Security FeaturesGianluca Sartori
 
Enterprise Cloud Platform - Keynote
Enterprise Cloud Platform - KeynoteEnterprise Cloud Platform - Keynote
Enterprise Cloud Platform - KeynoteNEXTtour
 
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...OpenStack
 
How to Set Up ApsaraDB for RDS on Alibaba Cloud
How to Set Up ApsaraDB for RDS on Alibaba CloudHow to Set Up ApsaraDB for RDS on Alibaba Cloud
How to Set Up ApsaraDB for RDS on Alibaba CloudAlibaba Cloud
 
RedisConf18 - Redis Enterprise on Cloud Native Platforms
RedisConf18 - Redis Enterprise on Cloud  Native  Platforms RedisConf18 - Redis Enterprise on Cloud  Native  Platforms
RedisConf18 - Redis Enterprise on Cloud Native Platforms Redis Labs
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed_Hat_Storage
 

Tendances (20)

StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9
 
Containers and Nutanix - Acropolis Container Services
Containers and Nutanix - Acropolis Container ServicesContainers and Nutanix - Acropolis Container Services
Containers and Nutanix - Acropolis Container Services
 
Responding to Digital Transformation With RDS Database Technology
Responding to Digital Transformation With RDS Database TechnologyResponding to Digital Transformation With RDS Database Technology
Responding to Digital Transformation With RDS Database Technology
 
Highlights of OpenStack Mitaka and the OpenStack Summit
Highlights of OpenStack Mitaka and the OpenStack SummitHighlights of OpenStack Mitaka and the OpenStack Summit
Highlights of OpenStack Mitaka and the OpenStack Summit
 
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSManage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
 
[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack Networking
[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack Networking[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack Networking
[OpenStack Day in Korea 2015] Keynote 5 - The evolution of OpenStack Networking
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and Future
 
Nutanix - Expert Session - Metro Availability
Nutanix -  Expert Session - Metro AvailabilityNutanix -  Expert Session - Metro Availability
Nutanix - Expert Session - Metro Availability
 
SUSE: Infraestructura definida por software para BigData
SUSE: Infraestructura definida por software para BigDataSUSE: Infraestructura definida por software para BigData
SUSE: Infraestructura definida por software para BigData
 
Nutanix basic
Nutanix basicNutanix basic
Nutanix basic
 
Red Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for Containers
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
 
10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage
 
SQL Server 2016 New Security Features
SQL Server 2016 New Security FeaturesSQL Server 2016 New Security Features
SQL Server 2016 New Security Features
 
Enterprise Cloud Platform - Keynote
Enterprise Cloud Platform - KeynoteEnterprise Cloud Platform - Keynote
Enterprise Cloud Platform - Keynote
 
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
 
How to Set Up ApsaraDB for RDS on Alibaba Cloud
How to Set Up ApsaraDB for RDS on Alibaba CloudHow to Set Up ApsaraDB for RDS on Alibaba Cloud
How to Set Up ApsaraDB for RDS on Alibaba Cloud
 
RedisConf18 - Redis Enterprise on Cloud Native Platforms
RedisConf18 - Redis Enterprise on Cloud  Native  Platforms RedisConf18 - Redis Enterprise on Cloud  Native  Platforms
RedisConf18 - Redis Enterprise on Cloud Native Platforms
 
Customer Case : Citrix et Nutanix
Customer Case : Citrix et NutanixCustomer Case : Citrix et Nutanix
Customer Case : Citrix et Nutanix
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super Storage
 

Similaire à 2021 March Pravega Community Meeting

CodeIgniter For Project : Lesson 103 - Introduction to Codeigniter
CodeIgniter For Project : Lesson 103 - Introduction to CodeigniterCodeIgniter For Project : Lesson 103 - Introduction to Codeigniter
CodeIgniter For Project : Lesson 103 - Introduction to CodeigniterWeerayut Hongsa
 
7 Apache Process Cloudstack Developer Day
7 Apache Process Cloudstack Developer Day7 Apache Process Cloudstack Developer Day
7 Apache Process Cloudstack Developer DayKimihiko Kitase
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Spring Framework 3.2 - What's New
Spring Framework 3.2 - What's NewSpring Framework 3.2 - What's New
Spring Framework 3.2 - What's NewSam Brannen
 
Unicon Nov 2014 IAM Briefing
Unicon Nov 2014 IAM BriefingUnicon Nov 2014 IAM Briefing
Unicon Nov 2014 IAM BriefingJohn Gasper
 
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...Lee Calcote
 
Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring WSO2
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil
 
Create great cncf user base from lessons learned from other open source com...
Create great cncf user base from   lessons learned from other open source com...Create great cncf user base from   lessons learned from other open source com...
Create great cncf user base from lessons learned from other open source com...Krishna-Kumar
 
Unicon June 2014 IAM Briefing
Unicon June 2014 IAM BriefingUnicon June 2014 IAM Briefing
Unicon June 2014 IAM BriefingJohn Gasper
 
Openstack trove-updates
Openstack trove-updatesOpenstack trove-updates
Openstack trove-updatesJesse Wiles
 
IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavatamarpierc
 
Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...
Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...
Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...Accumulo Summit
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk UpdateESUG
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 
Implementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdf
Implementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdfImplementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdf
Implementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdfssuserf4844f
 

Similaire à 2021 March Pravega Community Meeting (20)

CodeIgniter For Project : Lesson 103 - Introduction to Codeigniter
CodeIgniter For Project : Lesson 103 - Introduction to CodeigniterCodeIgniter For Project : Lesson 103 - Introduction to Codeigniter
CodeIgniter For Project : Lesson 103 - Introduction to Codeigniter
 
7 Apache Process Cloudstack Developer Day
7 Apache Process Cloudstack Developer Day7 Apache Process Cloudstack Developer Day
7 Apache Process Cloudstack Developer Day
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Spring Framework 3.2 - What's New
Spring Framework 3.2 - What's NewSpring Framework 3.2 - What's New
Spring Framework 3.2 - What's New
 
Unicon Nov 2014 IAM Briefing
Unicon Nov 2014 IAM BriefingUnicon Nov 2014 IAM Briefing
Unicon Nov 2014 IAM Briefing
 
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
Create Great CNCF User-Base from Lessons Learned from Other Open Source Commu...
 
Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
Create great cncf user base from lessons learned from other open source com...
Create great cncf user base from   lessons learned from other open source com...Create great cncf user base from   lessons learned from other open source com...
Create great cncf user base from lessons learned from other open source com...
 
Unicon June 2014 IAM Briefing
Unicon June 2014 IAM BriefingUnicon June 2014 IAM Briefing
Unicon June 2014 IAM Briefing
 
Trove Updates - Liberty Edition
Trove Updates - Liberty EditionTrove Updates - Liberty Edition
Trove Updates - Liberty Edition
 
Openstack trove-updates
Openstack trove-updatesOpenstack trove-updates
Openstack trove-updates
 
Developing XWiki
Developing XWikiDeveloping XWiki
Developing XWiki
 
IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavata
 
Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...
Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...
Accumulo Summit 2015: Real-Time Distributed and Reactive Systems with Apache ...
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk Update
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
Implementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdf
Implementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdfImplementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdf
Implementing-SaaS-on-Kubernetes-Michael-Knapp-Andrew-Gao-Capital-One.pdf
 
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
StarlingX - A Platform for the Distributed Edge | Ildiko VancsaStarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
 

Dernier

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 

Dernier (17)

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 

2021 March Pravega Community Meeting

  • 2. Welcome • Last call was November 6th, 2020 via Zoom • Since then: • Pravega was accepted as a CNCF Sandbox project (Nov. 10th, 2020) • 0.7.3 was released (Dec. 9th, 2020) • 0.8.1 was released (Jan. 14th, 2021) • 0.9.0 was released (Mar. 3rd, 2021) • This call via Cloud Native Community Groups (aka Bevy.com) • We will host these monthly
  • 3. Community Developments • Pravaga Akka connector gets updated with Key Value Tables support • https://github.com/akka/alpakka/pull/2566 • Pravega on ARM64 and RISCV • RISCV support issues are opened in upstream dependencies • PR for ARM64 support: https://github.com/pravega/pravega/pull/5747 • Documentation improvements coming soon • New guides to help Dev & Admin roles get started, updates to existing docs • New website coming soon • Maintainers group and Steering Committee are forming • Maintainers group invites have been sent out • Steering Committee invites will be coming soon • Join the users mailing list: https://lists.cncf.io/g/cncf-pravega-users
  • 4. Next call April 16th – 7 AM Pacific Topic: Connectors • Akka connector • Presto connector • Flink connector • Spark connector • NiFi connector Open to suggestions, especially if you want to show off your connector. (And please suggest topics for future monthly calls!) Suggestions & feedback – send to: Derek.Moore@dell.com
  • 5. Today’s call – Agenda • State of Pravega • Overview Experimental Features of Pravega • Schema Registry • Consumption Based Retention • Simplified Long-Term Storage (SLTS) • SLTS Plugin for BookKeeper • Key Value Tables • Overview Performance Evaluation
  • 6. time ran short, so Key Value Tables and Performance Evaluation presentations were rescheduled April 16th Community Call will feature: Key Value Tables (KVT) Performance Evaluation Akka connector w/ KVT EDIT
  • 7. State of Pravega Flavio Junqueira Pravega
  • 8. What's Pravega? • Pravega is about streaming data... 8 Pravega Community Meeting - March 2021 • Data sources • Continuously generated data Processing applications Pravega • Visualize • Alert • Train • Infer Scale-out storage (e.g., Object Store) Ingest and store providing: • Consistency • Elasticity • Durability
  • 9. What's Pravega? • Pravega is about streaming data... 9 Pravega Community Meeting - March 2021 • Data sources • Continuously generated data Processing applications Pravega • Visualize • Alert • Train • Infer Scale-out storage (e.g., Object Store) Durable Log Tiered storage Ingest and store providing: • Consistency • Elasticity • Durability
  • 10. What's Pravega? • Pravega is about streaming data... 10 Pravega Community Meeting - March 2021 • Data sources • Continuously generated data Processing applications Pravega • Visualize • Alert • Train • Infer Scale-out storage (e.g., Object Store) Ingest and store providing: • Consistency • Elasticity • Durability Durable Log Tiered storage Clients • Java • Others under development
  • 11. Where's Pravega used? 11 Pravega Community Meeting - March 2021
  • 12. Streaming Data Platform 12 Pravega Community Meeting - March 2021
  • 13. Pravega Community Meeting - March 2021 13 https://www.delltechnologies.com/en-us/blog/episode-two-the- best-ride-of-your-life/ https://www.youtube.com/watch?v=BTh1gkf0kQQ https://www.youtube.com/watch?v=89IDFI9jry8 Dell Tech Customer Profile - RWTH Amusement Parks Construction sites Industrial IoT
  • 14. Looking forward to seeing community use cases 14 Pravega Community Meeting - March 2021
  • 16. Timeline and status • Open-sourced early in 2017 • First open-source release: 0.1.0 – Dec. 19, 2017 • 0.9.0 is fresh out of the oven • In 2020, CNCF sandboxing • Transition • Overall bootstrapping • Setting up communication channels • Web site revamp (coming soon!) • Organizing repositories • Documenting governance https://github.com/cncf/toc/issues/560 Source: https://star-history.t9t.io/#pravega/pravega 16 Pravega Community Meeting - March 2021
  • 17. Repositories 17 Pravega Community Meeting - March 2021 Pravega Core Total: 43 Repositories Connectors: • Apache Flink • Apache Spark • Apache NiFi • Logstash • Presto (brand new) Kubernetes Operators: • Pravega • Apache BookKeeper • Apache Zookeeper Tools: • Pravega tools • Flink tools • Benchmark Client bindings: • Rust • Python
  • 18. Contributions • Unique collaborators across repositories: 135 • Vast majority from Dell • Expect more non-Dell contributions • Many open issues and opportunities to contribute • Look for guidance, maybe a mentor, if you want to get involved • Going forward • Expect more Pravega features • Important focus on ecosystem Pravega Community Meeting - March 2021 18
  • 19. Get involved! 19 Pravega Community Meeting - March 2021 https://github.com/pravega/pravega/wiki/Contributing
  • 20. T-Shirts for the best questions 20 Pravega Community Meeting - March 2021
  • 21. Pravega Community Meeting - March 2021 21 Thank you!
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 33. @PravegaIO https://github.com/pravega/pravega Data Retention for Streams (without CBR) • A Stream can be configured for: • No Retention Policy – • Data is never auto truncated. • Manual Truncation using explicit API invocation possible. • SIZE/TIME based Retention Policy • Data is periodically truncated based on Size/Time limits. • The Policy supports specifying an “atleast” (min) value. • A Retention Cycle runs periodically on Controller that truncates Streams that breach the policy limit. http://pravega.io https://pravega-io.slack.com
  • 34. @PravegaIO https://github.com/pravega/pravega Stream Truncation • Every time the retention cycle runs, a new Stream-Cut is generated at the tail of the Stream • Truncation an happen only at a specific Stream-Cut. • When a Stream is found to have more data than the configured retention limit, the Controller identifies a Stream-Cut, that satisfies the Retention Policy and truncates the Stream at this Stream-Cut. http://pravega.io https://pravega-io.slack.com
  • 35. @PravegaIO https://github.com/pravega/pravega Size Based Retention http://pravega.io https://pravega-io.slack.com
  • 36. @PravegaIO https://github.com/pravega/pravega Limitations • Stream truncation is agnostic of reads by Reader Group(s). • Un-read data could be lost. • Streams tend to consume more space, as approach to deletion is conservative. • No max limit on Stream size. http://pravega.io https://pravega-io.slack.com
  • 37. @PravegaIO https://github.com/pravega/pravega Why CBR ? • Some streaming use-cases, do not require data to be stored over the long term. • Once data is “read” by specific Reader-Group(s) it can be deleted. • Environments may have constrained Storage capacity – e.g.: Edge Gateways. • Need to cycle/move out data as soon as it is read. http://pravega.io https://pravega-io.slack.com
  • 38. @PravegaIO https://github.com/pravega/pravega What is CBR ? • Stream Truncation can happen based on read positions of “specific” Reader Groups reading from the Stream. • These Reader Groups need to be created as “subscriber” Reader Groups. • Read positions of “non-subscriber” Reader Groups do not impact Stream truncation. • The Stream Retention policy also has a max limit (in addition to the min limit discussed earlier) http://pravega.io https://pravega-io.slack.com
  • 39. @PravegaIO https://github.com/pravega/pravega Configuring CBR • The existing Retention policy (SIZE/TIME based) can stay as is. • To enable CBR, update the Reader-Group(s) configuration on Client to be “subscriber” Reader- Group(s). • If all Reader Groups are non-subscribers, the Stream won’t have Consumption Based Retention. • Optionally, Set a max(at-most) limit on Stream Retention Policy. Defaults to LONG_MAX. http://pravega.io https://pravega-io.slack.com
  • 40. @PravegaIO https://github.com/pravega/pravega How CBR works • A subscriber Reader Group periodically publishes the Stream-Cut corresponding to its “read” positions in the Stream to Controller. • This stream-cut is stored on Controller. • When the retention-cycle on Controller runs, Stream-Cuts from all “subscriber” Reader Groups, are used to compute a single subscriber-lowerbound-stream-cut. • The Stream is truncated at this subscriber-lowerbound-stream-cut if it satisfies the min/max limit criteria. http://pravega.io https://pravega-io.slack.com
  • 41. @PravegaIO https://github.com/pravega/pravega Reader Group Configuration changes … Retention Type - A New parameter that can be used to enable CBR: • AUTOMATIC_RELEASE_AT_LAST_CHECKPOINT – • At every checkpoint completion, the Reader Group automatically emits the Stream-cut corresponding to the checkpoint as “read” acknowledgement to Controller. • MANUAL_RELEASE_AT_USER_STREAMCUT – • No automatic publishing of Stream-Cuts from Client to Controller. • Users need to create manual Checkpoints and publish Stream-Cuts corresponding to this checkpoint to Controller using the updateRetentionStreamCut()API. • NONE (default) – • This Reader Group does not participate in Consumption Based Retention. • Its read positions do not impact Stream truncation. http://pravega.io https://pravega-io.slack.com
  • 42. @PravegaIO https://github.com/pravega/pravega Min/Max and Subscriber Lower Bound • If Min < SLB < Max, truncate at SLB • If SLB < Min, truncate at Min • If SLB > Max, truncate at Max http://pravega.io https://pravega-io.slack.com
  • 44. SLTS http://pravega.io Introducing Simplified Long Term Storage. Sachin Joshi Sr. Principal Software Engineer DELL EMC
  • 45. @PravegaProject https://github.com/pravega/pravega http://pravega.io https://pravega-io.slack.com Why: Pravega is Streaming Storage • Durability is fundamental • Once acknowledged data is never lost • Performance is critical • Low latency • High throughput • Storage Efficiency is important • Single Unified API that works for • Real time data • Historical Data • Excellent choice for Kappa Architecture • Automatic Tiering • between low latency short term storage and large-capacity external long- term storages. • space efficient and performant • completely transparent. • Bring your own external storage. • Cloud Native • Multi-cloud • Meet where customers are already moving (Object stores) • Enable edge.
  • 46. http://pravega.io LTS is integral part of Pravega Storage Traditional Way Pravega way @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 47. http://pravega.io LTS is integral part of Pravega Storage Traditional Way Pravega way @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 48. http://pravega.io Quick Background • Concepts • Stream • Segments • Append only semantics. Can be sealed. • Transactions use concatenation • Scaling up or down • Mapping of Routing key to segment is maintained by Controller • Implementation • Segment Store • Multiple containers per segment store. • Number of containers are fixed for a deployment. • Mapping from segment to containers is consistent. • Storages • Tier-1 : In cluster Short-term storage for Write Ahead Log • Tier-2 : External Long-term storage • Cache – Ephemeral (Internal spillover cache) • Assumptions • Tier-2 writes are async, not on critical path • Throughput matters more • Segment is an opaque sequence of bytes • Strong assumptions about tier-1 fencing. • Tier-2 reads can be optimized by prefetching @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 49. http://pravega.io Goal : Provide Segment Abstraction to upper layers Requirements • Segments are dynamic: • Grow at tail end • Shrink at head end • Otherwise immutable • Segments are everywhere • Segments form the very foundation on top of which higher order Pravega features, and data structures are built. • Almost all the data in Pravega is ultimately stored in such segments. • User streams, attribute segments, • Key Value tables, • internal Pravega streams, • client state management etc. • All of these assume that this segment abstraction works as expected Challenges • How to build segments out of immutable objects? • How to enforce single writer pattern? • How to implement Atomic appends? • How to deal with eventual consistency? • How to truncate at the head ? @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 50. http://pravega.io Problem 1 : Split Brain & Fencing Problem • Requirement: • Data is appended atomically • Strong single writer pattern • Split-brain: In case of network partition more than one Segment Stores may end up writing to the same underlying file causing data corruption. • Writes to storage are async, so older SS may be simply be flushing data even if not accepting new traffic • Long GC pauses • Safe appends are needed: • NFS 3 locks are not problematic • HDFS appends are atomic, but not concurrency safe – cannot append at specific offset. • AWS S3 has eventual consistency Current Solution • Fencing • Use "fencing" to mark ownership on underlying storage object to prevent older SS from writing • Implementation depends on guarantees and capabilities of underlying storage • How we do it today • ECS S3 – use offset conditional appends • HDFS – use atomic renames • NFS – overwrite data • Downside • Each storage binding provides different sets of guarantees • New fencing solution needs to be provided for each new binding • Possible performance degradation • Hard to reason about correctness and liveness properties. • Saving grace: SS are simply applying changelog from WAL, so they should produce same data at same offset. @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 51. http://pravega.io Problem 2 : Object Stores Problem • Requirement: • Data is appended atomically • Strong single writer pattern • No Append functionality: • AWS S3 does not provide appends or partial updates • Entire object must be overwritten • Eventually consistent: • Provides only read-after-write guarantee that too only on new object creation • All GET s are eventually consistent – you may get old version • Listing objects is eventually consistent • Versions are also eventually consistent • Cost • Per operation charges Current Solution • Not supported today • Possible solution : Use multi-part upload • Create big object out of small parts • Use CopyPartRequest • Issues • Still must deal with eventual consistency • Managing objects is hard with eventual consistency • Versioning doesn't help • No mechanism today to manage multiple objects @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 52. http://pravega.io What : Defining scope Goals • Simplify API contract for storage bindings • Eliminate need for complex fencing during failover. (E.g. During partitions) • Give freedom to optimize append logic • Leverage storage native merge/concatenation capability • Provide extension points for the future • Additional background services – Defragmentation, Integrity checks • Data compression, encryption, erasure encoding at • Access control • Multiple tiers Out of scope • This design does not target following • Reading data written by Pravega without using Pravega. • Import or export of pre-existing data in other formats (e.g. Avro) • Multiple tiers @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 53. http://pravega.io How : Architecture Change Segment Store - Current ReadIndex DurableLog SegmentMapper StorageReader StorageWriter Cache AsyncStorage RollingStorage SyncStorage SyncStorage Segment Store - New design ReadIndex DurableLog SegmentMapper StorageReader StorageWriter Cache ChunkedSegmentStorage Metadata TableStore SyncStorage ChunkStorage Storage Optimizer (Defragment, GarbageCollector) Unchanged Deleted Added Changed @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 54. http://pravega.io Unified Chunk Management Layer • Chunk • “Unit” of storage: stored on underlying storage as files or objects. • Append-only writes: • Effectively leverage append or concat from underlying storage wherever we can. • Otherwise each write is separate chunk • Strong single writer pattern: Chunks can be written by only one Segment Store. • Immutable: once they become inactive, they are considered “sealed”. • Names : Arbitrary but must be globally unique. • Segment • Made from chunks – conceptually a linked list of chunks • New chunk is added when • New segment is written for the first time • After failover • Underlying file/object reaches its limit • Underlying storage does not provide safe append semantics • Append only writes - There can be only one active chunk per segment at any time. @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 55. http://pravega.io ChunkedSegmentStorage Component in each segment store container that manages metadata for segment it owns • Conceptually the segment metadata consists of • A header describing various properties of a segment • plus a linked list of chunk metadata records describing each chunk. • Metadata is stored in BK based Table • Stored as KV pairs • Table is pinned to a container. • Metadata updates are atomic (when using multiple records). • Metadata updates are fenced by tier-1 • Metadata updates are efficient • Metadata updates are infrequent and updated lazily • Metadata records are cached using write-through cache for read performance. • Important optimization – metadata about chunk can be updated only when it becomes inactive. • Metadata records offer points of extensibility • Access Control lists at segment level • CRC-check sums per chunk • Import and export of metadata to a file/segment • Metadata can be stored on • Any storage that supports “read after writes” consistency
  • 56. http://pravega.io ChunkStorage Simple contract that each storage provider must implement ChunkStorage work only at chunk level. They are not aware of segment layout • Required • Create (C) - • Read (R) • Open (O) - Open chunk for read or write, returns SegmentHandle • Write (W) - The write must be appended in a concurrency safe way. ( Otherwise each write needs to be separate chunk) • Delete (D) • List (L) - List chunks • Info (I) - Get attributes of the chunk (e.g. Size) • Optionally • Merge (M) - concat existing chunks • Truncate(T) - truncate at the end • Make Writable/Read-only - • Not Required • Fencing logic @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 57. http://pravega.io Advantage of SLTS Technical • Separation of Responsibility • Storage providers must only support simple CRUDL operations • Clean implementation of Single Writer pattern • Increased concurrency • higher degree of thread utilization • higher degree of parallelism • higher read and write throughput and lower latency. • Robust failure handling Functional • Plug-in model • Enables third party storage adapters for wide variety of systems. • SDK and Test Suite (in near future) • Batteries included • Pravega comes with built in adapters for NFS, HDFS and ECS • S3 (in near future) • Its own built-in metadata store • fine-tuned for SLTS usage pattern. @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com
  • 58. http://pravega.io @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com Timeline Timeline Notes 0.8 Alpha – Initial implementation 0.9 Beta 1 – Experimental. Initial bindings for File System (NFS), HDFS and ECS. 0.9.1 Beta 2 – Experimental. Stability improvements and bug fixes. 0.10 Stable Release. Additional bindings (E.g. S3). Admin and diagnostic tools. 0.11 SLTS SDK and Test suite. Migration Tools. Future Default Pravega LTS.
  • 59. http://pravega.io @PravegaProject https://github.com/pravega/pravega https://pravega-io.slack.com References • Design documents - PDP 34 • https://github.com/pravega/pravega/wiki/PDP-34:-Simplified-Tier-2 • Slack Channel : pravega-lts • https://pravega-io.slack.com/archives/C013PPW5WC9
  • 61. Key Operations : Metadata-only Operations Operation How Implemented Create New metadata record is added. Open Metadata record is checked for access Exists Metadata record is checked for existence Seal Status field in Metadata record updated Unseal Status field in Metadata record updated Delete First Marked for delete.
  • 62. Key Operations : Data Operations Operation How Implemented Read Chunk metadata is retrieved based on offset. (Variant of binary search.) Offset within the chunk is calculated. Read is issued on underlying storage. Parallel Reads Like above except multiple offset ranges are read in parallel Write Active chunk metadata is retrieved Write is issued. Chunk metadata is updated lazily.
  • 63. Key Operations : Layout changes Operation How Implemented Concatenation • When transactions are committed • Two linked lists for chunks are concatenated. • Data is not moved Defrag • A big chunk is created by concatenating smaller chunks • Replace multiple chunk metadata records with single record for larger chunk Truncate Truncated at the head by deleting unneeded chunks Rollover When size limit is reached a new chunk is added.
  • 64. Key Scenarios • Rolling Storage • Add new chunk each time chunk size limit is exceeded • Segment Store Failover 1. New SS records the size of chunk that it sees. 2. New SS seals the chunk at that offset (from previous step) 3. Old SS can keep on writing even after this, but that doesn’t matter as we'll not read data after recorded offset. 4. Old SS is fenced for tier-1 from making any metadata updates (all table segment updates go through tier-1) 5. New SS starts a new chunk 6. New SS adds a metadata record for the new chunk 7. New SS replays the Write Ahead Log 8. New SS saves data to new chunk 9. If new SS fails, the process repeats
  • 65. SLTS for BlobIt! Object Store on BookKeeper BlobIt.org github.com/diennea/blobit Pravega BlobIt ChuckManager github.com/diegosalvi/pravega-blobit-chunkmanager EDIT
  • 66. Thank you! See you next time! • Slack Invite – pravega-slack-invite.herokuapp.com • Slack Workspace – pravega-io.slack.com • Blog – blog.pravega.io • April 16th meeting: KVT, Perf & Akka – community.cncf.io/e/m9mdcn • Feedback – Derek.Moore@dell.com

Notes de l'éditeur

  1. Limitations of existing Retention Mechanism.