Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Top 10 Best Practices for Apache Cassandra and DataStax Enterprise

245 vues

Publié le

No matter how diligent your organization is at driving toward efficiency, databases are complex and it’s easy to make mistakes on your way to production. The good news is, these mistakes are completely avoidable. In this webinar, Jeff Carpenter shares with you exactly how to get started in the right direction — and stay on the path to a successful database launch.

View recording: https://youtu.be/K9Zj3bhjdQg

Explore all DataStax webinars: https://www.datastax.com/resources/webinars

Publié dans : Technologie
  • Soyez le premier à commenter

Top 10 Best Practices for Apache Cassandra and DataStax Enterprise

  1. 1. academy.datastax.com | @jscarp DataStax Top 10 Best Practices 1 © DataStax, All Rights Reserved.
  2. 2. DataStax Top 10 Best Practices • Thanks to DataStax Professional Services • Related post: https://academy.datastax.com/top10best practices 2 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  3. 3. Why we ❤️ Apache Cassandra • Distributed, decentralized • Elastic scalability – add/remove nodes with no downtime • High performance • High availability / fault tolerant – no single point of failure • How do we realize these benefits? 3 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  4. 4. academy.datastax.com | @jscarp 1. Know your access patterns 4 © DataStax, All Rights Reserved.
  5. 5. Relational vs. Cassandra Data Modeling • Relational Approach • Cassandra Approach 5 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp Data Models Application Data Models Application
  6. 6. KillrVideo Reference Application 6 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  7. 7. Application Workflow in KillrVideo 7 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp7 User Logs into site Show basic information about user Show videos added by a user Show comments posted by a user Search for a video by tag Show latest videos added to the site Show comments for a video Show ratings for a video Show video and its details
  8. 8. Take the KillrVideo Tour 8 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  9. 9. academy.datastax.com | @jscarp 2. Get your data model right 9 © DataStax, All Rights Reserved.
  10. 10. 10
  11. 11. Relational Modeling • Create entity table • Add constraints • Index fields • Foreign Key relationships • SQL != CQL 11 CREATE TABLE users ( id number(12) NOT NULL , firstname nvarchar2(25) NOT NULL , lastname nvarchar2(25) NOT NULL, email nvarchar2(50) NOT NULL, password nvarchar2(255) NOT NULL, created_date timestamp(6), PRIMARY KEY (id), CONSTRAINT email_uq UNIQUE (email) ); -- Users by email address index CREATE INDEX idx_users_email ON users (email); CREATE TABLE videos ( id number(12), userid number(12) NOT NULL, name nvarchar2(255), description nvarchar2(500), location nvarchar2(255), location_type int, added_date timestamp, CONSTRAINT users_userid_fk FOREIGN KEY (userid) REFERENCES users (Id) ON DELETE CASCADE, PRIMARY KEY (id) );
  12. 12. Queries in KillrVideo to Support Workflows 12 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp Users User Logs into site Find user by email address Show basic information about user Find user by id Comments Show comments for a video Find comments by video (latest first) Show comments posted by a user Find comments by user (latest first) Ratings Show ratings for a video Find ratings by video
  13. 13. Designing Tables Based on Queries 13 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp Show video and its details Find video by id Show videos added by a user Find videos by user (latest first) CREATE TABLE videos ( videoid uuid, userid uuid, name text, description text, location text, location_type int, preview_image_location text, tags set<text>, added_date timestamp, PRIMARY KEY (videoid) ); CREATE TABLE user_videos ( userid uuid, added_date timestamp, videoid uuid, name text, preview_image_location text, PRIMARY KEY (userid, added_date, videoid) ) WITH CLUSTERING ORDER BY ( added_date DESC, videoid ASC);
  14. 14. Designing for fast access 14 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp CREATE TABLE user_videos ( userid uuid, added_date timestamp, videoid uuid, name text, preview_image_location text, PRIMARY KEY (userid, added_date, videoid) ) WITH CLUSTERING ORDER BY ( added_date DESC, videoid ASC); Partition key – which node(s) Clustering columns – layout on disk …uniqueness
  15. 15. Data Modeling Best Practices • Table per query • Use denormalization to minimize number of queries required • Make sure primary key guarantees uniqueness • Use bucketing to break up large partitions • Highly recommended: DS220: Practical Application Data Modeling with Apache Cassandra 15 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  16. 16. academy.datastax.com | @jscarp 3. Avoid tombstones 16 © DataStax, All Rights Reserved.
  17. 17. Deletion and Tombstones • Append-only storage model – SSTables are immutable • Tombstones used to explicitly indicate deleted data – Prevent accidental restoration of deleted data • Data actually cleaned up during compaction • Large numbers of tombstones can affect reads – Example log output 17 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp Read 1 live and 123780 tombstoned cells | 19:48:36,710 | 127.0.0.1 | 128631
  18. 18. Avoid Deletes and Writing Nulls 18 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp VS INSERT INTO myTable (primary_key, clustering_key) VALUES ('pk1', 'ck1'); INSERT INTO myTable (primary_key, clustering_key, regular_col) VALUES ('pk1', 'ck1', null); ⇒ Second version writes a tombstone
  19. 19. Tombstones - Mitigating • Use a journal-style data model • Set time to live (TTL) • Delete the largest possible amount of data at once – Range delete > Partition delete > row delete > cell delete > collection item delete • http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html • https://academy.datastax.com/support-blog/cleaning-tombstones-datastax-dse-and- apache-cassandra • https://academy.datastax.com/units/compaction-compaction-and-tombstones 19 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  20. 20. academy.datastax.com | @jscarp 4. Know your drivers 20 © DataStax, All Rights Reserved.
  21. 21. DataStax Drivers • OSS Cassandra Drivers – CQL Support – Sync / Async API – Load Balancing – Auto Node Discovery – Object Mapper © DataStax, All Rights Reserved. • DataStax Enterprise Drivers – OSS Driver features plus... – Unified Authentication – Graph Fluent API – Geometric Types 21 • ODBC • JDBC @DataStaxAcademy #DataStaxDeveloperDay
  22. 22. Driver Documentation 22 © DataStax, All Rights Reserved. Confidential Apache Cassandra Drivers (Open Source) DataStax Drivers (DataStax Enterprise) DataStax Java Driver DataStax Enterprise Java Driver DataStax Python Driver DataStax Enterprise Python Driver DataStax Node.js Driver DataStax Enterprise Node.js Driver DataStax Ruby Driver DataStax Enterprise Ruby Driver DataStax C# Driver DataStax Enterprise C# Driver DataStax C/C++ Driver DataStax Enterprise C/C++ Driver DataStax PHP Driver DataStax Enterprise PHP Driver
  23. 23. DataStax Driver Tips and Tricks • Common features: – Connection management – Creating and executing statements, and accessing the results – Synchronous and asynchronous execution – Object mapping – Logging and metrics – Policy management – Threading, networking and resource management – Schema access / management • Tips – Load balancing policy – Retry policy – Connection pool settings – Asynchronous operations • Coming soon: Getting Started with Drivers quick courses on DataStax Academy 23 © DataStax, All Rights Reserved. Confidential
  24. 24. academy.datastax.com | @jscarp 5. Plan for and practice operations 24 © DataStax, All Rights Reserved.
  25. 25. Do you have a Run Book? • Installation • Upgrade • Scaling up • Scaling down • Node replacement • Repairs • Backup • Restore • Monitoring • Tuning 25 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  26. 26. OpsCenter • Browser-based DSE cluster tool for: – Configuring – Monitoring – Managing • Two major components tied together: – OpsCenter Monitoring - monitoring and management – Life Cycle Manager (LCM) - mostly configuration and deployment 26 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  27. 27. @DataStaxAcademy #DataStaxDeveloperDay© DataStax, All Rights Reserved. Start with Security • End to end encryption • Data auditing • LDAP integration • Kerberos integration • Role based access control • Row-level access control Bad security results in data breaches. DSE has a number of features that can be used to secure your data at all stages. © DataStax, All Rights Reserved.27
  28. 28. academy.datastax.com | @jscarp 6. Do performance testing 28 © DataStax, All Rights Reserved.
  29. 29. Performance Testing Tips • Use your actual data model and realistic test data • Measure against SLAs – 99th percentile • Happy path and error / load conditions • Automate using Cassandra Stress, Gatling • Incorporate performance monitoring into operations 29 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  30. 30. OpsCenter Performance Service • Want to know when your nodes are slowing down? We'll send you a message • DSE will monitor your queries • Slow queries? Will let you know what's going on 30 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  31. 31. DSE 6.7 - Monitoring with Prometheus and Grafana 31 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  32. 32. Sample Grafana Dashboard 32 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  33. 33. academy.datastax.com | @jscarp 7. Automate cluster management 33 © DataStax, All Rights Reserved.
  34. 34. OpsCenter Lifecycle Manager (LCM) • Run jobs – Installation – Configuration – Upgrade 34 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  35. 35. Automate Monitoring and Management • Use Lifecycle Manager API to automate actions • Puppet, Chef, Ansible, Terraform, etc. 35 © DataStax, All Rights Reserved. Confidential https://docs.datastax.com/en/opscenter/6.7/ opsc/opscApi_g.html
  36. 36. academy.datastax.com | @jscarp 8. Understand how DSE features can help 36 © DataStax, All Rights Reserved.
  37. 37. DSE Search: Apache Solr / Lucene + Cassandra • Supports ad-hoc queries not supported by Cassandra • Full text search, Faceting, Stemming • Geospatial Search • Live indexing engine • No separate search cluster • No ETL or sync to build and maintain 37 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp • Search indexes co-located with Cassandra Search-enabled Data Center • Search-enabled CQL • Search index creation via CQL
  38. 38. DSE Analytics: Spark + Cassandra • Spark goodness: Spark Streaming, Spark SQL, Spark ML • Perform analytics without ETL to separate cluster • Write analytic results back to operational DB with Spark-Cassandra connector • Distributed file system (DSEFS) 38 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp Co-located Spark with Cassandra Analytics Data Center • Data validation • Migrate data to new schema
  39. 39. DSE Graph: Cassandra + Apache TinkerPop • Scalable, distributed graph DB • Optimized for storing, traversing, and querying • DSE Analytics and DSE Search integrated • Use cases: Customer 360, Recommendations, Fraud Detection 39 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp • Use when the relationships between the entities are the most important part
  40. 40. DataStax Studio • Notebook style interface • CQL, Gremlin, Spark SQL • Data visualization • Auto-completion 40 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp • Rapid prototyping & collaboration • Query tracing
  41. 41. DSBulk – Bulk Data Loading • Moves Cassandra data to/from files in the file system • Uses both CSV or JSON formats • Command-line interface 41 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  42. 42. academy.datastax.com | @jscarp 9. Know where to find helpful resources 42 © DataStax, All Rights Reserved.
  43. 43. DataStax Academy • Free self-paced courses • DS201: Apache Cassandra™ • DS210: Operations • DS220: Data Modeling • DS310: Search • DS320: Analytics • DS330: Graph • DS332: Graph Analytics (NEW) https://academy.datastax.com 43 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  44. 44. Learning Paths on DataStax Academy 44 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp https://academy.datastax.com/paths
  45. 45. Live Coding on Twitch • Live coding sessions with advocates and guests • Working through the challenges of building distributed systems • Join the conversation and ask questions • Some advocates also do streaming on personal channels https://www.twitch.tv/datastaxacademy 45 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  46. 46. Distributed Data Show • Interview-style show featuring a mix of DataStax and industry guests • We go in-depth on the technology and challenges of data in large-scale distributed systems • Released weekly on DataStax Academy YouTube channel and as a podcast • Send us your suggestions for topics and guests – we love customer use cases 46 academy.datastax.com | @jscarp© DataStax, All Rights Reserved.
  47. 47. Where are we online? Engage! – YouTube Channel – Twitter - @DataStaxAcademy – LinkedIn – KillrVideo reference application – DataStax Academy Slack 47 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp
  48. 48. DataStax Learning Events • Cassandra Days • DataStax Developer Days • Meetups • Accelerate Conference 48 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp http://academy.datastax.com/events
  49. 49. Join us at Accelerate! 49 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp http://www.datastax.com/accelerate Discount Code: ADVOCATE20
  50. 50. academy.datastax.com | @jscarp 10. Engage DataStax Professional Services 50 © DataStax, All Rights Reserved.
  51. 51. 51 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp DataStax Customer Success Helping Customers Get The Most Out of Their Investment ACCELERATE TIME TO VALUE MITIGATE RISK AND REALIZE VALUE IMPROVE ADOPTION AND PRODUCTIVITY SERVICES TRAINING CUSTOMER SUCCESS Our Customer Success team helps you with your implementation, training, adoption and operational support crucial to your success
  52. 52. 52 © DataStax, All Rights Reserved. academy.datastax.com | @jscarp Questions?
  53. 53. Thank you 53 © DataStax, All Rights Reserved. Confidential

×