Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Backup and RestoreBackup and Restore
in Cassandra andin Cassandra and
OpsCenterOpsCenter
OverviewOverview
Snapshot Operations
Restore Operations
Commit Log Archiving/Point in Time Restore
Remote backup
From both...
SnapshotsSnapshots
Nodetool Snapshot Basics
Performs a flush, then hard links sstables to
More at
http://docs.datastax.com/...
Snapshots in OpscenterSnapshots in Opscenter
Under Services -> Backup
Displays backup history, allows backup and restore.
...
Snapshots in OpscenterSnapshots in Opscenter
Schedule repeated backups
or create ad hoc backup
Select keyspaces
Set locati...
Auditable RecordsAuditable Records
Remote SnapshotsRemote Snapshots
Opscenter can also
backup to s3
Specify s3 bucket name,
aws credentials
Optional transfer...
SSTables need to be stored per node to avoid name
collisions.
However dropping and recreating a table can lead to
a naming...
Restore OperationsRestore Operations
SSTableloader Basics
Expects the schema to already exist for the sstables.
Expects a ...
Restore OperationsRestore Operations
Select a backup from a
list of available
snapshots.
Point in Time restores
(more on t...
Restore OperationsRestore Operations
Attempts to recreate the
schema or do a schema
comparison. The latter is
extremely di...
Remote RestoreRemote Restore
Topologies change over time.
When topologies shrink multiple nodes worth of data
will have to...
Remote RestoreRemote Restore
When topologies grow some nodes may be idle
during a restore.
Replacement nodes will have a d...
Commit Log ArchivingCommit Log Archiving
Cassandra an execute a script
when writing commit log
segments
set in
commitlog_a...
Commit Log ArchivingCommit Log Archiving
Opscenter can enable that also
under services->backups
service->settings
Opscente...
Point in Time RestorePoint in Time Restore
2 step operation, restore snapshot, then replay
commit logs.
Find the nearest s...
PiT in OpscenterPiT in Opscenter
OpsCenter can
automate the PiT
restore process
Set time (in UTC)
OpsCenter will verify
th...
PiT Restore ChallengesPiT Restore Challenges
Commit log replays don't stream data around the
ring, this makes topology cha...
Questions?Questions?
Feel free to reach out:
https://www.linkedin.com/in/philipsdoctor
Prochain SlideShare
Chargement dans…5
×

DataStax: Backup and Restore in Cassandra and OpsCenter

3 325 vues

Publié le

Cassandra and OpsCenter has a range of backup and restore topics. I will start with a basic overview of Cassandra backup/restore, walking through the operational steps to provide the understanding required to perform an on disk backup and restore. Expanding on this overview, I'll cover the limitations (including schema requirements) and their impact on the restore process. Further, I'll discuss commit log archiving and point in time restore operations. After covering the underlying operations, I'll wrap up with a discussion of how OpsCenter automates this process and leverages S3.

Publié dans : Technologie
  • Soyez le premier à commenter

DataStax: Backup and Restore in Cassandra and OpsCenter

  1. 1. Backup and RestoreBackup and Restore in Cassandra andin Cassandra and OpsCenterOpsCenter
  2. 2. OverviewOverview Snapshot Operations Restore Operations Commit Log Archiving/Point in Time Restore Remote backup From both Cassandra and Opscenter perspectives
  3. 3. SnapshotsSnapshots Nodetool Snapshot Basics Performs a flush, then hard links sstables to More at http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsSnapShot.html org.apache.cassandra.db ->StorageService ->takeSnapshot <data_file_directories>/<ks>/<table>/snapshots/<snapshot-name>/ Under the hood, mbeans
  4. 4. Snapshots in OpscenterSnapshots in Opscenter Under Services -> Backup Displays backup history, allows backup and restore. Advanced settings we'll cover later Backup Service is an Enterprise Feature More at http://docs.datastax.com/en/opscenter/5.2/opsc/online_help/services/opscBackupService. html
  5. 5. Snapshots in OpscenterSnapshots in Opscenter Schedule repeated backups or create ad hoc backup Select keyspaces Set location (on server vs s3) Uses the mbean to perform the snapshot rather than shelling out. Coordinates the snapshot on all nodes. Backs up the schema to schema.json Keeps a log for audit
  6. 6. Auditable RecordsAuditable Records
  7. 7. Remote SnapshotsRemote Snapshots Opscenter can also backup to s3 Specify s3 bucket name, aws credentials Optional transfer throttle and compression Not all SSTables need to be backed up, because they are immutable only part of the data may require it.
  8. 8. SSTables need to be stored per node to avoid name collisions. However dropping and recreating a table can lead to a naming collision as well, OPSC can attach a timestamp. If your data is encrypted, make sure that the encryption key is also put somewhere safe. Opsc backs up schemas Topologies change over time (more on this in restore).
  9. 9. Restore OperationsRestore Operations SSTableloader Basics Expects the schema to already exist for the sstables. Expects a directory structure different from that created by the snapshot, specifically <Keyspace>/<Table>/<files> Can stream data to other nodes, doesn't just move files into place Leaves files in place as they are restored, possible disk penalty. More at http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsBulkloader_t.html
  10. 10. Restore OperationsRestore Operations Select a backup from a list of available snapshots. Point in Time restores (more on this later) Restore from other location
  11. 11. Restore OperationsRestore Operations Attempts to recreate the schema or do a schema comparison. The latter is extremely difficult with thrift. Creates symbolic links in a temporary directory to match what SSTableloader expects. Logs/audit trail to follow. Uses SSTableloader
  12. 12. Remote RestoreRemote Restore Topologies change over time. When topologies shrink multiple nodes worth of data will have to be sent to a single node (sstable naming collisions).
  13. 13. Remote RestoreRemote Restore When topologies grow some nodes may be idle during a restore. Replacement nodes will have a different host ID and will need to be matched to host ID of the snapshot. Opscenter handles all of these cases.
  14. 14. Commit Log ArchivingCommit Log Archiving Cassandra an execute a script when writing commit log segments set in commitlog_archiving.properties http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configLogArchive_t. html
  15. 15. Commit Log ArchivingCommit Log Archiving Opscenter can enable that also under services->backups service->settings Opscenter can also send these to s3 as well. http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configLogArchive_t. html
  16. 16. Point in Time RestorePoint in Time Restore 2 step operation, restore snapshot, then replay commit logs. Find the nearest snapshot that happens prior to the point in time desired, perform a restore. Update commitlog_archiving.properties with the location of the commit logs as well as the point in time to restore. Restart cassandra. More At http://docs.datastax.com/en//cassandra/2.0/cassandra/configuration/configLogArchive_t. html
  17. 17. PiT in OpscenterPiT in Opscenter OpsCenter can automate the PiT restore process Set time (in UTC) OpsCenter will verify that it is capable of restoring to that point in time. Commit logs or Snapshots can be local or on S3
  18. 18. PiT Restore ChallengesPiT Restore Challenges Commit log replays don't stream data around the ring, this makes topology changes difficult to handle. Comparing schemas can be tricky if the reply contains schema changes.
  19. 19. Questions?Questions? Feel free to reach out: https://www.linkedin.com/in/philipsdoctor

×