This document discusses smart backup architectures for big data in open source environments. It describes a real-world implementation at a medical research facility that has experienced unpredictable data growth. Their implementation uses a SEP sesam server for file, database and bare metal backups to a clustered storage solution with replication. It discusses ongoing projects like expanding capacity with deduplication and new storage hardware.
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
Smart Backup Architectures for Big Data in Open Source Environments
1. Smart Backup Architectures for
• Textmasterformate
Big Data in complex, durch Klicken
Open Source based IT Environments
bearbeiten
Zweite Ebene
Dritte Ebene
Vierte Ebene
Fünfte Ebene
Hubert Schweinesbein
hs@sep.de
2. Hot IT Topics
Forrester Research Inc.
The Top 10 Strategic
Technology Trends
•
•
•
•
•
•
Mobile device battles
Mobile applications and HTML5
The personal cloud
The Internet of Things
Hybrid IT and cloud computing
Strategic big data
Session title
IT Projects 2013
Virtualization
Master Data Management
Enterprise Security
Integration of Standard and
Individual Software
• Data Quality Management
•
•
•
•
Cap Gemini
11. SEP & Costs
Unlimited Usage
Reduce your invest
COO instead COG
One subscribtion – use alle
SEP backup modules
Session title
12. The Real World
• Founded 1998 - Spin-off from the Institute of Clinical Chemistry
at the University Hospital Grosshadern of the LudwigMaximilians University
• Specialized in genetic and immunological diagnostics
• Health Care - very cost sensitive
• Open Source is a key strategy of their IT
• Use of Standard Hardware instead of
expensive dedicated Storage solutions
• Special Linux Distribution - Scientific Linux
www.scientificlinux.org
• Different databases – huge amount of Data
13. Requirements
• Sophisticated backup strategy with different medias, retention
times, remove of backup media to external locations, single
servers with 5TB of data
• Integrated Bare System Recovery for all Servers
– Linux and Windows
• Variety of Linux Distributions - “old” Linux Distributions (RH 7.1)
• Each backup has to be available 2 times – at least
• Backup to Disk and Tape
• Wide range of Databases – mySQL, MS SQL, Sybase,
PostgreSQL
• Upredictable data growth – 100 GB to 500 GB per month
15. Actual Implementation
• Cluster FS Storage Cluster with Replication
– 50TB of Backup Data ==> 100 TB soon
• SEP sesam Server on Linux
• File, Database and Bare System Recovery Backup
of Linux and Windows Servers to Cluster FS (Storage Cluster)
• Automated replication of the backup data via Cluster FS
• Migration of dedicated backups to tape
• Automated restore possible from all Nodes and from tape
16. Ongoing Projects
• Enlarge the storage capacity
• Deduplication of Backup Data by SEP Si3T Technology
• Conflict of interest - performance vs. invest for storage
• New Storage HW – Standard Server with 130 TB net capacity
• Slow disks slow down the backup - „smart“ backup tasks
speeds up to 2,5 TB / hour
18. Free technical Workshop
Free technical Workshop
• SEP sesam
& Red Hat Storage
• 1 Day @ Dupaco Leusden
• 18th of March 2014
• www.dupaco.de
Session title
19. Backup issues
88% Capability related
challanges
84% Complexity
87% Cost
Veeam,Annual Data Protection Report 2013
Session title
20. Backup is not important
All backup is done to ensure a fast and
reliable restore!
The restore and disaster recovery
requirements define the backup
strategy – not the other way around!