Contenu connexe
Similaire à Technology Outlook - The new Era of computing (20)
Plus de Swiss Big Data User Group (20)
Technology Outlook - The new Era of computing
- 1. 1 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Dr. Axel Koester – Storage Chief Technologist, European Storage Competence Center
Future perspectives: the new Era of Computing
BIG DATA from a storage point of view
- 2. 2 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
BIG DATA ≠≠≠≠ MUCH DATA
- 3. 3 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
MUCH DATA
Example
- 4. 4 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
www.extremetech.com
press release August 2011
- 5. 5 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Details of the 120 PB Cluster
No "Replication Cluster" like Amazon EC2 or Google Cloud
– no 3 copies of each data block
Why?
– Would require 550.000 instead of 200.000 disk drives
– and produce 30 instead of 12 daily failures (at identical net capacity)
- 6. 6 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
RAID technology won't work in a 200.000 disk cluster
1TB disk drives × 200.000
8+3 software RAID cluster
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD
http://www.almaden.ibm.com/storagesystems/projects/perseus/
No spare
No spare
- 7. 7 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
There is always something broken, somewhere
8+3 Reed Solomon encoding = tolerates up to 3 faults
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD
http://www.almaden.ibm.com/storagesystems/projects/perseus/
One fault: low rebuild thread priority
Two faults: prioritize rebuild
Three faults: rebuild asap
< 4 min 20 sec in this state
No spare
No spare
- 8. 8 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
120 PB storage grid : One of many p775 storage enclosures
Dense Disk Enclosure – 384 disks per unit (192 front & back)
2% Flash SSD for metadata
- 9. 9 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Power 775 rack with 3 "Dense Disk Enclosures" (1152 drives)
- 10. 11 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
IBM GSS*: "120 PB predictive technology" for everyone
2x IBM x3650 M42x IBM x3650 M4
GPFS
1PB
*
16GB/s
1PB
*
16GB/s
2U
2U
4U
4U
4U
4U
240 SAS Disks @ 4TB
(currently 720 TB @3TB disks)
(*) GSS = GPFS Storage Server, RAID-less (General Parallel File System)
4 M odules – 240 disks
1 PB
4 M odules – 240 disks
1 PB
- 11. 12 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Cost per 1MB sequenced genomic data
www.crops.org
For whom?
predicted
actual
» Statistical Medicine
- 12. 13 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Sometimes…
MUCH DATA » BIG DATA
- 13. 14 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
BIG DATA without MUCH DATA
- 14. 15 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson answers complex "trivia" questions from any subject area.
Real money is at stake.
- 15. 16 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Jeopardy! all-time champions Ken Jennings & Brad Rutter
You don't have this hereditary
lack of pigment. You just need
a little more sun!
- 16. 17 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson's "knowledge" is collected, not coded
Knowledge base = English WWW
Multiple millions of analyses per second
200 million memorized book pages,
with Wikipedia alone totaling 2,25 Mio.
10km10km
~2000 years to read
- 17. 18 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson holds all its knowledge in RAM
90 × 32 core IBM Power®750 / 16 TB RAM, 1 TB data, 500 GB/sec
out of 4 TB GPFS disk space, with 16TB = 15 TiB RAM
"information aggregator"
"information aggregator"
*Parallel access to 100% of the data,
no Internet access during games
- 18. 19 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson's inventors surprised by amazing correct answers
Dr. Jennifer Chu-Carrol,
Watson algorithms
Why? Because Watson's knowledge is collected, not coded.
Only rules are coded, but subject to machine learning.
- 19. 20 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
BIG DATA and ENERGY
- 20. 21 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
20 Watt20 Watt
IBM Watson: 200.000 Watt
- 21. 22 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Human brains are incredibly good at "low power"
http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.html
20 Watt20 Watt
Recognize a face in a crowd – efficiently
Distinguish own from outside sound
Combine unrelated facts
Filter & distill information
How?
Brains are not 100% accurate. Bit errors don't bother.
- 22. 23 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Root cause
Coronary Syndrome 60%
Pneumonia 25%
Pulmonary Embolism 9%
Congestion prediction
Ring A99 in 2 hrs 95%
Feeder A8 in 1 hr 90%
Energy production
Line load estimation
Production mix %
:
"Lower power" from abandoning the 100% bit-accurate IT
Patient symptoms
Road traffic sensors
Wind & sun forecast
:
Unreliableinputdata
Approximation
Approximations based on unreliable data
should not require bit-accurate processing !
- 23. 24 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Projects SyNAPSE and BlueBrain: Simulating the brain
http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.html
Understand
learning down
to synapses
Understand
learning down
to synapses
Explore large-
scale brain
simulations
Explore large-
scale brain
simulations
Design a chip
that "learns" at
molecular level
Design a chip
that "learns" at
molecular level
- 24. 25 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
The brain transistor
no more science-fiction, since March
- 25. 26 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
March 2013 : IBM publishes a liquid-based transistor
that process data like the human brain
- 26. 27 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
by IBM Fellow Stuart Parkin, inventor of "Racetrack Memory"
Droplet that can be turned into
"liquid metal" and switch currents
Stuart ParkinStuart Parkin
“We turn this material into a
metal and maintain it without
any need to supply power.”
metalized ions liquid
The programmable liquid can be the
information conveyor, not just a bit cell
- 27. 28 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
IT energy problems? Start learning from nature!
copied nature « » optimized engineering
- 28. 30 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
axel.koester@de.ibm.com