The document discusses NISCI's big data platform using high performance computing. It describes NISCI's HPC hardware infrastructure including servers, storage, and a 50TB InfiniBand network. It also outlines the software tools used like Intel Parallel Studio, Open MPI, Hadoop, and programming models. Several potential big data sources are mentioned like sensor networks, documents, satellite images, and various industry and government data. Finally, some proposals are made around using this infrastructure for applications in areas like disaster prevention, finance, environment, and transportation.
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Tran Minh: big data platform in high performance computing at NISCI
1. Big Data Platform
in High Performance Computing
at NISCI
Dr. Tran Minh
Deputy Director of NISCI
trminh@nisci.gov.vn
MINISTRY OF INFORMATION AND COMMUNICATIONS (MIC) VIETNAM NATIONAL INSTITUTE OF SOFTWARE AND DIGITAL CONTENT INDUSTRY (NISCI)
Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform”
Hanoi – 24th July, 2014
2. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 2/18
HPC system in NISCI
Software and data processing library
Data source for Big Data
Development environment in HPC
Some proposal
Contents
3. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 3/18
HPC & Cloud infrastructure at NISCI
4. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 4/18
24 servers SL250s Gen8 (Intel Xeon E5-2670 Processor).
24 servers SL250s Gen8 (Intel Xeon E5-2670 Processor), with Chip Intel Xeon Phi 5110P Coprocessor).
2 servers DL380p GEN8 (Intel Xeon E5-2670 Processor).
2 QLogic IB QDR 36 Port Switch.
Switch 3800 for Storage connection
Switch 2920 và 2620
HPC System Hardware
Big Data Hardware Infrastructure
+ Servers
+ Storage
+ Networks
+ HPC and Super Computers
5. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 5/18
Xeon Phi (Xeon Phi 5110p)
Multi core: # of Cores: 16
Clock Speed:1.053 GHz
L2 Cache: 30MB
Instruction Set:64-bit
Max memory size : 8GB
# of Memory Channels:16
Max Memory Bandwidth:320GB/s
InfiniBand
All ports on 4X QDR HCA card and switch are capable of supporting 40 Gbps signaling rate
Storage
50 Tb
HPC hardware
Proccessor
(1)
N of proccessor
(2)
N of core
(3)
Speed Ghz
(4)
N of operations/s
(5)
GFlops
(6) = 2*3*4*5
Xeon E5-2670
96
8
2.6
8
15.974
Xeon Phi 5110p
48
16
1.053
16
48.522
Total
64.496
6. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 6/18
Cluster LAN 1GbE
iLO LAN
IB QDR Network
Storage LAN 1GbE
2 x Master Node
StoreAll 40 TB
24 x Compute Node(s) – GPU
External Users
Cluster System Admin
Public Network
HP 642 1200mm Rack
HP X9320 43TB
6 x SL6500 chassis
Customer existing switch
HP 2920 48 port switch
HP 2620 48 port switch
HP 3800-24G-8XG Switch
Qlogic QDR IB switch
2 x DL380p
24x Compute Node(s)
6 x SL6500 chassis
HPC hardware architecture
7. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 7/18
HPC system in NISCI
Software and data processing library
Data source for Big Data
Development environment in HPC
Some proposal
Contents
8. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 8/18
Intel parallel Studio XE
Intel® C, C++ and Fortran Compilers – Industry-Leading Compilers
Intel® MKL and Intel® IPP – Performance Libraries
Intel® Threading Building Blocks and Intel® Cilk™ Plus – Parallel Programming Models
Intel® Advisor XE – Threading Assistant
Intel® VTune™ Amplifier XE – Performance & Thread Profiler
Intel® Inspector XE – Memory and Thread Checker
Static Analysis – Locate Difficult to find Defects
Software development tools
HP Open MPI
9. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 9/18
10. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 10/18
HPC software platform
Big Data Software Platforms
+ SQL / Data Warehouse
+ NoSQL / Big Data Storage
+ Digital Content Repositories
+ Public and Private Clouds
+ Hadoop Distributed File System (HDFS)
+ Map / Reduce
+ Pig / Hive Query ...
11. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 11/18
HPC system in NISCI
Software and data processing library
Some data source for Big Data
Development environment in HPC
Some proposal
Contents
12. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 12/18
- Flooding / Hurricane
- Earthquake / Land slide
- Vocanic eruption
HPC & BigData Deployment
Ho Chi Minh
Da Nang
Ha Noi
- Porfolio optimization
- Risk calculation
- Black-Scholes and Options
Greeks calculation
- Monte Carlo simulation
Disaster Prevention
iDC
iDC
iDC
iDragon Cloud
iDragon Cloud
iDragon Cloud
iDragon Cloud
iDragon Cloud
iDragon Cloud
iDragon Cloud
iDragon Cloud
iDragon Cloud
iDragon Cloud
Finance and Economic
- Climate change
-Air pollution
- Forest fire
-Waste water
- People and House holds
- Business and Industry
- Geography
Environment Warning
Census
- River water / Ground water ...
- Sea water / Salt ...
- Gas / Oil / Coal ...
- Gold / Silver / Bronze / Iron ...
- Bus info / train info
- Traffic jam / accident info
- Logistics info
Natural Resource Management
Transportation & Logistics
High Performance
Computing
19
13. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 13/18
Sensor Network and Big Data
1. Sensor Network
-
Danang
-
Cantho
-
Hochiminh city
2. Document image processing (800 Milion pages/year)
3. Network administration data (Danang)
4. Satelite Image (VN RED-SAT, Ministry of Public Security)
14. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 14/18
Sensor’s Network Big Data
15. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 15/18
- Monitor situation of wave, ships.
- Monitor level of river
- Monitor swirling water and dangerous area.
-Auto send sms/email of warning water level to leaders.
- Sharing data among gov. offices
16. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 16/18
17. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 17/18
Đà Nẵng, Cần Thơ,
18. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 18/18
Hospital
Station
Office
Show Room
Store House
Public
Home
Hotel
Digital Contents and Big Data
19. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 19/18
- Banks
- Insurance
- Securities
- VNPT
- Viettel
When Vietnam ?
20. Workshop on “Big Data Analasys on Cloud and High Performance Computing Platform” 20/18
Thanks for your attentions!
Vietnam National Institute of Software and Digital Content Industry (NISCI)
The Authority of Radio and Frequency Management Building 8 floor, 115 Tran Duy Hung, Cau Giay District, Ha Noi.
E-mail: info@nisci.gov.vn