Contenu connexe Similaire à Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | Edureka (20) Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | Edureka1. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Hadoop vs Spark
4. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Hadoop vs Spark
5. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Move data through disk & network
Caches data in memory
Performance
Step Step Step Step
Step Step Step Step
Performance
Ease of Use
Costs
Data Processing
Fault Tolerance
Security
6. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Hadoop can be integrated with multiple tools like Sqoop, Flume, Pig, Hive
Spark comes with user-friendly APIs for Scala, Java, Python, and Spark SQL
Ease of Use
Performance
Ease of Use
Costs
Data Processing
Fault Tolerance
Security
7. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Costs
Hadoop requires lot of disk space as well as faster disks
Spark requires large amounts of RAM for executing everything in memory
Performance
Ease of Use
Costs
Data Processing
Fault Tolerance
Security
8. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Data Processing
Client Client Client
Time
ms ms ms
Client Client Client
Time
Stored Stored Stored ETL
Client
Batch
Processing
Stream
Processing
Performance
Ease of Use
Costs
Data Processing
Fault Tolerance
Security
9. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Fault Tolerance
1 2
Replication Re-Execution of Job
RDD is automatically recomputed by using the original transformations
Performance
Ease of Use
Costs
Data Processing
Fault Tolerance
Security
10. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Security
1. Login Attempt
4. Logged In
2. Password
3. Check OK
Server LDAP
Performance
Ease of Use
Costs
Data Processing
Fault Tolerance
Security
11. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Hadoop Use-cases
12. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Hadoop Use-cases
1
2
Reporting Tool 1
Reporting Tool 2
Reporting Tool 3
Server
Archival Data
Reporting Tools
Client Client Client
Time
Stored Stored Stored ETL
Client
Applications requiring
Batch Processing
13. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Spark Use-cases
14. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Spark Use-cases
1
2
Client Client Client
Time
ms ms ms
3
Graph Processing
Applications requiring
Stream-processing
Iterative Processing
15. Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Which one is the best?