Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Big Data Analysis for Page Ranking using Map/Reduce
1. BBigig DData Analysis for Pageata Analysis for Page
Ranking using Map/ReduceRanking using Map/Reduce
R.Renuka,
R.Vidhya Priya,
IIIB.Sc., IT,
The S.F.R.College forWomen,
Sivakasi.
3. Introduction
Datahaveoutgrown thestorageand processing capabilitiesof
asinglehost.
Two fundamental challenges:
– how to storeand
– how to work with voluminousdatasizes, and,
– how to understand dataand turn it into acompetitive
advantage.
4. What isBig Data!
‘Big-data’ issimilar to ‘Small-data’, but bigger
But having databigger requiresdifferent approaches:
techniques, tools& architectures
To solve:
New problemsand old problemsin abetter way.
9. Hadoop
Hadoop isadistributed filesystem and data
processing engine
Hadoop hastwo components:
– TheHadoop distributed filesystem (HDFS)
– TheMapReduceprograming.
10. Map / Reduce
A High level abstracted framework for distributed processing of large
datasets
Fault Tolerant , Parallelization
Computation consistsof two phases
Map
Reduce
A Master-Slavearchitecture
Computationsoccursin multipleslavenodes
And it triesto providedatalocality asmuch aspossible.
11. MR model
Map
– Processakey/valuepair to generateintermediatekey/value
pairs
Reduce
– Mergeall intermediatevaluesassociated with thesamekey
Usersimplement interfaceof two primary methods:
1. Map: (key1, val1) → (key2, val2)
2. Reduce: (key2, [val2]) → [val3]
16. Conclusion
Real-time big data isn’t just a process for storing
petabytesor exabytesof datain adatawarehouse, It’s
about the ability to make better decisions and take
meaningful actionsat theright time.