Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Big data analysis using map/reduce

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Prochain SlideShare
Big Data
Big Data
Chargement dans…3
×

Consultez-les par la suite

1 sur 18 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Big data analysis using map/reduce (20)

Publicité

Plus récents (20)

Big data analysis using map/reduce

  1. 1. BBigig DData Analysis for Pageata Analysis for Page Ranking using Map/ReduceRanking using Map/Reduce R.Renuka, R.Vidhya Priya, IIIB.Sc., IT, The S.F.R.College forWomen, Sivakasi.
  2. 2. Overview Introduction What isBig Data! Why Big Data? 4 V’sOf Big Data Big DataAnalyticsTechnologies Map/Reduce Applications CaseStudy Conclusion
  3. 3. Introduction Datahaveoutgrown thestorageand processing capabilitiesof asinglehost. Two fundamental challenges: – how to storeand – how to work with voluminousdatasizes, and, – how to understand dataand turn it into acompetitive advantage.
  4. 4. What isBig Data! ‘Big-data’ issimilar to ‘Small-data’, but bigger But having databigger requiresdifferent approaches: techniques, tools& architectures To solve: New problemsand old problemsin abetter way.
  5. 5. TheBlind men and theElephant
  6. 6. Why Big Data? Key enablersfor thegrowth of “Big Data” are: Increaseof Processing Power Increaseof StorageCapacities Availability of Data
  7. 7. 4 V’sof Big Data
  8. 8. Big DataAnalyticsTechnologies Hadoop PLATFORA WibiData PIG Hive MapReduce NoSQL databases Column-oriented databases
  9. 9. Hadoop Hadoop isadistributed filesystem and data processing engine Hadoop hastwo components: – TheHadoop distributed filesystem (HDFS) – TheMapReduceprograming.
  10. 10. Map / Reduce A High level abstracted framework for distributed processing of large datasets Fault Tolerant , Parallelization Computation consistsof two phases Map Reduce A Master-Slavearchitecture Computationsoccursin multipleslavenodes And it triesto providedatalocality asmuch aspossible.
  11. 11. MR model Map – Processakey/valuepair to generateintermediatekey/value pairs Reduce – Mergeall intermediatevaluesassociated with thesamekey Usersimplement interfaceof two primary methods: 1. Map: (key1, val1) → (key2, val2) 2. Reduce: (key2, [val2]) → [val3]
  12. 12. Applications
  13. 13. Homeland Security FinanceSmarter Healthcare Multi-channel sales Telecom Manufacturing Traffic Control Trading Analytics Fraud and Risk Log Analysis Search Quality Retails
  14. 14. CaseStudy
  15. 15. Conclusion Real-time big data isn’t just a process for storing petabytesor exabytesof datain adatawarehouse, It’s about the ability to make better decisions and take meaningful actionsat theright time.
  16. 16. Queries ??

×