Hadoop is a cluster computing framework.
Hadoop tools empower more developers and more organizations to leverage Hadoop for big data management. There’s been a growing demand for Hadoop tools that can make Hadoop's vast processing power more accessible. I’m going to present a Brief explanation of the various applications and tools that are associated with Hadoop. Also, I would be presenting a project how on how some of these tools where used to analyze the percentage of brain injured person in New England in the month of December 2010 survey to determine if brain transplant was an option to solve brain problem in the Nation.
9. AN ANALYSIS OF BRAIN INJURY
Goal:
• Determine the percentage of brain injured
persons from New England compared to the
Nation for the month of December 2010
10. Why chose this project?
• This project was able to aloud me apply HBase,
Hive, Pig, which are tolls associated with Hadoop
common.
• These tools where available for free and easy to
navigate with.
11. Dataset source
• The dataset came from the U.S. Department
of Health and human services
Survey to implement brain transplant
12. Dataset information
• Filename: US-DEC-2010-BRAIN
• File size: 90.2 KB (92,320 bytes).
• The data span a period of 5 years, including ~998,877
reviews up to September 2015
• Reviews include brain cancer research and department of
mental health.
14. • Dataset was then converted to .CSV format
Dataset has 100 instances and 11 attributes
15. Dataset was imported in to Hadoop common
using the file browser and then store inside
HBase.
• Using Hcatalog, a table was created that
automatically fits the number of rows and
column to suit the dataset.
17. Data is then import into the table with a single
click (create). This would take a few minute
depending on the size of the dataset
• Actual Table US-DEC-2010-BRAIN PREVIEW
18. • Hive Query Language (HQL) was used to
generate statements that produce result of
the number of brain injured persons by state
• Target states where ME, RI, VT, NH, MA and CT
19. SELECT * FROM table_brain
WHERE table_brain.state = 'ME'
ME = 3
20. SELECT * FROM table_brain
WHERE table_brain.state = 'RI
RI = 5
21. SELECT * FROM table_brain
WHERE table_brain.state = 'VT'
VT = 0
22. NH = 1
SELECT * FROM table_brain
WHERE table_brain.state = 'NH'
23. SELECT * FROM table_brain
WHERE table_brain.state = ‘MA'
MA = 12
24. SELECT * FROM table_brain
WHERE table_brain.state = 'CT'
CT = 5
26. RESULT
• MAINE = 3
• VERMONT = 0
• NEW HAMPSHIRE = 1
• MASSACHUSETT = 12
• CONNECTICUT = 5
• TOTAL = 26
• NATION =100
• 26/100 = 0.26
• 0.26 * 100 = 26%
27. CONCLUSION
During a survey done in the month of
December 2010, to determine the number of brain
injured persons in the nation to enforce brain
transfer procedure, 26% of the nation’s total came
from New England.