SlideShare a Scribd company logo
1 of 5
Top Big Data Terms
Term Definition
Hadoop Open-source software framework that supports the running of applications
on large clusters of commodity hardware. Hadoop is written in Java.
HDFS Stands for Hadoop Distributed File System. HDFS is a distributed file system
that stores large files across multiple machines. The system replicates data
across multiple machines and understand what data is being processed when
and by whom
MapReduce MapReduce is a programming model for processing large data sets with a
parallel, distributed algorithm on a cluster. Its Map() procedure filters and
sorts and its Reduce() procedure performs summary operations.
Hive A Data Warehouse infrastructure built on top of Hadoop for providing data
summarization, query, and analysis.
Hbase HBase is an open source, non-relational, distributed database and runs on
top of HDFS.
Cassandra Apache Cassandra is an open source distributed database management
system designed to handle very large amounts of data spread out across
many commodity servers.
Source: Wikipedia (mainly)
Sizes that Matter
Name Value Example
1 Bit = The smallest unit of data that a computer uses. It can be used
to represent two states of information, such as Yes or No.
1 Byte = 8 Bits. A Byte can represent 256 states of information. 1 Byte
could be equal to one character. 10 Bytes could be equal to a
word. 100 Bytes would equal an average sentence.
1 kilobyte (kB) 1024 bytes 1 Kilobyte would be equal to a paragraph.
1 megabyte (MB) 1024 kB 3-1/2 inch floppy disks can hold 1.44 Megabytes or the
equivalent of a small book. 600 Megabytes is about the
amount of data that will fit on a CD-ROM disk.
1 gigabyte (GB) 1024 MB 1GB could hold the contents of about 10 yards of books .
1 terabyte (TB) 1024 GB 1 TB could hold 1,000 copies of the Encyclopedia Britannica.
1 petabyte (PB) 1024 TB 500 million floppy disks
1 exabyte (EB) 1024 PB 5 Exabytes could = all of the words ever spoken by mankind.
1 zettabyte (ZB) 1024 PB ?
Source: http://www.whatsabyte.com/
TRY IT @ WWW.SISENSE.COM
Glossary of Big Data Terms

More Related Content

More from Bruno Aziza

AI Weekly - April 5, 2021
AI Weekly - April 5, 2021AI Weekly - April 5, 2021
AI Weekly - April 5, 2021Bruno Aziza
 
Ai Weekly - March 29, 2021
Ai Weekly - March 29, 2021Ai Weekly - March 29, 2021
Ai Weekly - March 29, 2021Bruno Aziza
 
AI Weekly - March 22, 2021
AI Weekly - March 22, 2021AI Weekly - March 22, 2021
AI Weekly - March 22, 2021Bruno Aziza
 
AI Weekly - March 7, 2021
AI Weekly - March 7, 2021AI Weekly - March 7, 2021
AI Weekly - March 7, 2021Bruno Aziza
 
AI Weekly - March 1, 2021
AI Weekly - March 1, 2021AI Weekly - March 1, 2021
AI Weekly - March 1, 2021Bruno Aziza
 
AI Weekly - February 22, 2021
AI Weekly - February 22, 2021AI Weekly - February 22, 2021
AI Weekly - February 22, 2021Bruno Aziza
 
AI Weekly February 7, 2021
AI Weekly February 7, 2021AI Weekly February 7, 2021
AI Weekly February 7, 2021Bruno Aziza
 
AI Weekly - January 30, 2021
AI Weekly - January 30, 2021AI Weekly - January 30, 2021
AI Weekly - January 30, 2021Bruno Aziza
 
AI Weekly - January 17, 2021
AI Weekly - January 17, 2021AI Weekly - January 17, 2021
AI Weekly - January 17, 2021Bruno Aziza
 
AI Weekly - January 11, 2021
AI Weekly - January 11, 2021AI Weekly - January 11, 2021
AI Weekly - January 11, 2021Bruno Aziza
 
AI Weekly - December 27, 2020
AI Weekly  - December 27, 2020AI Weekly  - December 27, 2020
AI Weekly - December 27, 2020Bruno Aziza
 
AI Weekly - December 7, 2020
AI Weekly - December 7, 2020AI Weekly - December 7, 2020
AI Weekly - December 7, 2020Bruno Aziza
 
AI Weekly - November 30, 2020
AI Weekly - November 30, 2020AI Weekly - November 30, 2020
AI Weekly - November 30, 2020Bruno Aziza
 
AI Weekly: Predictions for 2021
AI Weekly: Predictions for 2021AI Weekly: Predictions for 2021
AI Weekly: Predictions for 2021Bruno Aziza
 
AI Weekly November 8, 2020
AI Weekly  November 8, 2020AI Weekly  November 8, 2020
AI Weekly November 8, 2020Bruno Aziza
 
Ai Weekly - November 1, 2020
Ai Weekly - November 1, 2020Ai Weekly - November 1, 2020
Ai Weekly - November 1, 2020Bruno Aziza
 
AI Weekly - October 18, 2020
AI Weekly - October 18, 2020AI Weekly - October 18, 2020
AI Weekly - October 18, 2020Bruno Aziza
 
AI Weekly - July 26, 2020
AI Weekly - July 26, 2020AI Weekly - July 26, 2020
AI Weekly - July 26, 2020Bruno Aziza
 
AI Weekly - July 5, 2020
AI Weekly - July 5, 2020AI Weekly - July 5, 2020
AI Weekly - July 5, 2020Bruno Aziza
 
AI Weekly - June 15, 2020
AI Weekly - June 15, 2020AI Weekly - June 15, 2020
AI Weekly - June 15, 2020Bruno Aziza
 

More from Bruno Aziza (20)

AI Weekly - April 5, 2021
AI Weekly - April 5, 2021AI Weekly - April 5, 2021
AI Weekly - April 5, 2021
 
Ai Weekly - March 29, 2021
Ai Weekly - March 29, 2021Ai Weekly - March 29, 2021
Ai Weekly - March 29, 2021
 
AI Weekly - March 22, 2021
AI Weekly - March 22, 2021AI Weekly - March 22, 2021
AI Weekly - March 22, 2021
 
AI Weekly - March 7, 2021
AI Weekly - March 7, 2021AI Weekly - March 7, 2021
AI Weekly - March 7, 2021
 
AI Weekly - March 1, 2021
AI Weekly - March 1, 2021AI Weekly - March 1, 2021
AI Weekly - March 1, 2021
 
AI Weekly - February 22, 2021
AI Weekly - February 22, 2021AI Weekly - February 22, 2021
AI Weekly - February 22, 2021
 
AI Weekly February 7, 2021
AI Weekly February 7, 2021AI Weekly February 7, 2021
AI Weekly February 7, 2021
 
AI Weekly - January 30, 2021
AI Weekly - January 30, 2021AI Weekly - January 30, 2021
AI Weekly - January 30, 2021
 
AI Weekly - January 17, 2021
AI Weekly - January 17, 2021AI Weekly - January 17, 2021
AI Weekly - January 17, 2021
 
AI Weekly - January 11, 2021
AI Weekly - January 11, 2021AI Weekly - January 11, 2021
AI Weekly - January 11, 2021
 
AI Weekly - December 27, 2020
AI Weekly  - December 27, 2020AI Weekly  - December 27, 2020
AI Weekly - December 27, 2020
 
AI Weekly - December 7, 2020
AI Weekly - December 7, 2020AI Weekly - December 7, 2020
AI Weekly - December 7, 2020
 
AI Weekly - November 30, 2020
AI Weekly - November 30, 2020AI Weekly - November 30, 2020
AI Weekly - November 30, 2020
 
AI Weekly: Predictions for 2021
AI Weekly: Predictions for 2021AI Weekly: Predictions for 2021
AI Weekly: Predictions for 2021
 
AI Weekly November 8, 2020
AI Weekly  November 8, 2020AI Weekly  November 8, 2020
AI Weekly November 8, 2020
 
Ai Weekly - November 1, 2020
Ai Weekly - November 1, 2020Ai Weekly - November 1, 2020
Ai Weekly - November 1, 2020
 
AI Weekly - October 18, 2020
AI Weekly - October 18, 2020AI Weekly - October 18, 2020
AI Weekly - October 18, 2020
 
AI Weekly - July 26, 2020
AI Weekly - July 26, 2020AI Weekly - July 26, 2020
AI Weekly - July 26, 2020
 
AI Weekly - July 5, 2020
AI Weekly - July 5, 2020AI Weekly - July 5, 2020
AI Weekly - July 5, 2020
 
AI Weekly - June 15, 2020
AI Weekly - June 15, 2020AI Weekly - June 15, 2020
AI Weekly - June 15, 2020
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Glossary of Big Data Terms

  • 1.
  • 2. Top Big Data Terms Term Definition Hadoop Open-source software framework that supports the running of applications on large clusters of commodity hardware. Hadoop is written in Java. HDFS Stands for Hadoop Distributed File System. HDFS is a distributed file system that stores large files across multiple machines. The system replicates data across multiple machines and understand what data is being processed when and by whom MapReduce MapReduce is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. Its Map() procedure filters and sorts and its Reduce() procedure performs summary operations. Hive A Data Warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. Hbase HBase is an open source, non-relational, distributed database and runs on top of HDFS. Cassandra Apache Cassandra is an open source distributed database management system designed to handle very large amounts of data spread out across many commodity servers. Source: Wikipedia (mainly)
  • 3. Sizes that Matter Name Value Example 1 Bit = The smallest unit of data that a computer uses. It can be used to represent two states of information, such as Yes or No. 1 Byte = 8 Bits. A Byte can represent 256 states of information. 1 Byte could be equal to one character. 10 Bytes could be equal to a word. 100 Bytes would equal an average sentence. 1 kilobyte (kB) 1024 bytes 1 Kilobyte would be equal to a paragraph. 1 megabyte (MB) 1024 kB 3-1/2 inch floppy disks can hold 1.44 Megabytes or the equivalent of a small book. 600 Megabytes is about the amount of data that will fit on a CD-ROM disk. 1 gigabyte (GB) 1024 MB 1GB could hold the contents of about 10 yards of books . 1 terabyte (TB) 1024 GB 1 TB could hold 1,000 copies of the Encyclopedia Britannica. 1 petabyte (PB) 1024 TB 500 million floppy disks 1 exabyte (EB) 1024 PB 5 Exabytes could = all of the words ever spoken by mankind. 1 zettabyte (ZB) 1024 PB ? Source: http://www.whatsabyte.com/
  • 4. TRY IT @ WWW.SISENSE.COM