Big Data

•Télécharger en tant que PPTX, PDF•

0 j'aime•286 vues

Concept of Big Data in the context of real world data scenario. Presented on DotNetters Tech Summit - 2015 RUET Presenter: Md. Delwar Hiossain Event Url: https://www.facebook.com/events/512834685530439/

Technologie

Md. Delwar Hossain
Sr. Software Engineer, desme Bangladsh.
Skype: delwar_databiz
E-mail: twinkle_023020@yahoo.com
Linkedin: delwar-hossain

Unit Name Symbal Size
Kilobyte KB 10^3
Megabyte MB 10^6
Gigabyte GB 10^9
Terabyte TB 10^12
Petabyte PB 10^15
Exabyte EB 10^18
Zettabyte ZB 10^21
Yottabyte YB 10^24

 Depending on traditional system
 Traditional Software Tools
100 MB Document 100 GB Image 100 TB Video
Unable to
Send
Unable to
View
Unable to
Edit

Data come from many quarters.
 Social media sites
 Sensors
 Business transactions
 Location-based

 Volume: large volumes of data
 Velocity: quickly moving data
 Variety: structure, unstructured, images, audios, videos etc.

 Business value outcomes tied to
the business strategy resulting
from business decisions.
 Higher productivity, faster time to
complete tasks
 Lower total cost of ownership and
greater efficiencies in IT

 Operational
 NoSQL
 Analytical
 Hadoop

 Scalability  Semi-structured and unstructured
Data.
 High Velocity

 Schema less : data structure is not predefined
 Focus on retrieval of data and appending new data
 Focus on key-value data stores that can be used to locate data
objects
 Focus on supporting storage of large quantities of unstructured
data
 SQL is not used for storage or retrieval of data
 No ACID (atomicity, consistency, isolation, durability)

 Built for cloud
 Scale out architecture
 Agility afforded by cloud
computing
 Sharding automatically
distributes data evenly across
multi-node clusters
 Automatically manages
redundant servers. (replica sets)
Horizontal scalality
Application
Document Oriented
{
author:”delwar”,
date:new Date(),
Topics:’Big Data Concept’,
tag:[“tools”,”database”]
}

Recommandé

Application development gadget & gearsShahriar Iqbal Chowdhury

Remote method invocationDew Shishir

Software Architecture vs design Arslan Anwar

Database Development: The Object-oriented and Test-driven WayTechWell

DBT ELT approach for Advanced Analytics.pptxHong Ong

Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida CLARA CAMPROVIN

Horses for Courses: Database RoundtableEric Kavanagh

Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...MongoDB

Recommandé

Application development gadget & gearsShahriar Iqbal Chowdhury

Remote method invocationDew Shishir

Software Architecture vs design Arslan Anwar

Database Development: The Object-oriented and Test-driven WayTechWell

DBT ELT approach for Advanced Analytics.pptxHong Ong

Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida CLARA CAMPROVIN

Horses for Courses: Database RoundtableEric Kavanagh

Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...MongoDB

Logical Data Fabric and Data Mesh – Driving Business OutcomesDenodo

Webinar: NoSQL as the New NormalMongoDB

Skillwise Big Data part 2Skillwise Group

4. aws enterprise summit seoul 기존 엔터프라이즈 it 솔루션 클라우드로 이전하기 - thomas parkAmazon Web Services Korea

Skilwise Big dataSkillwise Group

Introduction to Azure DocumentDBDenny Lee

VamsiKrishna MaddiboinaMaddiboina VamsiKrishna

2022 Trends in Enterprise AnalyticsDATAVERSITY

Meeting Archive: A Simple Step to Gain 33% Performance Improvements in Reques...Meghan Weinreich

VMworld 2013: Walk-Through an IT Makeover, End-to-End, and See the Results! V...VMworld

Data Culture Series - Keynote - 3rd DecJonathan Woodward

Future of Making ThingsJC Davis

Optimizing Your Database Performance | Embarcadero TechnologiesEmbarcadero Technologies

Optimizing Your Database Performance | Embarcadero TechnologiesMichael Findling

Hot Technologies of 2013: Investigative AnalyticsInside Analysis

Embedded Analytics: The Next Mega-Wave of InnovationInside Analysis

IoT Architecture - are traditional architectures good enough?Guido Schmutz

DBTA Case Study on Data Optimization | September 2008Embarcadero Technologies

e-IT exec lunch - "It's all about data" - 25 May '16Devin Deen

Os Solomonoscon2007

CommunityShahriar Iqbal Chowdhury

Cloud friendly Enterprise ArchitectureShahriar Iqbal Chowdhury

Contenu connexe

Similaire à Big Data

Logical Data Fabric and Data Mesh – Driving Business OutcomesDenodo

Webinar: NoSQL as the New NormalMongoDB

Skillwise Big Data part 2Skillwise Group

4. aws enterprise summit seoul 기존 엔터프라이즈 it 솔루션 클라우드로 이전하기 - thomas parkAmazon Web Services Korea

Skilwise Big dataSkillwise Group

Introduction to Azure DocumentDBDenny Lee

VamsiKrishna MaddiboinaMaddiboina VamsiKrishna

2022 Trends in Enterprise AnalyticsDATAVERSITY

Meeting Archive: A Simple Step to Gain 33% Performance Improvements in Reques...Meghan Weinreich

VMworld 2013: Walk-Through an IT Makeover, End-to-End, and See the Results! V...VMworld

Data Culture Series - Keynote - 3rd DecJonathan Woodward

Future of Making ThingsJC Davis

Optimizing Your Database Performance | Embarcadero TechnologiesEmbarcadero Technologies

Optimizing Your Database Performance | Embarcadero TechnologiesMichael Findling

Hot Technologies of 2013: Investigative AnalyticsInside Analysis

Embedded Analytics: The Next Mega-Wave of InnovationInside Analysis

IoT Architecture - are traditional architectures good enough?Guido Schmutz

DBTA Case Study on Data Optimization | September 2008Embarcadero Technologies

e-IT exec lunch - "It's all about data" - 25 May '16Devin Deen

Os Solomonoscon2007

Similaire à Big Data (20)

Logical Data Fabric and Data Mesh – Driving Business Outcomes

Webinar: NoSQL as the New Normal

Skillwise Big Data part 2

4. aws enterprise summit seoul 기존 엔터프라이즈 it 솔루션 클라우드로 이전하기 - thomas park

Skilwise Big data

Introduction to Azure DocumentDB

VamsiKrishna Maddiboina

2022 Trends in Enterprise Analytics

Meeting Archive: A Simple Step to Gain 33% Performance Improvements in Reques...

VMworld 2013: Walk-Through an IT Makeover, End-to-End, and See the Results! V...

Data Culture Series - Keynote - 3rd Dec

Future of Making Things

Optimizing Your Database Performance | Embarcadero Technologies

Hot Technologies of 2013: Investigative Analytics

Embedded Analytics: The Next Mega-Wave of Innovation

IoT Architecture - are traditional architectures good enough?

DBTA Case Study on Data Optimization | September 2008

e-IT exec lunch - "It's all about data" - 25 May '16

Os Solomon

Plus de Shahriar Iqbal Chowdhury

CommunityShahriar Iqbal Chowdhury

Cloud friendly Enterprise ArchitectureShahriar Iqbal Chowdhury

Interactive SDLCShahriar Iqbal Chowdhury

Enterprise business Inteligence Shahriar Iqbal Chowdhury

Cloud ComputingShahriar Iqbal Chowdhury

Version controlShahriar Iqbal Chowdhury

Node JSShahriar Iqbal Chowdhury

SPAShahriar Iqbal Chowdhury

Design Pattern that every cloud developer must know Shahriar Iqbal Chowdhury

Strategy PatternShahriar Iqbal Chowdhury

Observer patternShahriar Iqbal Chowdhury

Adapter Design PatternShahriar Iqbal Chowdhury

Factory method patternShahriar Iqbal Chowdhury

Plus de Shahriar Iqbal Chowdhury (13)

Community

Cloud friendly Enterprise Architecture

Interactive SDLC

Enterprise business Inteligence

Cloud Computing

Version control

Node JS

SPA

Design Pattern that every cloud developer must know

Strategy Pattern

Observer pattern

Adapter Design Pattern

Factory method pattern

Dernier

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

Sample pptx for embedding into website for demoHarshalMandlekar2

Advanced Computer Architecture – An IntroductionDilum Bandara

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Artificial intelligence in cctv survelliance.pptxhariprasad279825

What is Artificial Intelligence?????????blackmambaettijean

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

From Family Reminiscence to Scholarly Archive .Alan Dix

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

How to write a Business Continuity PlanDatabarracks

unit 4 immunoblotting technique complete.pptxBkGupta21

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

Dernier (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

SIP trunking in Janus @ Kamailio World 2024

DSPy a system for AI to Write Prompts and Do Fine Tuning

Generative AI for Technical Writer or Information Developers

Sample pptx for embedding into website for demo

Advanced Computer Architecture – An Introduction

What's New in Teams Calling, Meetings and Devices March 2024

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

What is DBT - The Ultimate Data Build Tool.pdf

WordPress Websites for Engineers: Elevate Your Brand

Artificial intelligence in cctv survelliance.pptx

What is Artificial Intelligence?????????

DevEX - reference for building teams, processes, and platforms

From Family Reminiscence to Scholarly Archive .

How AI, OpenAI, and ChatGPT impact business and software.

How to write a Business Continuity Plan

unit 4 immunoblotting technique complete.pptx

Developer Data Modeling Mistakes: From Postgres to NoSQL

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

Big Data

1. .Netter Tech Summit- 2015 RUET

2. Md. Delwar Hossain Sr. Software Engineer, desme Bangladsh. Skype: delwar_databiz E-mail: twinkle_023020@yahoo.com Linkedin: delwar-hossain

4. Unit Name Symbal Size Kilobyte KB 10^3 Megabyte MB 10^6 Gigabyte GB 10^9 Terabyte TB 10^12 Petabyte PB 10^15 Exabyte EB 10^18 Zettabyte ZB 10^21 Yottabyte YB 10^24

5.  Depending on traditional system  Traditional Software Tools 100 MB Document 100 GB Image 100 TB Video Unable to Send Unable to View Unable to Edit

6. Data come from many quarters.  Social media sites  Sensors  Business transactions  Location-based

7.  Volume: large volumes of data  Velocity: quickly moving data  Variety: structure, unstructured, images, audios, videos etc.

8.  Business value outcomes tied to the business strategy resulting from business decisions.  Higher productivity, faster time to complete tasks  Lower total cost of ownership and greater efficiencies in IT

9.  Operational  NoSQL  Analytical  Hadoop

10.  Scalability  Semi-structured and unstructured Data.  High Velocity

11.  Schema less : data structure is not predefined  Focus on retrieval of data and appending new data  Focus on key-value data stores that can be used to locate data objects  Focus on supporting storage of large quantities of unstructured data  SQL is not used for storage or retrieval of data  No ACID (atomicity, consistency, isolation, durability)

12.  Built for cloud  Scale out architecture  Agility afforded by cloud computing  Sharding automatically distributes data evenly across multi-node clusters  Automatically manages redundant servers. (replica sets) Horizontal scalality Application Document Oriented { author:”delwar”, date:new Date(), Topics:’Big Data Concept’, tag:[“tools”,”database”] }

13. Visit: www.mongodb.org

Notes de l'éditeur

Volume. A typical PC might have had 10 gigabytes of storage in 2000. Today, Facebook ingests 500 terabytes of new data every day; a Boeing 737 will generate 240 terabytes of flight data during a single flight across the US; the proliferation of smart phones, the data they create and consume; sensors embedded into everyday objects will soon result in billions of new, constantly-updated data feeds containing environmental, location, and other information Velocity. Clickstreams and ad impressions capture user behavior at millions of events per second; high-frequency stock trading algorithms reflect market changes within microseconds; machine to machine processes exchange data between billions of devices; infrastructure and sensors generate massive log data in real-time; on-line gaming systems support millions of concurrent users, each producing multiple inputs per second. Variety. Big Data data isn't just numbers, dates, and strings. Big Data is also geospatial data, 3D data, audio and video, and unstructured text, including log files and social media.
Selecting a Big Data Technology: Operational vs. Analytical The Big Data landscape is dominated by two classes of technology: systems that provide operational capabilities for real-time, interactive workloads where data is primarily captured and stored; and systems that provide analytical capabilities for retrospective, complex analysis that may touch most or all of the data. These classes of technology are complementary and frequently deployed together. Operational and analytical workloads for Big Data present opposing requirements and systems have evolved to address their particular demands separately and in very different ways. Each has driven the creation of new technology architectures. Operational systems, such as theNoSQL databases, focus on servicing highly concurrent requests while exhibiting low latency for responses operating on highly selective access criteria. Analytical systems, on the other hand, tend to focus on high throughput; queries can be very complex and touch most if not all of the data in the system at any time. Both systems tend to operate over many servers operating in a cluster, managing tens or hundreds of terabytes of data across billions of records. Operational Big Data For operational Big Data workloads, NoSQL Big Data systems such as document databases have emerged to address a broad set of applications, and other architectures, such as key-value stores, column family stores, and graph databases are optimized for more specific applications. NoSQL technologies, which were developed to address the shortcomings of relational databases in the modern computing environment, are faster and scale much more quickly and inexpensively than relational databases. Critically, NoSQL Big Data systems are designed to take advantage of new cloud computing architectures that have emerged over the past decade to allow massive computations to be run inexpensively and efficiently. This makes operational Big Data workloads much easier to manage, and cheaper and faster to implement. In addition to user interactions with data, most operational systems need to provide some degree of real-time intelligence about the active data in the system. For example in a multi-user game or financial application, aggregates for user activities or instrument performance are displayed to users to inform their next actions. Some NoSQL systems can provide insights into patterns and trends based on real-time data with minimal coding and without the need for data scientists and additional infrastructure. Analytical Big Data Analytical Big Data workloads, on the other hand, tend to be addressed by MPP database systems and MapReduce. These technologies are also a reaction to the limitations of traditional relational databases and their lack of ability to scale beyond the resources of a single server. Furthermore, MapReduce provides a new method of analyzing data that is complementary to the capabilities provided by SQL. As applications gain traction and their users generate increasing volumes of data, there are a number of retrospective analytical workloads that provide real value to the business. Where these workloads involve algorithms that are more sophisticated than simple aggregation, MapReduce has emerged as the first choice for Big Data analytics. Some NoSQL systems provide native MapReduce functionality that allows for analytics to be performed on operational data in place. Alternately, data can be copied from NoSQL systems into analytical systems such as Hadoop for MapReduce.