SlideShare a Scribd company logo
1 of 27
BIG DATA ANALYTICS
REFERENCE ARCHITECTURES AND
CASE STUDIES
BY SERHIY HAZIYEV AND OLHA HRYTSAY
Agenda
2
Big Data
Challenges
Big Data
Reference
Architectures
Case
Studies
10 tips for
Designing
Big Data
Solutions
Big Data Challenges
3
UNSTRUCTURED
STRUCTURED
HIGH
MEDIUM
LOW
Archives Docs Business
Apps
Media Social
Networks
Public
Web
Data
Storages
Machine
Log Data
Sensor
Data
Data Storages
RDBMS, NoSQL, Hadoop, file systems
etc.
Machine Log Data
Application logs, event logs, server
data, CDRs, clickstream data etc.
Sensor Data
Smart electric meters, medical
devices, car sensors, road cameras
etc.
Archives
Scanned documents, statements,
medical records, e-mails etc..
Docs
XLS, PDF, CSV, HTML, JSON etc.
Business Apps
CRM, ERP systems, HR, project
management etc.
Social Networks
Twitter, Facebook, Google+,
LinkedIn etc.
Public Web
Wikipedia, news, weather, public
finance etc
Media
Images, video, audio etc.
Velocity Variety VolumeComplexity
Big Data Analytics
4
Traditional Analytics (BI) Big Data Analytics
Focus on
Data Sets
Supports
• Descriptive analytics
• Diagnosis analytics
• Limited data sets
• Cleansed data
• Simple models
• Large scale data sets
• More types of data
• Raw data
• Complex data models
• Predictive analytics
• Data Science
Causation: what happened,
and why?
Correlation: new insight
More accurate answers
vs
Big Data Analytics Use Cases
5
Data
Discovery
Business
Reporting
Real Time
Intelligence
Data Quality
Self Service
Business Users
Intelligent AgentsConsumers
Low Latency
Reliability
Volume
Performance
Data Scientists/
Analysts
Big Data Analytics Reference Architectures
6
Architecture Drivers: Reference Architectures:
▪ Volume
▪ Sources
▪ Throughput
▪ Latency
▪ Extensibility
▪ Data Quality
▪ Reliability
▪ Security
▪ Self-Service
▪ Cost
▪ Extended Relational
▪ Non-Relational
▪ Hybrid
Relational Reference Architecture
7
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
OLAP Cubes
Query &
Reporting
Operational
Data Stores
Data Marts
Data
Warehouses
Replication
API/ODBC
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
8
Extended Relational
Reference Architecture
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
OLAP Cubes
Query &
Reporting
Operational
Data Stores
Data Marts
Data
Warehouses
Replication
API/ODBC
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Key components affected with Big Data challenges
Non-Relational Reference Architecture
9
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
Map Reduce
Query &
Reporting
Search Engines
Distributed File
Systems
NoSQL
Databases
API
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Key components introduced with non-relational movement
Extended Relational vs. Non-Relational Architecture
10
Architecture Drivers
Extended
Relational
Non-Relational
Large data volume
Self-service (ad-hoc reporting)
Unstructured data processing
High data model extensibility
High data quality and consistency
Extensive security
Reliability and fault-tolerance
Low latency (near-real time)
Low cost
Skills availability
Extended Relational vs. Non-Relational Architecture
11
Architecture Drivers
Extended
Relational
Non-Relational
Large data volume
Self-service (ad-hoc reporting)
Unstructured data processing
High data model extensibility
High data quality and consistency
Extensive security
Reliability and fault-tolerance
Low latency (near-real time)
Low cost
Skills availability
Relational vs. Non-Relational Architecture
12
Relational Non-Relational
• Rational
• Predictable
• Traditional
• Agile
• Flexible
• Modern
Data
Discovery
Business
Reporting
Real Time
Intelligence
Big Data Analytics Use Cases
13
Business Users
Intelligent AgentsConsumers
Performance
Volume
Data Scientists
Data Discovery: Non-Relational Architecture
14
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Advanced
Analytics
Map Reduce
Query &
Reporting
Search Engines
Distributed File
Systems
NoSQL
Databases
API
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Data
Discovery
Business
Reporting
Real Time
Intelligence
Big Data Analytics Use Cases
15
Intelligent AgentsConsumers
Data Scientists
Data Quality
Self Service
Business Users
Business Reporting: Hybrid Architecture
16
Web Services
Mobile
Devices
Native
Desktop
Web
Browsers
Map Reduce
SQL Query &
Reporting
Distributed File
Systems
API
Messaging
ETL
Unstructured
Semi-
Structured
Data Sources Integration Data Storages Analytics Presentation
Structured
Relational
DWH/DM
Advanced
Analytics
Search Engines
Extended Relational components Non-relational components
Data
Discovery
Business
Reporting
Real Time
Intelligence
Big Data Analytics Use Cases
17
Data Scientists Business Users
Intelligent AgentsConsumers
Low Latency
Reliability
Lambda Architecture
18
Source:
19
Business Goals:
 Provide development environment
for building custom mobile applications
 Charge customers for the platform they
use with pay-as-you-go model
Business Area:
Cloud based platform for building, deploying,
hosting and managing mobile applications
Case Study #1: Usage & Billing Analysis
Architectural Decisions
20
▪ Volume (> 10 TB)
▪ Sources (Semi-structured - JSON)
▪ Throughput (> 10K/sec)
▪ Latency (2 min)
▪ Extensibility (Custom metrics)
▪ Data Quality (Consistency)
▪ Reliability (24/7)
▪ Security (Multitenancy)
▪ Self-Service (Ad-Hoc reports)
▪ Cost (The less the better )
▪ Constraints (Public Cloud)
Architecture Drivers:
Trade-off:
//
Extended
Relational
Non-Relational
Extensibility - +
Data Quality + -
Self-Service + -
 Extended Relational Architecture
 Extensibility via Pre-allocated
Fields pattern
Solution Architecture
21
Technologies:
• Amazon Redshift
• Amazon SQS
• Amazon S3
• Elastic Beanstalk
• Jaspersoft BI Professional
• Python
22
Business Goals:
 Build in-house Analytics Platform for ROI measurement
and performance analysis of every product and feature
delivered by the e-commerce platform;
 Provide the ability to understand how end-users are
interacting with service content, products, and features on
sites;
 Do clickstream analysis;
 Perform A/B Testing
Business Area:
Retail. A platform for e-commerce and
collecting feedbacks from customers
Case Study #2: Clickstream for retail website
//
Extended
Relational
Non-
Relational
Volume/Scalability +/- +
Throughput + +
Self-Service + +/-
Extensibility - +
Architectural Decisions
23
▪ Volume (45 TB)
▪ Sources (Semi-structured - JSON)
▪ Throughput (> 20K/sec)
▪ Latency (1 hour)
▪ Extensibility (Custom tags)
▪ Data Quality (Not critical)
▪ Reliability (24/7)
▪ Security (Multitenancy)
▪ Self-Service (Canned reports, Data
science)
▪ Cost (The less the better )
▪ Constraints (Public Cloud)
Architecture Drivers:
Trade-off:
 Non-Relational Architecture
 Reporting via Materialized View
pattern
Solution Architecture
24
Technologies:
• Amazon S3
• Flume
• Hadoop/HDFS, MapReduce
• HBase
• Oozie
• Hive
Node 1
Node 2
Node N
10 Tips for Designing Big Data Solutions
25
 Understand data users and sources
 Discover architecture drivers
 Select proper reference architecture
 Do trade-off analysis, address cons
 Map reference architecture to technology stack
 Prototype, re-evaluate architecture
 Estimate implementation efforts
 Set up devops practices from the very beginning
 Advance in solution development through “small wins”
 Be ready for changes, big data technologies are evolving
rapidly
26
▪ Leading global Product and
Application Development partner
founded in 1993
▪ 3,300+ employees across North
America, Ukraine and Western
Europe
▪ Thousands of successful outsourcing
projects!
SaaS/Cloud Solutions . Mobility Solutions . UX/UI
BI/Analytics/Big Data . Software Architecture . Security
Clients include:
Thank You!
27
SoftServe US Office
One Congress Plaza,
111 Congress Avenue, Suite 2700 Austin, TX
78701
Tel: 512.516.8880
Contacts
Serhiy Haziyev: shaziyev@softserveinc.com
Olha Hrytsay: ohrytsay@softserveinc.com

More Related Content

What's hot

Introduction to Python for Data Science
Introduction to Python for Data ScienceIntroduction to Python for Data Science
Introduction to Python for Data ScienceArc & Codementor
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data ScienceMaloy Manna, PMP®
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Lake Database Database Template Map Data in Azure Synapse Analytics
Lake Database  Database Template  Map Data in Azure Synapse AnalyticsLake Database  Database Template  Map Data in Azure Synapse Analytics
Lake Database Database Template Map Data in Azure Synapse AnalyticsErwin de Kreuk
 
Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute PoojaPatidar11
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecasesSreenatha Reddy K R
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelDataiku
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupBlake Irvine
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021DATAVERSITY
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Dr. Arif Wider
 
Building an integrated data strategy
Building an integrated data strategyBuilding an integrated data strategy
Building an integrated data strategyLucas Modesto
 

What's hot (20)

Introduction to Python for Data Science
Introduction to Python for Data ScienceIntroduction to Python for Data Science
Introduction to Python for Data Science
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Data analytics
Data analyticsData analytics
Data analytics
 
Big Data
Big DataBig Data
Big Data
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Lake Database Database Template Map Data in Azure Synapse Analytics
Lake Database  Database Template  Map Data in Azure Synapse AnalyticsLake Database  Database Template  Map Data in Azure Synapse Analytics
Lake Database Database Template Map Data in Azure Synapse Analytics
 
Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute
 
Data analytics
Data analyticsData analytics
Data analytics
 
Big Data Analytics (1).ppt
Big Data Analytics (1).pptBig Data Analytics (1).ppt
Big Data Analytics (1).ppt
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecases
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
The Modern ERP Landscape
The Modern ERP LandscapeThe Modern ERP Landscape
The Modern ERP Landscape
 
Building an integrated data strategy
Building an integrated data strategyBuilding an integrated data strategy
Building an integrated data strategy
 

Similar to Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziyev and Olha Hrytsay

Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Denodo
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesDenodo
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationDenodo
 
Data APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of EngagementData APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of EngagementVictor Olex
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Denodo
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptxElsonPaul2
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Dell Digital Transformation Through AI and Data Analytics Webinar
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics WebinarBill Wong
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview Rajesh Menon
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2Joe_F
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQLPhilippe Julio
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSDenodo
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxAIMLSEMINARS
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jrJonathan Raspaud
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudJames Serra
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIDenodo
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsDenodo
 
AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAmazon Web Services
 

Similar to Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziyev and Olha Hrytsay (20)

big_data_case_studies.pdf
big_data_case_studies.pdfbig_data_case_studies.pdf
big_data_case_studies.pdf
 
Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solves
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
Data APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of EngagementData APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of Engagement
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Dell Digital Transformation Through AI and Data Analytics Webinar
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics Webinar
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWS
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jr
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
 
AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions Showcase
 

More from SoftServe

Approaching Quality in Digital Era
Approaching Quality in Digital EraApproaching Quality in Digital Era
Approaching Quality in Digital EraSoftServe
 
Digital Product Security
Digital Product SecurityDigital Product Security
Digital Product SecuritySoftServe
 
Testing Tools and Tips
Testing Tools and TipsTesting Tools and Tips
Testing Tools and TipsSoftServe
 
Android Mobile Application Testing: Human Interface Guideline, Tools
Android Mobile Application Testing: Human Interface Guideline, ToolsAndroid Mobile Application Testing: Human Interface Guideline, Tools
Android Mobile Application Testing: Human Interface Guideline, ToolsSoftServe
 
Android Mobile Application Testing: Specific Functional, Performance, Device ...
Android Mobile Application Testing: Specific Functional, Performance, Device ...Android Mobile Application Testing: Specific Functional, Performance, Device ...
Android Mobile Application Testing: Specific Functional, Performance, Device ...SoftServe
 
How to Reduce Time to Market Using Microsoft DevOps Solutions
How to Reduce Time to Market Using Microsoft DevOps SolutionsHow to Reduce Time to Market Using Microsoft DevOps Solutions
How to Reduce Time to Market Using Microsoft DevOps SolutionsSoftServe
 
Containerization: The DevOps Revolution
Containerization: The DevOps Revolution Containerization: The DevOps Revolution
Containerization: The DevOps Revolution SoftServe
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist SoftServe
 
Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS SoftServe
 
Implementing Test Automation: What a Manager Should Know
Implementing Test Automation: What a Manager Should KnowImplementing Test Automation: What a Manager Should Know
Implementing Test Automation: What a Manager Should KnowSoftServe
 
Using AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and BeyondUsing AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and BeyondSoftServe
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseSoftServe
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
 
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationBig Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationSoftServe
 
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...SoftServe
 
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...SoftServe
 
Managing Requirements with Word and TFS by Max Markov
Managing Requirements with Word and TFS by Max MarkovManaging Requirements with Word and TFS by Max Markov
Managing Requirements with Word and TFS by Max MarkovSoftServe
 
How to Implement Hybrid Cloud Solutions Successfully
How to Implement Hybrid Cloud Solutions SuccessfullyHow to Implement Hybrid Cloud Solutions Successfully
How to Implement Hybrid Cloud Solutions SuccessfullySoftServe
 
Designing Big Data Systems Like a Pro
Designing Big Data Systems Like a ProDesigning Big Data Systems Like a Pro
Designing Big Data Systems Like a ProSoftServe
 
Product Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
Product Management in Outsourcing by Roman Kolodchak and Roman PavlyukProduct Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
Product Management in Outsourcing by Roman Kolodchak and Roman PavlyukSoftServe
 

More from SoftServe (20)

Approaching Quality in Digital Era
Approaching Quality in Digital EraApproaching Quality in Digital Era
Approaching Quality in Digital Era
 
Digital Product Security
Digital Product SecurityDigital Product Security
Digital Product Security
 
Testing Tools and Tips
Testing Tools and TipsTesting Tools and Tips
Testing Tools and Tips
 
Android Mobile Application Testing: Human Interface Guideline, Tools
Android Mobile Application Testing: Human Interface Guideline, ToolsAndroid Mobile Application Testing: Human Interface Guideline, Tools
Android Mobile Application Testing: Human Interface Guideline, Tools
 
Android Mobile Application Testing: Specific Functional, Performance, Device ...
Android Mobile Application Testing: Specific Functional, Performance, Device ...Android Mobile Application Testing: Specific Functional, Performance, Device ...
Android Mobile Application Testing: Specific Functional, Performance, Device ...
 
How to Reduce Time to Market Using Microsoft DevOps Solutions
How to Reduce Time to Market Using Microsoft DevOps SolutionsHow to Reduce Time to Market Using Microsoft DevOps Solutions
How to Reduce Time to Market Using Microsoft DevOps Solutions
 
Containerization: The DevOps Revolution
Containerization: The DevOps Revolution Containerization: The DevOps Revolution
Containerization: The DevOps Revolution
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS
 
Implementing Test Automation: What a Manager Should Know
Implementing Test Automation: What a Manager Should KnowImplementing Test Automation: What a Manager Should Know
Implementing Test Automation: What a Manager Should Know
 
Using AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and BeyondUsing AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and Beyond
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationBig Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
 
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
 
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
 
Managing Requirements with Word and TFS by Max Markov
Managing Requirements with Word and TFS by Max MarkovManaging Requirements with Word and TFS by Max Markov
Managing Requirements with Word and TFS by Max Markov
 
How to Implement Hybrid Cloud Solutions Successfully
How to Implement Hybrid Cloud Solutions SuccessfullyHow to Implement Hybrid Cloud Solutions Successfully
How to Implement Hybrid Cloud Solutions Successfully
 
Designing Big Data Systems Like a Pro
Designing Big Data Systems Like a ProDesigning Big Data Systems Like a Pro
Designing Big Data Systems Like a Pro
 
Product Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
Product Management in Outsourcing by Roman Kolodchak and Roman PavlyukProduct Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
Product Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
 

Recently uploaded

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziyev and Olha Hrytsay

  • 1. BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BY SERHIY HAZIYEV AND OLHA HRYTSAY
  • 3. Big Data Challenges 3 UNSTRUCTURED STRUCTURED HIGH MEDIUM LOW Archives Docs Business Apps Media Social Networks Public Web Data Storages Machine Log Data Sensor Data Data Storages RDBMS, NoSQL, Hadoop, file systems etc. Machine Log Data Application logs, event logs, server data, CDRs, clickstream data etc. Sensor Data Smart electric meters, medical devices, car sensors, road cameras etc. Archives Scanned documents, statements, medical records, e-mails etc.. Docs XLS, PDF, CSV, HTML, JSON etc. Business Apps CRM, ERP systems, HR, project management etc. Social Networks Twitter, Facebook, Google+, LinkedIn etc. Public Web Wikipedia, news, weather, public finance etc Media Images, video, audio etc. Velocity Variety VolumeComplexity
  • 4. Big Data Analytics 4 Traditional Analytics (BI) Big Data Analytics Focus on Data Sets Supports • Descriptive analytics • Diagnosis analytics • Limited data sets • Cleansed data • Simple models • Large scale data sets • More types of data • Raw data • Complex data models • Predictive analytics • Data Science Causation: what happened, and why? Correlation: new insight More accurate answers vs
  • 5. Big Data Analytics Use Cases 5 Data Discovery Business Reporting Real Time Intelligence Data Quality Self Service Business Users Intelligent AgentsConsumers Low Latency Reliability Volume Performance Data Scientists/ Analysts
  • 6. Big Data Analytics Reference Architectures 6 Architecture Drivers: Reference Architectures: ▪ Volume ▪ Sources ▪ Throughput ▪ Latency ▪ Extensibility ▪ Data Quality ▪ Reliability ▪ Security ▪ Self-Service ▪ Cost ▪ Extended Relational ▪ Non-Relational ▪ Hybrid
  • 7. Relational Reference Architecture 7 Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics OLAP Cubes Query & Reporting Operational Data Stores Data Marts Data Warehouses Replication API/ODBC Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured
  • 8. 8 Extended Relational Reference Architecture Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics OLAP Cubes Query & Reporting Operational Data Stores Data Marts Data Warehouses Replication API/ODBC Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured Key components affected with Big Data challenges
  • 9. Non-Relational Reference Architecture 9 Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics Map Reduce Query & Reporting Search Engines Distributed File Systems NoSQL Databases API Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured Key components introduced with non-relational movement
  • 10. Extended Relational vs. Non-Relational Architecture 10 Architecture Drivers Extended Relational Non-Relational Large data volume Self-service (ad-hoc reporting) Unstructured data processing High data model extensibility High data quality and consistency Extensive security Reliability and fault-tolerance Low latency (near-real time) Low cost Skills availability
  • 11. Extended Relational vs. Non-Relational Architecture 11 Architecture Drivers Extended Relational Non-Relational Large data volume Self-service (ad-hoc reporting) Unstructured data processing High data model extensibility High data quality and consistency Extensive security Reliability and fault-tolerance Low latency (near-real time) Low cost Skills availability
  • 12. Relational vs. Non-Relational Architecture 12 Relational Non-Relational • Rational • Predictable • Traditional • Agile • Flexible • Modern
  • 13. Data Discovery Business Reporting Real Time Intelligence Big Data Analytics Use Cases 13 Business Users Intelligent AgentsConsumers Performance Volume Data Scientists
  • 14. Data Discovery: Non-Relational Architecture 14 Web Services Mobile Devices Native Desktop Web Browsers Advanced Analytics Map Reduce Query & Reporting Search Engines Distributed File Systems NoSQL Databases API Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured
  • 15. Data Discovery Business Reporting Real Time Intelligence Big Data Analytics Use Cases 15 Intelligent AgentsConsumers Data Scientists Data Quality Self Service Business Users
  • 16. Business Reporting: Hybrid Architecture 16 Web Services Mobile Devices Native Desktop Web Browsers Map Reduce SQL Query & Reporting Distributed File Systems API Messaging ETL Unstructured Semi- Structured Data Sources Integration Data Storages Analytics Presentation Structured Relational DWH/DM Advanced Analytics Search Engines Extended Relational components Non-relational components
  • 17. Data Discovery Business Reporting Real Time Intelligence Big Data Analytics Use Cases 17 Data Scientists Business Users Intelligent AgentsConsumers Low Latency Reliability
  • 19. 19 Business Goals:  Provide development environment for building custom mobile applications  Charge customers for the platform they use with pay-as-you-go model Business Area: Cloud based platform for building, deploying, hosting and managing mobile applications Case Study #1: Usage & Billing Analysis
  • 20. Architectural Decisions 20 ▪ Volume (> 10 TB) ▪ Sources (Semi-structured - JSON) ▪ Throughput (> 10K/sec) ▪ Latency (2 min) ▪ Extensibility (Custom metrics) ▪ Data Quality (Consistency) ▪ Reliability (24/7) ▪ Security (Multitenancy) ▪ Self-Service (Ad-Hoc reports) ▪ Cost (The less the better ) ▪ Constraints (Public Cloud) Architecture Drivers: Trade-off: // Extended Relational Non-Relational Extensibility - + Data Quality + - Self-Service + -  Extended Relational Architecture  Extensibility via Pre-allocated Fields pattern
  • 21. Solution Architecture 21 Technologies: • Amazon Redshift • Amazon SQS • Amazon S3 • Elastic Beanstalk • Jaspersoft BI Professional • Python
  • 22. 22 Business Goals:  Build in-house Analytics Platform for ROI measurement and performance analysis of every product and feature delivered by the e-commerce platform;  Provide the ability to understand how end-users are interacting with service content, products, and features on sites;  Do clickstream analysis;  Perform A/B Testing Business Area: Retail. A platform for e-commerce and collecting feedbacks from customers Case Study #2: Clickstream for retail website
  • 23. // Extended Relational Non- Relational Volume/Scalability +/- + Throughput + + Self-Service + +/- Extensibility - + Architectural Decisions 23 ▪ Volume (45 TB) ▪ Sources (Semi-structured - JSON) ▪ Throughput (> 20K/sec) ▪ Latency (1 hour) ▪ Extensibility (Custom tags) ▪ Data Quality (Not critical) ▪ Reliability (24/7) ▪ Security (Multitenancy) ▪ Self-Service (Canned reports, Data science) ▪ Cost (The less the better ) ▪ Constraints (Public Cloud) Architecture Drivers: Trade-off:  Non-Relational Architecture  Reporting via Materialized View pattern
  • 24. Solution Architecture 24 Technologies: • Amazon S3 • Flume • Hadoop/HDFS, MapReduce • HBase • Oozie • Hive Node 1 Node 2 Node N
  • 25. 10 Tips for Designing Big Data Solutions 25  Understand data users and sources  Discover architecture drivers  Select proper reference architecture  Do trade-off analysis, address cons  Map reference architecture to technology stack  Prototype, re-evaluate architecture  Estimate implementation efforts  Set up devops practices from the very beginning  Advance in solution development through “small wins”  Be ready for changes, big data technologies are evolving rapidly
  • 26. 26 ▪ Leading global Product and Application Development partner founded in 1993 ▪ 3,300+ employees across North America, Ukraine and Western Europe ▪ Thousands of successful outsourcing projects! SaaS/Cloud Solutions . Mobility Solutions . UX/UI BI/Analytics/Big Data . Software Architecture . Security Clients include:
  • 27. Thank You! 27 SoftServe US Office One Congress Plaza, 111 Congress Avenue, Suite 2700 Austin, TX 78701 Tel: 512.516.8880 Contacts Serhiy Haziyev: shaziyev@softserveinc.com Olha Hrytsay: ohrytsay@softserveinc.com

Editor's Notes

  1. Big Data – data that is too large complex and dynamic for any conventional data tools to capture, store, manage and analyze.
  2. More details can be found here: link to our case study