SlideShare une entreprise Scribd logo
1  sur  31
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Dave Vennergrund, CSRA Director Data and Analytics
June 13, 2017
AWS Big Data and Analytics Services
Speed Innovation
How CSRA helps agencies rapidly find mission insights
Challenges
• Data-driven agencies face data
integration challenges caused
by stove-piped data
• Data is growing at exponential
rates – of all types
• Staff available to
analyze data is
not growing
Overview
Our AWS-Based Solution
• A secure cloud platform to
accelerate time to analysis and
innovation
• Data integration services to
rapidly ingest, store and
integrate data for deeper
analytics
• Big data analytics services to
optimize operations and create
valuable mission impacts
AWS-based Data Analytic Platform
Secure Platform Data Integration Data Analytics
Enables a smaller number of staff to meet mission demands
by smartly leveraging rapidly growing data
Elastic, Services-Based
Architecture
Secure platform to jumpstart
data analytics
Start with a secure, extensible platform
AWS Security
Services integrated
with Security and
Management
Framework
FAA Cloud Services (FCS) Platform FeaturesApproach
Mission – provide a
Government Cloud for FAA
applications, data, and
analytics work streams
• FedRAMP-certified
Enterprise integration with
FAA services
• Extensible, agile
infrastructure and services
• Security-as-a-Service
model
BENEFITS Certified, agile, fully managed secure cloud platform
FSC Platform accelerates ATO for new workloads
• Baseline reference
architecture
• CONOPS for IaaS
• Inheritance of security
boundaries and controls
• SOC integration
• Procurement acquisition
guidelines
• Reduced review cycles for
compliance
FAA Cloud Services Impacts
Key Lessons Learned
• Adapt Contracts to enable cloud
acquisition
• Be Agile - use an iterative
approach
• Aligns Security Engineering with
Compliance
Key Benefits
• Established a Secure Platform for
community reuse
• New work streams inherit secure
foundation – can be accredited
much faster
• Security-as-a-Service lowers costs
and increases repeatability
Data Integration
at Scale
Tear Down the Walls
AWS IaaS to host
Open-Source
Solutions for Data
Integration and
Analysis
Mission – provide a
data integration
platform for FAA data
analytics
• FCS Secure Platform
• AWS IaaS
• Open Source Tools
• Big Data Tools
• Streaming & Batch
• Land, Ingest, Enrich,
Store, and Serve
Data
BENEFITS Secure, high availability with little O&M cost, scalable to
Peta- and Exa-bytes, deeper analytics in weeks
FAA Enterprise
Information Management
Analytic FeaturesApproach
Challenge – Stove-Piped Data Slows Analysis
Loss of Separation Flight Tracks Weather Data
Determination of root cause for a ”Loss of Separation” event requires information from
numerous source systems. Collecting and integrating data for analysis is time-consuming.
Many Data Sources impact LOS Events
Cockpit Recording Cloud Tops Flight Plans
Air Traffic Controllers Aircraft Sensors Maintenance Logs
Data Mall and App Mart
Big Data
Medium Data
High-writes
Freetext searches
In-memory Data
High-speed,
catching
Analytics
Data science
Data
EIM Architecture – Open Source on AWS IaaS
VISUALIZATION
Applications
Pipeline Management
Routing, mediation
INGEST
Apache NiFi
Data Processing
Normalizations,
enrichments
analytics
Apache Storm
Apache Spark
Small Data
RDBMS
MongoDBPostgreSQL
Reporting Dashboards Web Apps
HortonWorks
Elasticsearch
PandasRedis
CONSUMER
ACCESS
UNIFIEDDATALAYER
Data Transformation Deep Analytics
and Data
Exploration
Large Scale
Data Storage
LEGEND
Logical Grouping
Example
feature/ function
Example
Technology
Result - Enhanced Flight Analysis
FAA EIM Impacts
Key Lessons Learned
• Stove-piped data can be
integrated and accessed faster
• Enriched data frees up time for
deeper analytics
• Data treatments lower costs
(HDFS vs RDBMS)
• Infrastructure as a Service
reduces Capital Expenses and
O&M costs
Key Benefits
• Overcame data deluge in unified
platform
• Re-host, Re-point Apps to one
data platform in weeks
• Derive valuable, new Insights with
Analytics in weeks
Data Analytics to optimize and
improve missions
Reduce Time to Insights
AWS Data and
Analytics Services
(PaaS)
Missions - improve water
safety; verify fuel
economy
• Secure Platform
• AWS Cloud
Infrastructure (IaaS)
• Storage
• Databases
• Serverless Query
• Analytics
• Visualization
Wide array of data analytics services to create mission impact in
weeks; lower costs, better enforcement, improved health safety
BENEFITS
Environmental Analytics Analytic FeaturesApproach
AnalyzeStore
Amazon
Glacier
Amazon
S3
Amazon
DynamoDB
Amazon RDS,
Amazon Aurora
AWS Data Integration and Analytics Services
Amazon
EMR
Amazon EC2
Amazon
Redshift
Amazon
Elasticsearch
Service
Amazon Kinesis
Firehose
AWS Direct
Connect
Collect
Amazon Kinesis
Streams
Amazon
Machine
Learning
Amazon
Kinesis
Analytics
AWS Snowball
Amazon
QuickSight
AWS Data
Pipeline
AWS DMS
Amazon
CloudSearch
Move
Amazon
Athena
Challenge #1: Prevent Coliform Contamination
Public Water Systems are monitored for a wide-array of health impacting contaminants.
Coliform bacteria treatments are postulated to be overwhelmed by precipitation events.
Approach – Integrate and Prepare Data
Hypothesis
Can we accurately
predict the risk of a
health-impact coliform
violation for public
water systems based
on known violations
combined with
weather data?
Violation Data
• EPA SDWIS
• Health impacts
• Coliform violation
Weather Data
• NOAA Quality
Controlled Local
Climatological Data
Transformations
• Remove PWS with
no violations
• Standardize
location
• Join by time and
nearest weather
station location
• Store in
Amazon S3
Athena Serverless Data Query
Approach – Explore Data for Discovery
QuickSight Data Visualization
Approach – Model with Amazon Machine Learning
Best Model
• 80% Precision; 61% Recall
Findings – Utility of Model
• Weather impacts certain but
not all PWS
• Proactive water treatment in
face of precipitation
• Prioritize improvements
Allows business analysts, citizen scientists, and data scientist alike
to build and deploy predictive models with simple process
Challenge #2: Identify fuel economy label errors
Fuel Economy estimates are useful tools for consumers and regulators alike. Consumers
use MPG as a selection criteria. Manufacturer Fleet averages must meet targets.
Approach – Prepare Data and Model
Hypothesis
Given examples of
”re-labeled” fuel
economy metrics can
we develop a model
to locate other
potential revisions?
Attributes Used
• Horse power
• Weight
• Adjusted City MPG
• Transmission Type
• Transmission Gears
• Cylinders
• Valves
• Labeling Approach
• Re-labeled Flag
Models
 16 Models
 Algorithms
• Logistic Regression
• Support Vector
Machine
• Neural Net
• Conditional Tree
• Recursive Partition
Tree
AWS Data Science Linux AMI - R Studio
Best Model
• 97% Precision; 91% Recall
Findings – Utility of Model
• ”re-labeled” fuel economy
ratings can be detected
• Model may be applied to
other car types to detect
label errors
• Prioritize review
• Lower costs
Environmental Proofs – Impacts to Date
Key Lessons Learned
• We showed that agencies can
improve monitoring, compliance,
and safety with data currently
collected
• AWS advanced analytics services
create useful data science
solutions in hours
• Still a need for some IaaS
• Predictive power of data increases
with data from related agencies
Key Benefits
• Platform expedites new analytics
• Enables agency scientists,
business analysts, and citizen
scientists alike to discover new
relationships in public date
CSRA AWS Data Analytics Platform
Secure
FISMA
FedRAMP
ATO
Integration
Treatments
Persistent
Ephemeral
Analytics
Predictive
Streaming
ML
PaaS
Elastic
Managed
Serverless
The future is already here,
it’s just not evenly distributed
- William Gibson ca. 1999
CSRA - Think Next. Now.
• We deliver a broad range of innovative,
next-generation IT solutions and
professional services - Bringing tomorrow’s
solutions, today.
• We meet our clients on their journey to the
cloud, to manage, analyze, optimize and
innovate
• We help customers modernize, protect their
networks, and improve effectiveness of
mission-critical functions for our warfighters
and citizens
Questions?
Thank you!
Please visit us at the CSRA Booth 528

Contenu connexe

Tendances

What's New with Big Data Analytics
What's New with Big Data AnalyticsWhat's New with Big Data Analytics
What's New with Big Data Analytics
Amazon Web Services
 

Tendances (20)

BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
February 2016 Webinar Series - Introduction to AWS Database Migration Service
February 2016 Webinar Series - Introduction to AWS Database Migration ServiceFebruary 2016 Webinar Series - Introduction to AWS Database Migration Service
February 2016 Webinar Series - Introduction to AWS Database Migration Service
 
Scaling Ideas: Accelerating Research with AWS - Technical 301
Scaling Ideas: Accelerating Research with AWS - Technical 301Scaling Ideas: Accelerating Research with AWS - Technical 301
Scaling Ideas: Accelerating Research with AWS - Technical 301
 
What's New with Big Data Analytics
What's New with Big Data AnalyticsWhat's New with Big Data Analytics
What's New with Big Data Analytics
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
AWS Partnership Model - AWS - AWSome Day Zurich - 112016
AWS Partnership Model - AWS - AWSome Day Zurich - 112016AWS Partnership Model - AWS - AWSome Day Zurich - 112016
AWS Partnership Model - AWS - AWSome Day Zurich - 112016
 
Builders Day' - Databases on AWS: The Right Tool for The Right Job
Builders Day' - Databases on AWS: The Right Tool for The Right JobBuilders Day' - Databases on AWS: The Right Tool for The Right Job
Builders Day' - Databases on AWS: The Right Tool for The Right Job
 
Big data on aws
Big data on awsBig data on aws
Big data on aws
 
AWS re:Invent 2016: Continuous Compliance in the AWS Cloud for Regulated Life...
AWS re:Invent 2016: Continuous Compliance in the AWS Cloud for Regulated Life...AWS re:Invent 2016: Continuous Compliance in the AWS Cloud for Regulated Life...
AWS re:Invent 2016: Continuous Compliance in the AWS Cloud for Regulated Life...
 
How Can I Plan for Security, Risk, & Compliance Before Migrating to AWS? | A...
 How Can I Plan for Security, Risk, & Compliance Before Migrating to AWS? | A... How Can I Plan for Security, Risk, & Compliance Before Migrating to AWS? | A...
How Can I Plan for Security, Risk, & Compliance Before Migrating to AWS? | A...
 
Migration Recipes for Success - AWS Summit Cape Town 2017
Migration Recipes for Success - AWS Summit Cape Town 2017 Migration Recipes for Success - AWS Summit Cape Town 2017
Migration Recipes for Success - AWS Summit Cape Town 2017
 
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
 
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
 
Cloud Economics and calculating CTO - AWSome Day Zurich 112016
Cloud Economics and calculating CTO - AWSome Day Zurich 112016Cloud Economics and calculating CTO - AWSome Day Zurich 112016
Cloud Economics and calculating CTO - AWSome Day Zurich 112016
 

Similaire à AWS Big Data and Analytics Services Speed Innovation | AWS Public Sector Summit 2017

Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Amazon Web Services
 

Similaire à AWS Big Data and Analytics Services Speed Innovation | AWS Public Sector Summit 2017 (20)

Speeding innovation with aws big data analytic services
Speeding innovation with aws big data analytic servicesSpeeding innovation with aws big data analytic services
Speeding innovation with aws big data analytic services
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
The AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewThe AWS Big Data Platform – Overview
The AWS Big Data Platform – Overview
 
AWS Webcast - Webinar Series for State and Local Government #1: Discover Clou...
AWS Webcast - Webinar Series for State and Local Government #1: Discover Clou...AWS Webcast - Webinar Series for State and Local Government #1: Discover Clou...
AWS Webcast - Webinar Series for State and Local Government #1: Discover Clou...
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Migrate and Manage Workloads with Apps Associates
Migrate and Manage Workloads with Apps AssociatesMigrate and Manage Workloads with Apps Associates
Migrate and Manage Workloads with Apps Associates
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
 
AWS Public Sector Symposium 2014 Canberra | Getting Started with AWS for Gove...
AWS Public Sector Symposium 2014 Canberra | Getting Started with AWS for Gove...AWS Public Sector Symposium 2014 Canberra | Getting Started with AWS for Gove...
AWS Public Sector Symposium 2014 Canberra | Getting Started with AWS for Gove...
 
AWS Webcast - Migrating your Data Center to the Cloud
AWS Webcast - Migrating your Data Center to the CloudAWS Webcast - Migrating your Data Center to the Cloud
AWS Webcast - Migrating your Data Center to the Cloud
 
AWS Storage State of the Union
AWS Storage State of the UnionAWS Storage State of the Union
AWS Storage State of the Union
 
Overview of AWS Services for Data Storage and Migration - SRV205 - Anaheim AW...
Overview of AWS Services for Data Storage and Migration - SRV205 - Anaheim AW...Overview of AWS Services for Data Storage and Migration - SRV205 - Anaheim AW...
Overview of AWS Services for Data Storage and Migration - SRV205 - Anaheim AW...
 
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
 
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
 
AWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif Abbasi
AWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif AbbasiAWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif Abbasi
AWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif Abbasi
 
Introduction to the AWS Cloud – Russell Hall
Introduction to the AWS Cloud – Russell HallIntroduction to the AWS Cloud – Russell Hall
Introduction to the AWS Cloud – Russell Hall
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
 
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
 

Plus de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

AWS Big Data and Analytics Services Speed Innovation | AWS Public Sector Summit 2017

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dave Vennergrund, CSRA Director Data and Analytics June 13, 2017 AWS Big Data and Analytics Services Speed Innovation How CSRA helps agencies rapidly find mission insights
  • 2. Challenges • Data-driven agencies face data integration challenges caused by stove-piped data • Data is growing at exponential rates – of all types • Staff available to analyze data is not growing Overview Our AWS-Based Solution • A secure cloud platform to accelerate time to analysis and innovation • Data integration services to rapidly ingest, store and integrate data for deeper analytics • Big data analytics services to optimize operations and create valuable mission impacts
  • 3. AWS-based Data Analytic Platform Secure Platform Data Integration Data Analytics Enables a smaller number of staff to meet mission demands by smartly leveraging rapidly growing data Elastic, Services-Based Architecture
  • 4. Secure platform to jumpstart data analytics
  • 5. Start with a secure, extensible platform AWS Security Services integrated with Security and Management Framework FAA Cloud Services (FCS) Platform FeaturesApproach Mission – provide a Government Cloud for FAA applications, data, and analytics work streams • FedRAMP-certified Enterprise integration with FAA services • Extensible, agile infrastructure and services • Security-as-a-Service model BENEFITS Certified, agile, fully managed secure cloud platform
  • 6. FSC Platform accelerates ATO for new workloads • Baseline reference architecture • CONOPS for IaaS • Inheritance of security boundaries and controls • SOC integration • Procurement acquisition guidelines • Reduced review cycles for compliance
  • 7. FAA Cloud Services Impacts Key Lessons Learned • Adapt Contracts to enable cloud acquisition • Be Agile - use an iterative approach • Aligns Security Engineering with Compliance Key Benefits • Established a Secure Platform for community reuse • New work streams inherit secure foundation – can be accredited much faster • Security-as-a-Service lowers costs and increases repeatability
  • 9. Tear Down the Walls AWS IaaS to host Open-Source Solutions for Data Integration and Analysis Mission – provide a data integration platform for FAA data analytics • FCS Secure Platform • AWS IaaS • Open Source Tools • Big Data Tools • Streaming & Batch • Land, Ingest, Enrich, Store, and Serve Data BENEFITS Secure, high availability with little O&M cost, scalable to Peta- and Exa-bytes, deeper analytics in weeks FAA Enterprise Information Management Analytic FeaturesApproach
  • 10. Challenge – Stove-Piped Data Slows Analysis Loss of Separation Flight Tracks Weather Data Determination of root cause for a ”Loss of Separation” event requires information from numerous source systems. Collecting and integrating data for analysis is time-consuming.
  • 11. Many Data Sources impact LOS Events Cockpit Recording Cloud Tops Flight Plans Air Traffic Controllers Aircraft Sensors Maintenance Logs
  • 12. Data Mall and App Mart
  • 13. Big Data Medium Data High-writes Freetext searches In-memory Data High-speed, catching Analytics Data science Data EIM Architecture – Open Source on AWS IaaS VISUALIZATION Applications Pipeline Management Routing, mediation INGEST Apache NiFi Data Processing Normalizations, enrichments analytics Apache Storm Apache Spark Small Data RDBMS MongoDBPostgreSQL Reporting Dashboards Web Apps HortonWorks Elasticsearch PandasRedis CONSUMER ACCESS UNIFIEDDATALAYER Data Transformation Deep Analytics and Data Exploration Large Scale Data Storage LEGEND Logical Grouping Example feature/ function Example Technology
  • 14. Result - Enhanced Flight Analysis
  • 15. FAA EIM Impacts Key Lessons Learned • Stove-piped data can be integrated and accessed faster • Enriched data frees up time for deeper analytics • Data treatments lower costs (HDFS vs RDBMS) • Infrastructure as a Service reduces Capital Expenses and O&M costs Key Benefits • Overcame data deluge in unified platform • Re-host, Re-point Apps to one data platform in weeks • Derive valuable, new Insights with Analytics in weeks
  • 16. Data Analytics to optimize and improve missions
  • 17. Reduce Time to Insights AWS Data and Analytics Services (PaaS) Missions - improve water safety; verify fuel economy • Secure Platform • AWS Cloud Infrastructure (IaaS) • Storage • Databases • Serverless Query • Analytics • Visualization Wide array of data analytics services to create mission impact in weeks; lower costs, better enforcement, improved health safety BENEFITS Environmental Analytics Analytic FeaturesApproach
  • 18. AnalyzeStore Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora AWS Data Integration and Analytics Services Amazon EMR Amazon EC2 Amazon Redshift Amazon Elasticsearch Service Amazon Kinesis Firehose AWS Direct Connect Collect Amazon Kinesis Streams Amazon Machine Learning Amazon Kinesis Analytics AWS Snowball Amazon QuickSight AWS Data Pipeline AWS DMS Amazon CloudSearch Move Amazon Athena
  • 19. Challenge #1: Prevent Coliform Contamination Public Water Systems are monitored for a wide-array of health impacting contaminants. Coliform bacteria treatments are postulated to be overwhelmed by precipitation events.
  • 20. Approach – Integrate and Prepare Data Hypothesis Can we accurately predict the risk of a health-impact coliform violation for public water systems based on known violations combined with weather data? Violation Data • EPA SDWIS • Health impacts • Coliform violation Weather Data • NOAA Quality Controlled Local Climatological Data Transformations • Remove PWS with no violations • Standardize location • Join by time and nearest weather station location • Store in Amazon S3
  • 21. Athena Serverless Data Query Approach – Explore Data for Discovery QuickSight Data Visualization
  • 22. Approach – Model with Amazon Machine Learning Best Model • 80% Precision; 61% Recall Findings – Utility of Model • Weather impacts certain but not all PWS • Proactive water treatment in face of precipitation • Prioritize improvements Allows business analysts, citizen scientists, and data scientist alike to build and deploy predictive models with simple process
  • 23. Challenge #2: Identify fuel economy label errors Fuel Economy estimates are useful tools for consumers and regulators alike. Consumers use MPG as a selection criteria. Manufacturer Fleet averages must meet targets.
  • 24. Approach – Prepare Data and Model Hypothesis Given examples of ”re-labeled” fuel economy metrics can we develop a model to locate other potential revisions? Attributes Used • Horse power • Weight • Adjusted City MPG • Transmission Type • Transmission Gears • Cylinders • Valves • Labeling Approach • Re-labeled Flag Models  16 Models  Algorithms • Logistic Regression • Support Vector Machine • Neural Net • Conditional Tree • Recursive Partition Tree
  • 25. AWS Data Science Linux AMI - R Studio Best Model • 97% Precision; 91% Recall Findings – Utility of Model • ”re-labeled” fuel economy ratings can be detected • Model may be applied to other car types to detect label errors • Prioritize review • Lower costs
  • 26. Environmental Proofs – Impacts to Date Key Lessons Learned • We showed that agencies can improve monitoring, compliance, and safety with data currently collected • AWS advanced analytics services create useful data science solutions in hours • Still a need for some IaaS • Predictive power of data increases with data from related agencies Key Benefits • Platform expedites new analytics • Enables agency scientists, business analysts, and citizen scientists alike to discover new relationships in public date
  • 27. CSRA AWS Data Analytics Platform Secure FISMA FedRAMP ATO Integration Treatments Persistent Ephemeral Analytics Predictive Streaming ML PaaS Elastic Managed Serverless
  • 28. The future is already here, it’s just not evenly distributed - William Gibson ca. 1999
  • 29. CSRA - Think Next. Now. • We deliver a broad range of innovative, next-generation IT solutions and professional services - Bringing tomorrow’s solutions, today. • We meet our clients on their journey to the cloud, to manage, analyze, optimize and innovate • We help customers modernize, protect their networks, and improve effectiveness of mission-critical functions for our warfighters and citizens
  • 31. Thank you! Please visit us at the CSRA Booth 528

Notes de l'éditeur

  1. When is the presentation? 13th?
  2. AWS Security Services AWS CloudTrail security groups IAM Users and Groups Shared Responsibility Model Managed by AWS - AWS IAM - foundation services and AWS global infrastructure Managed by CSRA – Customer IAM – Data; Platform & Application Management; OS, Network & Firewall Configuration; Encryption (Client-side, Server-side), Network Traffic Protection Architecture Elements Direct Connect (DX) connection between our primary and secondary colocation facilities.   Three VPCs (dev, test, production) hosting workloads.   Each leverage that same DX and traffic is separated using VLANs as is typical of every DX connection. FAA FTI network connects all FAA sites is also connected to CSRA routers in each colocation facility.   The Management and Control tools are in a network in the colocation facilities. COTS Bit9, Carbon Black Splunk, RSA Archer, ArcSight
  3. Use an Iterative Approach “Security is a process” – Bruce Schneier Start security work early and iterate Security engineering and compliance are different Security engineering should be baked in Ensure hardened images are part of dev/test/ops Work out the problems with hardening first and adapt Engineering Comes First Make sure design and engineering artifacts come first to help define boundaries and controls Develop Ops CONOPS and engineering design early Identify security controls and iteratively design and build Security engineering becomes a collaborative part of this work and helps simplify the path to compliance Compliance work comes later Iterate on security engineering deliverables Assessment and compliance require stable deliverable production-ready environment SSP is dependent on the engineering and O&M design Maturity in engineering and O&M makes SSP deliverables and assessment easier Separating security engineering from compliance is a key tenet. There should be clear separation in the process, reviews, and oversight of security engineering and compliance. Efficiency is achieved by separating these two concepts and allowing engineering and design to work at its pace as a key input to the compliance process which comes later. Develop the engineering design and operations CONOPS early, and everything else will fall into place. The blocking and tackling with DevSecOps (see next slide) should be an integral part of this process. The risk is the compliance process will lengthen and assessment and reviews, including 3-party independent assessments will take much longer than planned. Preparing the engineering and operations model early and producing a well-tested integrated product makes the job of the assessors much easier. Another example is that security deliverables for compliance should be separated from engineering reviews to avoid inefficient review cycles. The security documents like the SSP depend on the engineering design, so make sure the design process allows for these reviews early. Don’t use the compliance deliverables and process to review engineering design and operations processes. Adapt Contracts to Cloud Carefully set boundaries with a clear SOW Make sure contract is aligned with authoritative FedRAMP compliance Separate security deliverables and reviews for assessment from other deliverables for engineering lifecycle, etc. First, adapting to the cloud is not just a technology approach, it should be suffused in the contract and the guidance for how oversight and programs work too. Make sure boundaries are set clearly with contractual documents and work statements. The SOW should be clear on scope and follow the principle that FedRAMP is the authoritative process and compliance. Going fast means that the program office and the contracts office have to be in sync on how and what will be built out and secured with focus on the essentials for how security engineering and compliance are achieved.
  4. Today, major events like a loss of separation require analysts to access data from numerous stove-pipe systems. Retrospective analysis is overwhelmed with the tasks data identification, collection, and integration – leaving less time for meaningful data analysis.
  5. ThinkStock items Cockpit Recording - 666297418  Air Traffic Controllers - A505695890
  6. 15 Apps integrated with Data Mall Apache Tools Hadoop Tools Search Tools FAA Analytics 20 data sets ingested in 8 weeks Data Mall Original Data Enriched Data Joined Data
  7. given the power of the cloud (storage and processing) enhanced analytics are enabled allowing programs to correlate data outside of their immediate interest to determine possible cause and effect relationships which  DID WE BUILD THIS ???
  8. This is the AWS Big Data portfolio. We have tools like Direct Connect and Import Export that can bring in a lot of data. We can persist that data into a number of storage services from S3 to DynamoDB to EMR and RedShift for further analysis. Amazon Redshift provides a fast, fully managed, petabyte-scale data warehouse for less than $1000 per terabyte per year. Amazon Elastic MapReduce provides a managed, easy to use analytics platform built around the powerful Hadoop framework. Amazon Kinesis, a managed service for real-time processing of streaming big data. Amazon Glacier allows you to backup and archive an unlimited amount of data at just 1 cent per GB per month. Automate and schedule big data processing workloads with Data Pipeline. The tools to support big data collection, computation along with collaboration and sharing are all available in a couple of clicks, with AWS.
  9. Water treatment plant (Left) : ThinkStock – 530987638 Flooded stream (Right): ThinkStock - 682424174
  10. Demonstrate the viability of a model to accurately predict the risk of a health-impact coliform violation for public water systems based on known violations combined with weather data The Safe Drinking Water Information System (SDWIS) contains information about public water systems and their violations of EPA's drinking water regulations, as reported to EPA by the states. These regulations establish maximum contaminant levels, treatment techniques, and monitoring and reporting requirements to ensure that water systems provide safe water to their customers. This search will help you to find your drinking water supplier and view its violations and enforcement history since 1993.  Local Climatological Data (LCD) is only available for stations and locations within the United States and its territories. Select the state or territory, location, and time to view specific data. Click the station name to view details or click "ADD TO CART" to order that station's data.
  11. Car Fleet ThinkStock item: 78459143
  12. Reduced to nine variables Attributes standardized with Z-scores, box-cox, log transforms and PCA
  13. Model 1, Logistic Regression with z-scores: Relabeled ~ HP * INERTIA_WT + AdjCity Model 2, Logistic Regression with z-scores: Relabeled ~ HP + INERTIA_WT + AdjCity Model 3, Logistic Regression with z-scores: Relabeled ~ HP * INERTIA_WT * AdjCity Model 4, Logistic Regression with z-scores: Relabeled ~ ECID * HP * INERTIA_WT * AdjCity Model 5, Same Logistic Regression as above, but with Box-Cox transformed values: Relabeled ~ ECID * HP * INERTIA_WT * AdjCity Model 6, Same Logistic Regression as above, but with log-transformed values and PCA: Relabeled ~ ECID * HP * INERTIA_WT * AdjCity Model 7, SVM: HP_z, INERTIA_WT_z, ECID_z, AdjCity_z, Transmission, NCYL Model 9, Neural Net: HP_z, INERTIA_WT_z, ECID_z, AdjCity_z, TRANS_TYPE, TRANS_GEARS, NCYL Model 10, Neural Net: Adding FE_LABEL_CALC_APPROACH and NVAL to predictor variables Model 11, Logistic regression with more categorical predictors: ECID + HP + INERTIA_WT + AdjCity + TRANS_TYPE + TRANS_GEARS + NCYL + FE_LABEL_CALC_APPROACH + NVAL Model 12, Logistic as above but log transform the continuous predictor variables Model 13, Conditional Tree Model 14, Recursive Partitioning Tree Model 15, Recursive Partitioning Tree with two more predictors Model 16, Neural Net with the same predictors 256 Car types – 196 TN; 53 TP; 5 FN; 2 FP ...
  14. Amazon Web Services accelerate Data Analytics AWS services enable organizations with only limited data analytics capabilities to tackle expert challenges When integrated with a robust cloud and security strategy, the platform can scale to support an agency's data needs Agencies can realize mission value quickly - in days and weeks - with significant opportunity for continued innovation
  15. Lead slide or last slide or even needed?