More Related Content
Similar to Open Data on AWS (20)
More from Amazon Web Services (20)
Open Data on AWS
- 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Ahim Kho, Head of Enterprise Business Hong Kong &
Taiwan
Open Data on AWS
https://opendata.aws
- 2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Agenda
Overview of Open Data on AWS
How shared data on the cloud can accelerate research
Finding data shared on AWS
Sharing data on AWS
- 3. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
No Up Front Expense
Pay for what you Use
Improve Time to
Market & Agility
Scale Up and
Down
Self-Service
Infrastructure
AWS Cloud
Equipment
Resources and
Administration
Contracts Cost
Traditional
Infrastructure
- 4. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Why does AWS care about open data?
Many AWS customers supply data
to the public to accelerate research
and product development.
Many AWS customers use data
shared on AWS to create new
products and services.
- 5. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Companies want more value from their data
Complications:
Siloed approaches don’t work anymore
It’s too expensive and limiting to store data
on-premises
Data is:
Implication:
A new approach is needed to extract insights
and value
Growing
exponentially
From new
sources
Increasingly
diverse
Used by
many people
Analyzed by
many applications
- 6. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
“…data must be organized, well-documented,
consistently formatted, and error free. Cleaning the data
is often the most taxing part of data science, and is
frequently 80% of the work.”
— Data Driven by DJ Patil and Hilary Mason
Undifferentiated heavy lifting
- 7. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Flipped data flow in the cloud
Traditional approach:
Move data to computing resources.
Cloud approach:
Move computing resources to data.
Amazon S3
Amazon
EC2
Amazon
EMR
Amazon
Athena
- 8. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Cloud data lakes are the future
Customers want:
To eliminate data silos
To move to a single store, i.e. a data lake in the cloud
To store data securely in standard formats
To grow to any scale, with low costs
To analyze their data in a variety of ways
To have real-time analytics
To predict future outcomes
Data Lake
- 9. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Sharing data in the cloud lets data users
spend more time on data analysis rather
than data acquisition.
https://opendata.aws
- 10. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Advantages of sharing data in the cloud
Global community of users
Faster pace of research Lower cost of research
New services and tools
- 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
AWS Public Datasets
https://registry.opendata.aws
- 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Data at work
- 13. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
“…data must be organized, well-documented,
consistently formatted, and error free. Cleaning the data
is often the most taxing part of data science, and is
frequently 80% of the work.”
— Data Driven by DJ Patil and Hilary Mason
Undifferentiated heavy lifting
- 14. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Graph by Drew Bollinger (@drewbo19) at Development Seed
Landsat on AWS
- 15. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
First Asian Government Organization Accepted into the
AWS Public Data Set Program - Taipei City Government
Taipei City Government joins the AWS
Public Data Set Program, which hosts
selected data sets for anyone to use for
free.
AWS provides a broad range of services
to analyze and discover insights from
data at any scale, including Amazon
Elastic Compute Cloud (Amazon EC2),
Amazon Athena, AWS Lambda and
Amazon EMR.
https://taipeicity.github.io/data_taipe
i
- 16. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Monitoring at-risk bodies of water from space
The Blue Dot Observatory uses
Sentinel-2 satellite data on AWS to
monitor water bodies around the world.
“The cost to process one month of data
for about 7,000 bodies of water
currently in the system is 6 EUR. It is
possible to set up world-scale systems
with a shoestring budget.”
opendata.aws/bluedot
- 17. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Facilitating over 31 million journeys made in London
every day
When Transport for London opened up
access to its data, application developers
and researchers used it to create more
than 600 applications that provide
services to 42 percent of Londoners,
saving an up to estimated £130 million
per year.
opendata.aws/tfl
- 18. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Finding data on AWS
Using the Registry of Open Data on AWS (RODA)
- 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Registry of Open Data on AWS
https://registry.opendata.aws/
- 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Sharing data (on AWS)
What we’ve learned
- 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
What makes a dataset successful?
It is treated like a product.
It is optimized for analysis.
There is a community around it.
- 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Thank you!