More Related Content Similar to Maximising Data Governance in the Cloud Similar to Maximising Data Governance in the Cloud (20) More from Amazon Web Services More from Amazon Web Services (20) Maximising Data Governance in the Cloud1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Matt Pitchford, Principal Solutions Archithect
13.11.18
Maximising Data Governance
2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What we’re going to cover
Governance Overview
What is Data Governance
Best Practices
Practical Implementation Tips
3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Where does Governance live?
4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Governance?
SecurityRisk ComplianceGovernance
5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Security
Access control
Encryption
Data Integrity
Data Leakage Prevention
6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Compliance
Data Access Policy
Data Retention
Forensics
7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Risk
Preparing for potential issues
Data loss
Data inaccessibility (service outage)
Data exposure
8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Governance
9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Data Governance
Data is always changing
Data history
Business is always changing
Regulatory Requirements
New Security Policies
New Risks
Are you still doing the right things?
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example
11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lake on AWS
On premises data
Amazon RDS
Other databases
Your data
AWS GLUE ETL
12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Adding New Data Sources
On premises
data
Web app data
Amazon RDS
Other databases
Streaming data
Your data
AWS GLUE ETL
13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Governance Process – New Producer
Receive
Request
• Manual – E-mail, trouble ticket, web portal
• Automated – API, Pipeline
Decision
• Manual – Governance Board, Change Board
• Automated – Workflow, Pipeline
Build
• Manual – Infrastructure provision and notification
• Automated – Cloudformation & SNS
14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data History
15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Access History
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be mybucket [06/Feb/2014:00:00:38 +0000] 192.0.2.3
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - "GET /mybucket?versioning HTTP/1.1" 200 -
113 - 7 - "-" "S3Console/0.4" -
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be mybucket [06/Feb/2014:00:00:38 +0000] 192.0.2.3
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 891CE47D2EXAMPLE REST.GET.LOGGING_STATUS - "GET /mybucket?logging HTTP/1.1" 200 -
242 - 11 - "-" "S3Console/0.4" -
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be mybucket [06/Feb/2014:00:00:38 +0000] 192.0.2.3
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be A1206F460EXAMPLE REST.GET.BUCKETPOLICY - "GET /mybucket?policy HTTP/1.1" 404
NoSuchBucketPolicy 297 - 38 - "-" "S3Console/0.4" -
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be mybucket [06/Feb/2014:00:01:00 +0000] 192.0.2.3
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 7B4A0FABBEXAMPLE REST.GET.VERSIONING - "GET /mybucket?versioning HTTP/1.1" 200 -
113 - 33 - "-" "S3Console/0.4" -
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be mybucket [06/Feb/2014:00:01:57 +0000] 192.0.2.3
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be DD6CC733AEXAMPLE REST.PUT.OBJECT s3-dg.pdf "PUT /mybucket/s3-dg.pdf HTTP/1.1" 200 -
- 4406583 41754 28 "-" "S3Console/0.4" -
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be mybucket [06/Feb/2014:00:03:21 +0000] 192.0.2.3
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be BC3C074D0EXAMPLE REST.GET.VERSIONING - "GET /mybucket?versioning HTTP/1.1" 200 -
113 - 28 - "-" "S3Console/0.4" -
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3 Versioning
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3 Versioning Demo
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Dynamo DB as a Metadata store
https://aws.amazon.com/blogs/big-data/building-and-maintaining-an-amazon-
s3-metadata-index-without-servers/
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Dynamo DB as a Metadata store
• Find all objects for a given customer/function/business
unit during a time range
• Calculate the total storage used for a given customer
unit
• List all objects for a given customer that contain a
transaction record
• Find all objects uploaded by a given server during a time
range
20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Dark Data
21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Adding New Data Sources
On premises
data
Web app data
Amazon RDS
Other databases
Streaming data
Your data
AWS GLUE ETL
22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Discovering value from unstructured data
• Do you have important data in unstructured formats?
• Are you responsible from a regulatory point of view for
this data, even if you haven’t used it?
• Can your business get value from this data if you
combine it with other sources?
23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Optimizing costs on unstructured data
• Lifecycle policies including data destruction
• Structured and unstructured storage options
• Tiers of object storage to optimize costs
24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Wrap up
25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Defining success
• Can you quantify the value you’re getting from data?
• Have you established a baseline?
• How do you tackle the outliers?
• Are you using the data to drive improvements?
• How do you deliver or analyse your metrics?
26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.