At Babylon Health, we are on a mission to put accessible and affordable healthcare in the hands of every person on Earth. You might imagine that such an endeavour would generate incredible amounts of data! And, since AI is in the core of our product, leveraging data from our microservices and clients is crucial to our success. So we set off to build a data platform of the future, based both in AWS and GCP, leveraging our existing infrastructure and CICD and building the missing parts.
16. Data Tech (the old ways…)
Data Analyst 1
Data Analyst 2
Natalie Godec
17. Data Tech (the old ways…)
● Infrastructure management overhead
● Requires special skills
● Difficult to scale
● Lack of granular access control
● Speed is a real issue
● Expensive
18. Researcher
My dataset
Legal and DP
approval?
DBS Check?
Data Source
Secure Access Right tools?
Where is the
data?
Accessing data (the old ways…)
19.
20. How can we give researchers access to
the data they need, when they need it,
safely and securely?
23. BigQuery: a data warehouse
● Create datasets
● Define table schemas
● Create tables and views
● Schedule jobs
GCPproject
UK
US
CA
Datasets Tables & Views
SELECT date, count(VisitorId) FROM
`project1.datasetA.frontend.ga_sessions_*`
group by date
39. BigQuery is a public API.
How do you protect your data?
40. Office IP range
Private cloud IP range
Securing your APIs:
VPC Service Controls
Unauthorized
client
VPC Service Perimeter
Unauthorized
VPC project
project1
project2
project3
Protected APIs