SlideShare une entreprise Scribd logo
1  sur  21
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Securing Big Data
Jeff Graham Mark Tomallo
Sr. Advisor, Data Analytics Director, Information Security & Risk
Enterprise Architecture Enterprise Services Department
June 4th, 2014
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Cardinal Health
33,000
plus employees
with direct
operations in 10
countries
100,000
locations
delivered to
daily
2
Leading provider of products and services across the healthcare
supply chain with an extensive footprint across multiple channels
$101B
FY13 revenue
#19
on Fortune 500
list
85%
of hospitals in the
U.S. use our
products and
services
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
What types of data do we use?
3
Market
Public
Data
(Medicare.gov)
Clinical
Product &
Supplier
Employee Logistics
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
The Challenge
4
• We knew the benefits of going to a Big Data platform, but
we had huge concerns over securing those assets.
• The technology was immature from a security standpoint.
• The goals of an analytics group were sometimes at odds
with the responsibility of Governance & Security.
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
The Opportunity
5
We needed to strike a balance between protecting our data and
liberating our analytics community.
This emerged into two guiding principles that is still evolving in
our organization:
• Lockdown the Platform
• Liberate the Data to authorized users
Lockdown Liberate
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
The Journey Begins..
6
We needed involvement from many disciplines to come together:
• Platform Security
• Identity Management
• Network Security
• Data Segmentation
• Data Tokenization
• Governance
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
7
Lockdown:Platform Security
• Host-based firewalls on control & data nodes
– Locked down using iptables
– Block connections from unauthorized hosts
• Gold-image boot for data nodes
– No persistent OS / config data - continuous fresh, secure image
– Ease of security patching
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
8
Lockdown:Hadoop Architecture
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Access Nodes
9
Lockdown:Identity Management
• Segmented access control to access/ control/ data nodes
• Secure Active Directory groups for data segmentation where sensitive
• Vintella Authentication using Kerberos
• Access Nodes can talk to Control Nodes, Control Nodes can talk to Data Nodes, User
restricted to Access Layer
Datameer
Admin
Data
Nodes
Users
Power Users
AD
MySQL
Sqoop
Hive
Flume
Control
Nodes
Developers
Data Owners
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
10
Lockdown:Network Security
• Host-based firewalls on control & data nodes
• Segregated VLAN on dedicated network switches
• Segregated Prod, Integration, Backup environments
• Transaction, security and event logging
• Host-based file integrity monitoring
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Segmentation
11
• Data is ingested under source specific accounts.
• Data ingestion is loosely coupled with transformations.
• Atomic data patterns to avoid partial data products
• Finer grain control over data access.
Ingestion
Transform
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Segmentation
12
Ingestion
• We had to ensure that our landed data was “all or nothing”
• Each load is atomic in nature.
• If a load fails, we don’t want to see partially streamed results.
HDFS
Merge & Rename Source (target area)Staging Part FilesRDBMS
Step #1
Sqoop
Step #3
hadoop fs -mv
Step #2
copyMerge API
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Segmentation
13
This gave us the flexibility to segment ingestion
privileges independently of any transformation.
Sales
Market
Market
Employee
Logistics
Clinical
Public
Data
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Segmentation
14
This gave us the flexibility to segment ingestion
privileges independently of any transformation.
Customer
Insights
Sales
Market
Market
Employee
Logistics
Warehouse
Optimization
Clinical
Public
Data
Outcome
Based
Medicine
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Tokenization
Private Data without Identity is no longer Private*
Segregation Model:
1. Private Identity Data – Identity data which is itself private
– e.g. Social Security Number
2. Identity Data – Data to identify the subject of the
associated data – e.g. Name, Passport ID
3. Private Attributes – Data only sensitive when associated
with an identity – e.g. blood type
*Except in rare cases where the Law decides it’s private without Identity.
15
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Tokenization
16
A tokenization gateway gives us a centralized, reusable
framework for transforming private data into non-sensitive data.
Address Tokenized Address
1313 Mockingbird Ln A76a39daf6e83363372d326
1700 Pennsylvania Ave 9eeb8dc55d37388b18c12b4
1411 N. Park Ave 0f2ef91d336d38b4db3be54
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Tokenization
17
The gateway is a highly protected service outside of the cluster.
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Liberate: Data Tokenization
18
The gateway is composed of three regions:
PRIVATE
• Data that needs to be tokenized.
• At a minimum must be comprised of a primary key and token values.
• Multi-tenant store with role-based security
VAULT
• Stores the private data in a SHA2/128-bit AES encrypted binary string.
• Generates a token by
• Tokens are sharded and referenced by name(and can be shared).
• Access extremely limit (administrator only).
PUBLIC
• Once tokens are generated in the vault, private data is joined to those
tokens and landed in the Public region.
• Multi-tenant store with role-based security.
• Private may read public, but public may only read public.
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
In Summary
19
We needed involvement from many disciplines to come together:
• Platform Security
• Hadoop Architecture
• Identity Management
• Network Security
• Data Segmentation
• Data Tokenization
Lockdown
Lockdown
Liberate
Liberate
Lockdown
Lockdown
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Lessons Learned
20
• Original focus was technology. Data privacy, governance, and
declassification were our largest hurdles.
• Accountability across the Enterprise is important.
• For Big Data, we haven’t achieved pure statistical anonymization as
this isn’t our core competency.
• Legacy source metadata security classification is challenge.
• Initial tokenization was a success. However:
o The complexity of a mature tokenization solution is orders of magnitude
more difficult than anticipated – The margin of error and penalty of error
are both very high.
o Metadata needed for full token lifecycle management are unknown &
complex
o Implementing without the right metadata would likely result in duplication
of tokens
© Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and
ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health.
Q&A

Contenu connexe

Tendances

Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3
Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3
Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3
J. David Morris
 
6DCP Presentation 2016
6DCP Presentation 20166DCP Presentation 2016
6DCP Presentation 2016
Eddie Cohen
 

Tendances (13)

Orcid oai8 20130618
Orcid oai8 20130618Orcid oai8 20130618
Orcid oai8 20130618
 
Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3
Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3
Contoural Kazeon Webinar Insourcing E Discovery Nov 08 V1 1 3
 
Hortonworks help customers building a HIPAA compliant Data Lake
Hortonworks help customers building a HIPAA compliant Data Lake Hortonworks help customers building a HIPAA compliant Data Lake
Hortonworks help customers building a HIPAA compliant Data Lake
 
Virtual Medicolegal Documents
Virtual Medicolegal DocumentsVirtual Medicolegal Documents
Virtual Medicolegal Documents
 
GDPR and Hadoop
GDPR and HadoopGDPR and Hadoop
GDPR and Hadoop
 
Managing Information Asset Register
Managing Information Asset RegisterManaging Information Asset Register
Managing Information Asset Register
 
Healthcare - SMART on FHIR (HealthCare OpenSource is here!!!)
Healthcare - SMART on FHIR (HealthCare OpenSource is here!!!)Healthcare - SMART on FHIR (HealthCare OpenSource is here!!!)
Healthcare - SMART on FHIR (HealthCare OpenSource is here!!!)
 
Efficiently Handling Subject Access Requests
Efficiently Handling Subject Access RequestsEfficiently Handling Subject Access Requests
Efficiently Handling Subject Access Requests
 
Secure Your Enterprise Data Now and Be Ready for CCPA in 2020
Secure Your Enterprise Data Now and Be Ready for CCPA in 2020Secure Your Enterprise Data Now and Be Ready for CCPA in 2020
Secure Your Enterprise Data Now and Be Ready for CCPA in 2020
 
Smartlinx
SmartlinxSmartlinx
Smartlinx
 
Overview of orcid in research lifecycle (M. Buys)
Overview of orcid in research lifecycle (M. Buys)Overview of orcid in research lifecycle (M. Buys)
Overview of orcid in research lifecycle (M. Buys)
 
6DCP Presentation 2016
6DCP Presentation 20166DCP Presentation 2016
6DCP Presentation 2016
 
Data classification-policy
Data classification-policyData classification-policy
Data classification-policy
 

En vedette

cardinal health Conference Call Presentation
cardinal health Conference Call Presentationcardinal health Conference Call Presentation
cardinal health Conference Call Presentation
finance2
 
Scientific and regulatory consulting overview presentation jan 2013
Scientific and regulatory consulting overview presentation jan 2013Scientific and regulatory consulting overview presentation jan 2013
Scientific and regulatory consulting overview presentation jan 2013
Christopher Kavlick
 
johnson & johnson PDF Download Presentation
johnson & johnson  PDF  	Download Presentationjohnson & johnson  PDF  	Download Presentation
johnson & johnson PDF Download Presentation
finance4
 
Key Tech & BD - DDP 2016
Key Tech & BD - DDP 2016Key Tech & BD - DDP 2016
Key Tech & BD - DDP 2016
Andy Rogers
 
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Rainer Sternfeld
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
Perficient, Inc.
 

En vedette (20)

cardinal health Conference Call Presentation
cardinal health Conference Call Presentationcardinal health Conference Call Presentation
cardinal health Conference Call Presentation
 
Scientific and regulatory consulting overview presentation jan 2013
Scientific and regulatory consulting overview presentation jan 2013Scientific and regulatory consulting overview presentation jan 2013
Scientific and regulatory consulting overview presentation jan 2013
 
johnson & johnson PDF Download Presentation
johnson & johnson  PDF  	Download Presentationjohnson & johnson  PDF  	Download Presentation
johnson & johnson PDF Download Presentation
 
mHealth Israel conference_Noel G. Harvey_VP R&D_Becton Dickinson
mHealth Israel conference_Noel G. Harvey_VP R&D_Becton DickinsonmHealth Israel conference_Noel G. Harvey_VP R&D_Becton Dickinson
mHealth Israel conference_Noel G. Harvey_VP R&D_Becton Dickinson
 
Key Tech & BD - DDP 2016
Key Tech & BD - DDP 2016Key Tech & BD - DDP 2016
Key Tech & BD - DDP 2016
 
Presentation for Horizon BCBS
Presentation for Horizon BCBSPresentation for Horizon BCBS
Presentation for Horizon BCBS
 
iHT² Health IT Summit New York - Mark MacNaughton, SVP & CIO, Medial Segment ...
iHT² Health IT Summit New York - Mark MacNaughton, SVP & CIO, Medial Segment ...iHT² Health IT Summit New York - Mark MacNaughton, SVP & CIO, Medial Segment ...
iHT² Health IT Summit New York - Mark MacNaughton, SVP & CIO, Medial Segment ...
 
Philips john huffman
Philips john huffmanPhilips john huffman
Philips john huffman
 
When size matters Is social media data really that BIG
When size matters Is social media data really that BIGWhen size matters Is social media data really that BIG
When size matters Is social media data really that BIG
 
Customer Presentation
Customer PresentationCustomer Presentation
Customer Presentation
 
The Lean Transformation at Cardinal Health
The Lean Transformation at Cardinal HealthThe Lean Transformation at Cardinal Health
The Lean Transformation at Cardinal Health
 
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
 
Lessons for successfully executing digital transformation in Healthcare
Lessons for successfully executing digital transformation in HealthcareLessons for successfully executing digital transformation in Healthcare
Lessons for successfully executing digital transformation in Healthcare
 
BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler
 
SILS 2015 - Transforming Data into Actionable Knowledge
SILS 2015 - Transforming Data into Actionable KnowledgeSILS 2015 - Transforming Data into Actionable Knowledge
SILS 2015 - Transforming Data into Actionable Knowledge
 
Rock Report: Big Data by @Rock_Health
Rock Report: Big Data by @Rock_HealthRock Report: Big Data by @Rock_Health
Rock Report: Big Data by @Rock_Health
 
mHealth Israel_Becton Dickinson_US Healthcare Digital Transformation_July 2015
mHealth Israel_Becton Dickinson_US Healthcare Digital Transformation_July 2015mHealth Israel_Becton Dickinson_US Healthcare Digital Transformation_July 2015
mHealth Israel_Becton Dickinson_US Healthcare Digital Transformation_July 2015
 
Staying Relevant in a High Volume, Commoditized, Medical Device Product Line ...
Staying Relevant in a High Volume, Commoditized, Medical Device Product Line ...Staying Relevant in a High Volume, Commoditized, Medical Device Product Line ...
Staying Relevant in a High Volume, Commoditized, Medical Device Product Line ...
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
 

Similaire à Securing Big Dta: Lock it Down or Liberate

Classification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtectionClassification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtection
Gianmarco Ferri
 

Similaire à Securing Big Dta: Lock it Down or Liberate (20)

Securing Big Data and the Grid
Securing Big Data and the GridSecuring Big Data and the Grid
Securing Big Data and the Grid
 
Case Study: Securing & Tokenizing Big Data
Case Study: Securing & Tokenizing Big DataCase Study: Securing & Tokenizing Big Data
Case Study: Securing & Tokenizing Big Data
 
Oracle Database 11g Security and Compliance Solutions - By Tom Kyte
Oracle Database 11g Security and Compliance Solutions - By Tom KyteOracle Database 11g Security and Compliance Solutions - By Tom Kyte
Oracle Database 11g Security and Compliance Solutions - By Tom Kyte
 
IDERA Live | Understanding SQL Server Compliance both in the Cloud and On Pre...
IDERA Live | Understanding SQL Server Compliance both in the Cloud and On Pre...IDERA Live | Understanding SQL Server Compliance both in the Cloud and On Pre...
IDERA Live | Understanding SQL Server Compliance both in the Cloud and On Pre...
 
Centrifuge Systems Overview 2 14
Centrifuge Systems Overview 2 14Centrifuge Systems Overview 2 14
Centrifuge Systems Overview 2 14
 
Data compliance - get it right the first time (Full color PDF)
Data compliance - get it right the first time (Full color PDF)Data compliance - get it right the first time (Full color PDF)
Data compliance - get it right the first time (Full color PDF)
 
Understanding the Data You Have Before Applying a Governance Strategy
Understanding the Data You Have Before Applying a Governance StrategyUnderstanding the Data You Have Before Applying a Governance Strategy
Understanding the Data You Have Before Applying a Governance Strategy
 
Trust in a Digital World
Trust in a Digital WorldTrust in a Digital World
Trust in a Digital World
 
Data compliance - get it right the first time (Black/White printable PDF)
Data compliance - get it right the first time (Black/White printable PDF)Data compliance - get it right the first time (Black/White printable PDF)
Data compliance - get it right the first time (Black/White printable PDF)
 
Data security in the cloud
Data security in the cloud Data security in the cloud
Data security in the cloud
 
Apouc 2014-business-analytics-and-big-data
Apouc 2014-business-analytics-and-big-dataApouc 2014-business-analytics-and-big-data
Apouc 2014-business-analytics-and-big-data
 
Classification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtectionClassification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtection
 
A Journey towards Self-Service Analytics
A Journey towards Self-Service AnalyticsA Journey towards Self-Service Analytics
A Journey towards Self-Service Analytics
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
 
Tdwi austin simplifying big data delivery to drive new insights final
Tdwi austin   simplifying big data delivery to drive new insights finalTdwi austin   simplifying big data delivery to drive new insights final
Tdwi austin simplifying big data delivery to drive new insights final
 
CDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and TrendsCDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and Trends
 
Enterprise Data World Webinar: Mastering & Referencing Data for the Enterprise
Enterprise Data World Webinar: Mastering & Referencing Data for the EnterpriseEnterprise Data World Webinar: Mastering & Referencing Data for the Enterprise
Enterprise Data World Webinar: Mastering & Referencing Data for the Enterprise
 
Shield db data security
Shield db   data securityShield db   data security
Shield db data security
 
Shield db data security
Shield db   data securityShield db   data security
Shield db data security
 
Shield db data security
Shield db   data securityShield db   data security
Shield db data security
 

Plus de DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Securing Big Dta: Lock it Down or Liberate

  • 1. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Securing Big Data Jeff Graham Mark Tomallo Sr. Advisor, Data Analytics Director, Information Security & Risk Enterprise Architecture Enterprise Services Department June 4th, 2014
  • 2. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Cardinal Health 33,000 plus employees with direct operations in 10 countries 100,000 locations delivered to daily 2 Leading provider of products and services across the healthcare supply chain with an extensive footprint across multiple channels $101B FY13 revenue #19 on Fortune 500 list 85% of hospitals in the U.S. use our products and services
  • 3. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. What types of data do we use? 3 Market Public Data (Medicare.gov) Clinical Product & Supplier Employee Logistics
  • 4. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. The Challenge 4 • We knew the benefits of going to a Big Data platform, but we had huge concerns over securing those assets. • The technology was immature from a security standpoint. • The goals of an analytics group were sometimes at odds with the responsibility of Governance & Security.
  • 5. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. The Opportunity 5 We needed to strike a balance between protecting our data and liberating our analytics community. This emerged into two guiding principles that is still evolving in our organization: • Lockdown the Platform • Liberate the Data to authorized users Lockdown Liberate
  • 6. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. The Journey Begins.. 6 We needed involvement from many disciplines to come together: • Platform Security • Identity Management • Network Security • Data Segmentation • Data Tokenization • Governance
  • 7. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. 7 Lockdown:Platform Security • Host-based firewalls on control & data nodes – Locked down using iptables – Block connections from unauthorized hosts • Gold-image boot for data nodes – No persistent OS / config data - continuous fresh, secure image – Ease of security patching
  • 8. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. 8 Lockdown:Hadoop Architecture
  • 9. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Access Nodes 9 Lockdown:Identity Management • Segmented access control to access/ control/ data nodes • Secure Active Directory groups for data segmentation where sensitive • Vintella Authentication using Kerberos • Access Nodes can talk to Control Nodes, Control Nodes can talk to Data Nodes, User restricted to Access Layer Datameer Admin Data Nodes Users Power Users AD MySQL Sqoop Hive Flume Control Nodes Developers Data Owners
  • 10. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. 10 Lockdown:Network Security • Host-based firewalls on control & data nodes • Segregated VLAN on dedicated network switches • Segregated Prod, Integration, Backup environments • Transaction, security and event logging • Host-based file integrity monitoring
  • 11. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 11 • Data is ingested under source specific accounts. • Data ingestion is loosely coupled with transformations. • Atomic data patterns to avoid partial data products • Finer grain control over data access. Ingestion Transform
  • 12. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 12 Ingestion • We had to ensure that our landed data was “all or nothing” • Each load is atomic in nature. • If a load fails, we don’t want to see partially streamed results. HDFS Merge & Rename Source (target area)Staging Part FilesRDBMS Step #1 Sqoop Step #3 hadoop fs -mv Step #2 copyMerge API
  • 13. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 13 This gave us the flexibility to segment ingestion privileges independently of any transformation. Sales Market Market Employee Logistics Clinical Public Data
  • 14. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 14 This gave us the flexibility to segment ingestion privileges independently of any transformation. Customer Insights Sales Market Market Employee Logistics Warehouse Optimization Clinical Public Data Outcome Based Medicine
  • 15. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization Private Data without Identity is no longer Private* Segregation Model: 1. Private Identity Data – Identity data which is itself private – e.g. Social Security Number 2. Identity Data – Data to identify the subject of the associated data – e.g. Name, Passport ID 3. Private Attributes – Data only sensitive when associated with an identity – e.g. blood type *Except in rare cases where the Law decides it’s private without Identity. 15
  • 16. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization 16 A tokenization gateway gives us a centralized, reusable framework for transforming private data into non-sensitive data. Address Tokenized Address 1313 Mockingbird Ln A76a39daf6e83363372d326 1700 Pennsylvania Ave 9eeb8dc55d37388b18c12b4 1411 N. Park Ave 0f2ef91d336d38b4db3be54
  • 17. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization 17 The gateway is a highly protected service outside of the cluster.
  • 18. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization 18 The gateway is composed of three regions: PRIVATE • Data that needs to be tokenized. • At a minimum must be comprised of a primary key and token values. • Multi-tenant store with role-based security VAULT • Stores the private data in a SHA2/128-bit AES encrypted binary string. • Generates a token by • Tokens are sharded and referenced by name(and can be shared). • Access extremely limit (administrator only). PUBLIC • Once tokens are generated in the vault, private data is joined to those tokens and landed in the Public region. • Multi-tenant store with role-based security. • Private may read public, but public may only read public.
  • 19. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. In Summary 19 We needed involvement from many disciplines to come together: • Platform Security • Hadoop Architecture • Identity Management • Network Security • Data Segmentation • Data Tokenization Lockdown Lockdown Liberate Liberate Lockdown Lockdown
  • 20. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Lessons Learned 20 • Original focus was technology. Data privacy, governance, and declassification were our largest hurdles. • Accountability across the Enterprise is important. • For Big Data, we haven’t achieved pure statistical anonymization as this isn’t our core competency. • Legacy source metadata security classification is challenge. • Initial tokenization was a success. However: o The complexity of a mature tokenization solution is orders of magnitude more difficult than anticipated – The margin of error and penalty of error are both very high. o Metadata needed for full token lifecycle management are unknown & complex o Implementing without the right metadata would likely result in duplication of tokens
  • 21. © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Q&A

Notes de l'éditeur

  1. Mark Cardinal Health is a multi-billion dollar healthcare services company. Actually, we like to say we’re the business behind healthcare because we focus on making it more cost-effective so our customers can focus on their patients. We work with pharmacies, hospitals, doctor’s offices, surgery centers and clinical labs- basically anywhere healthcare services are offered. As a leading provider of products and services in the healthcare supply chain, we have the broadest view of healthcare in the industry: We have more than 33,000 employees with direct operations around the world We deliver products and services to 100,000 locations daily 85 percent of hospitals in the U.S. use Cardinal Health products and services We supply pharmaceuticals to fill 25 percent of branded prescriptions in the U.S. In fact, a third of all distributed pharmaceutical, laboratory and medical products in the U.S. and Puerto Rico flow through the Cardinal Health supply chain. We are proud to be #19 on the Fortune 500 list
  2. Mark How we use the data specific to Big Data
  3. Mark
  4. Jeff Some view Locking down areas of functionality as a bad thing. We should embrace lockdown much like we do brakes on a car. The breaks actually allow us to take more risks and improve agility.
  5. Jeff
  6. Jeff
  7. Jeff
  8. Mark
  9. Mark
  10. Jeff Data is ingested under a source specific account. The data ingestion process is loosely coupled with the transformation processes. This afforded us finer grain control over who and what processes have permission to access raw data. This required us to develop atomic data patterns to avoid partial data products.
  11. Jeff
  12. Jeff This gave us the flexibility to segment ingestion privileges independently of any transformation.
  13. Jeff This gave us the flexibility to segment ingestion privileges independently of any transformation.
  14. Mark
  15. Mark
  16. Jeff
  17. Jeff
  18. Mark
  19. Mark