Contenu connexe Similaire à Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox (20) Plus de DataWorks Summit (20) Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox1. 1 © Hortonworks Inc. 2011–2018. All rights reserved
Fortifying Multi-Cluster Hybrid Cloud Data
Lakes using Apache Knox
Sandeep Moré – Sr. Software Engineer
Kiran Matty – Sr. Product Manager
06/19/18
2. 2 © Hortonworks Inc. 2011–2018. All rights reserved
Agenda
• Multi-Cluster Hybrid Cloud Data lakes
• Apache Knox
• Demo
• Q&A
3. 3 © Hortonworks Inc. 2011–2018. All rights reserved
Who are We?
• Apache Knox PMC member
• Sr. Software Engineer@Hortonworks
• Software Engineer / Security Gateway –
Intel
3
• PM@Hortonworks – Apache
Knox, HDP Search/Solr, and
Platform Security
• Big Data Analytics and Security
@ startup, HPE, and Cisco
4. 4 © Hortonworks Inc. 2011–2018. All rights reserved
Multi-Cluster Hybrid
Cloud Data Lakes
5. 5 © Hortonworks Inc. 2011–2018. All rights reserved
Why Hybrid Cloud?
Unified Security &
Governance
Model
Cluster 2
(Unstructured)
Cluster 1
(Structured)
Cluster 3
(Structured)
Cluster 4
(Unstructured)
Data Lake 1, San Jose
Cluster 1
(Unstructured)
Cluster 2
(Structured)
Workloads (typical)
On-prem Cloud
Compliance Sensitive Non-sensitive
Flexibility Production Test/Demo
Cost
Optimization
Fixed Variable
Data Lake 2, UK
Best Practice: Run your analytics workloads where data
is stored
6. 6 © Hortonworks Inc. 2011–2018. All rights reserved
Need to augment existing security controls offered by Cloud
Providers for Hadoop Workloads
Security Control AWS Azure GCP
Network Isolation Virtual Private
Cloud (VPC)
Microsoft Azure Virtual
Network (VNet)
Virtual Private Cloud
(VPC) network
Network security Security Groups Network Access Control
List (NACL) and Network
Security Groups (NSGs)
Firewall rules
Identity
Management
Identity and Access
management (IAM)
Azure Active Directory
(AAD)
Google Cloud Identity
and Access
Management (Cloud
IAM)
7. 7 © Hortonworks Inc. 2011–2018. All rights reserved
A Few Issues across the Hybrid Cloud Data Lakes
How To:
authenticate cloud users without moving your on-prem LDAP to the cloud?
keep unauthorized users from accessing your customer data i.e. Insider attack?
protect your clusters from stolen credentials i.e. Account Highjacking?
8. 8 © Hortonworks Inc. 2011–2018. All rights reserved
AuthN Challenges: Connecting to on-prem Active Directory Options
Replication
Corporate DC Cloud
AD ADVPN
App AppDomain join to on-prem AD over
VPN
1
2
3
10. 10 © Hortonworks Inc. 2011–2018. All rights reserved
• an extensible reverse proxy framework
• that can be deployed in the cloud or on-prem
• for securely exposing REST APIs, HTTP, and WebSockets based services
• and out of the box it provides:
• Proxying of HTTP services - REST, UIs, Websockets
• Authentication services - pluggable authentication and federation providers and token,
SSO services
• Client services - KnoxShell for consuming cluster services through Knox
• And many other features…
Apache Knox Gateway is…
11. 11 © Hortonworks Inc. 2011–2018. All rights reserved
• a Firewall
• a Load balancer
• a Kerberos replacement
Apache Knox Gateway is NOT…
12. 12 © Hortonworks Inc. 2011–2018. All rights reserved
Why Knox?
Simplified Access
• Kerberos encapsulation
• Extends API reach
• Single access point
• Multi-cluster support
• Single SSL certificate
Centralized Control
• Auditing
• Service-level authorization
• Knox Admin UI
• Service Discovery and Topology Generation
Framework
Enterprise Integration
• LDAP/AD integration
• Support for SAMLv2
• SSO integration
Enhanced Security
• Proxy to abstract network details
• TLS Termination for non-SSL services
13. 13 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Knox Community Snapshot
Mar 2013
Entered
Incubator
Oct 2013
0.1.0 - 0.3.0
Incubator
Releases
Feb 2014
Graduates
to
Apache TLP
Apr 2014
0.4.0
TLP
Release
Nov 2014
0.5.0
May 2015
0.6.0
Apr/Aug 2016
0.9.0/0.9.1
Feb 2016
0.8.0
Dec 2015
0.7.0
Nov 2016
0.10.0
Dec 2016
0.11.0
Mar 2017
0.12.0
Feb 2018
1.0
• Committers: 20
• Contributors from:
• Hortonworks, IBM, CGI,
Uber, Oracle, Blue Talon,
Microsoft, Talend
Apache Knox 0.14.0
@apache_knox
Aug 2017
0.13.0
Apache Knox 1.0.0
• Ambari Service Discovery Support
for HA-Enabled Services
• Update hadoop dependencies to
Hadoop 3
Dec 2017
0.14.0
• Service Discovery and Topology
Generation Framework
• Add support for proxying NiFi and
Livy (Spark Rest Service)
• High Availability Support For
Apache SOLR, HBase & Kafka
15. 15 © Hortonworks Inc. 2011–2018. All rights reserved
Demo Coverage
How To:
authenticate users without moving your on-prem LDAP to the cloud?
• Knox Federation
keep unauthorized users from accessing your customer data i.e. Insider attack?
• Knox AuthZ
protect your clusters from stolen credentials i.e. Account Highjacking?
• MFA* on Knox
*no out of box support
16. 16 © Hortonworks Inc. 2011–2018. All rights reserved
Knox Providers - Primer
• Providers add new features to the gateway
• These features can be used by all services
• Example providers used for federation:
• Auth Provider - Knox Federation
Header Based Pre Auth
<provider>
<role>federation</role>
<name>HeaderPreAuth</name>
<enabled>true</enabled>
<param>
<name>preauth.custom.header</name>
<value>aws_user</value>
</param>
</provider>
• Authorization Provider - Knox AuthZ
AclsAuthz
<provider>
<role>authorization</role>
<name>AclsAuthz</name>
<enabled>true</enabled>
<param>
<name>hive.acl</name>
<value>*;sales;*</value>
</param>
</provider>
17. 17 © Hortonworks Inc. 2011–2018. All rights reserved
Knox Cloud Federation
• Part of KIP – 11 : Cloud use cases
• KNOX-1339 – Support for cloud federation
• Leverages Knox Header Based Pre Auth provider
• JDBC / Beeline / REST
• JDBC + Knoxshell for demo
• Federation Dispatch –
<dispatch classname="org.apache.knox.gateway.dispatch.HeaderPreAuthFederationDispatch" use-two-way-ssl="true" />
18. 18 © Hortonworks Inc. 2011–2018. All rights reserved
Demo Personas
Kate
LDAP Group: DevOps
Cluster Access: Prod and Demo
AWS IAM user
Michelle
LDAP Group: Sales
Cluster Access: Demo
Not AWS IAM user
Malicious Insider
Maximus
Hacker
19. 19 © Hortonworks Inc. 2011–2018. All rights reserved
Demo Architecture
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
2-way
Inbound:8443
JDBC Client
Knoxline
Inbound: 8443
Prod (on-prem) Demo (cloud)
20. 20 © Hortonworks Inc. 2011–2018. All rights reserved
Scenario 1: Access on-prem cluster
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
1. Hive(JDBC)
2. Authenticate Kate
3. Access HDFS
4. HDFS
Response
5. Response
21. 21 © Hortonworks Inc. 2011–2018. All rights reserved
Scenario 2: Access cloud cluster by AuthN w/ on-prem LDAP
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
1. GET Webhdfs
2. Authenticate
Michelle
8. Response
3. Dispatch request to
Cloud Knox
4. Header based pre auth
5. Access HDFS
6. HDFS Response
7. Response
Knox Federation
22. 22 © Hortonworks Inc. 2011–2018. All rights reserved
Scenario 3: Blocking Michelle’s Unauthorized access
AuthN
Run a Hive
query
against the
customer
DB to get
names and
phone
numbers
Load into
CSV file
Exfilterate
via USB
drive
23. 23 © Hortonworks Inc. 2011–2018. All rights reserved
Scenario 3: Blocking Michelle’s Unauthorized access
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
1. Hive (JDBC)
2. Authorization failure
3. 403 Forbidden
Knox AuthZ
24. 24 © Hortonworks Inc. 2011–2018. All rights reserved
Scenario 4: Thwarting Maximus’s Kill Chain
Harvest
Kate’s
credentials
from GitHub
via social
engineering
Create an
exploit to
scan and
identify
sensitive
tables, and
exfilterate
to EC2
server
AuthN
using Kate’s
stolen
credentials
Install the
exploit to
scan
sensitive
tables
Chunk data
and send
to C2
server
Request
for
Ransom
25. 25 © Hortonworks Inc. 2011–2018. All rights reserved
Scenario 4: Thwarting Maximus’s Kill Chain
* No out of the box support for MFA
katec@newcor.com
MFA* on Knox