SlideShare une entreprise Scribd logo
1  sur  37
Artem Aliev
Bring Your Own Spark
with Enterprise Security
1 DSE BYOS Overview
2 BYOS Configuration Tools
3 Use Cases
4 BYOS vs OSS Spark Connector
5 Kerberos Demo
2© DataStax, All Rights Reserved.
Connect Your Spark to DSE
© DataStax, All Rights Reserved. 3
HDFS
Hive
Meta
Store
ClusterManger
Spark
SQL
DSE C*
Hive
Meta
Store
CFS
DSE Spark
SQL
Connect Your Spark to DSE
© DataStax, All Rights Reserved. 4
HDFS
Hive
Meta
Store
ClusterManger
Spark
SQL
Hive
Meta
Store
CFS
DSE Spark
SQL
DSE C*
Bring Your Own Spark!
• A simple way to
– Read Cassandra and CFS data from external Spark
– Export necessary configuration info to connect to DSE
• Includes security options
– Export necessary Jars to connect
– Attach these exported resource to a spark-submit
• Also
– Simple way to get the SparkSQL syntax to create catalog entries for
tables in Cassandra
– Read external HDFS data from DSE Spark jobs
© DataStax, All Rights Reserved. 5
BYOS Components
• BYOS assembly jar (add it to spark jars)
• spark-cassanda-connector, secure transport, CFS and dependencies
$DSE_HOME/clients/dse-byos_2.10-5.0.2-SNAPSHOT.jar
• Spark configuration generator (merge result with spark-defaults.conf)
• Contains Cassandra host, auth type and factories
dse client-tool configuration byos-export byos.conf
• Spark-SQL Schema mapping generator (run result by spark-sql)
• The sql script will create databases and table mapping for all C* tables
© DataStax, All Rights Reserved. 6
dse client-tool spark sql-schema -all > mapping.sql
dse client-tool configuration byos-export byos.conf
$DSE_HOME/clients/dse-byos_2.10-5.0.2.jar
byos.conf
© DataStax, All Rights Reserved. 7
#Exported node configuration properties
#Fri Jul 29 22:55:48 UTC 2016
spark.hadoop.cassandra.host=127.0.0.1
spark.hadoop.cassandra.auth.kerberos.enabled=false
spark.cassandra.auth.conf.factory=com.datastax.bdp.spark.DseByosAuthConfFactory
spark.cassandra.connection.port=9042
spark.hadoop.cassandra.ssl.enabled=false
spark.hadoop.cassandra.auth.kerberos.defaultScheme=false
spark.hadoop.cassandra.client.transport.factory=com.datastax.bdp.transport.client.TDseClientTransportFactory
spark.cassandra.connection.host=127.0.0.1
spark.hadoop.fs.cfs.impl=com.datastax.bdp.hadoop.cfs.CassandraFileSystem
spark.hadoop.cassandra.connection.native.port=9042
spark.hadoop.dse.client.configuration.impl=com.datastax.bdp.transport.client.HadoopBasedClientConfiguration
spark.cassandra.connection.factory=com.datastax.bdp.spark.DseCassandraConnectionFactory
spark.hadoop.cassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader
spark.hadoop.cassandra.connection.rpc.port=9160
spark.hadoop.dse.system_memory_in_mb=7985
spark.hadoop.cassandra.thrift.framedTransportSize=15728640
spark.hadoop.cassandra.partitioner=org.apache.cassandra.dht.Murmur3Partitioner
spark.hadoop.cassandra.dsefs.port=5598
mapping.sql
© DataStax, All Rights Reserved. 8
CREATE DATABASE IF NOT EXISTS test_keyspace;
USE test_keyspace;
CREATE TABLE test_table
USING org.apache.spark.sql.cassandra
OPTIONS (
keyspace "test_keyspace",
table "test_table",
pushdown "true");
Add BYOS to the Spark
• Copy dse-byos.jar, byos.conf and mapping.sql to a spark client node
• Merge byos.conf properties with spark defaults
• add DSE tables mapping (optional)
Run any spark application the same way:
© DataStax, All Rights Reserved. 9
cat byos.conf /etc/spark/conf/spark-defaults.conf > merged.conf
spark-sql --jars dse-byos*.jar --properties-file merged.conf –f mapping.sql
spark-shell --jars dse-byos*.jar --properties-file merged.conf
SSL Support
• Copy DSE client SSL certificate truststore and keystore files to Spark nodes
• Pass file locations to configuration generator
• Tip: You can use --files spark parameter to distribute files for the YARN job
© DataStax, All Rights Reserved. 10
dse client-tool configuration byos-export 
--set-truststore-path .truststore --set-truststore-password password 
--set-keystore-path .keystore --set-keystore-password password 
byos.conf
spark-shell --jars dse-byos*.jar --properties-file merged.conf 
--files .truststore,.keystore
Kerberos
• Kerberos setup on Spark cluster:
Just specify preferred JAAS connect in .java.login.config
DseClient {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true
renewTGT=true;
};
• No Kerberos on Spark Cluster? (less secure)
Request DSE token manually while generate config
© DataStax, All Rights Reserved. 11
Driver
Executors
KerberosAuth
DSEToken
DSE Token
dse client-tool configuration byos-export --generate-token
byos.conf
Usage: Migrate/Save/Load Data
© DataStax, All Rights Reserved. 12
• DSE tables to Hadoop and back
• Streaming
• DSE Max CFS and HDFS
• spark-shell
• dse spark
scala> sc.textFile("hdfs://hadoop1/data").saveAsTextFile("cfs:/data")
scala> val df = sqlContext.read.format("org.apache.spark.sql.cassandra")
.options(Map("keyspace"->"t", "table" -> "t")).load()
df.write.format("json").save ("/tmp/t.json”)
scala> sc.textFile("cfs:/data").saveAsTextFile("hdfs://hadoop1/data")
session_stream.saveToCassandra("web", "sessions")
Usage: JOIN/Enrich with C* Tables
• all C* tables are available after mapping
• join your RDD with C*
KILLER FEATURE: Enrich your stream, with C* on the fly
© DataStax, All Rights Reserved. 13
spark-sql> select * from hive_table h join cassandra_table с on h.key = c.key
scala> hrdd.joinWithCassandraTable("t", "t")
click_stream.joinWithCassandraTable("web", "sessions")
Building Full Lambda Architecture?
© DataStax, All Rights Reserved. 14
Add Speed Layer!
© DataStax, All Rights Reserved. 15
DSE
DSE
HBase?
© DataStax, All Rights Reserved. 16
Still HBase?
Double Master/Slave architecture
One for server, one for storage
Master-less architecture
OSS Spark Connector or DSE BYOS?
Feature OSS DSE BYOS
DataStax Official Support NO YES
Spark SQL Source Tables / Cassandra DataFrames YES YES
CassandraRDD batch and streaming YES YES
C* to Spark-SQL table mapping generator NO YES
Spark Configuration Generator NO YES
Cassandra File System Access NO YES
SSL Encryption YES YES
User/password authentication YES YES
Kerberos authentication NO YES
© DataStax, All Rights Reserved. 18
Kerberos Demo
Kerberos Demo
• No time for live demo. Find me at Meet Expert, for it
© DataStax, All Rights Reserved. 20
Kerberos Demo
• MIT Kerberos usage is well documented.
© DataStax, All Rights Reserved. 21
Kerberos Demo
• MIT Kerberos usage is well documented.
© DataStax, All Rights Reserved. 22
Kerberos Demo
• MIT Kerberos usage is well documented.
• MS Domain Controller will be used
© DataStax, All Rights Reserved. 23
Kerberos Demo
• MIT Kerberos usage is well documented.
• MS Domain Controller will be used
• Cloudera and MapR use MIT Kerberos
© DataStax, All Rights Reserved. 24
Kerberos Demo
• MIT Kerberos usage is well documented.
• MS Domain Controller will be used
• Cloudera and MapR use MIT Kerberos
© DataStax, All Rights Reserved. 25
Kerberos Demo
• MIT Kerberos usage is well documented.
• MS Domain Controller will be used
• Cloudera and MapR use MIT Kerberos
• Hortonworks supports Active Directory
© DataStax, All Rights Reserved. 26
Kerberos Demo
• MIT Kerberos usage is well documented.
• MS Domain Controller will be used
• Cloudera and MapR use MIT Kerberos
• Hortonworks supports Active Directory
• DataStax Enterprise full support:
• Kerberos Auth
• LDAP Auth
• LDAP Roles
27
Demo Servers
© DataStax, All Rights Reserved. 28
c1 c2
DSE 5.0.2
Domain Controller: Kerberos, Secure LDAP, DNS
Ubuntu LTS 14.04
h1 h2
Spark 1.6.1
Hadoop 2.7
Ubuntu LTS 14.04
Byos 5.0.2
• Realm: DC.DATASTAX.COM
• DNS Domain: dc.datastax.com
• Windows2012R2 server
• 2 Hadoop nodes
• 2 DataStax Enterprise 5.0 nodes
• Ubuntu 14.04
Domain Controller Setup
• DNS forward and reverse zones
• Secure LDAP
• Ambari setup wizard
• LDAP DseRoleManager (Optional)
• Organization Units
for Hadoop and DSE users/principals
© DataStax, All Rights Reserved. 29
Linux Join the Domain (Optional)
• REALMD and SSSD
#> apt-get install realmd sssd samba-common samba-common-bin samba-libs sssd-tools
krb5user adcli packagekit vim ntp -y
#> realm --verbose join -U Administrator DC.DATASTAX.COM
# optional create home directories for domain users
#> echo 'session required pam_mkhomedir.so skel=/etc/skel/ umask=0022' >>
/etc/pam.d/common-session
• Various workaround/additional steps for you Linux will be required
#> ln -s /usr/lib/x86_64-linux-gnu/ldb /usr/lib/x86_64-linux-gnu/samba
• Security will need to be tuned
© DataStax, All Rights Reserved. 30
#> apt-get install realmd sssd samba-common samba-common-bin samba-libs 
sssd-tools krb5-user adcli packagekit vim ntp -y
#> realm --verbose join -U Administrator DC.DATASTAX.COM
# optional create home directories for domain users
#> echo 'session required pam_mkhomedir.so skel=/etc/skel/ umask=0022' >> 
/etc/pam.d/common-session
#> ln -s /usr/lib/x86_64-linux-gnu/ldb /usr/lib/x86_64-linux-gnu/samba
Ambari Kerberos Wizard
© DataStax, All Rights Reserved. 31
• Admin->Kerberos ->
ActiveDirectory
• DC data :
• next next next
That will create a bunch of Windows
users and keytabs for them
• Configure Hadoop component
security and permissions
DataStax Enterprise
On windows:
• Create ‘dse’ user in a GUI.
• Create DSE keytabs for each node:
c:>ktpass -princ HTTP/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass
password -crypto all -out tmp.keytab
c:>ktpass -princ dse/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass
password -crypto all –in tmp.keytab -out c1.keytab
• copy keytabs to appropriate node
Enable Kerberos on DSE nodes:
https://docs.datastax.com/en/datastax_enterprise/5.0/datastax_enterprise/unifie
dAuth/configAuthenticate.html
© DataStax, All Rights Reserved. 32
c:>ktpass -princ HTTP/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass
****** -crypto all -out tmp.keytab
c:>ktpass -princ dse/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass
****** -crypto all –in tmp.keytab -out c1.keytab
DataStax Enterprise
• dse.yaml
authenticator: com.datastax.bdp.cassandra.auth.DseAuthenticator
authorizer: com.datastax.bdp.cassandra.auth.DseAuthorizer
authentication_options:
enabled: true
kerberos_options:
• Replace default cassandra user:
cqlsh> create role 'cassandra@DC.DATASTAX.COM' with SUPERUSER = true AND LOGIN =
true;
• User for Hadoop Spark Thrift Server
cqlsh> create role 'hive/hdp0.dc.datastax.com@DC.DATASTAX.COM' with LOGIN = true;
© DataStax, All Rights Reserved. 33
cqlsh> create role 'cassandra@DC.DATASTAX.COM' with SUPERUSER = true AND LOGIN =
true;
cqlsh> create role 'hive/hdp0.dc.datastax.com@DC.DATASTAX.COM' with LOGIN = true;
BYOS
• Generate the byos.conf usual way
dse client-tool configuration byos-export byos.conf
• create .java.login.config in Hadoop user home directory:
DseClient {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true
renewTGT=true;
};
• keytab usage could be configured in the file
© DataStax, All Rights Reserved. 34
dse client-tool configuration byos-export byos.conf
Spark
© DataStax, All Rights Reserved. 35
#>kinit
Password for cassandra@DC.DATASTAX.COM:
• Add CFS to spark.yarn.access.namenodes property, to request C* token.
#> spark-shell --master yarn-client --jars dse-byos*.jar --properties-file
merged.conf --conf spark.yarn.access.namenodes=cfs://node1/
Spark Thrift Server
Start:
Connect:
© DataStax, All Rights Reserved. 36
#> kinit -kt /etc/security/keytabs/hive.service.keytab 
hive/hdp0.dc.datastax.com@DC.DATASTAX.COM
#> cat /etc/spark/conf/spark-thrift-sparkconf.conf byos.conf > byos-
thrift.conf
#> start-thriftserver.sh --properties-file byos-thrift.conf --jars dse-
byos*.jar
#> kinit
#> beeline -u 
'jdbc:hive2://hdp0:10015/default;principal=hive/_HOST@DC.DATASTAX.COM'
Bring Your Own Spark!
© DataStax, All Rights Reserved. 37
HDFS
Hive
Meta
Store
ClusterManger(yarn)
Spark
SQL
Cassandra
Hive
Meta
Store
CFS
DSE Spark
SQL

Contenu connexe

Tendances

Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1DataStax Academy
 
Building the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopBuilding the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopAll Things Open
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceSeveralnines
 
How Prometheus Store the Data
How Prometheus Store the DataHow Prometheus Store the Data
How Prometheus Store the DataHao Chen
 
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...DataStax
 
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceQuick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceCloudian
 
Network Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Network Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceNetwork Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Network Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceCloudian
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsJulien Anguenot
 
Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)J.B. Langston
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationRamkumar Nottath
 
Cloudera hadoop installation
Cloudera hadoop installationCloudera hadoop installation
Cloudera hadoop installationSumitra Pundlik
 
Introducing SciaaS @ Sanger
Introducing SciaaS @ SangerIntroducing SciaaS @ Sanger
Introducing SciaaS @ SangerPeter Clapham
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016DataStax
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0J.B. Langston
 
tow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxtow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxjustinit
 

Tendances (20)

Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
 
Building the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopBuilding the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for Hadoop
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
 
How Prometheus Store the Data
How Prometheus Store the DataHow Prometheus Store the Data
How Prometheus Store the Data
 
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
 
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceQuick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
 
Network Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Network Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceNetwork Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Network Setup Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migration
 
Cloudera hadoop installation
Cloudera hadoop installationCloudera hadoop installation
Cloudera hadoop installation
 
Introducing SciaaS @ Sanger
Introducing SciaaS @ SangerIntroducing SciaaS @ Sanger
Introducing SciaaS @ Sanger
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
 
Apache cassandra v4.0
Apache cassandra v4.0Apache cassandra v4.0
Apache cassandra v4.0
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
 
tow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxtow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualbox
 

Similaire à DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev) | Cassandra Summit 2016

Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!SolarWinds
 
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!SolarWinds
 
Devoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on AzureDevoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on AzurePatrick Chanezon
 
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...Patrick Chanezon
 
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosBig Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosHeiko Loewe
 
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on AzureDocker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on AzurePatrick Chanezon
 
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure Patrick Chanezon
 
Oracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and HowOracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and HowSeth Miller
 
Kite SDK introduction for Portland Big Data
Kite SDK introduction for Portland Big DataKite SDK introduction for Portland Big Data
Kite SDK introduction for Portland Big Data_blue
 
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloudOpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloudNetcetera
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Big Data Spain
 
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuAzure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuMarco Obinu
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache KafkaDataStax
 
Tech-Spark: SQL Server on Linux
Tech-Spark: SQL Server on LinuxTech-Spark: SQL Server on Linux
Tech-Spark: SQL Server on LinuxRalph Attard
 
OpenStack Deployments with Chef
OpenStack Deployments with ChefOpenStack Deployments with Chef
OpenStack Deployments with ChefMatt Ray
 
Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...
Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...
Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...Derek Ashmore
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedCloudera, Inc.
 
Securing Hadoop with OSSEC
Securing Hadoop with OSSECSecuring Hadoop with OSSEC
Securing Hadoop with OSSECVic Hargrave
 

Similaire à DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev) | Cassandra Summit 2016 (20)

Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
 
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
Hi! Ho! Hi! Ho! SQL Server on Linux We Go!
 
Devoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on AzureDevoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on Azure
 
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
 
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosBig Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and Mesos
 
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on AzureDocker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
 
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
 
Oracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and HowOracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and How
 
Kite SDK introduction for Portland Big Data
Kite SDK introduction for Portland Big DataKite SDK introduction for Portland Big Data
Kite SDK introduction for Portland Big Data
 
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloudOpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
 
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheConTechnical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
 
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuAzure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Tech-Spark: SQL Server on Linux
Tech-Spark: SQL Server on LinuxTech-Spark: SQL Server on Linux
Tech-Spark: SQL Server on Linux
 
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the EnterpriseDeploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
 
OpenStack Deployments with Chef
OpenStack Deployments with ChefOpenStack Deployments with Chef
OpenStack Deployments with Chef
 
Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...
Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...
Microservices with Terraform, Docker and the Cloud. Chicago Coders Conference...
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
 
Securing Hadoop with OSSEC
Securing Hadoop with OSSECSecuring Hadoop with OSSEC
Securing Hadoop with OSSEC
 

Plus de DataStax

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsDataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphDataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...DataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesDataStax
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudDataStax
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceDataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...DataStax
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsDataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingDataStax
 
Innovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud DetectionInnovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud DetectionDataStax
 

Plus de DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 
Innovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud DetectionInnovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud Detection
 

Dernier

StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfsteffenkarlsson2
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion Clinic
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfQ-Advise
 
CompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdfCompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdfFurqanuddin10
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdfkalichargn70th171
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1KnowledgeSeed
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAlluxio, Inc.
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024vaibhav130304
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Soroosh Khodami
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Krakówbim.edu.pl
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersEmilyJiang23
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareinfo611746
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationWave PLM
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignNeo4j
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionWave PLM
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems ApproachNeo4j
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfmbmh111980
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 

Dernier (20)

StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
CompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdfCompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdf
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
AI Hackathon.pptx
AI                        Hackathon.pptxAI                        Hackathon.pptx
AI Hackathon.pptx
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java Developers
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 

DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev) | Cassandra Summit 2016

  • 1. Artem Aliev Bring Your Own Spark with Enterprise Security
  • 2. 1 DSE BYOS Overview 2 BYOS Configuration Tools 3 Use Cases 4 BYOS vs OSS Spark Connector 5 Kerberos Demo 2© DataStax, All Rights Reserved.
  • 3. Connect Your Spark to DSE © DataStax, All Rights Reserved. 3 HDFS Hive Meta Store ClusterManger Spark SQL DSE C* Hive Meta Store CFS DSE Spark SQL
  • 4. Connect Your Spark to DSE © DataStax, All Rights Reserved. 4 HDFS Hive Meta Store ClusterManger Spark SQL Hive Meta Store CFS DSE Spark SQL DSE C*
  • 5. Bring Your Own Spark! • A simple way to – Read Cassandra and CFS data from external Spark – Export necessary configuration info to connect to DSE • Includes security options – Export necessary Jars to connect – Attach these exported resource to a spark-submit • Also – Simple way to get the SparkSQL syntax to create catalog entries for tables in Cassandra – Read external HDFS data from DSE Spark jobs © DataStax, All Rights Reserved. 5
  • 6. BYOS Components • BYOS assembly jar (add it to spark jars) • spark-cassanda-connector, secure transport, CFS and dependencies $DSE_HOME/clients/dse-byos_2.10-5.0.2-SNAPSHOT.jar • Spark configuration generator (merge result with spark-defaults.conf) • Contains Cassandra host, auth type and factories dse client-tool configuration byos-export byos.conf • Spark-SQL Schema mapping generator (run result by spark-sql) • The sql script will create databases and table mapping for all C* tables © DataStax, All Rights Reserved. 6 dse client-tool spark sql-schema -all > mapping.sql dse client-tool configuration byos-export byos.conf $DSE_HOME/clients/dse-byos_2.10-5.0.2.jar
  • 7. byos.conf © DataStax, All Rights Reserved. 7 #Exported node configuration properties #Fri Jul 29 22:55:48 UTC 2016 spark.hadoop.cassandra.host=127.0.0.1 spark.hadoop.cassandra.auth.kerberos.enabled=false spark.cassandra.auth.conf.factory=com.datastax.bdp.spark.DseByosAuthConfFactory spark.cassandra.connection.port=9042 spark.hadoop.cassandra.ssl.enabled=false spark.hadoop.cassandra.auth.kerberos.defaultScheme=false spark.hadoop.cassandra.client.transport.factory=com.datastax.bdp.transport.client.TDseClientTransportFactory spark.cassandra.connection.host=127.0.0.1 spark.hadoop.fs.cfs.impl=com.datastax.bdp.hadoop.cfs.CassandraFileSystem spark.hadoop.cassandra.connection.native.port=9042 spark.hadoop.dse.client.configuration.impl=com.datastax.bdp.transport.client.HadoopBasedClientConfiguration spark.cassandra.connection.factory=com.datastax.bdp.spark.DseCassandraConnectionFactory spark.hadoop.cassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader spark.hadoop.cassandra.connection.rpc.port=9160 spark.hadoop.dse.system_memory_in_mb=7985 spark.hadoop.cassandra.thrift.framedTransportSize=15728640 spark.hadoop.cassandra.partitioner=org.apache.cassandra.dht.Murmur3Partitioner spark.hadoop.cassandra.dsefs.port=5598
  • 8. mapping.sql © DataStax, All Rights Reserved. 8 CREATE DATABASE IF NOT EXISTS test_keyspace; USE test_keyspace; CREATE TABLE test_table USING org.apache.spark.sql.cassandra OPTIONS ( keyspace "test_keyspace", table "test_table", pushdown "true");
  • 9. Add BYOS to the Spark • Copy dse-byos.jar, byos.conf and mapping.sql to a spark client node • Merge byos.conf properties with spark defaults • add DSE tables mapping (optional) Run any spark application the same way: © DataStax, All Rights Reserved. 9 cat byos.conf /etc/spark/conf/spark-defaults.conf > merged.conf spark-sql --jars dse-byos*.jar --properties-file merged.conf –f mapping.sql spark-shell --jars dse-byos*.jar --properties-file merged.conf
  • 10. SSL Support • Copy DSE client SSL certificate truststore and keystore files to Spark nodes • Pass file locations to configuration generator • Tip: You can use --files spark parameter to distribute files for the YARN job © DataStax, All Rights Reserved. 10 dse client-tool configuration byos-export --set-truststore-path .truststore --set-truststore-password password --set-keystore-path .keystore --set-keystore-password password byos.conf spark-shell --jars dse-byos*.jar --properties-file merged.conf --files .truststore,.keystore
  • 11. Kerberos • Kerberos setup on Spark cluster: Just specify preferred JAAS connect in .java.login.config DseClient { com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true renewTGT=true; }; • No Kerberos on Spark Cluster? (less secure) Request DSE token manually while generate config © DataStax, All Rights Reserved. 11 Driver Executors KerberosAuth DSEToken DSE Token dse client-tool configuration byos-export --generate-token byos.conf
  • 12. Usage: Migrate/Save/Load Data © DataStax, All Rights Reserved. 12 • DSE tables to Hadoop and back • Streaming • DSE Max CFS and HDFS • spark-shell • dse spark scala> sc.textFile("hdfs://hadoop1/data").saveAsTextFile("cfs:/data") scala> val df = sqlContext.read.format("org.apache.spark.sql.cassandra") .options(Map("keyspace"->"t", "table" -> "t")).load() df.write.format("json").save ("/tmp/t.json”) scala> sc.textFile("cfs:/data").saveAsTextFile("hdfs://hadoop1/data") session_stream.saveToCassandra("web", "sessions")
  • 13. Usage: JOIN/Enrich with C* Tables • all C* tables are available after mapping • join your RDD with C* KILLER FEATURE: Enrich your stream, with C* on the fly © DataStax, All Rights Reserved. 13 spark-sql> select * from hive_table h join cassandra_table с on h.key = c.key scala> hrdd.joinWithCassandraTable("t", "t") click_stream.joinWithCassandraTable("web", "sessions")
  • 14. Building Full Lambda Architecture? © DataStax, All Rights Reserved. 14
  • 15. Add Speed Layer! © DataStax, All Rights Reserved. 15 DSE DSE
  • 16. HBase? © DataStax, All Rights Reserved. 16
  • 17. Still HBase? Double Master/Slave architecture One for server, one for storage Master-less architecture
  • 18. OSS Spark Connector or DSE BYOS? Feature OSS DSE BYOS DataStax Official Support NO YES Spark SQL Source Tables / Cassandra DataFrames YES YES CassandraRDD batch and streaming YES YES C* to Spark-SQL table mapping generator NO YES Spark Configuration Generator NO YES Cassandra File System Access NO YES SSL Encryption YES YES User/password authentication YES YES Kerberos authentication NO YES © DataStax, All Rights Reserved. 18
  • 20. Kerberos Demo • No time for live demo. Find me at Meet Expert, for it © DataStax, All Rights Reserved. 20
  • 21. Kerberos Demo • MIT Kerberos usage is well documented. © DataStax, All Rights Reserved. 21
  • 22. Kerberos Demo • MIT Kerberos usage is well documented. © DataStax, All Rights Reserved. 22
  • 23. Kerberos Demo • MIT Kerberos usage is well documented. • MS Domain Controller will be used © DataStax, All Rights Reserved. 23
  • 24. Kerberos Demo • MIT Kerberos usage is well documented. • MS Domain Controller will be used • Cloudera and MapR use MIT Kerberos © DataStax, All Rights Reserved. 24
  • 25. Kerberos Demo • MIT Kerberos usage is well documented. • MS Domain Controller will be used • Cloudera and MapR use MIT Kerberos © DataStax, All Rights Reserved. 25
  • 26. Kerberos Demo • MIT Kerberos usage is well documented. • MS Domain Controller will be used • Cloudera and MapR use MIT Kerberos • Hortonworks supports Active Directory © DataStax, All Rights Reserved. 26
  • 27. Kerberos Demo • MIT Kerberos usage is well documented. • MS Domain Controller will be used • Cloudera and MapR use MIT Kerberos • Hortonworks supports Active Directory • DataStax Enterprise full support: • Kerberos Auth • LDAP Auth • LDAP Roles 27
  • 28. Demo Servers © DataStax, All Rights Reserved. 28 c1 c2 DSE 5.0.2 Domain Controller: Kerberos, Secure LDAP, DNS Ubuntu LTS 14.04 h1 h2 Spark 1.6.1 Hadoop 2.7 Ubuntu LTS 14.04 Byos 5.0.2 • Realm: DC.DATASTAX.COM • DNS Domain: dc.datastax.com • Windows2012R2 server • 2 Hadoop nodes • 2 DataStax Enterprise 5.0 nodes • Ubuntu 14.04
  • 29. Domain Controller Setup • DNS forward and reverse zones • Secure LDAP • Ambari setup wizard • LDAP DseRoleManager (Optional) • Organization Units for Hadoop and DSE users/principals © DataStax, All Rights Reserved. 29
  • 30. Linux Join the Domain (Optional) • REALMD and SSSD #> apt-get install realmd sssd samba-common samba-common-bin samba-libs sssd-tools krb5user adcli packagekit vim ntp -y #> realm --verbose join -U Administrator DC.DATASTAX.COM # optional create home directories for domain users #> echo 'session required pam_mkhomedir.so skel=/etc/skel/ umask=0022' >> /etc/pam.d/common-session • Various workaround/additional steps for you Linux will be required #> ln -s /usr/lib/x86_64-linux-gnu/ldb /usr/lib/x86_64-linux-gnu/samba • Security will need to be tuned © DataStax, All Rights Reserved. 30 #> apt-get install realmd sssd samba-common samba-common-bin samba-libs sssd-tools krb5-user adcli packagekit vim ntp -y #> realm --verbose join -U Administrator DC.DATASTAX.COM # optional create home directories for domain users #> echo 'session required pam_mkhomedir.so skel=/etc/skel/ umask=0022' >> /etc/pam.d/common-session #> ln -s /usr/lib/x86_64-linux-gnu/ldb /usr/lib/x86_64-linux-gnu/samba
  • 31. Ambari Kerberos Wizard © DataStax, All Rights Reserved. 31 • Admin->Kerberos -> ActiveDirectory • DC data : • next next next That will create a bunch of Windows users and keytabs for them • Configure Hadoop component security and permissions
  • 32. DataStax Enterprise On windows: • Create ‘dse’ user in a GUI. • Create DSE keytabs for each node: c:>ktpass -princ HTTP/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass password -crypto all -out tmp.keytab c:>ktpass -princ dse/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass password -crypto all –in tmp.keytab -out c1.keytab • copy keytabs to appropriate node Enable Kerberos on DSE nodes: https://docs.datastax.com/en/datastax_enterprise/5.0/datastax_enterprise/unifie dAuth/configAuthenticate.html © DataStax, All Rights Reserved. 32 c:>ktpass -princ HTTP/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass ****** -crypto all -out tmp.keytab c:>ktpass -princ dse/c1.dc.datastax.com@DC.DATASTAX.COM -mapUser dse -pass ****** -crypto all –in tmp.keytab -out c1.keytab
  • 33. DataStax Enterprise • dse.yaml authenticator: com.datastax.bdp.cassandra.auth.DseAuthenticator authorizer: com.datastax.bdp.cassandra.auth.DseAuthorizer authentication_options: enabled: true kerberos_options: • Replace default cassandra user: cqlsh> create role 'cassandra@DC.DATASTAX.COM' with SUPERUSER = true AND LOGIN = true; • User for Hadoop Spark Thrift Server cqlsh> create role 'hive/hdp0.dc.datastax.com@DC.DATASTAX.COM' with LOGIN = true; © DataStax, All Rights Reserved. 33 cqlsh> create role 'cassandra@DC.DATASTAX.COM' with SUPERUSER = true AND LOGIN = true; cqlsh> create role 'hive/hdp0.dc.datastax.com@DC.DATASTAX.COM' with LOGIN = true;
  • 34. BYOS • Generate the byos.conf usual way dse client-tool configuration byos-export byos.conf • create .java.login.config in Hadoop user home directory: DseClient { com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true renewTGT=true; }; • keytab usage could be configured in the file © DataStax, All Rights Reserved. 34 dse client-tool configuration byos-export byos.conf
  • 35. Spark © DataStax, All Rights Reserved. 35 #>kinit Password for cassandra@DC.DATASTAX.COM: • Add CFS to spark.yarn.access.namenodes property, to request C* token. #> spark-shell --master yarn-client --jars dse-byos*.jar --properties-file merged.conf --conf spark.yarn.access.namenodes=cfs://node1/
  • 36. Spark Thrift Server Start: Connect: © DataStax, All Rights Reserved. 36 #> kinit -kt /etc/security/keytabs/hive.service.keytab hive/hdp0.dc.datastax.com@DC.DATASTAX.COM #> cat /etc/spark/conf/spark-thrift-sparkconf.conf byos.conf > byos- thrift.conf #> start-thriftserver.sh --properties-file byos-thrift.conf --jars dse- byos*.jar #> kinit #> beeline -u 'jdbc:hive2://hdp0:10015/default;principal=hive/_HOST@DC.DATASTAX.COM'
  • 37. Bring Your Own Spark! © DataStax, All Rights Reserved. 37 HDFS Hive Meta Store ClusterManger(yarn) Spark SQL Cassandra Hive Meta Store CFS DSE Spark SQL

Notes de l'éditeur

  1. It is not a Way of the Samurai
  2. It is not a Way of the Samurai
  3. It is not a Way of the Samurai
  4. That’s the way!