Spark is an execution framework designed to operate on distributed systems like Cassandra. It's a handy tool for many things, including ETL (extract, transform, and load) jobs. In this session, let me share with you some tips and tricks that I have learned through experience. I'm no oracle, but I can guarantee these tips will get you well down the path of pulling your relational data into Cassandra.
About the Speaker
Jim Hatcher Principal Architect, IHS Markit
Jim Hatcher is a software architect with a passion for data. He has spent most of his 20 year career working with relational databases, but he has been working with Big Data technologies such as Cassandra, Solr, and Spark for the last several years. He has supported systems with very large databases at companies like First Data, CyberSource, and Western Union. He is currently working at IHS, supporting an Electronic Parts Database which tracks half a billion electronic parts using Cassandra.