** Apache Cassandra Certification Training: https://www.edureka.co/cassandra **
In this PPT, you will get a detailed introduction to NoSQL and Apache Cassandra Questions and Answers required to crack any Interview. Brush up your Knowledge of Cassandra, It's various database elements and how to configure the database.
2. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Agenda
❖ Cassandra Salary Trend
❖ Cassandra History and it’s Users in Market
❖ Performance Comparison
❖ General NoSQL Questions
❖ Cassandra Beginners Questions
❖ Cassandra Advance Questions
3. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Salary Trend
₹ 700 K
₹ 1.4 M
₹ 2.1 M
₹ 413,000
₹ 909,694
₹ 1,922,300
Years Experience1-4 years 5-9 years >10 years
Source : Payscale
5. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra History
Cassandra was
developed at
Facebook for inbox
search
Open-sourced by
Facebook in July
2008
Accepted into
Apache Incubator in
March 2009
Top level Apache
project since
February 2010
Influenced by
“Google BigTable”
and “Amazon
Dynamo”
8. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
NoSQL General Questions
What is a NoSQL Database?
1
• NoSQL is also referred as Not only SQL to emphasize that they may support SQL-like query language used
in relational database.
• NoSQL database provides a mechanism to store and retrieve data, which are modeled rather than the tabular
relations used in Relational databases.
A
9. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
NoSQL General Questions
What are the key features of any NoSQL Database?
2
10. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
NoSQL General Questions
What are the different types of NoSQL Databases?
3
Key Value Store01
Document Store02
Column Store03
Graph Database04
11. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
NoSQL General Questions
What is Key Value Store DB? Explain with an example.
4
• All of the data within database consists of an indexed key and a value.
• A key may correspond to one or multiple values (hash table).
• Gives great performance and can be very easily scaled as per business needs.
12. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
NoSQL General Questions
What is Document Store DB? Explain with an example.
5
• Data record is the JSON/XML representation of key-value pairs.
• Every record can have different set of fields.
• Document DBs are similar to Key-value pairs
• But the difference is that the key is associated with a document
13. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
NoSQL General Questions
What is Column Store DB? Explain with an example.
6
• Data is stored in cells are grouped in columns of data rather than as rows of data.
• Columns are logically grouped into column families.
• One row may have one or multiple data records, which is indexed by a partition key.
14. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
NoSQL General Questions
What is Graph DB? Explain with an example.
7
• A flexible graphical representation is used.
• Key purpose is to store relationships between nodes.
• Here, Nodes are Id 1, 2 and 3. Properties for Node 1 are Name and Age
• Edges are : Id 100, 101, 102, 103, 104 and 105
16. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is Apache Cassandra?
1
• Apache Cassandra is a free and open-source distributed NoSQL database management
system designed to handle large amounts of data across many commodity servers,
providing high availability with no single point of failure.
A
17. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What are the features of Apache Cassandra?
2
18. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What are the Different types of Data Model?
3
• Conceptual Data Model
• Logical Data Model
• Physical Data Model
A
19. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What are the Key Differences between Cassandra and Traditional RDBMS?
4
20. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What are the different Database Elements of Cassandra?
5
21. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is CQLSH? And why is it used?
6
• Cassandra-Cqlsh is a query language that enables users to communicate with its database.
By using Cassandra cqlsh, you can do following things:
• Define a schema
• Insert a data, and
• Execute a query
A
22. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is a YAML file in Cassandra?
7
• The cassandra.yaml file is the main configuration file for Cassandra.
• After changing properties in the cassandra.yaml file, you must restart the node for the
changes to take effect.
A
23. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What are Clusters in Cassandra?
8
• The outermost structure in Cassandra is the cluster
• Cluster is a container for Keyspaces
• Sometimes called the ring, because Cassandra assigns data to nodes in the cluster by
arranging them in a ring
• A node holds a replica for different range of data
A
24. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is a Keyspace in Cassandra?
9
• A keyspace is the outermost container for data in Cassandra
• Like a relational database, a keyspace has a name and a set of attributes that define
keyspace-wide behaviour
• Keyspace is used to group Column Families together
A
25. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
How is a Keyspace created in Cassandra? & What are the parameters used?
10
A
26. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What are durable writes?
11
• Provides a means to instruct Cassandra whether to use commitlog for updates on the
current KeySpace or not.
• This option is not mandatory.
• Default value for durable writes is TRUE.
A
27. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What are the steps to Describe, USE and Delete the Keyspaces?
12
28. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What do you mean by replication factor?
13
• Cassandra stores copies (called replicas) of each row based on the row key
• The replication factor refers to the number of nodes that will act as copies (replicas) of each
row of data
A
29. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What do you mean by replication Strategy?
14
• The replica placement strategy refers to how the replicas will be placed in the ring
• There are different strategies that ship with Cassandra for determining which nodes will get
copies of which keys
• There are mainly two types of Strategies:
• Simple Strategy
• Network Topology Strategy
A
30. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is Simple Strategy?
15
• It uses Simple Single Datacenter Clusters
• It places the first Replica on a node determined by the Partitioner
• Additional Replicas are placed on the next nodes in clockwise (in a Ring) manner without
considering Rack or Datacenter location
A
31. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is Network Topology Strategy?
16
• This is used when we deploy a cluster across Multiple Datacenters
• Primary consideration to insert replicas
• Can satisfy reads, locally without incurring cross Data-Center Latency
• Handling Failure Scenarios
A
32. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is a Column Family?
17
• A column family is a container for an ordered collection of rows, each of which is itself an
ordered collection of columns
• We can freely add any column to any column family at any time, depending on your needs
• The comparator value indicates how columns will be sorted when they are returned to you
in a query
A
33. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is a Column Family?
17
34. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is a Row in Cassandra? and What are the different elements of it?
18
• Row is a collection of sorted columns
• It is the smallest unit that stores related data in Cassandra
• Any component of a Row can store data or metadata
A
35. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
What is a Primary Key? And what are it’s different types?
19
• The Primary Key is a column that is used to uniquely identify a row
A
36. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
Differentiate between the various types of Primary Keys in Cassandra.
20
37. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
Differentiate between Static and Dynamic CQL Tables
21
• A Static Table uses a relatively static set of column names and is similar to Relational
Database Table
• A dynamic table allows you to pre-compute result sets and store them in a single row for
efficient data retrieval
A
38. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Beginners Questions
Differentiate between Drop and Truncate in CQLSH
22
• The Drop table command drops specified table including all the data from the keyspace.
• The Truncate table command is used to truncate a table and deletes all the rows of the table
permanently.
A
40. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What is Gossip Protocol ?
1
• Gossip Protocol in Cassandra is a peer-to-peer communication protocol in which nodes can
choose among themselves with whom they want to exchange their state information.
A
41. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
How does gossip Protocol Work?
2
42. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
How does gossip Protocol help in Failure Detection?
3
43. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What are partitions and Tokens in Cassandra?
4
• Partition: It is a hash function located on each node which hashes tokens from designated
values in rows being added
• It converts a variable length input to a fixed length value.
A
• Token: Integer value generated by a hashing algorithm, identifying a partition's location
within a cluster
A
44. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What are partitions and Tokens in Cassandra?
4
45. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What are the different types of Partitioners in Cassandra? Explain.
5
46. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What do you mean by Snitch? Name a few.
6
• A snitch determines which datacenters and racks, nodes belong to.
• They inform Cassandra about the network topology and allows Cassandra to distribute
replicas
• Specifically, the Replication strategy places the replicas based on the information provided
by the new snitch
A
47. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
How does Cassandra perform write operations?
6
48. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
Explain the terms Memtable, CommitLog and SSTables.
7
49. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What is the use of Coordinator Node in Read?
8
• Read Operation is easy, because clients can connect to any node in the cluster to perform
reads
• If a client connects to a node that doesn’t have the data it’s trying to read, the node it’s
connected to will act as coordinator node
A
50. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
How does Cassandra perform Read operation? Explain
9
51. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What do you mean by Compaction?
10
• It is the process of freeing up space by merging large accumulated datafiles
A
52. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What is Anti-Entropy and How is it associated to Merkel Tree?
11
• Anti-entropy is the replica synchronization mechanism, ensuring that data on different nodes
is updated to the newest version
• Cassandra uses Merkle tree for anti-entropy repair
• Merkel Tree is a hash tree where leaves are hashes of the values of individual keys
A
53. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
Explain the different types of Repairs.
12
54. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What is Hinted Handoff?
13
• Hinted Handoff is a mechanism to ensure availability, fault-tolerance and graceful
degradation in Cassandra
• The node that receives the hint will know when the unavailable node comes back online
again, because of Gossip
A
55. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What is Hinted Handoff?
14
• Hinted Handoff is a mechanism to ensure availability, fault-tolerance and graceful
degradation in Cassandra
• The node that receives the hint will know when the unavailable node comes back online
again, because of Gossip
A
56. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What are Roles in CQLSH?
15
• Roles enable authorization management on a larger scale than security per user can provide
A
57. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What do you mean by Logging in Cassandra? Mention a few levels of Logging
16
• Logs are written to the system.log and debug.log file in the Cassandra logging directory
• We can configure logging programmatically or manually
• The simplest way to get a picture of what’s happening in your database is to just change the
logging level to make the output more verbose, by default it is set at INFO.
A
58. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
How can one change the Logging Level in Cassandra and Why ?
17
• By default, the Cassandra server log level is set at INFO
• INFO doesn’t give much detail about the work Cassandra is doing, it just outputs basic status
updates
• By changing the logging level to DEBUG, we can see much more clearly what activity the
server is working on
A
59. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What is JMX? And How is it useful in Cassandra?
18
• JMX (Java Management Extension) is a Java technology that supplies tools for managing and
monitoring Java applications and services
• Cassandra makes use of JMX to enable remote management of the servers
A
60. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
Explain Nodetool Utility.
19
• The Nodetool Utility is a command-line utility that comes out of the box with Cassandra and
is a great tool for administration and monitoring.
• It communicates with JMX to perform operational and monitoring tasks exposed by MBeans
A
61. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
Explain Nodetool Utility tools and how to use them?
20
62. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What are snapshots and how do you create one in Cassandra?
21
• Snapshot represents the state of the data files at a particular point in time
• Snapshot command is used while taking a backup and creates hard links for SSTables in the
snapshots folder which can later be used to restore the node
A
63. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
Why is JConsole used? What are it’s different elements?
22
• JConsole is used to Monitor and perform analysis on the Server activities.
• Once you’ve connected to a server, the default view includes four major categories about
your server’s state, which are updated constantly:
A
64. APACHE CASSANDRA CERTIFICATION TRAINING https://www.edureka.co/cassandra
Cassandra Advance Questions
What is Python Stress test in Cassandra?
23
• Cassandra comes with a popular utility called py_stress that can be used to run a stress test
on Cassandra cluster.
• The cassandra-stress tool is a Java-based stress testing utility for basic benchmarking and
load testing a Cassandra cluster.
• This is an effective tool for populating a cluster and stress testing CQL tables and queries
A