Question 1: What is Apache Cassandra?
Answer:- Cassandra is highly scalability and high availability without compromising performance NoSQL database. Cassandra's support for replicating across multiple datacenters and manage a large set of data.
Question 2: What is Cassandra Data Model?
Answer:-
- Column family: Along with row key reference there are multiple columns.
- Cluster: It is made up of multiple keyspaces and multiple nodes.
- Column: Consists of a column value, name and timestamp.
- Keyspace: It is a namespace that is used to group several column families, particularly one per partition.
Question 3: What is a cluster in Cassandra?
Answer:- The cluster is the collection of many data centers. A cluster contains one or more datacenters. In case of any data handling failure, these nodes have a copy, which takes the charge.
Question 4: What is Thrift?
Answer:- Thrift refers to the name of the RPC client, which is used to interact with the Cassandra server.
Question 5: What is a keyspace in Cassandra?
Answer:- A keyspace in Cassandra is a namespace, which verifies data replication on nodes. For each node, a cluster consists of one keyspace.
Question 6: What is Cassandra Query Language [CQL]?
Answer:- The Cassandra Query Language “CQL” is language to communicate with the Cassandra database. It interact with Cassandra is using the CQL shell, cqlsh. Using cqlsh, you can create keyspaces and tables, insert and query tables etc.
Question 7: What are the differences between a datacenter, node and a cluster in Cassandra?
Answer:- Node: - It is the region or place where data is stored. It is the initial component of Cassandra.
Data Center: - Data center is the collection of nodes. Data can be written to multiple datacenters depending on the replication factor. However, datacenters should never span physical locations. Cluster: - The cluster is the collection of many data centers. A cluster contains one or more datacenters.
Data Center: - Data center is the collection of nodes. Data can be written to multiple datacenters depending on the replication factor. However, datacenters should never span physical locations. Cluster: - The cluster is the collection of many data centers. A cluster contains one or more datacenters.
Question 8: What do you understand by Commit log in Cassandra?
Answer:- All write operation is written to Commit Log, after all its data has been flushed to SSTables that can be archived, deleted, or recycled.
Question 9: What is SSTable?
Answer:- SSTable (sorted string table) is data file and stored on disk sequentially and maintained for each Cassandra table.
Question 10: Explain what is a keyspace in Cassandra?
Answer:- A keyspace is a container for data. When you are defining a keyspace, you need to specify a replication strategy and a replication factor i.e. the number of nodes that the data must be replicate too.
Question 11: What are the three components of Cassandra write?
Answer:- The three components are:
- Commitlog write
- Memtable write
- SStable write
Question 12: When do you have to avoid secondary indexes?
Answer:-
- Do not use an index on high-cardinality values (timestamps, birthdates, keywords etc.).
- Do not use an index on tables that use a counter column.
- Do not use an index on a frequently updated or deleted column.
- Do not use an index on unsorted result values.
- Do not use an index you look for a row in a large partition unless narrowly queried.
Question 13: Define composite type in Cassandra?
Answer:- You can use two types of Composite Types
- Row Key
- Column Name
Question 14: In which language, Cassandra is written?
Answer:- Java
Question 15: How many types of NoSQL databases?
Answer:-
- Document Stores (MongoDB, Couchbase)
- Key-Value Stores (Redis, Volgemort)
- Column Stores (Cassandra)
- Graph Stores (Neo4j, Giraph)
Question 16: What is the difference between Cassandra and RDBMS?
Answer:- http://www.code-view.com/2016/12/cassandra-database-vs-relational.html
Question 17: Why use "void close()" method?
Answer:- This method is used to close the current session instance.
Question 18: What is Bloom filter in Cassandra?
Answer:- Bloom filter are nothing but quick, nondeterministic, algorithms for testing whether an element is a member of a set. It is a special kind of cache.
Question 19: What do you understand by mem-table in Cassandra?
Answer:- Mem-table is a memory-resident data structure. After commit log, the data will be written to the mem-table. Sometimes, for a single-column family, there will be multiple mem-tables.
Question 20: What do you understand by Consistency in Cassandra?
Answer:- Consistency means to synchronize and how up-to-date a row of Cassandra data is on all of its replicas.
Question 21: Explain Zero Consistency?
Answer:- In this write operations that will be handled in the background, asynchronously. It is the fastest way to write data, and the one that is used to offer the least confidence that operations will succeed.
Question 22: What are secondary indexes?
Answer:- Secondary indexes on tables that allowing queries on the table to use those indexes. A secondary index is identified by a name. Index name is optional. Using CQL, you can create an index on a column after defining a table. Secondary indexes are tricky to use and can impact performance greatly.
Question 23: What Situation a Secondary Index used in Cassandra?
Answer:- http://www.code-view.com/2016/12/when-use-secondary-index-in-cassendra.html
Question 24: What are Data replication strategies in Cassandra?
Answer:- The replication strategy of a keyspace decides which nodes are copies for a given token range. The two main replication strategies are:
- SimpleStrategy
- NetworkTopologyStrategy
Question 25: What is Row Key in Cassandra?
Answer:- A row key is also known as the partition key and has a number of columns associated with it.
No comments:
Post a comment