SSTables

SSTables

SStable (Sorted String table) is set of 8 files how Cassandra persists data for durability.

Create a new table .. call it employee(id int primary key, name text);

$ bin/cqlsh -e "create table demo.employee(id int primary key, name text)"
$ ls -l data/data/demo/employee-*/
total 4
drwxrwxr-x 2 training training 4096 Jul 10 18:26 backups

Do a manual flush

[training@localhost apache-cassandra-3.10]$ bin/nodetool flush
[training@localhost apache-cassandra-3.10]$ ls -l data/data/demo/employee-*/
total 4
drwxrwxr-x 2 training training 4096 Jul 10 18:26 backups

2. Re-produce auto compaction

We have to run flush 4 times. 4th flush will trigger an auto compaction.

$ bin/cqlsh -e "insert into demo.employee (id, name) values (1, 'user1')"
$ bin/nodetool flush
$ bin/cqlsh -e "insert into demo.employee (id, name) values (2, 'user2')"
$ bin/nodetool flush
$ bin/cqlsh -e "insert into demo.employee (id, name) values (3, 'user3')"
$ bin/nodetool flush
$ bin/cqlsh -e "insert into demo.employee (id, name) values (4, 'user4')"
$ bin/nodetool flush

After every flush, observe data dir: data/data/demo/employee-*/

After the fourth flush, you should see compacted sstables.

$ ls -l  data/data/demo/employee-*
total 40
drwxrwxr-x 2 training training 4096 Jul 10 18:26 backups
-rw-rw-r-- 1 training training   51 Jul 10 18:36 mc-5-big-CompressionInfo.db
-rw-rw-r-- 1 training training  101 Jul 10 18:36 mc-5-big-Data.db
-rw-rw-r-- 1 training training    9 Jul 10 18:36 mc-5-big-Digest.crc32
-rw-rw-r-- 1 training training   16 Jul 10 18:36 mc-5-big-Filter.db
-rw-rw-r-- 1 training training   32 Jul 10 18:36 mc-5-big-Index.db
-rw-rw-r-- 1 training training 4618 Jul 10 18:36 mc-5-big-Statistics.db
-rw-rw-r-- 1 training training   56 Jul 10 18:36 mc-5-big-Summary.db
-rw-rw-r-- 1 training training   92 Jul 10 18:36 mc-5-big-TOC.txt


3. Force compaction

$ bin/cqlsh -e "insert into demo.employee (id, name) values (5, 'user5')"
$ bin/nodetool flush
$ bin/nodetool compact

4. View the SStable using sstabledump

$ tools/bin/sstabledump data/data/demo/employee-*/mc-2-big-Data.db

5. Delete a record, flush and view SStable again

$ bin/cqlsh -e "delete from demo.employee where id = 1"
$ bin/nodetool flush
$ bin/nodetool compact

Note the SStable file name

$ ls -l data/data/demo/employee-*/

View the SStable. Note: use the appropriate SStable file name as found above.

$ tools/bin/sstabledump data/data/demo/employee-*/mc-3-big-Data.db

Reference:

  • http://distributeddatastore.blogspot.in/2013/08/cassandra-sstable-storage-format.html