SSTables

SSTables

SStable (Sorted String table) is set of 8 files how Cassandra persists data for durability.

Create a new table .. call it employee(id int primary key, name text);

$ bin/cqlsh -e "create table demo.employee(id int primary key, name text)"

$ ls -l data/data/demo/employee-*/

total 4

drwxrwxr-x 2 training training 4096 Jul 10 18:26 backups

Do a manual flush

[training@localhost apache-cassandra-3.10]$ bin/nodetool flush

[training@localhost apache-cassandra-3.10]$ ls -l data/data/demo/employee-*/

total 4

drwxrwxr-x 2 training training 4096 Jul 10 18:26 backups

2. Re-produce auto compaction

We have to run flush 4 times. 4th flush will trigger an auto compaction.

$ bin/cqlsh -e "insert into demo.employee (id, name) values (1, 'user1')"

$ bin/nodetool flush

$ bin/cqlsh -e "insert into demo.employee (id, name) values (2, 'user2')"

$ bin/nodetool flush

$ bin/cqlsh -e "insert into demo.employee (id, name) values (3, 'user3')"

$ bin/nodetool flush

$ bin/cqlsh -e "insert into demo.employee (id, name) values (4, 'user4')"

$ bin/nodetool flush

After every flush, observe data dir: data/data/demo/employee-*/

After the fourth flush, you should see compacted sstables.

$ ls -l data/data/demo/employee-*

total 40

drwxrwxr-x 2 training training 4096 Jul 10 18:26 backups

-rw-rw-r-- 1 training training 51 Jul 10 18:36 mc-5-big-CompressionInfo.db

-rw-rw-r-- 1 training training 101 Jul 10 18:36 mc-5-big-Data.db

-rw-rw-r-- 1 training training 9 Jul 10 18:36 mc-5-big-Digest.crc32

-rw-rw-r-- 1 training training 16 Jul 10 18:36 mc-5-big-Filter.db

-rw-rw-r-- 1 training training 32 Jul 10 18:36 mc-5-big-Index.db

-rw-rw-r-- 1 training training 4618 Jul 10 18:36 mc-5-big-Statistics.db

-rw-rw-r-- 1 training training 56 Jul 10 18:36 mc-5-big-Summary.db

-rw-rw-r-- 1 training training 92 Jul 10 18:36 mc-5-big-TOC.txt


3. Force compaction

$ bin/cqlsh -e "insert into demo.employee (id, name) values (5, 'user5')"

$ bin/nodetool flush

$ bin/nodetool compact

4. View the SStable using sstabledump

$ tools/bin/sstabledump data/data/demo/employee-*/mc-2-big-Data.db

5. Delete a record, flush and view SStable again

$ bin/cqlsh -e "delete from demo.employee where id = 1"

$ bin/nodetool flush

$ bin/nodetool compact

Note the SStable file name

$ ls -l data/data/demo/employee-*/

View the SStable. Note: use the appropriate SStable file name as found above.

$ tools/bin/sstabledump data/data/demo/employee-*/mc-3-big-Data.db

Reference:

  • http://distributeddatastore.blogspot.in/2013/08/cassandra-sstable-storage-format.html