Related Technologies

JanusGraph

JanusGraph is a OLPT graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. JanusGraph is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time. It can use Cassandra as backend storage. JenusGraph is a fork from the TitanDB. One commentary has been - "JenusGraph picks up where TitanDB left off". JanusGraph could be the de facto reference provider implementation for TinkerPop.

Apache Tinkerpop

Apache TinkerPop is a graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP). JanusGraph supports queries using apache TinkerPop. Below are some examples of TinkerPop queries.

// What are the names of the managers in the management chain going from Gremlin to the CEO?

gremlin> g.V().has("name","gremlin").repeat(in("manages")).until(has("title","ceo")).path().by("name")

// What is the distribution of job titles amongst Gremlin's collaborators?

gremlin> g.V().has("name","gremlin").as("a").out("created").in("created").where(neq("a")).groupCount().by("title")

// Get a ranking of the most relevant products for Gremlin given his purchase history.

gremlin> g.V().has("name","gremlin").out("bought").aggregate("stash").in("bought").out("bought").

where(not(within("stash"))).groupCount().order(local).by(values,decr)

Gremlin

Gremlin is a graph traversal language and virtual machine developed by Apache TinkerPop. Gremlin works for both OLTP-based graph databases as well as OLAP-based graph processors. Gremlin's automata and functional language foundation enable Gremlin to naturally support imperative and declarative querying, host language agnosticism, user-defined domain specific languages, an extensible compiler/optimizer, single- and multi-machine execution models, hybrid depth- and breadth-first evaluation.

Other graph database systems

Amazon Neptune - Fully-managed graph database service.
Bitsy - A small, fast, embeddable, durable in-memory graph database.
Blazegraph - RDF graph database with OLTP support.
CosmosDB - Microsoft's distributed OLTP graph database.
ChronoGraph - A versioned graph database.
DSEGraph - DataStax graph database with OLTP and OLAP support.
GRAKN.AI - Distributed OLTP/OLAP knowledge graph system.
Hadoop (Spark) - OLAP graph processor using Spark.
HGraphDB - OLTP graph database running on Apache HBase.
IBM Graph - OLTP graph database as a service.
JanusGraph - Distributed OLTP and OLAP graph database with BerkeleyDB, Apache Cassandra and Apache HBase support.
JanusGraph (Amazon) - The Amazon DynamoDB Storage Backend for JanusGraph.
Neo4j - OLTP graph database (embedded and high availability).
neo4j-gremlin-bolt - OLTP graph database (using Bolt Protocol).
OrientDB - OLTP graph database
Apache S2Graph - OLTP graph database running on Apache HBase.
Sqlg - OLTP implementation on SQL databases.
Stardog - RDF graph database with OLTP and OLAP support.
TinkerGraph - In-memory OLTP and OLAP reference implementation.
Titan - Distributed OLTP and OLAP graph database with BerkeleyDB, Apache Cassandra and Apache HBase support.
Titan (Amazon) - The Amazon DynamoDB storage backend for Titan.
Titan (Tupl) - The Tupl storage backend for Titan.
Unipop - OLTP Elasticsearch and JDBC backed graph.

Scylla - Cassandra Killer?

Scylla Is Next Generation NoSQL database that claims to give 10x performance of Cassandra. It is written in C++ ground up. It gives redis like performance. Scylla is a droping replacement of Cassandra 2.2 along with support for. Find the roadmap of Scylla here.

All Apache Cassandra Drivers
Protocols: CQL, Thrift, JMX
Tooling: cqlsh, nodetool, cassandra-stress, and all of Cassandra 2.2 tools
SSTable format

C++ applications can draw in maximum output from the available hardware resources. It is evident from the benchmark report too - to achieve the same of level of performance by a 3 node Scylla database might require as much as 30 nodes of Cassandra database. In the industry there is a push for C++ based products that take lower the hardware requirements and lower energy bills at the data center level. One drawback of C++ is that it requires significantly higher learning curve compared to Java and lack of standard libraries that Java ecosystem is blessed with.

Benchmark reports: https://www.scylladb.com/product/benchmarks/

Although Scylla has a superior throughput than Cassandra, the latter is more mature and battle tested for numerous internet scale applications with commercial support from Datastax. Perhaps sticking to Cassandra to solving is a good idea at this moment and let Scylla gain a more product maturity.

Related Technologies

JanusGraph

Apache Tinkerpop

Gremlin

Scylla - Cassandra Killer?

Pithos - build S3 like object store using Cassandra