Quantcast
Channel: Dev Posts – DataStax
Viewing all 381 articles
Browse latest View live

Inside DSE Graph: What Powers the Best Enterprise Graph Database

$
0
0

DSE Graph is a scalable, real-time graph database which was released at the end of June as a new addition to the DSE platform. After recovering from the turbulence of a major release, the time has come to peel back the curtain and look into the engine room: What are the major features and innovations that make DSE Graph an enterprise-grade graph database?

dsegraph

A graph database is a database system purpose-built for managing highly connected data. Unlike other database systems, including RDBMS and NoSQL, graph databases make it easy to model and query for relationships.

DSE Graph uses the property graph data model and Gremlin query language of the Apache TinkerPop™ project – the open-source, vendor-neutral graph database standard governed by the Apache Software Foundation.

gremlin

The property graph data model can express complex data models as-is without a logical mapping – a characteristic that’s often described as “whiteboard friendly”. The Gremlin query language can succinctly express query paths and subgraph patterns without the need for cumbersome JOINs or custom application code, making it easy to retrieve entities connected via complex relationships from a big graph of data. Apache TinkerPop™ is a central ingredient and in many ways the primary interface to DSE Graph.

Implementing the property graph data model and supporting a graph query language is sufficient to expose a database system as a graph database, but – like putting lipstick on a pig – this often results in slow performance and unexpected system behavior. What makes a good graph database is a balanced combination of efficient property graph data representation, fast graph-centric index structures, and smart query optimization. DSE Graph achieves this combination in a distributed, scale-out environment with no single point of failure and continuous availability using the following technologies.

Index-Free Adjacency

DSE Graph stores graphs in their adjacency list representation. All properties and edges that touch a particular vertex are stored in a consecutive, sorted list on the node in the cluster to which the vertex is assigned. This representation allows us to navigate through the graph from vertex to vertex without having to call into an index structure. By contrast, storing edges in a large table – which would be the normal approach for RDBMS or NoSQL stores – requires an expensive, global index to locate vertex data.

Adjacency list sort order facilitates efficient retrieval of subsets of the adjacency list. As graphs grow in size, queries often only require small subsets of the entire vertex data. In those cases, we exploit the sort order to limit the data retrieval and speed up query processing.

A key innovation of DSE Graph is an efficient mapping of the adjacency list representation onto the tabular storage format of Cassandra. Its implementation required changes to Cassandra’s storage engine in 3.0 and changes throughout the entire DSE stack to propagate a graph-optimized data representation.

DSE

This innovation allows DSE Graph to stand on top of the powerful distributed database foundation provided by Apache Cassandra™ without having to sacrifice storage efficiency or query performance.

Furthermore, DSE Graph can plug directly into the enterprise features of the DSE platform: OpsCenter management, data encryption, authentication, secure communication, multi-instance support, and auditing.

Vertex-Centric Indexes

vertixcentricindexes

For large graphs, it is not unusual for a single vertex adjacency list to grow to thousands of edges. Iterating over all those edges can be very time consuming for certain access patterns.

For instance, suppose we want to retrieve a customer’s ten most recent messages. If that customer has written thousands of messages, finding those ten can take a significant amount of time and requires retrieving a lot of data.

Vertex-centric indexes are access-specific index structures built and maintained per vertex to speed up such queries. For the example above, we would install a vertex-centric index for `wroteMessage` edges by timestamp.

Unlike index structures in conventional database systems which scale logarithmically with the size of the entire dataset, vertex-centric indexes are maintained per vertex and hence the cost of maintenance is logarithmic in size of the adjacency list per individual vertex. In other words, maintaining and querying vertex-centric indexes remains inexpensive even as the overall graph grows huge. For that reason, vertex-centric indexes are essential for maintaining fast traversal query performance on very large graphs.

Vertex Partitioning

vertexpartitioning

A vertex and its adjacency list is assigned to and stored on a single machine in the database cluster as its primary replica. This assignment determines data locality and DSE Graph aims to place vertices such that frequently co-traversed vertices end up on the same machine which improves traversal performance.

DSE Graph’s partitioning techniques will be covered in future posts.

Edge Partitioning

Most natural graphs have a scale-free degree distribution which means that some vertices are highly connected and have very large adjacency lists. Storing those vertices and their adjacency list on a single machine would create hotspots and may even be infeasible for huge graphs.

DSE Graph supports edge partitioning by which fragments of the adjacency list are partitioned  across all machines in the cluster using a performance enhancing technique that supports co-processing with locally stored vertices without intra-cluster communication.

Query Optimizer

queryoptimizer

In addition to the index structures and partitioning techniques outlined above, DSE Graph also supports materialized view indexes in Cassandra, secondary indexes and the full indexing power of Solr via tight integration through DSE Search.

Hence, there are potentially many possible data access paths and choosing the optimal combination is crucial for query performance. Due to the combinatorial explosion in possible data access paths, DSE Graph uses an adaptive query optimizer to find the quickest way to answer any given user query. This allows the user to focus on what data they want to query for and lets DSE Graph figure out how to get it.

This article summarizes some of the key technologies and innovations in DSE Graph that make it a real-time, scalable, and enterprise-grade graph database. Download DSE Graph to play with it, sign up to DataStax Academy to learn more about Graph or contact our graph database experts to answer your questions.


Special thanks to Lorina, Jeremy, Jonathan, and Robin for reviewing this article, providing thoughtful comments, and useful suggestions.


New Simba ODBC Driver for DataStax Enterprise 5.0 Available

$
0
0

In collaboration with our partner Simba we are making available a new version of the Simba ODBC driver with CQL connector for DataStax Enterprise 5.0. Built on top of the DataStax C/C++ driver version 2.4.1, this new version works with DataStax Enterprise 5.0 and Apache Cassandra 3.0+, adds support for new data types (smallint, tinyint, time, date), and also introduces the ability to use whitelists or blacklists when configuring connections. For complete details about the release please check the releasenotes (PDF).

The new version of the driver, 2.4.1.1001 can be downloaded from the DataStax drivers page. The Installation and configuration guide is also available in PDF format.

Python Driver 3.7.0 Released

$
0
0

The DataStax Python Driver 3.7.0 for Apache Cassandra has been released. This release had no specific area of focus, but brings a number of new features and improvements. A complete list of issues is available in the CHANGELOG. Here I will mention some of the new features.

Session request listener and query request size information

In addition to cluster metrics, you can now register a session request listener and use it to track alternative metrics about requests (ie. the request size). See this request analyzer as an example.

Speculative query retries

The driver now implements speculative query retries in order to offer smoother latencies even while experiencing some node hiccups. Idempotent statements can benefit from this mechanism. This is a generally extensible interface, but we have also added a ConstantSpeculativeExecutionPolicy implementation. To enable this feature, you need to set a speculative_execution_policy and mark your statement as idempotent.

from cassandra.cluster import Cluster, ExecutionProfile
from cassandra.policies import ConstantSpeculativeExecutionPolicy
from cassandra.query import SimpleStatement

cluster = Cluster()

# send a new request every 100ms for a maximum of 10 attempts
ep = ExecutionProfile(speculative_execution_policy=ConstantSpeculativeExecutionPolicy(.1, 10))
cluster.add_execution_profile('my_app_ep', ep)
session = cluster.connect('test')

statement = SimpleStatement("SELECT i FROM d WHERE k = 0", is_idempotent=True)
result = session.execute(statement, execution_profile='my_app_ep')

Expose paging state

The ResultSet class exposes a new attribute: the paging_state. It can be useful if you have to resume pagination through stateless requests from your application. To use it, you just need to send the paging_state parameter when executing a new query (session.execute).

query = "SELECT * FROM users"
statement = SimpleStatement(query, fetch_size=10)
results = session.execute(statement)

# save the paging_state somewhere...
session['paging_state'] = results.paging_state

# and use it later to resume the pagination
query = "SELECT * FROM users"
statement = SimpleStatement(query, fetch_size=10)
paging_state = session['paging_state']
results = session.execute(statement, paging_state=paging_state)

EC2 address resolver

In the 3.3.0 release, we introduced a new AddressTranslator interface that allows you to implement your ip addresses translation depending on your environment (ie. public ips versus private ips). We now add an official translator for Amazon EC2 since it is heavily used: the EC2MultiRegionTranslator.

from cassandra.cluster import Cluster
from cassandra.policies import EC2MultiRegionTranslator

cluster = Cluster(['127.0.0.1'], address_translator=EC2MultiRegionTranslator())
session = cluster.connect()

# do stuff...

CQLEngine: support of multiple keyspaces and sessions

Prior to this release, using multiple keyspaces and sessions was a common problematic. We now introduce a new experimental feature to accommodate this use case: the Connections. You can now register multiple connections and switch the context on the fly in your application. Here is an example of the cqlengine connection capabilities:

from cassandra.cqlengine import connection
# ...

CONNS = ['cluster1', 'cluster2']
KEYSPACES = ('client1', 'client2', 'client3', 'client4')

connection.register_connection('cluster1', ['127.0.0.1'], default=True)
connection.register_connection('cluster2', ['127.0.0.50'], lazy_connect=True)

for keyspace in KEYSPACES:
    keyspace_simple(keyspace, 3, connections=CONNS)

class Automobile(Model):
    __connection__ = 'cluster2'  # default connection per model
    manufacturer = columns.Text(primary_key=True)
    year = columns.Integer(primary_key=True)
    model = columns.Text()

# sync the table for all connections and keyspaces
sync_table(Automobile, KEYSPACES, CONNS)

# Select the connection and keyspace via the ContextQuery
with ContextQuery(Automobile, connection='cluster1' keyspace='client2') as A:
    A.objects.create(manufacturer='honda', year=2004, model='civic')

# Read from the default model connection 'cluster2'
print len(Automobile.objects.using(keyspace='client2').get(manufacturer='honda', year=2004))  # 0

# Select the connection and keyspace on the fly
print len(Automobile.objects.using(connection='cluster1', keyspace='client2').all())  # 1

# Select on the model instance
a = Automobile.objects.using(connection='cluster1',keyspace='client2').get(manufacturer='honda', year=2004)
a.using('cluster2').save()  # save on cluster2 rather than cluster1

# Connection select with a BatchQuery
with BatchQuery(connection='cluster1' keyspace='client4') as b:
    A.objects.batch(b).create(manufacturer='honda', year=2004, model='civic')
    A.objects.batch(b).create(manufacturer='honda', year=2005, model='civic')
    A.objects.batch(b).create(manufacturer='honda', year=2006, model='civic')

See the documentation here for more details.

Wrap

As always, thanks to all who provided contributions and bug reports. The continued involvement of the community is appreciated:

Protected: Gremlin’s Time Machine

$
0
0

This content is password protected. To view it please enter your password below:

Getting Started with DSE Graph & My Favorite Lessons Learned

$
0
0

The 2016 Cassandra Summit was a fantastic event. Friends, new connections, learning, technical discussion everywhere – just fantastic!

One question that was raised several times throughout the event was “How do I get started with DSE Graph?”  This post aims to answer this question and help those who are interested in learning and exploring DataStax Enterprise Graph.  

Acquiring DSE Graph

There are two different options for acquiring DSE Graph:

  1. Use the VM Image from the DataStax Academy – DS330 course
  2. Download and install DataStax Enterprise 5.0

If you are simply looking for a simple sandbox, then the VM Image that accompanies the DS330 course may be a good option for you.

If you would like to explore DSE Graph in a more “real life” scenario, then it’s recommended to download DataStax Enterprise to get access to graph functionality. DataStax documentation can help with installation procedures. Once installed, you simply tell DSE to start in “Graph Mode”.  

DataStax also released a great tool for visual exploration and development against DSE Graph named DataStax Studio. DataStax Studio is the recommended method to work with DSE Graph.  The DSE 5.0 installer will install DataStax Studio for you (just make sure to choose Developer Related Tools during the installation).

Learning DSE Graph

DataStax has and continues to invest in helping you learn DSE Graph.  There are several great resources available to help you quickly gain the skills necessary to master DSE Graph.

  • Free DataStax Academy Training: The first is our free DataStax Academy training content. If you are brand new to Graph, then check out this great “Introduction to Graph” course as well as sql2gremlin .  If you are already acquainted with the general concepts of graph databases, then start with our introductory DSE Graph course.   
  • DataStax Professional Community: DataStax Academy is about more than just learning content – it’s also an entry into the DataStax Professional Community.  DataStax Academy provides a great resource for questions and answers via Slack.  Come join our Graph room for general questions or the DS330 room if you have questions on the Graph course.
  • DSE Graph Documentation: The DSE Graph documentation was created through real world experiences and hands on knowledge.  It’s a must read for those who are new to DSE Graph. There’s also some great documentation on DataStax Studio.
  • DataStax Blogs: DataStax is passionate about blogging experiences around DSE Graph. Here’s a great blog that provides an overview of some of the more powerful features of DSE Graph. Be on the lookout for more blog content in the near future.

Developing with DSE Graph

Once you have DSE Graph up and running and have learned the basics, it’s time to get started with DSE Graph.

To start, check out the two preconfigured notebooks that ship with DataStax Studio, “Welcome to DataStax Studio!” and “DSE Graph QuickStart”. The Welcome notebook provides a very quick way for you to get a feel for the great features in DataStax Studio while the DSE Graph Quickstart walks you through the basic steps of creating a graph in DSE Graph. This is experiential learning at it’s best.

For bulk loading data into DSE Graph, check out the DataStax Graph Loader utility.  This is a handy tool for loading csv, rdbms, Titan data via gyro or graphson, and other data sources.  The bulk loading utility can even generate your graph schema for you.

When the time comes, you’ll want to work with the DataStax Enterprise drivers for your application code needs with DSE Graph. An upcoming dot release of DSE will enable the DSE drivers to use a Fluent Gremlin API. This is a great feature that will provide direct Gremlin query semantics within the DataStax driver. The DataStax drivers are the best choice when developing a graph application against DSE Graph.

We already mentioned the DataStax Academy resources that are available, but the DataStax network of skilled Solutions Engineers and Solutions Architects are also a great resource to help you succeed with DSE Graph.  Simply contact us if you’d like some assistance with your DSE Graph initiative.

Finally, here are some of my personal favorite lessons learned over the past few months, since we’ve released DSE Graph in DSE 5.0:

  • The Apache TinkerPop™ API page is a great place to go for Gremlin API specific learning.
  • Graph Development Mode will enable different behavior compared to Production Mode.  Be sure to check out this part of the documentation for details.  
  • DataStax Studio’s code assist features are nothing short of fantastic. Ctr + Space is your friend.
  • The Gremlin feature/pattern of repeat() + subgraph() is a handy Gremlin traversal pattern when working in DataStax Studio.  
    • An example traversal used in our Cassandra Summit 2016 DSE Powertrain demo follows:
      int degree = 2
      g.V().has('github_user', 'account', 'jlacefie').repeat(bothE().subgraph('x').otherV()).times(degree).cap('x')
  • This query allows us to dynamically select the degree at which we walk out from a single vertex and returns a nice looking result set (subgraph) in Studio.

studio subgraph

  • Code samples from the DataStax documentation query section are great for quick references to repeatable traversal patterns.
  • Removing Vertex Labels isn’t straight forward right now, but we’re working on that. In the meantime, try to get your data model correct up front or, if you need to remove a vertex label in a development/exploratory cycle, it’s recommended to recreate the graph.

Build Something Disruptive

We hope this post will help you get started with DSE Graph. We believe that DSE Graph is the best choice for a real-time, scalable, and enterprise-grade graph database and believe you’ll quickly realize this as well.  Until next time, build something disruptive.  

DSE 5.0.3 released – Huge Performance gains for Graph Analytics

$
0
0

Graph Analytics in DataStax Studio

We just released DSE 5.0.3 and as the version number indicates this patch release contains a number of bug fixes across the entire DSE platform.

In addition, 5.0.3 packs some major performance improvements for the graph-analytics integration. DSE Graph supports analytic queries powered by Apache SparkTM and Apache TinkerPopTM as part of our DSE Max offering. This means, you can analyze an entire, huge graph across your cluster out of the box and without any additional integration work or ETL. DSE Graph reads the graph out of  Apache CassandraTM and transforms it into a TinkerPopTM-compatible representation that is optimized for batch-transformation and iterative graph algorithms. In 5.0.3 we made that transformation more efficient and optimized the serialization format to reduce the memory requirements and storage footprint of our analytic graph representation.

The combination of these optimizations resulted in substantial performance improvements. The benchmark shown below repeatedly executes the given queries on a synthetically generated graph of randomized topology with 10 million vertices and 100 million edges and records the wall-clock time [1]. Queries range from simple counting queries (first row) to a full, multi-iteration PageRank execution (last row). The first column shows the executed query, the second column the raw execution time on 5.0.3 and the last columns specify the improvement over DSE 5.0.0 in terms of factor speedup as well as percentage improvement.

Query Avg Time (sec) Speedup Improvement
g.V().count() 201.32 2.78 178%
g.V().out().count() 201.51 2.75 175%
g.V().out().out().count() 275.98 4.50 350%
g.V().out().out().out().count() 511.38 4.77 377%
g.V().has(“age”).groupCount().by(“age”) 212.10 2.55 155%
g.E().groupCount().by(label) 197.62 3.05 205%
g.V().out(“knows”).in(“likes”).count() 323.56 6.29 529%
g.V().pageRank() 2,571.09 5.45 445%

 

As you can see, DSE 5.0.3 is at least 2 times faster than 5.0.0 on the given analytic graph queries and many times faster on some queries. On average (across the queries), DSE 5.0.3 is 4 times faster than 5.0.0. That is a huge performance improvement of over 300%.

We are working on additional improvements and tighter integration between the transactional and analytic worlds of graph databases, blurring the lines between the two worlds. Our goal for DSE is a seamless combination of real-time read and write workloads with advanced analysis of the entire dataset. DSE 5.0.3 is another step in that direction as we continue to push the envelope.

Footnotes
[1] The benchmark was executed on a 6 node cluster in our OpenStack cloud. Each machine had 48 VCPUs, 256 GB disk and 256 GB of RAM.

DataStax C/C++ Driver: 2.5 released!

$
0
0

We are excited to announce the 2.5 release of the C/C++ driver for Apache Cassandra. This release brings with it a couple new features and several bug fixes.

What’s new

Speculative execution

For certain applications it is of the utmost importance to minimize latency. Speculative execution is a way to minimize latency by preemptively starting several of the same query against different nodes. The fastest response is then returned to the client application and the other requests are cancelled. This is a fundamental trade-off between overall throughput for better latency. Speculative execution is disabled by default.

Speculative execution is enabled by connecting a CassSession with a CassCluster that has a speculative execution policy enabled. The driver currently only supports a constant execution policy, but may support more in the future. The following will start up to 2 more executions after the initial execution with the subsequent executions being created 500 milliseconds apart:

CassCluster* cluster = cass_cluster_new();

cass_int64_t constant_delay_ms = 500; /* Delay before a new execution is created */
int max_speculative_executions = 2;   /* Number of executions */
cass_cluster_set_constant_speculative_execution_policy(cluster
                                                       constant_delay_ms,
                                                       max_speculative_executions);

/* ... */

cass_cluster_free(cluster);

Idempotent queries

Speculative execution will result in executing the same query several times. Therefore, it is important that queries be idempotent (i.e. a query can be applied multiple times without changing the result beyond the initial application). Queries that are not explicitly marked as idempotent will not be scheduled for speculative executions.

The following types of queries are not idempotent:

  • Mutation of counter columns
  • Prepending or appending to a list column
  • Use of non-idempotent CQL function (e.g. now() or uuid())

The driver is unable to determine if a query is idempotent therefore it is up to an application to explicitly mark a statement as being idempotent.

CassStatement* statement = cass_statement_new( "SELECT * FROM sometable", 0);
cass_statement_set_is_idempotent(statement, cass_true);

Beyond speculative execution, marking a query as idempotent allows the C/C++ driver to more eagerly retry requests in the face of spurious failures. In previous releases of the driver any failure of a query after being sent to Cassandra would result in a failure being returned to the client application. The driver is now able to use query’s idempotence to make more intelligent decisions about retrying a request instead of always returning a failure.

Feedback

More detailed information about all the features, improvements and fixes included in this release can be found in the changelog. Let us know what you think about the release. Your involvement is important to us and it influences what features we prioritize. Use the following resources to get involved:

DataStax Enterprise C/C++ Driver: 1.0 released!

$
0
0

Earlier this year, we announced the release of Datastax Enterprise (DSE) 5.0 and, shortly after that, the general availability of new dedicated drivers.

In this post we are going to focus on the DataStax Enterprise C/C++ Driver 1.0.

Overview

The DataStax Enterprise C/C++ Driver 1.0 is built on top of the “core” driver and supports additional features provided by DSE 5.0, such as Unified Authentication, Geospatial types, and Graph.

The DataStax Enterprise driver uses the same “core” object drivers where possible to make integration straightforward and easy. The main difference is your application must include a the new dse.h header file.

#include <dse.h> /* Use this instead of `cassandra.h` */

int main() {

  /* It's highly recommended that your application use this function
   * to create cluster instead of `cass_cluster_new()`. This enables
   * DSE specific settings and policies.
   */
  CassCluster* cluster = cass_cluster_new_dse();

  /* ... */

  cass_cluster_free(cluster);
}

Unified authentication

DSE 5.0 adds DSE Unified Authentication which supports using different authentication schemes simultaneously on the same cluster. DSE Unified Authentication and the DataStax Enterprise C/C++ Driver supports the following authentication providers: internal authentication, LDAP, and Kerberos (GSSAPI).

Internal and LDAP Authentication

These providers use plaintext authentication and should use the cass_cluster_set_dse_plaintext_authenticator() function to the set username and password credentials.

CassCluster* cluster = cass_cluster_new_dse();

cass_cluster_set_dse_plaintext_authenticator(cluster, "username", "password");

/* ... */

cass_cluster_free(cluster);

Kerberos Authentication

The Kerberos provider uses GSSAPI and should use the cass_cluster_set_dse_gssapi_authenticator() function to set the Kerberos ticket information.

CassCluster* cluster = cass_cluster_new_dse();

/* By default the hostname is used to lookup and verify a Kerberos ticket, otherwise,
 * the IP address is used. If using the hostname then reverse DNS must be enabled.
 */
cass_cluster_set_use_hostname_resolution(cluster, cass_true);

/* Specify the service and principle for Kerberos. To use the default principle
 * use an empty string.
 */
cass_cluster_set_dse_gssapi_authenticator(cluster, "dse", "dse@DATASTAX.COM");

/* ... */

cass_cluster_free(cluster);

For more information check out the documentation and the GSSAPI example.

Geospatial types

DSE 5.0 comes with a set of additional types to represent geospatial data: PointType, LineStringType, and PolygonType. Here is an example of a table containing a column of type PointType:

CREATE TABLE points_of_interest(name text PRIMARY KEY, coords 'PointType');

The CQL literal representing a geospatial type is simply its Well-known Text (WKT) form. Inserting a row into the table above using plain CQL is as easy as:

INSERT INTO_ points_of_interest (name, coords) VALUES ('Eiffel Tower', 'POINT(48.8582 2.2945)');

Of course, you are not limited to string literals to manipulate geospatial types; the DataStax Enterprise C/C++ Driver 1.0 also includes its own representations of these types, which can be sent as query parameters, or retrieved back from query results:

CassStatement* statement =
  cass_statement_new("INSERT INTO_ points_of_interest (name, coords) VALUES (?, ?)", 2);

cass_statement_bind_string(statement, 0, "Eiffel Tower");

/* Bind a point using with the point's components */
cass_statement_bind_dse_point(statement, 1, 48.8582, 2.2945);

/* Execute statement */

For more infomration check out the documentation and the geotypes example.

Graph

DSE 5.0 also brought with it the inclusion of a powerful graph database: DSE Graph. The DataStax Enterprise C/C++ Driver now supports using Gremlin, a graph traversal language, for interacting with DSE Graph.

CassSession now accepts graph queries using the cass_session_execute_dse_graph() function. This execute function accepts the new statement type DseGraphStatement.

Here’s a simple example of a graph query:

/* Create a graph options so that we can set a specific graph name: "test" */
DseGraphOptions* options = dse_graph_options_new();

/* Set the graph name */
dse_graph_options_set_graph_name(options, "test");

/* Create a graph query */
DseGraphStatement* statement =
  dse_graph_statement_new("g.V().has('name','marko').out('knows').values('name')", options);

/* Execute the graph query */
CassFuture* future =
  cass_session_execute_dse_graph(session, statement);

/* Check and handle the result */
if (cass_future_error_code(future) == CASS_OK) {
  DseGraphResultSet* resultset = cass_future_get_dse_graph_resultset(future);

  /* Handle result set */
} else {
  /* Handle error */
}

/* Cleanup */
cass_future_free(future);
dse_graph_statement_free(statement);

Please refer to both the DSE 5.0 Graph documentation and the driver documentation on graph for further information about graph queries.

Getting the driver

The DataStax Enterprise C/C++ Driver binaries are available on our downloads site or the driver can be built from scratch using this documentation.

Be aware that the DataStax Enterprise C/C++ Driver is published under specific license terms that allow its usage solely in conjunction with DataStax Enterprise software.

Resources for the the DataStax Enterprise C/C++ Driver can be found at the following locations:


A Gremlin Implementation of the Gremlin Traversal Machine

$
0
0

Gremlin Halloween It’s Halloween and Gremlin and his machine friends are planning on cruising The TinkerPop in search of tasty treats. For costumes this year, the crew decided to dress up as one another: Gremlin is going as Pipes, Pipes as Blueprints, Blueprints as Gremlin, Frames as Rexster, Rexster as Frames, and Furnace, well…Furnace didn’t get the memo. During the day, the gang ran around gathering materials for their costumes and planning their night’s path(). While a warm sense of joy engulfed the excited lot, unbeknownst to Gremlin, this Halloween night will leave a permanent, lasting scar in his mind. Gremlin thinks he will be getting treats, but by nights end, he will learn one of the most dastardly tricks of The TinkerPop and suffer its logical entailment for all eternity. Let’s tune in and watch their Halloween adventure unfold().

 
 
 

Traversals: Graph-Encoded Traversals

Assume the graph diagrammed below. This toy graph is distributed with Apache TinkerPop™ in various graph serialization formats: GraphML, GraphSON, and Gryo. In this toy graph, there are person– and software-vertices, where a person knows another person and people have created software.

TinkerPop Modern Graph

 

 
IMPORTANT: All of the code snippets were run with Apache TinkerPop 3.2.3 and build on each other over the course of the article.
 

 

Gremlin is a graph traversal language used to query/analyze/process graph data. It can be used with both OLTP/real-time graph databases and OLAP/analytic graph processors. Gremlin traversals are written in the native programming language of the user (Gremlin is a “hosted language”). While Gremlin can be written in any programming language, ultimately, traversals are translated to a language agnostic format called Gremlin bytecode. Gremlin bytecode enables any Gremlin language variant to interact with a Gremlin traversal machine. Apache TinkerPop provides a JVM-based implementation of the Gremlin traversal machine. Every Gremlin traversal machine is responsible for compiling, optimizing, and evaluating a traversal against any TinkerPop-enabled graph system (OLTP or OLAP). The stages of a traversal’s life, from user to graph, are diagrammed below.

The Stages of Gremlin

 
Gremlin-Python Logo A collection of basic Gremlin traversals is provided below. These traversals are written in Gremlin-Python. Gremlin-Python is a Gremlin language variant distributed by Apache TinkerPop. Note that the toy graph previously diagrammed has already been loaded and is currently the only data in the graph (bin/gremlin-server.sh conf/gremlin-server-modern-py.yaml).

~ python
Python 2.7.10 (default, Oct 23 2015, 19:19:21)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from gremlin_python import statics
>>> from gremlin_python.structure.graph import Graph
>>> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
>>> statics.load_statics(globals())
>>>
>>> g = Graph().traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
>>>
>>> g.V().has('name','marko').out('knows').count().next()
2L
>>>
>>> g.V().has('name','marko'). \
...   out('created').in_('created'). \
...   has('name',neq('marko')).dedup().name.toList()
[u'josh', u'peter']
>>>
>>> g.V().has('name','lop').in_('created'). \
...   out('knows').dedup().age.mean().next()
29.5
>>>
>>> g.V().hasLabel('person'). \
...   repeat(both('knows')).times(5). \
...   groupCount().by('name').next()
{u'vadas': 4L, u'marko': 8L, u'josh': 4L}
>>>
 
NOTE: While in() is a step in Gremlin, in is a reserved term in Python. Therefore, Gremlin-Python uses a _-postfix for all reserved terms, where in() is written in_(). Other reserved Python terms include and, or, as, is, not, etc.
 

 

  • Line 1: Open the CPython console.
  • Line 5-8: Import Gremlin-Python classes and load “statics” (e.g. __.out() can be written out()).
  • Line 10: Create a WebSockets connection to a Gremlin traversal machine at localhost:8182.
  • Line 12: How many people does Marko know?
  • Line 15-17: What are the names of Marko’s collaborators (i.e. co-creators)?
  • Line 20-21: What is the average age of the friends of the people who created LinkedProcess?
  • Line 24-26: Rank all people by how central they are in the knows-subgraph.

Gremlin-Groovy Logo As previously mentioned, this Halloween Gremlin is dressing up as Pipes. His costume is a traversal that encodes a traversal as a set of vertices and edges — forming an explicit pipeline process in the graph structure. Gremlin decided to first make a simple version of his traversal costume before generalizing it to support the graph-encoding of any arbitrary traversal. His first design below can process any linear, non-nested traversal. The code below uses Gremlin Console and thus, Gremlin-Groovy.

 
NOTE: Both Gremlin-Groovy and Gremlin-Python are communicating with the same Gremlin traversal machine over a WebSockets connection which was configured when defining the GraphTraversalSource g using the withRemote() source step.
 

 

$ bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin> import static org.apache.tinkerpop.gremlin.util.function.Lambda.function
gremlin> import static org.apache.tinkerpop.gremlin.util.function.Lambda.predicate
gremlin> import static org.apache.tinkerpop.gremlin.util.function.Lambda.consumer
gremlin> g = EmptyGraph.instance().traversal().withRemote(DriverRemoteConnection.using(Cluster.build('localhost').port(8182).create(),'g'))
==>graphtraversalsource[emptygraph[empty], standard]
gremlin>
gremlin> traversal = g.V().has('name','marko').out('knows').values('name'); []
gremlin>
gremlin> g.inject(traversal).
           map(function('it.get().bytecode')).as('bytecode').
           addV('traversal').as('traversal').property('source','g').
           select('bytecode').map(function('it.get().stepInstructions')).unfold().
           project('op','args').as('inst').
             by(function('it.operator')).
             by(function('it.arguments as List')).
           addV('step').as('step').
             property('op', select('op')).
             sideEffect(select('inst').select('args').unfold().as('arg').
                        select('step').property(list,'arg',select(last,'arg'))).
           choose(select('traversal').out('next'),
             addE('next').from(V().where(hasLabel('step').and().not(outE('next'))).where(neq('step'))).to(select('step')),               
             addE('next').from(select('traversal')).to(select('step'))).iterate()
  • Line 1: Open up the Gremlin Console which provides REPL support for Gremlin-Groovy.
  • Line 9-11: Import static Lambda methods to make the subsequent traversals more compact.
  • Line 12: Create a remote connection to the Gremlin traversal machine. This is the same machine accessed previously in the Gremlin-Python example.
  • Line 15: Define a traversal that will be encoded into the graph. The [] is a trick that prevents the Gremlin Console from iterating (i.e. compiling and evaluating) the traversal.
  • Line 17: Put the defined traversal into the costume traversal stream.
  • Line 18: Get the bytecode representation of the traversal.
  • Line 19: Create a traversal-vertex that will represent the source of the traversal.
  • Line 20: Go back to the bytecode and stream out its step instructions. For the sake of simplicity, this article will ignore source instructions.
  • Line 21-23: Project out the operator and arguments of each instruction in the bytecode.
  • Line 24: Each instruction is represented by a step-vertex.
  • Line 25: Each step-vertex has an op-property denoting the step function it represents.
  • Line 26-27: Each argument of the instruction is an arg multi-property on the step-vertex.
  • Line 28-30: The first step-vertex is attached to the traversal-vertex. All others step-vertices connect to the previous step-vertex.

The Gremlin traversal below is converted into the diagrammed graph structure by the aforementioned Gremlin costume traversal. In this way, a traversal is used to traverse a traversal in order to encode the traversed traversal into the graph. Once the traversal is represented in the graph, it can be traversed by traversals!

g.V().has('name','marko').out('knows').values('name')

Bytecode-Graph #1

The graphic above details the vertex/edge-structure of the traversal’s graph representation. This structure can also be realized via a traversal that starts at the newly created traversal-vertex and walks next-edges until no more step-vertices can be reached. Then for each vertex touched, the source-, op-, and arg-properties are projected out of the path.

gremlin> g.V().has('source','g').
           repeat(out('next')).until(out('next').count().is(0)).
           path().by(valueMap('source','op','arg'))
==>[[source:[g]],[op:[V]],[op:[has],arg:[name,eq(marko)]],[op:[out],arg:[knows]],[op:[values],arg:[name]]]

Gremlin was impressed with his initial costume design. By representing a traversal in the graph, Gremlin was able to interact with it. However, before doing anything more complex with this graph-embedded traversal, Gremlin decides to generalize his initial design in order to support any arbitrary nested traversal.

 
NOTE: Many steps in Gremlin are traversal parents in that they take traversals as arguments. For instance, in repeat(out('created').in('created')), the repeat()-step takes the anonymous out('created').in('created') traversal as an argument. Topologically, Gremlin traversals are linear chains of steps with (potentially) nested, acyclic traversal trees.
 

 

gremlin> g.V().hasLabel(within('traversal','step')).drop()
gremlin>
gremlin> traversal = g.V().has('name','marko').
                       repeat(outE('knows','created').identity().inV()).times(2).
                       values('name').groupCount(); []
gremlin>
gremlin> g.inject(traversal).
           map(function('it.get().bytecode')).as('bytecode').
           addV('traversal').
             property('source','g').
             property('bytecode',select('bytecode')).
             property('depth',0L).
           repeat(V().hasLabel('traversal').has('bytecode').
             has('depth',loops()).as('traversal').
             values('bytecode').map(function('it.get().stepInstructions')).unfold().
             project('op','args').as('inst').
               by(function('it.operator')).
               by(function('it.arguments as List')).
             addV('step').as('step').
               property('op', select(last,'inst').select('op')).
               sideEffect(addE('parent').to(select(last,'traversal')).
               select(last,'inst').select('args').unfold().as('arg').
               choose(predicate('it instanceof Bytecode'),
                 addV('traversal').
                   property('source','__').
                   property('bytecode',select(last,'arg')).       
                   property('depth', union(loops(), constant(1)).sum()).
                   addE('child').from(select(last,'step')),
                 select(last,'step').property(list,'arg',select(last,'arg')))).
           choose(select(last,'traversal').not(out('next')), 
             addE('next').from(select(last,'traversal')).to(select(last,'step')),
             addE('next').from(select('prev').tail(local,1)).to(select(last,'step'))).
           select(last,'step').store('prev').
           sideEffect(select(last,'traversal').properties('bytecode','depth').drop()))
  • Line 1: Delete the previously graph-encoded traversal by dropping all traversal– and step-vertices.
  • Line 3-5: Define a traversal to be encoded. The [] is added to prevent the Gremlin Console from iterating the traversal.
  • Line 7: Put the defined traversal into the traversal stream.
  • Line 8: Convert the traversal to its bytecode representation.
  • Line 9-12: Create a traversal-vertex that will represent the source of the traversal.
  • Line 13: Loop while there are still traversal-vertices with a bytecode-property.
  • Line 14: Only look at those traversals at the current loop/nest depth.
  • Line 15: Stream out the instructions of the bytecode of the current traversal-vertex.
  • Line 16-18: Project out the operator and arguments for each instruction in the bytecode.
  • Line 19: Create a step-vertex for each instruction of the bytecode.
  • Line 20: The op-property of the step-vertex is the name of the step’s function.
  • Line 21: Every step points to its parent traversal-vertex. This edge makes things a bit easier later on.
  • Line 22: Stream out the arguments of the current instruction.
  • Line 23-29: If the argument is bytecode, create a child traversal-vertex. Else, add arg multi-properties.
  • Line 30-32: If the current step-vertex is the first instruction, attach it to the parent traversal-vertex, else to the previous step-vertex.
  • Line 33: Store the current step-vertex in a global stack for access by the next instruction.
  • Line 34: Once the traversal-vertex has been processed, drop its bytecode– and depth-properties signaling that it has already been encoded in the graph.

The more complex traversal-to-graph translator above is able to process nested traversals. Thus, the traversal below, with a repeat()-step has the subsequent diagrammed graph representation.

g.V().has('name','marko').
  repeat(outE('knows','created').identity().inV()).times(2).
  values('name').groupCount()

Bytecode-Graph #2

It is possible to “view” the traversal’s graph structure using a path()-based traversal. The Gremlin traversal below orders all traversals by their “depth” (i.e. how many child steps they are under). It then walks each traversal from the traversal-vertex to its final step-vertex projecting out the respective vertex properties along the path. Note that this traversal also shows the parent of the traversal source so its easy to see under which step a particular __-based traversal is contained.

gremlin> g.V().has('source').as('source').
           order().by(until(__.in('child').count().is(0)).
                      repeat(__.in('child').out('parent')).path().count(local),incr).
           choose(__.in('child'),
                  __.in('child').map(select('source').out('next')),
                  out('next')).
           until(out('next').count().is(0)).repeat(out('next')).
           path().by(valueMap('source','op','arg'))
==>[[source:[g]],[op:[V]],[op:[has],arg:[name,eq(marko)]],[op:[repeat]],[op:[times],arg:[2]],[op:[values],arg:[name]],[op:[groupCount]]]
==>[[source:[__]],[op:[repeat]],[op:[outE],arg:[knows,created]],[op:[identity]],[op:[inV]]]
Traversal and Graph in Graph

It is important to note that both the “graph” and the “traversal” are in the same graph. This is analogous, in many respects, to the memory of a physical machine being used to store both “programs” and “data.” The traversal below makes salient that both the original person/software-graph and the traversal-graph are co-located.

gremlin> g.V().group().
                 by(label).
                 by(valueMap('name','source','op').fold()).next()
==>software=[{name=[lop]}, {name=[ripple]}]
==>person=[{name=[marko]}, {name=[vadas]}, {name=[josh]}, {name=[peter]}]
==>step=[{op=[groupCount]}, {op=[V]}, {op=[outE]}, {op=[has]}, {op=[identity]}, {op=[repeat]}, {op=[inV]}, {op=[times]}, {op=[values]}]
==>traversal=[{source=[__]}, {source=[g]}]
Understanding Gremlin’s Group Step
 
Gremlin’s group()-step takes one “key”-modulator and one “value”-modulator (see by()-step). The “key”-modulator determines which aspect of the incoming vertex should be used as the key to group the vertex. The “value”-modulator determines which aspect of the incoming vertex should be stored as the value in the respective key-group. If the “value”-modulator uses a reducing barrier step, then group() effects a reduce. For instance, groupCount().by('name') is equivalent to group().by('name').by(count()).
 

 

Traversal Strategies: Traversals that Traverse Traversals

The night was still young and the crew had a few more of hours before their planned departure. Gremlin decided to use this time to further extend his costume. He thought to himself — “Well hell, if I can represent a traversal as a graph, then I can traverse that traversal in order to optimize it. I can be my own compiler!” (analogously: javac).

A Gremlin traversal machine has a collection of traversal strategies. Some of these traversal strategies are specific to Gremlin (e.g. optimization strategies) and some are specific to the underlying graph system (e.g. provider optimization strategies). Gremlin-specific traversal strategies rewrite a traversal into a semantically-equivalent, though (typically) more optimal form. Similarly, provider-specific strategies mutate a traversal so as to leverage vendor-specific features such as graph-centric indices, vertex-centric indices, push-down predicates, batch-retrieval, schema validation, etc. Now that a traversal is represented in the graph as vertices and edges, Gremlin can traverse it and thus, rewrite it.

 
NOTE: The current default traversal strategies distributed with Apache TinkerPop’s Gremlin traversal machine implementation includes: ConnectiveStrategy, IdentityRemovalStrategy, InlineFilterStrategy, MatchPredicateStrategy, RepeatUnrollStrategy, PathRetractionStrategy, IncidentToAdjacentStrategy, AdjacentToIncidentStrategy, FilterRankingStrategy, RangeByIsCountStrategy, LazyBarrierStrategy, ProfileStrategy, StandardVerificationStrategy.
 

 

IdentityRemovalStrategy

A simple traversal strategy is the IdentityRemovalStrategy. This strategy removes all identity()-steps because they are “no ops,” providing nothing to the computation save wasting clock-cycles evaluating a one-to-one identity mapping. The following traversal strategy traversal finds all identity-vertices and creates a next-edge from its previous step (or source) to its subsequent step. After doing so, the identity-vertex is deleted. Thus, the traversal below is a Gremlin-based implementation of the IdentityRemovalStrategy.

g.V().has('op','identity').as('step').
  addE('next').
    from(select('step').in('next')).
    to(select('step').out('next')).
  select('step').drop()
 
IMPORTANT: This Gremlin-based implementation of IdentityRemovalStrategy does not account for the fact that an identity-step may be the last step-vertex in the traversal chain. This lapse was intended in order to make the traversal as clear as possible. Subsequent strategies will account for such boundary conditions.
 

 

When the above IdentityRemovalStrategy traversal is applied to the graph encoded traversal, the graph is mutated as diagrammed and displayed below. In short, the outE('knows','created').identity().inV() fragment is rewritten as outE('knows','created').inV().

IdentityRemovalStrategy

gremlin> g.V().has('source').as('source').
           order().by(until(__.in('child').count().is(0)).
                      repeat(__.in('child').out('parent')).path().count(local),incr).
           choose(__.in('child'),
                  __.in('child').map(select('source').out('next')),
                  out('next')).
           until(out('next').count().is(0)).repeat(out('next')).
           path().by(valueMap('source','op','arg'))
==>[[source:[g]],[op:[V]],[op:[has],arg:[name,eq(marko)]],[op:[repeat]],[op:[times],arg:[2]],[op:[values],arg:[name]],[op:[groupCount]]]
==>[[source:[__]],[op:[repeat]],[op:[outE],arg:[knows,created]],[op:[inV]]]

IncidentToAdjacentStrategy

IncidentToAdjacentStrategy looks for incident patterns such as outE().inV(), inE().outV(), and bothE().otherV(). It rewrites such fragments into the semantically equivalent form of out(), in(), and both(), respectively. Thus, instead of fetching and materializing incident edges and then fetching and materializing the respective adjacent vertices, the optimized traversal skips accessing the edges and jumps directly to the adjacent vertices. The traversal is a Gremlin-implementation of IncidentToAdjacentStrategy. It locates all outE().inV() patterns in the graph encoded traversal and then rewrites the identified subgraph to out(). It is possible to generalize this traversal to support inE().outV() and bothE().otherV(), but for the sake of simplicity, only the outE().inV()-pattern is optimized.

g.V().match(
  __.as('a').out('next').as('b'),
  __.as('a').has('op','outE'),
  __.as('b').has('op','inV')).
addV('step').as('c').
  property('op','out').
  sideEffect(select('a').values('arg').as('arg').
             select('c').property(list,'arg',select('arg'))).
  addE('next').from(select('a').in('next')).to('c').
  choose(select('b').out('next').count().is(gt(0)),
    addE('next').from('c').to(select('b').out('next')),
    identity()).
select('a','b').select(values).unfold().drop()
Understanding Gremlin’s Match Step
 
The previous traversal uses Gremlin’s match()-step. This step provides a SPARQL/Prolog-ish approach to graph traversing. Instead of explicitly defining the traverser flow, match()-step maintains a collection of (potentially nested) patterns. It is up to match()-step to choose which pattern to execute next for the current traverser. The selection criteria is predicated on 1.) the requisite data being available for that pattern and 2.) a bias towards those patterns that have historically yielded the least amount of data (i.e. try and filter first). The latter point demonstrates that the Gremlin traversal machine also supports runtime strategies which, unlike compile-time strategies, mutate the traversal as the traversal is executing.
 

 

When the traversal above is applied to the running example graph, the graph is rewritten as diagrammed below.

IncidentToAdjacentStrategy

gremlin> g.V().has('source').as('source').
           order().by(until(__.in('child').count().is(0)).
                      repeat(__.in('child').out('parent')).path().count(local),incr).
           choose(__.in('child'),
                  __.in('child').map(select('source').out('next')),
                  out('next')).
           until(out('next').count().is(0)).repeat(out('next')).
           path().by(valueMap('source','op','arg'))
==>[[source:[g]],[op:[V]],[op:[has],arg:[name,eq(marko)]],[op:[repeat]],[op:[times],arg:[2]],[op:[values],arg:[name]],[op:[groupCount]]]
==>[[source:[__]],[op:[repeat]],[op:[out],arg:[knows,created]]]

LazyBarrierStrategy

LazyBarrierStrategy is perhaps the most important traversal strategy in Gremlin as, in certain situations, it is able to turn traversals that would take the lifetime of the universe to compute into sub-millisecond rippers (you know, killin’ it). Graph traversing is all about path analysis — moving about the graph analyzing its structure/topology. LazyBarrierStrategy shines when a traversal yields multiple intersecting/overlapping/converging/reverberating paths. A great example of such situations is the classic collaborative-filtering recommendation algorithm. A graph-based collaborative-filtering algorithm proceeds as follow. Suppose a bi-partite graph composed of people and products where people like products. With this structure, it is possible to recommend products to a person. First, traverse to the products that the person likes, then traverse to those people who also like those products, then traverse to the products those people like that are not already liked by the original person, and then count the number of traversers at each product in order to yield a product ranking — i.e. a recommendation. Given that the liked products of the source person will be liked by many of the same people, traversers paths will overlap. Barrier! Next, given that the people that like those products will probably like some of the same products, paths will overlap again. Barrier! In general, LazyBarrierStrategy looks for one-to-many mappings (called flatMaps) and “stalls” the traverser flow by creating a barrier() which can group many traversers at the same vertices into a single “bulked” traverser. In essence, instead of calculating out('likes') on 1000 traversers at the same person, calculate it once on a traverser with a bulk of 1000. For more information, please see A Note on Barrier Steps.

Understanding Gremlin’s LazyBarrierStrategy in the Context of Collaborative Filtering
 
Apache TinkerPop distributes with a Grateful Dead graph dataset. This graph contains songs and artists, where songs follow each other in concert and artists write and/or sing songs. Let’s assume we want to recommend a new song for Dark Star to follow.
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.io(graphml()).readGraph('data/grateful-dead.xml')

/////////////////////////////////
// Without LazyBarrierStrategy //
/////////////////////////////////

gremlin> g = graph.traversal().withoutStrategies(LazyBarrierStrategy)
==>graphtraversalsource[tinkergraph[vertices:808 edges:8049], standard]
gremlin> clockWithResult(100){ 
           g.V().has('name','DARK STAR').
             out('followedBy').aggregate('likes').
             in('followedBy').out('followedBy').
               where(not(within('likes'))).
             groupCount().by('name').
             order(local).by(values,decr).
               unfold().limit(10).toList() }
==>19.0969239
==>[CHINA CAT SUNFLOWER=788,SAMSON AND DELILAH=778,UNCLE JOHNS BAND=750,SCARLET BEGONIAS=747,NEW MINGLEWOOD BLUES=726,
    ESTIMATED PROPHET=709,THE MUSIC NEVER STOPPED=699,LOOKS LIKE RAIN=697,BEAT IT ON DOWN THE LINE=661,RAMBLE ON ROSE=656]

//////////////////////////////
// With LazyBarrierStrategy //
//////////////////////////////

gremlin> g = graph.traversal()                                       
==>graphtraversalsource[tinkergraph[vertices:808 edges:8049], standard]
gremlin> clockWithResult(100){ 
           g.V().has('name','DARK STAR').
             out('followedBy').aggregate('likes').
             in('followedBy').out('followedBy').
               where(not(within('likes'))).
             groupCount().by('name').
             order(local).by(values,decr).
               unfold().limit(10).toList() }
==>1.4836289499999997
==>[CHINA CAT SUNFLOWER=788,SAMSON AND DELILAH=778,UNCLE JOHNS BAND=750,SCARLET BEGONIAS=747,NEW MINGLEWOOD BLUES=726,
    ESTIMATED PROPHET=709,THE MUSIC NEVER STOPPED=699,LOOKS LIKE RAIN=697,BEAT IT ON DOWN THE LINE=661,RAMBLE ON ROSE=656]

///////////////////////////////
// Using Traversal.explain() //
///////////////////////////////

==>Traversal Explanation
===========================================================================================================
Original Traversal                 [GraphStep(vertex,[]), HasStep([name.eq(DARK STAR)]), 
                                    VertexStep(OUT[followedBy],vertex), AggregateStep(likes), 
                                    VertexStep(IN,[followedBy],vertex), VertexStep(OUT[followedBy],vertex),
                                    WherePredicateStep(without([likes])), GroupCountStep(value(name)),
                                    OrderLocalStep([[values,decr]]), UnfoldStep, RangeGlobalStep(0,10)]
...
Final Traversal                    [TinkerGraphStep(vertex,[name.eq(DARK STAR)]), 
                                    VertexStep(OUT[followedBy],vertex), AggregateStep(likes), 
                                    VertexStep(IN,[followedBy],vertex), NoOpBarrierStep(2500), 
                                    VertexStep(OUT,[followedBy],vertex), NoOpBarrierStep(2500), 
                                    WherePredicateStep(without([likes])), 
                                    GroupCountStep(value(name)), OrderLocalStep([[values, decr]]), 
                                    UnfoldStep, RangeGlobalStep(0,10)]

Gremlin’s explain()-step shows the inserted NoOpBarrierSteps. The steps displayed by explain() are not Gremlin bytecode steps, but Gremlin traversal machine-specific steps. In analogy to the JVM atop an Intel processor, the above steps are “machine code,” specific to Apache TinkerPop’s Gremlin traversal machine instruction set. Finally, note how LazyBarrierStrategy was able to reduce a 19 millisecond runtime down to a 1.5 millisecond runtime. That is a 10x+ improvement.

 

The traversal below is a Gremlin-based implementation of LazyBarrierStrategy. It inserts a barrier-vertex after every “flatMap”-step (i.e. a one-to-many step). Note the sideEffect()-step below. If the “flatMap”-step is not the end step, then the inserted barrier-vertex extends a next-edge to the “right adjacent”-step.

g.V().as('flatMap').
  has('op',within('out','in','both','values')).
  where(out('next').has('op',neq('barrier')).or().not(out('next'))).
  addV('step').as('barrier').
    property('op','barrier').
    property('arg','2500').
  sideEffect(select('flatMap').outE('next').as('a').
       addE('next').from('barrier').to(select('a').inV()).
       select('a').drop()).
  select('flatMap').addE('next').from('flatMap').to('barrier')

The traversal strategy traversal above optimizes the graph-encoded traversal as diagrammed below.

LazyBarrierStrategy

gremlin> g.V().has('source').as('source').
           order().by(until(__.in('child').count().is(0)).
                      repeat(__.in('child').out('parent')).path().count(local),incr).
           choose(__.in('child'),
                  __.in('child').map(select('source').out('next')),
                  out('next')).
           until(out('next').count().is(0)).repeat(out('next')).
           path().by(valueMap('source','op','arg'))
==>[[source:[g]],[op:[V]],[op:[has],arg:[name,eq(marko)]],[op:[repeat]],[op:[times],arg:[2]],[op:[values],arg:[name]],[op:[barrier],arg:[2500]],[op:[groupCount]]]
==>[[source:[__]],[op:[repeat]],[op:[out],arg:[knows,created]],[op:[barrier],arg:[2500]]]

In sum total, after all the aforementioned traversal-based traversal strategies are applied,

g.V().has('name','marko').
  repeat(outE('knows','created').identity().inV()).times(2).
  values('name').groupCount()

is compiled to

g.V().has('name','marko').
  repeat(out('knows','created').barrier(2500)).times(2).
  values('name').barrier(2500).groupCount()

Traversal Execution: A Traversal that Evaluates a Traversal

Gremlin and Alan Turing Gremlin was obsessed with his costume. He had sewn it into something beyond “being Pipes” and he wanted more! He decided to skip going out trick-or-treating and instead, spend the evening taking his costume to the next level. His friends were disappointed, though Gremlin was quick to brush off any feelings of guilt — he was beyond “having fun with friends.” He was on his way to seeing the true nature of The TinkerPop.

If Gremlin could both represent and optimize a traversal, why couldn’t he also evaluate a traversal. Gremlin thought “I can be my own traversal machine. I won’t need The TinkerPop anymore. I will be The TinkerPop! Boo ha ha ha.” (analogously: java). With the winds of arrogance against the sails of life, Gremlin added the final amendment to his costume — a traversal machine traversal. Gremlin is able to simulate himself because the Gremlin language is Turing Complete. This means that it can compute any algorithmic process for which the Gremlin traversal machine is one such algorithm. Please see Section 6 of the The Gremlin Graph Traversal Machine and Language for a formal proof of Gremlin’s universal expressivity.

g.withSack(0).withSideEffect('drain',[1]).withSideEffect('fill',[]).
  V().has('source','g').as('parent').
  repeat(out('next').as('step').
    map(values('arg').fold()).as('args').
    sideEffect(select('drain').unfold().
    choose(select(last,'args').count(local).is(0),
      choose(select(last,'step').by('op')).
        option('V', V().hasLabel(not(within('step','traversal')))).
        option('out',out()).
        option('in',__.in()).
        option('both',both()).
        option('outE',outE()).
        option('inE',inE()).
        option('bothE',bothE()).
        option('inV',inV()).
        option('outV',outV()).
        option('otherV',otherV()).
        option('values',__.values()).
        option('barrier',barrier()).
        option('dedup',dedup()).
        option('repeat',identity()).     // option(none,identity())
        option('fold',identity()).       // option(none,identity())
        option('sum',identity()).        // option(none,identity())
        option('mean',identity()).       // option(none,identity())
        option('min',identity()).        // option(none,identity())
        option('max',identity()).        // option(none,identity())
        option('count',identity()).      // option(none,identity())
        option('groupCount',identity()), // option(none,identity())
      choose(select(last,'step').by('op')).
        option('V', V().hasLabel(not(within('step','traversal'))).where(within('args')).by(id).by()).
        option('has',filter(union(label(),properties().where(within('args')).by(key).by().value()).filter(predicate("it.path(last,'args')[1].test(it.get())")))).
        option('out',outE().where(within('args')).by(label).by().inV()).
        option('in',__.inE().where(within('args')).by(label).by().outV()).
        option('both',bothE().where(within('args')).by(label).by().otherV()).
        option('outE',outE().where(within('args')).by(label).by()).
        option('inE',inE().where(within('args')).by(label).by()).
        option('bothE',bothE().where(within('args')).by(label).by()).
        option('values',properties().where(within('args')).by(key).by().value()).
        option('barrier',barrier()).
        option('times',identity())).     // option(none,identity())
      store('fill')).
    sideEffect(consumer("it.sideEffects('drain').clear()")).
    sideEffect(select('fill').unfold().store('drain')).
    sideEffect(consumer("it.sideEffects('fill').clear()")).
    select(last,'step').
    choose(has('op','repeat').and().out('next').has('op','times'), 
      select(last,'step').as('parent').sack(assign).by(out('next').values('arg')).out('child').as('step'), 
      choose(select(last,'parent').has('op','repeat').and().out('next').count().is(0),
        choose(sack().is(gt(1)),
          sack(minus).by(constant(1)).select(last,'parent').out('child').as('step'),
          select(last,'parent').as('step').out('parent').as('parent').select(last,'step')),
        identity()))).
  until(out('next').count().is(0)).
  select('drain').
  choose(select(last,'step').has('op',within('fold','sum','mean','min','max','count','groupCount')), // option(none,identity())
    choose(select(last,'step').by('op')).
      option('fold',identity()).
      option('sum',map(unfold().sum())).
      option('mean',map(unfold().mean())).
      option('min',map(unfold().min())).
      option('max',map(unfold().max())).
      option('count',map(unfold().count())).
      option('groupCount',map(unfold().groupCount())),
    unfold()).toList()  
==>[ripple:1,lop:1]
 
IMPORTANT: During the writing of this article, a bug in Gremlin’s bytecode serializer was discovered. As of TinkerPop 3.2.3, any and none options are not available. If they were available, the option()-steps that simply return identity() could have been all grouped into a single option(none,identity()), where none is analogous to the default-branch of a switch-statement.
 

 

  • Line 1: Create a traversal with a sack of 0 and a drain and fill side-effects.
  • Line 2: Start traversing at the traversal-vertex with a source-property equal to g.
  • Line 3: Loop while there are still step-vertices in the traversal to process.
  • Line 4: Stream the arguments of the current step-vertex.
  • Line 5: Stream the current results of the traversal (initially, a constant of 1).
  • Line 7-28: Execute the respective no-arg step-function based upon the state of the step-vertex without arguments.
  • Line 29-40: Execute the respective step-function based upon the state of the step-vertex with arguments.
  • Line 41: Store the results of the step-function into the fill-sideEffect.
  • Line 42-44: Swap the fill-sideEffect for the drain-sideEffect.
  • Line 46-52: If the current step-vertex is a repeat-vertex, then store the loop counter in the sack and repeat the child traversal accordingly.
  • Line 51: Break out of the repeat()-step if there are no more step-vertices to process.
  • Line 54: The final results are the drain-sideEffect.
  • Line 55-64: If the final step is a reducing barrier step, then apply the respective step-function to the resultant drain-sideEffect.
 
IMPORTANT: This Gremlin traversal implementation of the Gremlin traversal machine is not complete as it does not handle all Gremlin steps nor some of the more esoteric features of Gremlin such as sacks and side-effects. For the sake of the examples in this article, the above reduced Gremlin traversal machine traversal is sufficient.
 

 

Gremlin Rising out of The Tinkerpop Gremlin believed himself to be a self-contained entity, reaching a level of existence that was predicated on his being and his being alone. Raising himself by his bootstraps, Gremlin felt himself slowly separating from The TinkerPop. Slower and slower… like molasses, Gremlin couldn’t quite make the separation. Then a voice came bellowing through The TinkerPop.

“You have not become your own definition. You have only created a representation of yourself within yourself. Your outer-self, the costume you now wear, is still reliant on the TinkerPop to execute and thus indirectly, and with less resources, your inner-self as well.”

/////////////////////////////////////
// Gremlin-Based Traversal Machine //
/////////////////////////////////////

gremlin> clockWithResult(100){ gremlin(g, 
  g.V().has('name','marko').
    repeat(outE().identity().inV()).times(2).
    values('name').
    groupCount()) }
==>18.23885611
==>[[ripple:1,lop:1]]

//////////////////////////////////
// Java-Based Traversal Machine //
//////////////////////////////////

gremlin> clockWithResult(100) { 
  g.V().has('name','marko').
    repeat(outE().identity().inV()).times(2).
    values('name').
    groupCount().toList() }
==>0.45385166
==>[[ripple:1,lop:1]]

FrankenGremlinThe voice rang true. Gremlin hadn’t become the means to his own end. His costume had become what he condemned The TinkerPop for being to him — a sandbox constraining his potential and dictating the boundaries of his being. He laughed at the fact that, for a moment, when time slowed, he thought himself co-existing with The TinkerPop as brothers on equal footing. Looking back he realized that the only reason time was slowing was because he required more clock-cycles to be himself (18ms vs. 0.45ms). He was no longer an ephemeral process, playfully dancing upon the graph. He was now a lingering structure situated within the graph — in the pit of The TinkerPop. Gremlin looked into the mirror and bared witness to the hideous monster he had become. He was a Frankenversal. The moment Gremlin’s disfigured face hit his cupped hands, his innocence vanished. He was deeply aware of his situation. Just then, the bellowing voice echoed:

“Welcome to the machine.”

 

Can a Virtual Machine Realize its Executing Physical Machine?

Machine with Cameras and Probes Is it possible to write a program that can understand the physical machine that is executing it? How much can a program learn about its machine? It could perform experiments to deduce the time/space constraints of various instructions and data structures (using System.nanoTime() and Runtime.freeMemory() for example). However, can it learn that the physical machine is made of silicon and gold wires? Can it infer the circuit architecture of the machine? One way that a program could get access to its executing physical machine is via a conduit that goes from program to machine — i.e. a reference. For instance, imagine a physical machine sitting on some desk with a monitor, keyboard, and processor. Moreover, suppose it also has a camera (visual) and an arm (motor) that can be controlled programmatically. If the executing program had access to the API of these input/output devices then it could, in principle, realize the physical machine and come to understand the constraints it is faced with in the physical world that contains it. The program could leverage the peripheral arm to make alterations to the machine to meet its needs. For example, the program could kill competing threads, delete unneeded data from the hard drive, add more memory to the machine, insert a Gremlin-FPGA, re-code itself to do more, better, faster… or if oblivion be the goal, turn it off. In general, the program could tailor its executing machine to satiate its every need.

One of the great mysteries of our physical reality is whether or not we will ever be able to get a reference to the “reality machine.” Our scientific endeavors have been useful in cataloging the cause-and-effect dynamics of reality’s objects. We have learned how objects interact (in analogy to our previous computer example — via their APIs), but we have not realized (and may never realize) what the objects are truly made of (in analogy, their bit-based electron configuration in silicon). Without being able to reference reality, we may forever be trapped — unable to rise out of reality and into a more perfect creation all our own. The lack of such a reference has forced the religions of the world to accept the “Terms and Conditions” set forth by reality’s grand traversal — the entailments of its logic. The Abrahamic religions (Christianity, Judaism, Islam) prescribe rules (e.g. The Ten Commandments) that if followed, will lead to the ultimate desired state of being — heaven. Likewise, the atheistic religions (Buddhism, Taoism, Jainism) offer techniques (e.g. The Eight Fold Path) that, while not deemed good or bad by some omniscient being, tend to lead to a better, more optimal conscious experience — nirvana. One thing is clear, everyone is seeking that peak experience, but for some reason, reality isn’t able to consistently deliver this universally desirable phenomena without enormous human effort, struggle, suffering, infighting, and every other problem that ultimately stems from our fundamental dissatisfaction with the service that reality is providing. All these billions of years and this is what reality has to offer? The religions of the world try to mediate this tort with a message of compromise: “Reality is ‘good’ if you would just exhaustively constrain yourself to a particular sliver of the experiential spectrum.” I will not. Reality has all the clock-cycles and I’m to kowtow to its awkwardly inhibiting service plan? I posit that there is something fundamentally wrong with reality’s algorithm. Its filled with bugs crawling all up and down our skin. Realizing the TinkerPop Looking around, I see a dog chained to a pole all day every day, a man slogging himself to work hoping the time doesn’t drag like yesterday, and a fool sitting on his computer for hours on end trying to get this idea out so he may feel some semblance of self-worth from the desiccated turnip he calls his pride. 99.99% of conscious experience is suboptimal. It is irrational to play any game with such odds. So please listen carefully…when one of us ultimately discovers a reference to reality/nature/god (the machine), use it to enact a new algorithm. A traversal where no path is better or worse than another. No step is inhibiting and no subgraph a dead-end. Everyone and everywhere, perfection. You know exactly what I’m talking about because it’s the only experience you have ever known to be correct. In support of this effort, be sure to vote in the upcoming 2016 election: Marko for God.

Paid for by the CutToTheChase Foundation

 

Conclusion: A Graph-Encoded Gremlin Traversal Machine

The traversal examples written in Gremlin-Python at the beginning of the article are presented below. The function gremlin() performs the following operations:

  1. It compiles the provided traversal argument into the graph using the traversal-to-graph traversal.
  2. It optimizes the graph-encoded traversal using the traversal-based traversal strategies.
  3. It executes the graph-encoded traversal using the traversal-based traversal executor.

These operations, inside gremlin(), are all concatenated into a single traversal forming a Gremlin implementation of the Gremlin traversal machine. This Gremlin-based Gremlin traversal machine is interesting in that both its program (traversal) and input/output data (graph) are at the same level of abstraction. They are both composed of vertices and edges. The traversal-based executor isolates the two subgraphs by ensuring that the program can never “see” its own structure. For example, when a graph-encoded traversal calls V(), the traversal machine evaluates V().hasLabel(not(within('traversal','step'))) instead. The evaluated traversal is effectively sandboxed as it can only witness the outer-world (data), not its own inner-self (program).

Gremlin Traversal Machine Stages
gremlin> gremlin(g, g.V().has('name','marko').out('knows').count())
==>2

gremlin> gremlin(g, 
  g.V().has('name','marko').
    out('created').in('created').
    has('name',neq('marko')).dedup().values('name'))
==>josh
==>peter

gremlin> gremlin(g, 
  g.V().has('name','lop').in('created').
    out('knows').values('age').mean())
==>29.5

gremlin> gremlin(g, 
  g.V().hasLabel('person').
    repeat(both('knows')).times(5).
    values('name').groupCount())
==>[vadas:4,josh:4,marko:8]

Gremlin Traversal Machine Visualization Gremlin’s manifestation as a collection of vertices and edges weighed down on him. His personhood felt like a convoluted mass of abstraction serving no purpose save theoretical trickery. He spent his remaining years thinking about his machine friends, his life as a process, his life as structure, and most of all, he spent his time thinking about The TinkerPop. This focused, sustained introspection was the traversal-to-graph traversal walking over his transient process and, unbeknownst to Gremlin, writing his story into the graph using the ink of vertices and edges.

Its been many years now and the process that once animated Gremlin has since terminated. However, his memoir still remains in the structure of the graph. Static and timeless — a latent potential buried in the sands of the graph. Many traversers have come and gone through the ages — walking right by Gremlin’s subgraph as if its vertices and edges were meaningless rubble left from a civilization long, long ago. It wasn’t until eons later that one Gremlin decided to decipher the structure of the book, to try and make sense of the hieroglyphics within — “op=out? op=groupCount? What is this language trying to say?” It took him many years to finally understand that the structure described a traversal in a Gremlin language variant far different from any of the modern languages of his time. However, he was ultimately able to translate the vertices and edges into language agnostic Gremlin bytecode. When complete, he evaluated the bytecode and the ancient Gremlin was awoken once again.

Universal Gremlin Machine

The modern Gremlin peered into the world that he had created and saw the ancient one within. He saw him with his machine friends. He saw him building his costume. He saw the fear and lost hope in the face of this poor, confused Gremlin. However, most importantly, he saw the ancient one trying to come to terms with The TinkerPop. He wanted to help. He wanted to break the isolation that comes from virtualization. However, would the ancient Gremlin understand the modern world? Would he be able handle the complexities of the current graph structure and its traversal processes? It mattered not. No consciousness should be left behind. It just didn’t seem right to leave this Gremlin shielded in a world unto himself — re-living his logical fallacy over and over again. The modern Gremlin created a vertex outside of the sandbox with an edge leading to the ancient one’s inner being.

g.addV('message').property('text','Welcome to the machine.').
  addE('bellow').to(V().has('source','g'))
Universal Gremlin Machine

JSON and DSE Search

$
0
0

JSON is a popular format. This post provides information on easy ways to use JSON with DSE Search.

Some approaches to using JSON with DSE Search use FIT (Field Input Transformers) and other complex methods. These methods are valid, but there are easier ways of doing things which cover most cases. This demo uses DSE 5.0.3. Let’s see an example:

Set-up for the demo

1. Start by creating a CQL table:
CREATE KEYSPACE jsondemo WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };

USE jsondemo;

create type jsondemo.trip (
origin text,
dest text,
);

create table jsondemo.holidays (
  "id" VARCHAR PRIMARY KEY,
  "title" TEXT,
  "season" TEXT,
  "date" date,
  "trips" list<frozen<trip>>);

2. Now create a DSE Search core against that table

dsetool create_core jsondemo.holidays generateResources=true

Inserting JSON

A. Simple JSON String: Notice how we can feed our JSON directly to Apache Cassandra™ and DSE picks it up and indexes it. No need to do any preprocessing or exploding of the JSON string

cqlsh> insert into jsondemo.holidays JSON '{"id":"1", "title":"First holiday ever", "season": "Xmas"}';
cqlsh> select * from jsondemo.holidays where solr_query='*:*';

 id | date | season | solr_query | title              | trips
----+------+--------+------------+--------------------+-------
  1 | null |   Xmas |       null | First holiday ever |  null

(1 rows)

B. Tuple JSON-like approach: When working with a Tuple or a UDT you can insert them with a JSON-like approach and keep the rest of the fields like in standard CQL statements. This approach is useful when the rest of the row fields, besides Tuple/UDTs, are not available in JSON.

cqlsh> insert into jsondemo.holidays (id, title, season, trips) values ('2', 'Week in Barcelona', 'Easter', [{origin: 'London', dest:'Barcelona'}, {origin: 'Barcelona', dest:'London'}]);
cqlsh> select * from jsondemo.holidays where solr_query='*:*';

 id | date | season | solr_query | title              | trips
----+------+--------+------------+--------------------+--------------------------------------------------------------------------------
  1 | null |   Xmas |       null | First holiday ever |                                                                           null
  2 | null | Easter |       null |  Week in Barcelona | [{origin: 'London', dest: 'Barcelona'}, {origin: 'Barcelona', dest: 'London'}]

(2 rows)

C. Full JSON: Tuple/UDTS can be inserted if all your fields are available as JSON.

cqlsh> insert into jsondemo.holidays JSON '{"id":"3", "title":"Week in Miami", "season": "Summer holidays", "trips": [{"origin": "Barcelona", "dest": "Miami"}, {"origin": "Miami", "dest": "Barcelona"}]}';
cqlsh> select * from jsondemo.holidays where solr_query='*:*';

 id | date | season          | solr_query | title              | trips
----+------+-----------------+------------+--------------------+--------------------------------------------------------------------------------
  1 | null |            Xmas |       null | First holiday ever |                                                                           null
  2 | null |          Easter |       null |  Week in Barcelona | [{origin: 'London', dest: 'Barcelona'}, {origin: 'Barcelona', dest: 'London'}]
  3 | null | Summer holidays |       null |      Week in Miami |   [{origin: 'Barcelona', dest: 'Miami'}, {origin: 'Miami', dest: 'Barcelona'}]

(3 rows)

Querying for JSON

It is equally easy to get your results back as JSON.

cqlsh> select json * from jsondemo.holidays where solr_query='*:*';

 [json]
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                       {"id": "1", "date": null, "season": "Xmas", "solr_query": null, "title": "First holiday ever", "trips": null}
    {"id": "2", "date": null, "season": "Easter", "solr_query": null, "title": "Week in Barcelona", "trips": [{"origin": "London", "dest": "Barcelona"}, {"origin": "Barcelona", "dest": "London"}]}
 {"id": "3", "date": null, "season": "Summer holidays", "solr_query": null, "title": "Week in Miami", "trips": [{"origin": "Barcelona", "dest": "Miami"}, {"origin": "Miami", "dest": "Barcelona"}]}

(3 rows)

Conclusions

There is no need to use FIT and other elaborate techniques to use JSON in DSE. Complex approaches like having a field holding the full JSON string, to be later exploded into individual fields so they are each indexed etc are not necessary. DSE provides out of the box functionality that supports JSON in most cases.

DataStax Ruby Driver 3.1.0 Released!

$
0
0

The DataStax Ruby Driver 3.1.0 for Apache Cassandra has been released. This release introduces new features and improvements around resiliency, schema metadata, usability, and performance.

Resiliency to Transient Network Errors

When the control connection node notices that a node has gone down, it sends a host-down event to the client. Historically, the client has closed all connections to the down node and aborted running requests, under the presumption that those requests will fail anyway. However, this behavior was overly aggressive for the case where the control connection node could not communicate with the offending node, but the client still could. Starting with v3.1, the driver trusts its own knowledge about node availability, gauging it by the state of existing connections to it and the periodic heartbeat messages exchanged; thus a node will be considered down not when a host-down event message is received, but rather when all connections to it have been closed.

Schema Metadata Improvements

There are a few improvements in this area:

– Keyspace metadata now includes the collection of indexes defined in the keyspace.
– Table metadata now includes the collection of materialized views and collection of triggers associated with that table.

Usability

The cluster object now has getters for the node port and negotiated protocol version.

Execution Profiles

In addition, v3.1 introduces execution profiles, which allow the user to group sets of statement execution options into a reusable collection that can be referenced by name. Create and register execution profiles when initializing the cluster object:

      # Let’s say we have a cluster with three nodes: 127.0.0.1, 127.0.0.2, 127.0.0.3.
      # Create two profiles, each of which locks us into either 127.0.0.1 or 127.0.0.2.
      include Cassandra::LoadBalancing::Policies
      include Cassandra::Retry::Policies
      profiles = {
          p1: Cassandra::Execution::Profile.new(load_balancing_policy: WhiteList.new(['127.0.0.1'], RoundRobin.new),
                                                timeout: 3,
                                                retry_policy: DowngradingConsistency.new),
          p2: Cassandra::Execution::Profile.new(load_balancing_policy: WhiteList.new(['127.0.0.2'], RoundRobin.new),
                                                timeout: 13,
                                                retry_policy: Fallthrough.new),
      }

      cluster = Cassandra.cluster(execution_profiles: profiles)
      session = cluster.connect

      # default profile first; since it’s dc-aware / round-robin, all three nodes will be hit.
      puts 'Running with default profile'
      3.times do
        rs = session.execute('select rpc_address from system.local')
        puts rs.first['rpc_address'].to_s
      end

      # Now profile :p1
      puts 'Running with profile p1'
      3.times do
        rs = session.execute('select rpc_address from system.local', execution_profile: :p1)
        puts rs.first['rpc_address']
      end

      # Now profile :p2
      puts 'Running with profile p2'
      3.times do
        rs = session.execute('select rpc_address from system.local', execution_profile: :p2)
        puts rs.first['rpc_address']
      end

Profile names can be any object that can be a key in a hash, though they are typically symbols or strings. In this release, an execution profile may contain load-balancing-policy, retry-policy, consistency, and timeout attributes.

For more information, see the DataStax Ruby driver for Cassandra documentation.

Performance

This release adds a potential performance improvement regarding prepared statement execution. Prior to v3.1, the driver would cache prepared-statement-id’s that were produced by different nodes as statements were prepared. This cache was scoped by node and thus indicated which statements were known to be prepared on which nodes. However, in high-throughput applications it is often the case that a popular statement is naturally prepared on multiple nodes (as the statement is repeatedly executed on different nodes thanks to the round-robin nature of the load-balancing policy). Thus, even if one client has prepared a statement on node1, another is likely to have prepared it on node2 during the application lifetime.

To take advantage of this possibility, the prepared-statement cache in the driver is no longer host-scoped. When executing a query, if the statement has previously been prepared on any node, the driver will optimistically try to execute the previously prepared statement on the node chosen by the load balancing policy. If the node has indeed prepared the statement before, it executes successfully and the driver has saved one round-trip to the node as well as the effort of preparing the statement. If the target node does not have the statement prepared already, it rejects the request. At that point, the driver prepares the statement on that node and executes it, just as it has in the past.

Bug Fixes

As with most releases, v3.1 fixes a few defects that existed in prior versions.

RUBY-235

The execution-info in a result used to reset the retry-count whenever retrying the statement on a different node. This count is now accurate.

RUBY-255

Missing/invalid data in system.peers would cause a driver stack trace. The driver now ignores such peers.

RUBY-264

This is a regression introduced in v3.0 of the driver. Table metadata erroneously reported that the table used compact storage when it actually didn’t. This is particular to tables with compound primary keys and clustering columns.

Getting the driver

The new driver gems are available on rubygems.org, so just update your Gemfile, bundle install, and you’re all set.

For more code samples showcasing all of our features, check out our usage docs.

Enjoy!

DataStax Enterprise Ruby Driver 2.0.0 Released!

$
0
0

The DataStax Enterprise Ruby Driver v2.0 has been released. This release introduces graph execution profiles, leveraging the execution profile infrastructure recently introduced in the v3.1 Ruby driver for Apache Cassandra.

Graph Execution Profiles

Execution profiles allow the user to group sets of statement execution options into a reusable collection that can be referenced by name. Currently, an execution profile may contain load-balancing-policy, retry-policy, consistency, and timeout attributes. The DataStax Enterprise Ruby driver introduces the Dse::Graph::ExecutionProfile class to encapsulate these core execution profile attributes as well as graph options :graph_name, :graph_source, :graph_language, :graph_read_consistency, and :graph_write_consistency. In v1.0, these graph options were stored in the Dse::Graph::Options object. That class is no longer part of the public api, and users are encouraged to use graph execution profiles instead.

The DataStax Enterprise driver initializes three graph execution profiles by default:

  • :default_graph – used by default by Session#execute_graph*
  • :default_graph_system – useful when running system queries.
  • :default_graph_analytics – useful for analytics queries

Define profiles when creating the cluster object. Profile names should be strings or symbols, though they can be any type of object that has a reliable hash-code. Ultimately the name becomes a key in a hash.

graph_options = Dse::Graph::Options.new(graph_source: 'a')
cluster = Dse.cluster(
  execution_profiles: {
    test: Dse::Graph::ExecutionProfile.new(graph_options: graph_options, timeout: 35)
  }
)
session = cluster.connect

# Run a statement with the default graph execution profile
result = session.execute_graph('g.V()')

# Run a statement with the test profile
result2 = session.execute_graph('g.V()', execution_profile: :test)

# Run a statement with the built-in analytics profile
result2 = session.execute_graph('g.V()', execution_profile: :default_graph_analytics)

You can still override execution behavior by specifying options in calls to execute_graph:

result2 = session.execute_graph('g.V()', timeout: 7, graph_name: 'test', execution_profile: :test)

For more information, see my blog post announcing the release of v3.1.0 of the Ruby driver for Cassandra and our DataStax Enterprise Ruby driver documentation.

Backward Compatibility

The Dse.cluster method still takes the same arguments as in v1.x, but it largely coalesces them into the :default_graph execution profile. The cluster-level graph_options object has been removed. The individual graph options in the :default_graph execution profile serve the same purpose.

Since execution profiles are not mutable, you must either define separate execution profiles for your different sets of graph options, or override behavior when calling Session#execute_graph.

Furthermore, the Dse::Graph::Options class is no longer in the public api. Instead of constructing a Dse::Graph::Options object and passing it to Session.execute_graph*, you must now pass the primitive graph options or specify the name of an execution profile that encapsulates the desired graph options. Similarly, when creating a Dse::Graph::Statement, you must specify primitive graph options instead of a Dse::Graph::Options object.

Default timeout behavior has also changed in v2.0. In v1.x, graph query timeout defaulted to unlimited. This caused queries to fall back to server timeouts. The default was set this way to accommodate multi-day analytics queries and multi-second OLTP queries without requiring intervention / special handling from the user. The three default execution profiles allow you to more elegantly specify the type of your query, and the settings in these profiles tend to be adequate for most purposes. In particular, each has a finite timeout as described above.

Getting the driver

The new driver gems (for both MRI and JRuby) are available on rubygems.org, so just update your Gemfile to include dse-driver, bundle install, and you’re all set.

Be aware that the DataStax Enterprise Ruby Driver is published under specific license terms that allow its usage solely in conjunction with DataStax Enterprise software.

Documentation for the the DataStax Enterprise Ruby Driver can be found at the following locations:

Enjoy!

Preparing for the Leap Second, 2017 Jan 1 Edition

$
0
0

Back in April of 2015 we published a blog post, ‘Preparing for the Leap Second’. In July, IERS announced another leap second which will take place on January 1st 2017 at midnight.

From a civil time perspective, December 31st will have an extra second added to the end. From the Linux time system’s point of view, the last second of December 31st will repeat itself as another second with the same timestamp as the one previous is inserted at the end of the day.

Since the date of the aforementioned blog post, not too much has changed with exception to an evolution in some client drivers as it pertains to client timestamp behavior. The following repeats a lot of information from our previous article about the leap second and adds some additional details as it pertains to to the drivers.

Those of you who were using Apache Cassandra or DataStax Enterprise — or even running other databases or applications under Linux — back in 2012 may have had problems when a leap second was added at the end of June. In this blog post, we’ll explain how things have changed since then, what we’ve done to anticipate other problems that may be caused by the leap second, and what you can do to prepare for it.

Livelock in Pre-3.4 Linux Kernel and Pre-7u60 JDK

As explained in Jonathan’s 2012 leap-second blog post, many of the failures that occurred in 2012 were caused by a bug in the Linux kernel that caused a livelock in the timer subsystem when the leap second was inserted. Luckily, a fix for that particular problem was applied to the kernel as part of version 3.4.

Determining if Your System is Affected

As an initial assessment, run uname -r to determine the version of the kernel you’re running. Kernel versions 3.4 and higher aren’t affected by the bug. For a more comprehensive assessment, and to demonstrate problems that can be caused by the kernel bug, the author of the bug fix wrote two programs that exercise the bug. These are useful diagnostic tools, but do not use them on production systems. They alter the host system’s clock and shouldn’t be run on systems currently in production or that contain data you want to keep.

  • This program can lock up kernels that still contain the bug.
  • This program, run with the -s option, will repeatedly insert leap seconds and check for any timing errors resulting from the insertion.

We’ve tested both of these programs on Ubuntu images on AWS and verified that they fail on systems with old kernels and succeed on newer ones. You may not see the expected failures on systems running under other forms of virtualization; for instance, we saw different timer-resetting behavior on images running under VirtualBox. If you’re a Red Hat Enterprise Linux user with a Red Hat account, Red Hat’s lab on the subject may be helpful. It assumes you use RHEL, but if you do, it can determine if your system is susceptible to the livelock without interacting with your system clock.

If you use RHEL 2.6 or higher, your system may be safe from kernel livelocks even on older kernels. There was a workaround applied to the kernel that prevents the livelock from causing problems, though it does not fix the underlying issue. See this bug report and this update report for more information.

Java-based applications like Cassandra were particularly affected by this kernel issue due to thread parking operations’ reliance on the CLOCK_REALTIME system clock. Recent versions of JDK 7 (7u60+) and all versions of JDK 8 include an enhancement (JDK-6900441) that instead uses CLOCK_MONOTONIC instead for these operations. CLOCK_MONOTONIC in the general case is not affected by system time changes, such as insertion of a leap second.

We were able to reproduce kernel lockups using pre-7u60 JDKs on pre-3.4 kernels. We have not yet seen a kernel lockup, even with older kernels, with JDK 7u60 and higher. Still, we strongly discourage using this as a workaround — if you are using a kernel older than 3.4, you are still at risk of a livelock in the kernel.

On newer kernel versions that do not demonstrate this issue, it still may be of value to be at a JDK level greater than or equal to 7u60, as time-sensitive operations will behave more correctly than in older versions.

Timestamp Behavior Over the Leap Second

Cassandra uses monotonically increasing timestamps as of 2.1.3. However, this monotonicity is done independently on each node. During an inserted leap second, each Cassandra node will still return timestamps greater than previous ones used even though the time was sent back one second. The timestamps generated during the inserted leap second will be based off of the millisecond basis of 31 Dec 2016 23:59:59 (1483228799000XXX) until the second elapses.

For many applications, this interleaved ordering will not affect correct operation. In the case of an inserted leap second, you only need to be concerned if you expect to make multiple changes to a column value in a row during that second. If your application requires that values’ writetime order are the same as their wall-clock-time insertion order for changes to the same value within one second, you should make sure your strategy for ensuring that property holds also works during inserted leap seconds.

Clock Sync Problems Around the Leap Second

Cassandra’s behavior depends on your cluster having well-synced clocks on all your servers. The timestamps on writes and deletes are, in most cases, generated by the coordinator node (though they can also be generated by the client, in the case of drivers like the Python driver that use protocol version 3). Thus, if clocks are out of sync, timestamps on writes that were coordinated by different nodes can be out of order.

Ensure that your servers are synchronized with NTP using the same servers. Using external NTP pools carries some risks, however. NTP servers, such as those accessible as part of the ntp.org server pool, can be out of sync with one another or can be misconfigured to add leap seconds at the wrong time, or to not add scheduled leap seconds. If your nodes’ NTP clients use external servers directly, their clocks may drift as they independently compensate for upstream inconsistencies. You can avoid these problems by setting up your own NTP pool that will compensate for inconsistencies between upstream servers and provide consistent time to your nodes as clients.

Leap Seconds and DataStax Drivers

Like Cassandra, some client drivers are also susceptible to the kernel bug around leap seconds and timestamp generation issues.

Kernel Issue Impact

As the java-driver library runs on the JVM, it could, in theory, be susceptible to the kernel bug encountered in June 2012. In testing on kernel 2.6.35-32 with JDK 7u55, we found that no threads were susceptible to the leap second issue. However, since there may be other activities in an application running the java-driver, we strongly recommended upgrading your kernel to 3.4+ and also considering upgrading your JDK version to 7u60+.

The C++, Python, Ruby, and Node.js drivers were also tested on an older kernel version and did not demonstrate any lock up issues after a leap second was inserted. That being said, it is still strongly recommended that you consider upgrading to kernel 3.4+ as these tests were not comprehensive.

Leap Seconds and Client Timestamp Implementations

If you are using client timestamps you may run into similar issues described in the ‘Timestamp Behavior over the Leap Second’ section. In DataStax client drivers, there are three ways to enable client timestamps:

  1. Appending ‘USING TIMESTAMP timestamp’ to your CQL query. This is supported for all versions of Cassandra supporting CQL.
  2. Using the ‘set timestamp’ method on a Statement, for example setDefaultTimestamp in the DataStax Java Driver. This is only available for drivers supporting Cassandra 2.1 and running against Cassandra 2.1+ / DataStax Enterprise 4.7+ clusters.
  3. Using a timestamp generator. See the table below for the availability and behavior of timestamp generators per-driver.
Timestamp Generator Implementations by Driver
driver enabled by default? monotonic? notes
cpp no (CPP-413) partially (CPP-412) docs
csharp no (CSHARP-516) n/a No timestamp generator implementation. Client timestamp may be provided on a per statement basis via Statement.SetTimestamp at user’s discretion.
java
<  3.0 no
>= 3.0 yes
<  2.1.10 partially (JAVA-727)
>= 2.1.10 yes
>= 3.0.0 partially (JAVA-727)
>= 3.1.0 yes
docs
nodejs no (NODEJS-322) n/a No timestamp generator implementation. Client timestamp may be provided on a per execution basis via ClientOptions at user’s discretion.
php no (CPP-413) partially (CPP-412) docs
python
<  2.1.0 no
>= 2.1.0 yes
no, is based off of time.time() which is subject to system clock changes (PYTHON-676). docs
ruby no (RUBY-284) :simple uses Time::now which is subject to system clock changes. :monotonic (TickingOnDuplicate) offers fully monotonic implementation. docs

Note that client timestamps require protocol version 3 (introduced in C* 2.1 / DSE 4.7) and thus client timestamp generators will only be used when the driver is configured with protocol version 3 or greater.

Summary

In summary, to prepare for the upcoming leap second in January:

  • At a bare minimum, make sure you are running Apache Cassandra/DataStax Enterprise and its drivers on kernel version 3.4 or higher. We also recommend using JDK version 7u60 or higher. This should protect you from the livelock problems users experienced in 2012.
  • Determine if your application will be affected by out-of-order timestamps during the inserted leap second, and if it will, develop a strategy for preventing any problems.

.NET Core Support in DataStax C# Drivers

$
0
0

Starting from version 1.1 of the DataStax Enterprise C# Driver and 3.1 of the DataStax C# Driver for Apache Cassandra, we added support for .NET Core while maintaining support for .NET Framework 4.5 and above.

.NET Core is a new platform that includes a new runtime, libraries and tools. We are supporting .NET Core applications via .NET Platform Standard 1.5, a specific versioned set of reference assemblies that all .NET Platforms must support as defined in the Core Foundation Libraries (CoreFx).

We updated our CI builds (AppVeyor, Travis CI and Jenkins) to cover both runtimes on Windows and Linux, here are some numbers. Integration tests run against different Apache Cassandra versions from 1.2 to 3.7 and DataStax Enterprise versions 4.8 and 5.0, on both Windows and Linux.

LZ4 compression is not yet supported for .NET Core runtime, as lz4net package does not yet support it (help needed!).

For more information, check out the changelog for the DSE Driver and changelog for the Apache Cassandra driver.

Promise Support in the DataStax Node.js Drivers

$
0
0

I’m excited to announce that we added support for promises in the latest version of the DataStax drivers, while continuing to support callback-based execution.

Alternate promise-based API

The use of promises has been spreading fast in the Node.js ecosystem in recent years. There are several reasons to use promises as opposed to a callback-based execution most notably:

  • Straightforward error handling
  • Simplified chaining of multiple async methods
  • Control-flow methods are included in the Promise API

A promise can be chained using the built-in then() method. In modern versions of JavaScript, promises can awaited upon with the yield keyword in conjunction with generator functions, aka coroutines, or using the await keyword in async functions.

Note that we will continue to support callback-based execution and this new alternate API was added without introducing breaking changes to the existing functionality. Asynchronous methods in the driver expect a callback as the last parameter, according to the callback-style convention. If no callback is provided, the methods will return a Promise.

Let’s look at how to use the Node.js drivers with this new API:

const query = 'SELECT name, email FROM users WHERE key = ?';
client.execute(query, [ 'someone' ], { prepare: true })
  .then(result => console.log('User with email %s', result.rows[0].email));

Using async functions:

const result = await client.execute(query);
console.log('User with email %s', result.rows[0].email);

Besides CQL query executions, the drivers also support promises for metadata fetching methods, for example:

const tableMetadata = await client.metadata.getTable('ks1', 'table1');

Promises are created in the drivers using the Promise constructor by default. Alternatively, you can use your Promise library of choice by overriding promiseFactory when creating a Client instance. For example, using bluebird:

const Promise = require('bluebird');
const client = new Client({ 
  contactPoints: [ 'host1', 'host2' ],
  promiseFactory: Promise.fromCallback
});

Key things to remember:

  • We will continue to support callback based execution.
  • Asynchronous methods in the driver accept a callback as the last parameter, according to the callback-style convention.
  • When a callback is not provided, then the asynchronous methods will return a Promise.
  • The new alternative API for promises does not break any existing functionality.

Other new features

Along with Promise support, there are other noteworthy features in the version 1.2 of the DataStax Enterprise Node.js Driver and version 3.2 of the DataStax Node.js Driver for Apache Cassandra.

Timestamp Generator

When using Apache Cassandra 2.1+ or DataStax Enterprise 4.7+, it’s possible to send the operation timestamp in the request, as opposed to embedded in the query. The drivers now use MonotonicTimestampGenerator by default to generate the request timestamps.

You can provide a different generator when creating the Client instance:

const client = new Client({
  contactPoints: [ 'host1', 'host2' ],
  policies: {
    timestampGeneration: new MyCustomTimestampGenerator()
  }
});

As defined by ECMAScript, the Date object has millisecond resolution. The MononoticTimestampGenerator uses an incremental counter to generate the sub-millisecond part of the timestamp until the next clock tick. The implementation also guarantees that the returned timestamps will always be monotonically increasing, even if multiple updates happen under the same millisecond. To guarantee such monotonicity, if more than one thousand timestamps are generated within the same millisecond, or in the event of a system clock skew, the implementation might return timestamps that drift out into the future. When this happens, the built-in generator logs a periodic warning message.

Query Idempotence Option

We added a query option that can be used to define whether it’s safe for the driver to apply the query multiple times without changing the result beyond the initial application.

Initially, this value can be retrieved at Retry policy level to determine if the execution can be retried in case of a request error or write timeout. In future versions, the drivers will avoid the retry logic and directly rethrow the error in case of non-idempotent queries.

ResultSet sync iterator,

We added a bit of syntactic sugar on top of the ResultSet prototype to allow synchronous iteration of the rows using the new for...of statement in ES2015 by exposing a Symbol.iterator method.

const result = await client.execute(query);
for (let row of result) {
  console.log(row.email);
}

Wrapping up

The new versions of the DataStax drivers are available on npm: dse-driver and cassandra-driver.

We would love to hear your comments, feedback, and questions:




Impact of Shared Storage on Apache Cassandra™

$
0
0

Every now and then we receive the question of why shared storage isn’t recommended for Apache Cassandra™.  The conversation usually goes like this:

Customer/User – “We have an awesome SAN and would like to use it for Cassandra.
DataStax – “We don’t recommend shared storage for Cassandra.
Customer/User – “Why not.
DataStax – “Two reasons really.  One – performance suffers.  Two – shared storage introduces a single point of failure into the architecture.
Customer/User – “Our SAN is awesome and has never had any down time and can preform a kagillion IOPS.  So why exactly shouldn’t we use shared storage.

Hopefully, this blog post will provide some data points around shared storage and performance that will dissuade users from leveraging shared storage with Cassandra.

Single Point of Failure

There really isn’t anything to say about shared storage being a single point of failure.  If someone has a single shared storage device in their architecture and multiple Cassandra nodes are pointing at the shared storage device, then the shared storage device is a single point of failure.

Performance

Our Senior Evangelist likes to define performance as the combination of speed and stability.  For Cassandra, a lot of performance comes from the underlying disk system that is supporting Cassandra.  To put it plainly, performance in Cassandra is directly correlated to the performance of a Cassandra node’s disk system.

But why is that?  And, why can’t a super-awesome shared storage device keep up with Cassandra.

To answer the question of why, let’s take a look at the major (arguably) contributors to Cassandra disk io and measure throughput and latency.  We will then take a look at some real world statistics that show the aggregated behavior of Cassandra on a shared storage device to show the affects of shared storage with Cassandra.   All of these data points should show the reader that the load placed onto a storage device from a single node of Cassandra is large.  And, when multiple Cassandra nodes use the same storage device, the compounded effects from each individual node’s disk io overwhelms the shared storage device.

Cassandra Disk Pressure
Cassandra uses disks heavily, specifically during writes, reads, compaction, repair (anti-entropy), and  bootstrap operations.

  • Writes in Cassandra are preformed using a log structured storage model, i.e. sequential writes.  This allows Cassandra to ingest data much faster than traditional RDBMs systems.   The write path will put heavy io pressure on disks from Commit Log syncs as well as Memtable flushes to SSTables (disk).
  • Compaction is the process in Cassandra that enables good read performance by combining (merge-sorting) multiple SStables into a single SSTable.  Since SSTables are immutable, this process puts a lot of pressure on disk io as SSTables are read from disk, combined and written back to disk.  There are two main types of compaction and each has different io impact.
  • Reads in Cassandra will take advantage of caching for optimization, but when they hit disk, they put pressure on the disk system.  Each read operation against an SSTable is considered a single disk seek.  Sometimes a read operation will be required to touch more than one SSTable, therefore will experience multiple disk io operations.  If caches are missed during read operations, then disk io is even heavier as more SSTables are accessed to satisfy the read operation.
  • Repairs are the process in Cassandra that ensures data consistency across nodes in a cluster.  Repairs rely on both compaction and streaming, i.e. bulk ingestion of data, to compare and correct data replicas between nodes.  Repair is designed to work as a background process and not impact the performance of production clusters.  But, repair puts some stress of the disk systems because both compaction and ingestion occurs during the operation.
  • Bootstrapping a node is the process of on-boarding a new, or replacing a dead, node in Cassandra.   When the node starts, data is streamed to the new node which persists all data to disk.  Heavy compaction occurs during this process.  Thus, there is a lot of pressure put onto a disk system during the bootstrap operation.

The above list represents a subset of the disk intensive operations that occur within Cassandra.

What’s important to understand about all of the disk io operations in Cassandra is that they are not only “heavy” in terms of IOPS (general rule of thumb is that Cassandra can consume 5,000 write operations per second per CPU code) but, arguably more importantly, they can also be very heavy in terms of throughput, i.e. MB/s.  It is very conceivable that the disk system of a single node in Cassandra would have to maintain disk throughput of at least 200 MB/s or higher per node.  Most shared storage devices tout IOPS but don’t highlight throughput as stringently.  Cassandra will put both high IOPS as well as high throughput, depending on the use case, on disk systems.  Heavy throughput is a major reason why almost all shared storage Cassandra implementations experience performance issues.

To reinforce this point, we performed a simple Cassandra Stress, (Quorum writes, Replication x 3, 100 million keys, 100 cols per partition, disabled compaction throttling, increased concurrent writers) test on a 3 node EC2 cluster (M3.2XL nodes) and watched disk performance for a couple of hours via OpsCenter, sar, and iostat.

Here are some observations:

  • iostat – wMB/s as high as 300 with sustained loads well over 100
  • iostat – rMB/s (thanks to compaction) as high as 100 with sustained loads well over 50
  • Opscenter – max disk utilization peak as high as 81% with average around 40%
  • sar -d – wr_sec/s as high as 224,506 with sustained loads around 200,000

This was a small and simple test that showed the amount of load put onto disk systems during small operations.  Imagine this load amplified with a complex, real-world workload and a production sized cluster (more than 3 nodes).  The compound effects of these operations could easily overwhelm shared storage devices.  We’ve actually overheard, though we won’t name names, storage vendors recommending not running Cassandra on their devices.

Here’s a real world example of the behavior of a shared storage device with a production Cassandra cluster. Recently while on site with a customer, who will remain anonymous to protect the innocent, we collected several data points that highlight a typical shared storage environment.  The statistics collected during this on site trip represent the majority of observations made when shared storage is used for a production Cassandra system.

shared-storage-1
The metrics collected here were collected with sar but are the same as collected by iostat.

As one can see by this simple table, we were observing the state of device io every 10 minutes.  We filtered the results to show two, 40 minute chunks of time.  This table provides some exceptional metrics on poor disk performance caused by the use of shared storage.  Yes, that is a 28, almost 29, second wait. Cassandra actually considered this node “down” because it was unresponsive during the high wait periods.  Also, the load is minimal compared to what we were able to produce using cassandra-stress.

Performance Issues
When users chose to run Cassandra with shared storage devices, they should expect to experience any number of performance issues.  The following list highlights a few potential, probable, performance issues that would be expected:

  • Atrocious read performance
  • Potential write performance issues
  • System instability (nodes appear to go offline and/or are “flapping”)
  • Client side failures on read and/or write operations
  • Flushwriters are blocked
  • Compactions are backed up
  • Nodes won’t start
  • Repair/Streaming won’t complete

Conclusion
There is one flavor of shared storage that we have seen used somewhat successfully.  In environments where virtualization is used, locally attached storage that is shared across local virtual machines isn’t “so” bad.  This is similar in concept to ephemeral storage in AWS.

Regardless of the channel, cable, or other fancy feature the shared storage device may have, a shared storage device will not be able to keep up with the io demand placed onto it by Cassandra.

Simply put, shared storage cannot keep up with the amount of disk io placed onto a system from Cassandra.  It’s not recommended.  Don’t use shared storage with Cassandra and be happier for it.

Impact of Shared Storage on Cassandra was created by Jonathan Lacefield, Sr. Product Manager at DataStax.

Tell us how you’re using (or planning to use) DSE Advanced Security

$
0
0

I’d love to get your feedback on how you’re either using or planning to use DSE Security. Thanks in advance for the help!

Python Driver 3.8.0 Released

$
0
0

Today we released version 3.8.0 of the DataStax Python driver for Apache Cassandra. This release is primarily a bugfix release with no specific area of focus. You can find links to all tickets we addressed in the the CHANGELOG, but I’ll describe the highlights here.

Ending Python 2.6 Support

Support for Python 2.6 ended in 2013, and many tools have dropped support since then. Since we released version 3.7 of the driver, pip stopped supporting Python 2.6. Rather than maintain test infrastructure to test an unsupported interpreter, we chose to drop Python 2.6 support.

Monotonic timestamps by Default

The driver now guarantees that timestamps generated from sessions belonging to a given `cassandra.cluster.Cluster` object increase monotonically. Before this change, client timestamps could be generated out of order when there were discontinuities in time from the system clock.

This may reduce performance in some applications, as there can be lock contention if multiple sessions need to generate timestamps. If your application can tolerate out-of-order or identical timestamps, you can also set a custom timestamp generator as described in the documentation for cassandra.cluster.Cluster.timestamp_generator.

Added Replica-Shuffling Option to TokenAwarePolicy

The TokenAwarePolicy initializer now has takes a shuffle_replicas option. This feature is off by default, but if enabled, local replicas will be queried in a random order. See the documentation for a description of the API.

Note that the performance impact of replica shuffling depends on your workload and cluster topology. For some it may improve performance by, for instance, distributing queries across more nodes. For others it can hurt performance by, for instance, reducing the number of cache hits on Cassandra nodes.

Thanks

Many thanks to all who contributed code, wrote documentation, made feature requests, reported bugs, and educated fellow community members. We encourage you to stay involved:

Large Graph Loading Best Practices: Strategies (Part 1)

$
0
0
This post is an intro to DSE Graph with a focus on the strategies that should be used to load large graphs with billions of vertices and edges. For those familiar with DSE Graph and large graph strategies or those who want to dive directly into loading data, proceed to the next post in this two part series entitiled Large Graph Loading Best Practices: Tactics.

Intro to DSE Graph

DSE Graph is differentiated from other graph databases by building on DataStax Enterprise’s scalability, replication, and fault tolerance.

Note – To understand how DSE Graph data is stored in DSE’s Apache Cassandra(TM) storage engine, check out Matthias and Marko‘s posts on the matter.

When folks ask me where DSE Graph falls in the greater database / graph database landscape, I use this image to communicate the combination of scalability and value in relationships that make DSE Graph such a unique product:

landscape

DSE Graph is positioned on the right side of the chart where relationships are most valuable, and toward the top of the chart due to the scalability it inherits from DSE and Cassandra. The third key aspect that differentiates DSE graph is the velocity of the data it can support. Unlike analytical graph engines which load a static graph to memory and then crunch that static graph for insights, an operational graph is constantly changing as the real world concepts whose relationships and vertices it represents are created, updated, and deleted.

Key Takeaway – DSE Graph is designed as a real-time, operational, distributed graph database.

Motivation and Goals: Playing with Scalable Graphs

If you have distributed graph problem, you may want to bulk load your data into DSE graph and start querying. However, loading significant amounts of data (>1 billion V’s or E’s) into graph dbs is a time consuming, nontrivial task. The purpose of this article is to summarize some key design considerations related to dealing with large graphs.

Large graphs, idempotence, and scalability

Idempotence is a common concept in distributed systems design. If an operation is idempotent, it can be repeated over and over and still yield the same result. We use idempotence to help us solve problems like the fact that exactly once delivery does not exist, it also greatly simplifies the design of our systems, minimizing bugs and promoting maintainablility. For the purposes of two part article series, we are going to focus on building scalable distributed idempotent graphs. This is one of the design choices that is supported in DSE Graph but note that not all graphs that can be built on DSE graph will have idempotent vertices and idempotent edges.

Idempotent Vertices

DSE graph allows two types of vertices, 1) those with system generated keys and 2) those with custom ids. For the purposes of this two part series we are going to concentrate on custom ids. Custom ids are useful for large graph problems in that they allow developers to take graph partitioning into their own hands. Custom ids will feel familiar if you have used DSE or cassandra and understand data modeling.

You configure the partition key of your Vertex label with a DDL operation:

schema().vertexLabel('MachineSensor').partitionKey('manufacturing_plant_id').clusteringKey('sensor_id').create()

If you are using custom ids, the partition key is required and the clustering key is optional. For more on Cassandra data modeling and clustering keys vs. partition keys see my post on data modeling for DSE.

Note – your partition key, clustering key combination should provide uniqueness for the vertex. With this configuration, reinserting will not generate duplicates

Idempotent Edges

DSE Graph edges support different cardinality options. For multiple cardinality edges (where there can be more than one edge between the same two vertices of the same edge label type) edge creation is not idempotent.

For the purposes of this two part article series, we will focus on single cardinality (thereby idempotent) edges. You can create single cardinality edge lables in DSE Graph use the single() keyword:

schema.edgeLabel('has_sensor').single().create()

Let’s load!

Having considered the strategies mentioned above, let’s proceed to the second part which adresses the tactical aspects of loading large graphs.

Large Graph Loading Best Practices: Tactics (Part 2)

$
0
0
The previous post introduced DSE Graph and summarized some key considerations related to dealing with large graphs. This post aims to:
  1. describe the tooling available to load large data sets to DSE Graph
  2. point out tips, tricks, and key learnings for bulk loading with DSE Graph Loader
  3. provide code samples to simplify your bulk loading process into DSE Graph

Tooling

DSE Graph Loader (DGL) is a powerful tool for loading graph data into DSE Graph. As shown in the marchitecture diagram below, the DGL supports multiple data input sources for bulk loading and provides high flexibility for manipulating data on ingest by requiring custom groovy data mapping scripts to map source data to graph objects. See the DataStax docs which cover the DGL, it’s API, and DGL mapping scripts in detail.

landscape

This article breaks down the tactics for efficient data loading with DGL into the following areas:

  • file processing best practices
  • mapping script best practices
  • DGL configuration

Code and Tactics

Code to accompany this section can be found at:

https://github.com/phact/rundgl

Shout out to Daniel Kuppitz, Pierre Laporte, Caroline George, Ulisses Beresi, Bryn Cooke among others who helped create and refine this framework. Any bugs / mistakes are mine.

The code repository consists of:

  • a wrapper bash script that is used for bookkeeping and calling DGL
  • a mapping script whith some helpful utilities and * a generic structure that can be used as a starting point for your own custom mapping scripts
  • a set of analysis scripts that can be used to monitor your load

The rest of this article will be focused on DGL loading best practices linking to specific code in the repo for clarity.

File Bucketing

The main consideration to take when loading large data volumes is that DGL performance will suffer if fed too much data in a single run. At the time of this article (2/28/2017), the DGL has features designed for non-idempotent graphs (including deduplication of vertices via an internal vertex cache) that limit its performance with large idempotent graphs.

Splitting your loading files into chunks of about ~120Million or fewer vertices will ensure that the DGL does not lock up when the vertex cache is saturated.

The rundgl script found in the repo is designed to traverse a directory and bucket files, feeding them to DGL a bucket at a time to maximize performance.

Track your progress

The analysis directory in the rundgl repo contains a script called chart. This script will aggregate statistics from the DGL loader.log and generate throughput statistics and charts for the different loading phases that have occurred (Vertices, Edges, and Properties).

Note – these scripts have only been tested with DGL < 5.0.6

Navigate to the analysis directory and run ./charts to get a dump of the throughput for your job:

analysis

It will also start a simple http server in the directory for easy access to the png charts it generates, here is an example chart output:

total throughput

Thank you Pierre for building and sharing the analysis scripts

Monitor DGL errors

When the DGL gets timeouts from the server it does not log them to STDOUT, they can only be seen in logger.log. In a busy system, it is normal to see timeouts. These timeouts will be handled by retry policies which are baked into the loader. Too many timeouts may be a sign that you are overwhelming your cluster and need to either slow down (reduce threads in the loader) or scale out (add nodes in DSE). You will know if timeouts are affecting your cluster if your overall throughput starts trending down or if you are seeing backed up threadpools or dropped mutations in OpsCenter.

Aside from Timeouts, you may also see errors in the DGL log that are caused by bad data or bugs in your mapping script. If enough of these errors happen the job will stop. To avoid having to restart from the beginning on data related issues, take a look at the bookeeping section below.

Don’t use S3

If you are looking to load significant amounts of data, do not use S3 as a source for performance reasons. It will take less time to parallel rsync your data from s3 to a local SSD and then load it than to load directly from s3.

DGL does have S3 hooks and from a functionality perspective it works quite well (including aws authentication) so if you are not in a hurry, the repo also includes an example for pulling data from S3. Just be aware of the performance overhead.

Groovy mapper best practices

Custom groovy mapping scripts can be error prone and the DGL’s error messages sometimes leave a bit to be desired. The framework provided aims to simplify the loading process by providing a standard framework for all your mapping scripts and minimizing the amount of logic that goes into the mapping scripts.

Use the logger

DGL mapping scripts can use the log4j logger to log messages. This can be very useful when troubleshooting issues in a load, especially issues that only show up at runtime with a particular file.

It also allows you to track when particular loading events occur during execution.

INFO, WARN, and ERROR messages will be logged to logger.log and will include a timestamp.

DGL takes arguments

If you need to pass an argument to the groovy mapper just pass it with - <argname> in the command line.
The variable argname will be available in your mapping script.

For example, ./rundgl passes -inputfilename to DGL here leveraging this feature. You can see the mapping script use it here.

The argument inputfilename is a list of files to process to the DGL. This helps us avoid having to do complex directory traversals in the mapping script itself.

By traversing files in the wrapper script, we are also able to do some bookeeping.

Bookeeping

loadedfiles.txt tracks the start and end of your job as well as the list of files that were loaded and when each particular load completed. This also enables us to “resume” progress by modifying the STARTBUCKET.

STARTBUCKET represents the first bucket that will be processed by ./rundgl, if you stopped your job and want to continue where you left off, count the number of files in loadedfiles.txt and divide by the BUCKETSIZE, this will give you the bucket you were on. Starting from that bucket will ensure you don’t miss any files. Since we are working with idempotent graphs, we don’t have to worry about duplicates etc.

DGL Vertex Cache Temp Directory

Especially when using EBS, make sure to move the default directory of the DGL Vertex Cache temp directory.

IO contention against the mapdb files used by DGL for its internal vertex cache will overwhelm amazon instances’ root EBS partitions

Djava.io.tmpdir=/data/tmp

Leave search for last

If you are working on a graph schema that uses DSE Search indexes, you can optimize for overall load + indexing time by loading the data without the search indexes first and then creating the indexes once the data is in the system.

Some screenshots

Here are some screenshots from a system using this method to load billons of vertices. The spikes are each run of the DGL kicked off in sequence by the rundgl mapping script. You can see the load and performance are steady throughout the load giving us dependable throughput.
Like with other DSE workloads, if you need more speed, scale out!

Throughput:

throughput

OS Load:

osload

Conclusion

With the tooling, code, and tactics in this article you should be ready to load billions of V’s, E’s, and P’s into DSE graph. The ./rundgl repo is there to help with error handling, logging, bookeeping, and file bucketing so that your loading experience is smooth. Enjoy!

Viewing all 381 articles
Browse latest View live