Follow Us:

Call Now! +39 0761 1916790

SPARQL Integration tests with SolRDF

In 2014 I got a chance to give some contribution to a wonderful project, CumulusRDF, an RDF store on a cloud-based architecture. The project Integration Test Suite was definitely a challenging task.

I used JUnit for running some examples coming from Learning SPARQL by Bob DuCharme (O’Reilly, 2013). Both O’Reilly and the Author (BTW thanks a lot) gave me permissions to do that in the project.

So, when I set up the first prototype of SolRDF, I wondered how I could create a complete (integration) test suite for doing more or less the same thing…and I came to the obvious conclusion that something of that work could be reused.

Something had to be changed. That mainly because CumulusRDF uses Sesame as underlying RDF framework, while SolRDF uses Jena…but at the end it was a minor change…they are both valid, easy and powerful.

So, in LearningSPARQL_ITCase there is:

  • A setup method for loading the example data;
  • A teardown method for cleaning up the store;

The example data is provided, in the LearningSPARQL website, in several files. Each file can contain: a small dataset or a query or an expected result (in tabular format). Coming back to the test suite, the flow should load the small dataset X, run the query Y and verify the results Z.

Although this post illustrates how to load a sample dataset in SolRDF, this is something that you can do from the command line, and not in a JUnit test. Instead, using Jena we can automatize the data loading in SolRDF using these few lines:

// DatasetAccessor provides access to  
// remote datasets using SPARQL 1.1 Graph Store HTTP Protocol
DatasetAccessor dataset = DatasetAccessorFactory.createHTTP();

// Load a local memory model
Dataset memoryDataset = DatasetFactory.createMem();
Model memoryModel = memoryDataset.getDefaultModel(); 
memoryModel.read(dataURL, ...);

// Load the memory model in the remote dataset
dataset.add(memoryModel);

Great, data has been loaded! In another post I will explain what I did, in SolRDF, for supporting the SPARQL 1.1 Graph Store HTTP Protocol.

It’s time to run some query in order to assert and check the corresponding results. As you can see the tests execute the same query twice: the first is against a memory model, the second towards SolRDF. In this way, assuming the Jena memory model behaviour as a ground truth, each test will be able to check and compare results coming from SolRDF:

final Query query = QueryFactory.create(readQueryFromFile(...));
QueryExecution execution = null;
QueryExecution memExecution = null;  
try {
  execution = QueryExecutionFactory.sparqlService(SOLRDF_URL, query);
  memExecution = QueryExecutionFactory.create(query, memoryDataset);
  
  ResultSet rs = execution.execSelect();
  ResultSet mrs = memExecution.execSelect();
  assertTrue(ResultSetCompare.isomorphic(rs, mrs));
} catch (...) {
  ...
} finally {
  // Close executions 
}

Last but not least, the RDF store needs to be cleared after each test. Although the Graph Store protocol would be very useful for such purpose, it cannot be implemented in Solr because some HTTP methods (i.e. PUT and DELETE) cannot be used in RequestHandlers: Solr allows those methods only for /schema and /config requests. So while a clean up could be easily done using something like this:

dataset.deleteDefault();

Or, in HTTP:

DELETE /rdf-graph-store?default HTTP/1.1 
Host: example.com

It’s not possible to implement it so the only remaining approach is a Solr plain way to do that:

SolrServer solr = new HttpSolrServer(SOLRDF_URI);
solr.deleteByQuery("*:*");
solr.commit();

That has nothing to do with RDF and with the Graph Store protocol, but for such purpose (specifically test-scoped) it sounds like a good compromise.

That’s all! I just merged all those stuff in the master so feel free to have a look. If you want to run the integration test suite you can do that from command line:

> cd $SOLRDF_HOME
> mvn clean install

or in Eclipse, using the predefined Maven launch configuration solrdf/src/dev/eclipse/run-integration-test-suite.launch. Just right-click on that file e choose “Run as…”

After starting the suite, you can see these messages:


...
(build messages)
...
[INFO] ---------------------------------------------------
[INFO] Building Solr RDF plugin 1.0
[INFO] ---------------------------------------------------
...
(unit tests)
...
[INFO] ---------------------------------------------------
[INFO] TESTS
[INFO] ---------------------------------------------------
...
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.691 sec
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0
...
(cargo section. It starts the embedded Jetty)
...
[INFO] [beddedLocalContainer] Jetty 7.6.15.v20 Embedded starting...
...
[INFO] [beddedLocalContainer] Jetty 7.6.15.v20 Embedded started
...
(integration tests section)
...
[INFO] ---------------------------------------------------
[INFO] TESTS
[INFO] ---------------------------------------------------
...
Running org.gazzax.labs.solrdf.integration.LearningSparql_ITCase
[INFO]  Running Query with prefixes test...
[INFO]  [store] webapp=/solr path=/rdf-graph-store params={default=} status=0 QTime=712
...
[DEBUG] : Query type 222, incoming Accept header...
...
[INFO]  [store] Closing main searcher on request.
...
[INFO] [beddedLocalContainer] Jetty 7.6.15.v20140411 Embedded is stopped
[INFO] --------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] --------------------------------------------------
[INFO] Total time: 42.302s
[INFO] Finished at: Tue Feb 10 18:19:21 CET 2015
[INFO] Final Memory: 39M/313M
[INFO] --------------------------------------------------
Andrea Gazzarini

Andrea Gazzarini is a curious software engineer, mainly focused on the Java technology. He strongly loves coding and definitely likes to be considered a developer. Andrea has more than 15 years of experience in various software engineering areas, from telecommunications to banking. He has worked for several medium- and large-scale companies, such as IBM and Orga Systems. Andrea has several certifications in the Java programming language (programmer, developer, web component developer, business component developer, and JEE architect), BEA products (build and portal solutions), and Apache Solr (Lucid Apache Solr/Lucene Certified Developer).

No Comments

Post a Comment

Comment
Name
Email
Website