Follow Us:

Call Now! +39 0761 1916790

RDF meets Solr: S-P-O-C Faceting

SolRDF is a Solr extension for indexing and searching RDF data.
In this post you can find a brief guide about how to set-up it in a couple of minutes.

Once everything is set up and running, SolRDF can index and query RDF data. Since it is running on top of a full text search engine, is it possible to combine some cool Solr feature Solr with SPARQL results?

The underlying idea is: SPARQL results serialisation is described in several W3C documents and being a standard, it cannot be changed.

The SPARQL results should be wrapped within a higher-level response so it could be possible to embed additional information like metadata, facets.

Solr query response sounds perfect to accomplish this goal: it’s just a matter of replacing the <result> section with a <sparql> XML document. Note I’m specifically talking about the XML response writer because only this writer is supported at the moment.

A query like this

/sparql?facet=true&amp;facet.field=p&amp;q=SELECT * WHERE {?s ?p ?o}

produces the following response (note the mix between SPARQL and Solr results):

<response>
    <lst name="responseHeader">
        <int name="status">0
        <int name="QTime">31
        <str name="query">SELECT  * WHERE{ ?s ?p ?o}
    </lst>
    <result name="response" numFound="3875" start="100">
        <sparql>
            <head>
                <variable name="s"/>
                <variable name="p"/>
                <variable name="o"/>
            </head>
            <results>
                <result>
                    <binding name="s">
                        <uri>http://example/book2
                    </binding>
                    ...
                </result>
                ...
        </results>
    </sparql>
    </result>
    <lst name="facet_counts">
        <lst name="facet_queries"/>
        <lst name="facet_fields">
            <lst name="p">
              <int name="<http://example.org/ns#price>">231</int>
              <int name="<http://purl.org/dc/elements/1.1/creator>">1432</int>
              <int name="<http://purl.org/dc/elements/1.1/title>">2212</int>
            </lst>
          </lst>
        <lst name="facet_dates"/>
        <lst name="facet_ranges"/>
   </lst>
</response>

What does trigger that hybrid search? The underlying rule is actually very simple:

  • if the query string contains only a q parameter then the plain SPARQL endpoint will execute the query. It will return a standard SPARQL-Result response;
  • if the query string contains also other parameters (at the moment I considered only the facet, facet.field, rows and start parameters) then a hybrid search will be executed, therefore providing results in the mixed mode listed above.
Andrea Gazzarini

Andrea Gazzarini is a curious software engineer, mainly focused on the Java technology. He strongly loves coding and definitely likes to be considered a developer. Andrea has more than 15 years of experience in various software engineering areas, from telecommunications to banking. He has worked for several medium- and large-scale companies, such as IBM and Orga Systems. Andrea has several certifications in the Java programming language (programmer, developer, web component developer, business component developer, and JEE architect), BEA products (build and portal solutions), and Apache Solr (Lucid Apache Solr/Lucene Certified Developer).

No Comments

Sorry, the comment form is closed at this time.