Apache Solr: Loading Data at Startup

SolrEventListener is an interface that defines a set of callbacks on several lifecycle events:

void postCommit()
void postSoftCommit()
void newSearcher(SolrIndexSearcher newSearcher, SolrIndexSearcher currentSearcher)

For this example, I’m not interested in the two first callbacks because the corresponding invocations will happen, as their name suggests, after hard and soft commit events.
The interesting method is instead newSearcher(…) which allows me to register a custom event listener associated with two events:

firstSearcher
newSearcher

In Solr, the Index Searcher which serves requests at a given time is called the current searcher. At startup time, there’s no current searcher because the first one is created; hence we are in the “firstSearcher” event, which is exactly what I was looking for 😉

When another (i.e. new) searcher is opened, it is prepared (i.e., auto-warmed) while the current one still serves the incoming requests. When the new searcher is ready, it will become the current searcher, it will handle any new search requests, and the old searcher will be closed (as soon as all requests it was servicing finished). This scenario is where the “newSearcher” callback is invoked.

As you can see, the callback method for those two events is the same; there’s no a “firstSearcher” and a “newSearcher” method. The difference resides in the input arguments: for “firstSearcher” events, there’s no currentSearcher so the second argument is null; this is obviously not true for “newSearcher” callbacks where both the first and second arguments contain a valid searcher reference.

Returning to my scenario, all that I need

to declare that listener in solrconfig.xml
a concrete implementation of SolrEventListener

In solrconfig.xml, within the <updateHandler> section, I can declare my listener:

				
					<listener event="firstSearcher" class="a.b.c.SolrStartupListener">
    <str name="datafile">${solr.solr.home}/sample/data.xml&lt;/str>
</listener>

The listener will be initialized with just one parameter, the file that contains the sample data. Using the “event” attribute I can inform Solr about the kind of event I’m interested on (i.e firstSearcher).

The implementation class is quite simple: it extends SolrEventListener:

				
					public class SolrStartupListener implements SolrEventListener {
...

    @Override
    public void init(final NamedList args) {
        this.datafile = (String) args.get("datafile");
    }
    ...
    
    LocalSolrQueryRequest request = null;
    try {
           // 1. Create the arguments map for the update request
           final NamedList args = new SimpleOrderedMap();
            args.add(
                    UpdateParams.ASSUME_CONTENT_TYPE,  
                    "text/xml");
            addEventParms(currentSearcher, args);

            // 2. Create a new Solr (update) request
            request = new LocalSolrQueryRequest(
                     newSearcher.getCore(), 
                     args);
           
            // 3. Fill the request with the (datafile) input stream
            final List streams = new ArrayList();
            streams.add(new ContentStreamBase() {
                @Override
                public InputStream getStream() throws IOException {
                    return new FileInputStream(datafile);
                }
            });
           
            request.setContentStreams(streams);
           
            // 4. Creates a new Solr response
            final SolrQueryResponse response = 
                new SolrQueryResponse();
           
            // 5. And finally call invoke the update handler
            SolrRequestInfo.setRequestInfo( 
                new SolrRequestInfo(request, response))

            newSearcher
                 .getCore()
                 .getRequestHandler("/update")
                 .handleRequest(request, response);    
  
        } finally {
            request.close();
        }
    }
}

Voilà: if you start Solr, you will see sample data loaded. Other than avoiding a lot of repetitive tasks, this could be useful when you’re using a SolrCore as a NoSql storage, like for example, if you are storing SKOS vocabularies for synonyms, translations, and broader / narrower searches.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Apache Solr: Loading Data at Startup

Share this post

Leave a ReplyCancel reply

SpazioCodice SRL

Services

Useful Links

Contact Us

Apache Solr: Loading Data at Startup

Share this post

Leave a ReplyCancel reply

SpazioCodice SRL

Services

Useful Links

Contact Us

Discover more from SpazioCodice