You are here

Adding a custom field to the Solr search index (in Drupal)

I recently had a project, which required the search results returned by the Solr search engine used on our Drupal site (via the Apache Solr Search module) to be ordered by a custom field.  The custom field was not even directly on the given node, but could be calculated through some relationships to the node.  Imagine this as a sort based on which records belonged to a subscribed user, for example, without there being an actual field on the record to indicate that this node belongs to a subscribed user (though you can get to it through the author field).  The default sort options of ordering by title, relevancy, type, author or date just wouldn't do here.

In this blog post I will show you how I added a custom field to the Solr index and ordered our search results based on that custom field. 

Add the custom field to the search index

In our custom module, we need to first implement the hook_apachesolr_index_document_build_node(ApacheSolrDocument $document, $node, $env_id) function, which will be called each time a node is added to the index.  Here, we can use the $document parameter to add any additional fields we want to the index.

function MY_MODULE_apachesolr_index_document_build_node(ApacheSolrDocument $document, $node, $env_id) {
  // Add the full node object of 'MY_CUSTOM_NODE_TYPE' nodes to the index.
  if ($node->type == 'MY_CUSTOM_NODE_TYPE') {

    // Get to the actual field I want to order by through a SQL query
    $result = db_query("SELECT
    node.nid

    ... blah, blah ...
    ... no need to show you the entire query ...

    ORDER BY node.created DESC
    LIMIT 1 OFFSET 0", array(':nid' => $node->nid);
    

    // Here, I'm adding a new field to the index for this node, and I am just setting it to 1, because it's just signifying
    // that this node belongs to a subscribed user, so we just need a TRUE and FALSE.  This could be a string value though, if you need, etc
    foreach($result as $record) {
        $document->addField('bs_MY_NEW_CUSTOM_INDEX_FIELD', 1);
    }
  }
}

Naming your custom field

If you look at the schema.xml file that comes with the apachesolr module, you can find a section that covers the typical prefixes it uses for the fields in the index.  Here is just an excerpt (pay attention to that first comment):

 <!-- For 2 and 3 letter prefix dynamic fields, the 1st letter indicates the data type and
         the last letter is 's' for single valued, 'm' for multi-valued -->

    <!-- We use long for integer since 64 bit ints are now common in PHP. -->
    <dynamicField name="is_*"  type="long"    indexed="true"  stored="true" multiValued="false"/>
    <dynamicField name="im_*"  type="long"    indexed="true"  stored="true" multiValued="true"/>
    <!-- List of floats can be saved in a regular float field -->
    <dynamicField name="fs_*"  type="float"   indexed="true"  stored="true" multiValued="false"/>
    <dynamicField name="fm_*"  type="float"   indexed="true"  stored="true" multiValued="true"/>
    <!-- List of doubles can be saved in a regular double field -->
    <dynamicField name="ps_*"  type="double"   indexed="true"  stored="true" multiValued="false"/>
    <dynamicField name="pm_*"  type="double"   indexed="true"  stored="true" multiValued="true"/>
    <!-- List of booleans can be saved in a regular boolean field -->
    <dynamicField name="bm_*"  type="boolean" indexed="true"  stored="true" multiValued="true"/>
    <dynamicField name="bs_*"  type="boolean" indexed="true"  stored="true" multiValued="false"/>
    <!-- Regular text (without processing) can be stored in a string field-->
    <dynamicField name="ss_*"  type="string"  indexed="true"  stored="true" multiValued="false"/>
    <dynamicField name="sm_*"  type="string"  indexed="true"  stored="true" multiValued="true"/>
    <!-- Normal text fields are for full text - the relevance of a match depends on the length of the text -->
    <dynamicField name="ts_*"  type="text"    indexed="true"  stored="true" multiValued="false" termVectors="true"/>
    <dynamicField name="tm_*"  type="text"    indexed="true"  stored="true" multiValued="true" termVectors="true"/>

 

So now you can see why my custom field above has a prefix of bs_... As the first comment above indicates, the first letter, 'b', stands for binary, while the second letter, 's', stands for single value, versus multi-value.

Choose a prefix that describes the data that your field will contain (and start understanding your site's Solr index better!) with this information.

Testing

Now after rebuilding the search index, test to make sure your new field is showing up in Solr's search index.  You can find detailed information about what's been indexed on your site here: /admin/reports/apachesolr

Add a custom sort to search queries

Now let's intercept new search queries and give them a custom sort:

function MY_MODULE_apachesolr_query_prepare($query) {

  // Customize the name of this field.  Don't use all CAPS, I'm just doing that to bring the field to your attention and hope you notice that you need to change it.  ;-)
  $query->addParam('sort', 'bs_MY_NEW_CUSTOM_INDEX_FIELD desc');  
  $query->addParam('sort', 'sort_label asc');   // Sort I want used after the subscription-based nodes are put on top

}

 

Voila!  That should be all you need.  It works for me beautifully.

 

Let us know if it works for you too.

Share

Comments

Thanks for this great tutorial. I would be grateful if you can share with us the custom module created. I have been struggling with this issue of adding a custom field to Solr and I hope your module will be a starting point for me.

Thanks for sharing!!!

I want to dynamically add field using $document->addField('ss_MY_NEW_CUSTOM_INDEX_FIELD', 1);
But this custom field has to be indexed only and not stored. How to do it?

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.