eXXcellent solutions tech blog logoabout

TechBlog - Client Side Search with LUNR

Cover Image for TechBlog - Client Side Search with LUNR
Posted
by Luca Nimmrichter

Currently our blog uses Gatsby to statically render all pages before deploying them onto a web server. Because of that we were eager to find a solution to implement the search on the client side so no additional backend would be required.

In the following we will have a quick look at the lunr.js library and its associated Gatsby plugin gatsby-plugin-lunr.

lunr.js

Because the GitHub page of LUNR sums it up pretty well, here is its description:

Lunr.js is a small, full-text search library for use in the browser. It indexes JSON documents and provides a simple search interface for retrieving documents that best match text queries.

With that being sad let's have a quick look at an example:

const blogPosts = [{ id: "A_VERY_LONG_AND_RANDOM_ID", title: "TechBlog - Client Side Search with LUNR", author: "Luca Nimmrichter", tags: ["Search", "JavaScript", "Gatsby", "LUNR"], content: "This acritlce explains how the client side search for this blog ..." }]; const index = lunr(function(){ this.ref("id"); this.field("title", {boost: 2}); this.field("author", {boost: 3}); this.field("tags", {boost: 4}); this.field("content"); for(const blogPost of blogPosts) this.add(blogPost); }); const results = index.search("search"); console.log(results); // Output: [ { "ref": "A_VERY_LONG_AND_RANDOM_ID", "score": 2.497, "matchData": { "metadata": { "search": { "title": {}, "tags": {}, "content": {} } } } } ]

First of we have our documents we want to index. In this case these documents are blog posts. Each blog post has a unique id that identifies it, as well as a title, author, tags and the content. Afterwards the search index is build. This is done by calling lunr and passing it a function that configures how the index is build. Afterwards the search method can be used on the index to retrieve all matching documents.

Building the Index

Building an index mainly consists out of three simple steps:

  • Set a reference field
  • Configure an arbitrary amount of fields
  • Add documents to the index

With ref(String) the reference field is set. It uniquely identifies each document and is therefore mandatory.

Other fields are added via the field(String, Object) method. Its important to note that the names of both, the reference and the other fields, have to match the name of a top-level attribute within each document. For normal fields if that is not the case an extractor function can be added inside the second - optional - parameter object. Another option that can be added there is a boost modifier. The default boost value is 1, increasing that value will increase the relevance of all matches within the field. Which will in turn rank the blog post matching that field higher in the search results.

Last but not least each document is added by calling the add(Object, Object) method. As a second parameter, similar to the field method, an optional object can be passed with a boost value to increase the overall relevance of this document.

There are a few more options to configure the index such as adding plugins, modifying the way LUNR filters words that are indexed and tuning the values of the matching algorithm used by LUNR. More information about these topics can be found here.

Searching

As seen in the example above, searching in LUNR is as simple as calling the search method on an index. The results always consist out of the same three main parts:

  • ref - Holds the value of the reference field initially set by calling ref(String).
  • score - The matching score which correlates with the importance of a document based on the query. More information about the score here.
  • matchData - Contains information about where the match occurred as well as additional metadata if enabled.

LUNR provides many options to tune the search query by manipulating individual terms to:

  • Include wildcards
  • Match only a specific field
  • Boost its relevance
  • Fuzzy matching e.g. to compensate typos
  • Be required/optional/excluded

The syntax of these features as well as detailed information on how they work can be found here.

Gatsby (gatsby-plugin-lunr)

This plugin makes it easy to use LUNR in combination with Gatsby. To keep the tradition alive here is the GitHub description:

Gatsby plugin for full text search implementation based on Lunr.js client-side index. It supports multi-language search. Search index is placed into the /public folder during build time and has to be downloaded on client side on run time.

Now that the basics of LUNR are covered i will describe how the plugin is configured for this blog as well as how the remaining search was implemented.

Configuration

This plugin mainly consists out of a configuration entry inside of gatsby-config.js.

{ resolve: "gatsby-plugin-lunr", options: { languages: [{ name: "en" }], fields: [ { name: "content", attributes: {boost: 1}}, { name: "author", attributes: {boost: 2}}, { name: "subtitle", attributes: {boost: 3}}, { name: "title", attributes: {boost: 4}}, { name: "tags", attributes: {boost: 5}} ], resolvers: { MarkdownRemark: { title: node => node.frontmatter.title, subtitle: node => node.frontmatter.subtitle, author: node => node.frontmatter.author, tags: node => node.frontmatter.tags, content: node => node.frontmatter.content, } } } }

The used languages are set under options.languages. Besides the name, other attributes such as filterNodes and customEntries can be specified here. A full documentation of all available options within this plugin configuration can be found here.

Under options.fields all fields can be configured. Everything should be self-explanatory after reading the lunr.js basics. It is important to note that this plugin does not allow you to set the reference field. It will be automatically populated with the Gatsby node id.

Under options.resolvers each node type has a set of functions that describe how to resolve each previously registered field.

Each time Gatsby is build, the plugin will automatically build the LUNR search index and put it into to /public directory.

Retrieve Blog Posts

To display the search results, the data of all blog posts are required. The following GraphQL query is responsible for that:

query { allMarkdownRemark { nodes { id excerpt(pruneLength: 100) fields { slug } frontmatter { date(formatString: "MMMM DD, YYYY") title description tags author } } } }

To match the search results to the blog posts more efficiently we map the blog post id to its object:

const blogPostIdMap = {}; for(const blockPost of graphqlData.allMarkdownRemark.nodes){ blogPostIdMap[blockPost.id] = blockPost; }

Search Implementation

On the frontend the index is available as window.__LUNR__.en.index and the search query as a GET parameter. Now bringing all pieces together:

const query = new URLSearchParams(window.location.search).get("query") || ""; let searchResults = []; if(query.trim().length !== 0){ searchResults = window.__LUNR__en.index.search(query); } const blogPosts = searchResults .map(r => blogPostIdMap[r.ref]);

Display Search Results

Finally, the last and also quite simple step - displaying the search results. Because we already have a react component on the front page to display a list of blog posts, it is as easy as reusing it for this purpose:

<BlogList posts={blogPosts} />

Conclusion

For our use case LUNR is great and it does its job. In combination with the plugin for Gatsby it is really easy to get a basic search up-and-running within a matter of time. Despite LUNRs simplicity there are still quite a lot of customization options available to adjust it to your needs.

Image sources

The cover image used in this post was created by Pixabay under the following license. All other images on this page were created by eXXcellent solutions under the terms of the Creative Commons Attribution 4.0 International License