2

Keeping your Algolia search index up to date

 7 months ago
source link: https://www.algolia.com/blog/product/keeping-your-algolia-search-index-up-to-date/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Keeping your Algolia search index up to date

Feb 12th 2024product

Keeping your Algolia search index up to date

When creating your initial Algolia index, you may seed the index with an initial set of data. This is convenient and gets you up and running quickly. That’s rarely it though. Soon you’ll need to add new items, edit older items, and sometimes even completely remove content. Keeping your search index up to date as your content changes is critically important to ensure your search results return valid information. Here’s a high-level look at how that can be accomplished.

Determine your updating schedule

In theory, it is best if your index is always completely in sync with your data, but every application is unique and you may decide on a scheduled update. For example, a site with millions of content edits throughout the day may decide on a daily update at midnight to reduce the number of operations done during the day. This will add complexity though, as you will need to determine all the changes since the last update and apply them accordingly. While there are cases where this may make sense, in this article we’ll assume an immediate update is performed every time content changes. This will be most beneficial to your users as search can be the primary way they find your content. Having the most up-to-date index will be a huge way to help with that.

Adding and updating content

One of the aspects of the Algolia APIs and SDKs is that additions and updates can be done the same way. Algolia is smart enough to recognize the addition of a new object as well as when an existing record is being updated with new data, so they can both make use of the same method.

As an example, a new record for a cat may be defined as:

jsx
const newCat = {
name: 'Luna',
age: 10,
gender: 'female',
breed: 'calico'
}

And using the JavaScript SDK, stored in an index like so:

jsx
let objectIds = await index.saveObjects([newCat]);

Algolia will notice the lack of an object identifier and add one accordingly.

Conversely, when your data is edited, the object passed to Algolia can simply include the existing object identifier:

jsx
const oldCat = {
name: 'Luna',
age: 11,
gender: 'female',
breed: 'calico'
}
let objectIds = await index.saveObjects([newCat]);

As evidenced by the use of an array, the saveObjects method of the SDK does allow for multiple objects to be passed in at once.

Removing content

When content is removed from your database, a corresponding deletion should be done against your Algolia index. Given the ID of the object just removed, in JavaScript this can be done like so:

jsx
// the id of what was just removed...
let objectId = 'something';
index.deleteObjects([objectId]);

As before, this method can send multiple objects as well.

An Example

Let’s take a look at an example of this in action. For our demonstration, we’ll use Eleventy. Obviously, Algolia works with nearly everything, but Eleventy is quick and simple. Our Eleventy site is a basic blog with a few pieces of content:

eleventy example

For the content, we used Google’s Generative AI Gemini model to write a few paragraphs of text about why cats, fish, dogs, and dragons, are excellent pets.

dogs example

Let’s see how easy it is to integrate with Algolia. Before starting, we got an application and admin key for our credentials. While not required, we also created a new index in the dashboard named BlogSearch.

While there’s multiple ways of handling this, a simple example can make use of Eleventy’s eleventy.after event. This runs every time a build is generated and is defined in the Eleventy configuration file, .eleventy.js. Here’s the initial code version of this that just logs when the event is fired:

jsx
const english = new Intl.DateTimeFormat('en');
eleventyConfig.addFilter("dtFormat", function(date) {
return english.format(date);
});
eleventyConfig.on(
'eleventy.after',
async ({ dir, results, runMode, outputMode }) => {
console.log('build done');
}
);

To add Algolia support, we’ll install the JavaScript SDK. The docs cover this well. Begin with installing the SDK with npm install algoliasearch, then integrate it into our .eleventy.js. First, we’ll load up the SDK and configure it with credentials:

jsx
const algoliasearch = require('algoliasearch');
const algoliaClient = algoliasearch('my app id', 'my key that can write data');
const algoliaIndex = algoliaClient.initIndex('BlogSearch');

Next, we need to generate the content we’ll use to populate the Algolia index. Let’s look at the code, and then we’ll explain it in detail:

jsx
eleventyConfig.on('eleventy.after', async ({ dir, results, runMode, outputMode }) => {
// Gather posts by filtering our result.
let posts = results.filter(p => p.url.indexOf('/posts/') === 0);
// Now reduce the post content to <main> ... </main>
posts = posts.map(p => {
let post = {
content: p.content.replace(/[\\s\\S]*?<main.*?>([\\s\\S]*?)<\\/main>[\\s\\S]*/gm, '$1').trim(),
objectID: p.url,
title:p.content.match('<title>(.*?)</title>')[1]
}
return post;
});
console.log('Sending to Algolia....');
await algoliaIndex.saveObjects(posts);
console.log('Done...');
});

We begin by getting a list of generated files from our Eleventy site and filtering to the blog posts. This is an arbitrary decision. Your site may contain other pages that you wish indexed as well.

After that, we then convert the raw HTML. We grab the content between the <main> tags (which avoids site layout and unrelated content), the title, and use the URL of the content as a unique identifier.

The last step is to pass this to our index.

And does it work? One of the best features of Algolia’s dashboard is a built-in search so you can test right away. Let’s try a search for “dragon”:

dragon.jpg

Now let’s test how well our code updates. First, a search for camden, which returns nothing:

shot4.jpg

Next, we’ll update one blog post (the dog one), to contain the text, and voila, it now shows up:

shot5.jpg

Simple enough. Keep in mind that updating your Algolia index this way could be a bit overkill in production. An alternative would be use to your platforms event handling system. For example, Netlify lets you run code when your site is built.

Next Steps

As demonstrated above, Algolia’s SDKs and APIs help support atomic updates with methods for adding (as well as editing) and removing content. By integrating these into your existing CMS or other backend code, you can easily keep your search index in sync with your actual content. If you have any questions, reach out to us on Discord.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK