Examine Indexing in Umbraco: Custom Search Optimization Explained

Examine Indexing in Umbraco: Custom Search Optimization Explained

Umbraco

Introduction

Searching is an integral part of any user experience. Regardless of whether you use your Umbraco website to conduct business, provide technical support, or publish content, your users are going to expect instant and highly relevant results. Examine Umbraco’s built-in search and indexing engine, which provides the basis for this.

In this article, we will discuss how Examine works, when customizing is necessary, and some real-world patterns to build search solutions people enjoy using.

Understanding Examine Architecture

Examine is built on top of Lucene.NET, which is Umbraco's default full-text search engine. The Examine library builds indexes that allow you to perform fast searches without having to iterate through every node in the content tree.

By default, Umbraco creates the following indexes:

ExternalIndex - Indexes published content that is visible to your website’s visitors.

InternalIndex - Indexes all the content including drafts; used to perform searches by editors.

Why Customize Indexing?

Limitations of Default Indexes:

Inadequate field coverage: Some of the fields that you require are not indexed. You cannot include any customized document type and properties.

Low ranking criteria: The search result is ranked based on its relevance score only. There are chances where you require to give some sort of preference to products on sale, recent posts, or category wise.

No facet support: You cannot filter your result based on categories, price range, and custom metadata unless you use customized indexes.

Slow performance: When the query goes through a large dataset, there is some kind of performance hit. In a customized index, you can index whatever information you need.

English language and analysis: If you need to analyze documents in any other language apart from English, you would need a customized index.

Building a Custom Index

Here's how to create a custom index for an e-commerce site:

public class ProductSearchIndexComposer : IComposer

{

    public void Compose(IUmbracoBuilder builder)

    {

       builder.Services.AddSingleton<IIndexHandler, ProductSearchIndexHandler>();

    }

}

public class ProductSearchIndexHandler : IIndexHandler

{

    private readonly IExamineManager _examineManager;

    private readonly IPublishedContentTypeCache _contentTypeCache;

    public ProductSearchIndexHandler(IExamineManager examineManager,

       IPublishedContentTypeCache contentTypeCache)

    {

        _examineManager = examineManager;

        _contentTypeCache = contentTypeCache;

    }

    public void Handle(IndexingNotification notification)

    {

        var indexer = _examineManager.GetIndex("ProductIndex");

        foreach (var item in notification.PublishedNodes)

        {

            if (item.ContentType.Alias == "product")

            {

                var valueSet = new ValueSet(

                   item.Id.ToString(),

                   "content",

                   "product",

                   new Dictionary<string, IEnumerable<object>>

                   {

                      { "nodeName", new object[] { item.Name } },

                      { "productName", new object[] { item.GetProperty("name")?.GetValue() } },

                      { "description", new object[] { item.GetProperty("description")?.GetValue() } },

                      { "price", new object[] { item.GetProperty("price")?.GetValue() } },

                      { "category", new object[] { item.GetProperty("category")?.GetValue() } },

                      { "isOnSale", new object[] { item.GetProperty("onSale")?.GetValue() } }

                   }

                );

                indexer.Index(valueSet);

            }

        }

    }

}

This approach:

  • Listens for publish events
  • Extracts only the product nodes
  • Indexes explicitly the custom fields (price, category, sale status)
  • Enables sorting and searching of the custom fields

Enhancing Search Queries

So now you have a good index. The next thing you want to do is improve your queries.

public class ProductSearchService

{

    private readonly IExamineManager _examineManager;

    public ProductSearchService(IExamineManager examineManager)

    {

        _examineManager = examineManager;

    }

    public SearchResults Search(string term, string category, bool onlyOnSale)

    {

        var searcher = _examineManager.GetSearcher("ProductIndex");

        var query = searcher.CreateQuery("content");

        // Search across multiple fields with boosting

        query = query

           .Field("productName", term).Boost(10)

           .Or()

           .Field("description", term).Boost(2);

        // Filter by category

        if (!string.IsNullOrEmpty(category))

        {

           query = query.And().Field("category", category);

        }

        // Filter for sale items only

        if (onlyOnSale)

        {

           query = query.And().Field("isOnSale", "true");

        }

        var results = query.Execute();

        return new SearchResults

        {

           Items = results.Select(x => new SearchItem

           {

              Id = x.Id,

              Name = x.GetValues("productName").FirstOrDefault(),

              Price = x.GetValues("price").FirstOrDefault(),

              Category = x.GetValues("category").FirstOrDefault()

           }).ToList(),

           TotalResults = results.TotalItemCount

        };

    }

}

Some of the key optimizations include:

Field boosting - Matches found in the product’s name are weighted 10 times more heavily than those found in the product’s description. The product "Nike Shoes" will rank above any document containing it.

Filtered queries - Limiting the query to a specific product category and/or sale status.

Multifield Search - Searching for matches in name and description fields at the same time.

Search Query Optimisation Best Practices

1. Index selectively - Not everything needs to be indexed. Index only things that people can and will be searching for. Smaller indexes are quicker to execute.

2. Data normalization - Perform data pre-processing operations such as stripping out HTML tags, converting prices into numbers, and ensuring uniform text casing.

3. Analyze intelligently - Select a text analysis algorithm suited to your use case. English analyzers strip out stop words; you might require custom algorithms for technical documents.

4. Control index size - Indexes that are large affect indexing speed and query performance. Make sure to frequently rebuild indexes and archive old content rather than indexing it constantly.

5. Ranking experimentation - Run experiments on ranking algorithms. Does giving higher weight to product names generate more clicks? Does time-based ranking work?

6. Caching - Search queries tend to be repetitive. Cache frequent searches in Redis or memory to save computing costs.

public async Task<SearchResults> SearchWithCaching(string term, string category)

{

    var cacheKey = $"search_{term}_{category}";

    var cached = await _cache.GetAsync(cacheKey);

    if (cached != null)

        return JsonConvert.DeserializeObject<SearchResults>(cached);

    var results = _productSearch.Search(term, category, false);

    await _cache.SetAsync(cacheKey, JsonConvert.SerializeObject(results),

       TimeSpan.FromHours(1));

    return results;

}

Conclusion

Examine gives Umbraco its power of search capabilities, but there is much more than can be done beyond the standard settings. With custom indexes based on your own content, optimized queries using field boosts and filters, and adherence to indexing guidelines, you will achieve great results and improve user interaction and conversion rates.

First, consider what users search for. Create an index according to their needs. Evaluate your ranking. Modify it as necessary based on actual user searches. This is worth the effort; effective search means more time spent on your website.

Written by
Nishantimage 1

Nishant Vaghasiya

Technical Architect

I'm Nishant Vaghasiya, a Technical Architect and Umbraco & Sitecore Certified Developer at Arroact Technologies. I specialise in building digital solutions with Umbraco and Sitecore that are practical, scalable, and built to last.

Over the years, I've learned that the best solutions aren't always the most complex ones, they're the ones that make a team's day-to-day work simpler and give them confidence that the system won't let them down.

That's what drives me writing code that performs well, stays reliable, and continues to create real impact long after it goes live.

Related Blogs blue-line-vector-3

How Umbraco's MCP Server Makes Your CMS AI-Powered
16 April 2610 min read
Umbraco
How Umbraco's MCP Server Makes Your CMS AI-Powered
If you have been paying attention to what's going on with AI tools you probably know that …
Read More
How Umbraco Cloud Keeps Your Website Fast, Secure, and Scalable
06 April 2613 min read
Umbraco
How Umbraco Cloud Keeps Your Website Fast, Secure, and Scalable
Most digital teams do not want to think about hosting they want to build things publish co…
Read More
umbraco-17-graphql-headless-api-arroact
11 February 2615 min read
Umbraco
Umbraco 17 GraphQL Package – Build Flexible Headless APIs with Arroact.Umbraco.GraphQL
Are you building a headless website using Umbraco 17 and finding the Delivery API difficul…
Read More