Examine Indexing in Umbraco: Custom Search Optimization Explained
Introduction
Searching is an integral part of any user experience. Regardless of whether you use your Umbraco website to conduct business, provide technical support, or publish content, your users are going to expect instant and highly relevant results. Examine Umbraco’s built-in search and indexing engine, which provides the basis for this.
In this article, we will discuss how Examine works, when customizing is necessary, and some real-world patterns to build search solutions people enjoy using.
Understanding Examine Architecture
Examine is built on top of Lucene.NET, which is Umbraco's default full-text search engine. The Examine library builds indexes that allow you to perform fast searches without having to iterate through every node in the content tree.
By default, Umbraco creates the following indexes:
ExternalIndex - Indexes published content that is visible to your website’s visitors.
InternalIndex - Indexes all the content including drafts; used to perform searches by editors.
Why Customize Indexing?
Limitations of Default Indexes:
Inadequate field coverage: Some of the fields that you require are not indexed. You cannot include any customized document type and properties.
Low ranking criteria: The search result is ranked based on its relevance score only. There are chances where you require to give some sort of preference to products on sale, recent posts, or category wise.
No facet support: You cannot filter your result based on categories, price range, and custom metadata unless you use customized indexes.
Slow performance: When the query goes through a large dataset, there is some kind of performance hit. In a customized index, you can index whatever information you need.
English language and analysis: If you need to analyze documents in any other language apart from English, you would need a customized index.
Building a Custom Index
Here's how to create a custom index for an e-commerce site:
public class ProductSearchIndexComposer : IComposer
{
public void Compose(IUmbracoBuilder builder)
{
builder.Services.AddSingleton<IIndexHandler, ProductSearchIndexHandler>();
}
}
public class ProductSearchIndexHandler : IIndexHandler
{
private readonly IExamineManager _examineManager;
private readonly IPublishedContentTypeCache _contentTypeCache;
public ProductSearchIndexHandler(IExamineManager examineManager,
IPublishedContentTypeCache contentTypeCache)
{
_examineManager = examineManager;
_contentTypeCache = contentTypeCache;
}
public void Handle(IndexingNotification notification)
{
var indexer = _examineManager.GetIndex("ProductIndex");
foreach (var item in notification.PublishedNodes)
{
if (item.ContentType.Alias == "product")
{
var valueSet = new ValueSet(
item.Id.ToString(),
"content",
"product",
new Dictionary<string, IEnumerable<object>>
{
{ "nodeName", new object[] { item.Name } },
{ "productName", new object[] { item.GetProperty("name")?.GetValue() } },
{ "description", new object[] { item.GetProperty("description")?.GetValue() } },
{ "price", new object[] { item.GetProperty("price")?.GetValue() } },
{ "category", new object[] { item.GetProperty("category")?.GetValue() } },
{ "isOnSale", new object[] { item.GetProperty("onSale")?.GetValue() } }
}
);
indexer.Index(valueSet);
}
}
}
}
This approach:
- Listens for publish events
- Extracts only the product nodes
- Indexes explicitly the custom fields (price, category, sale status)
- Enables sorting and searching of the custom fields
Enhancing Search Queries
So now you have a good index. The next thing you want to do is improve your queries.
public class ProductSearchService
{
private readonly IExamineManager _examineManager;
public ProductSearchService(IExamineManager examineManager)
{
_examineManager = examineManager;
}
public SearchResults Search(string term, string category, bool onlyOnSale)
{
var searcher = _examineManager.GetSearcher("ProductIndex");
var query = searcher.CreateQuery("content");
// Search across multiple fields with boosting
query = query
.Field("productName", term).Boost(10)
.Or()
.Field("description", term).Boost(2);
// Filter by category
if (!string.IsNullOrEmpty(category))
{
query = query.And().Field("category", category);
}
// Filter for sale items only
if (onlyOnSale)
{
query = query.And().Field("isOnSale", "true");
}
var results = query.Execute();
return new SearchResults
{
Items = results.Select(x => new SearchItem
{
Id = x.Id,
Name = x.GetValues("productName").FirstOrDefault(),
Price = x.GetValues("price").FirstOrDefault(),
Category = x.GetValues("category").FirstOrDefault()
}).ToList(),
TotalResults = results.TotalItemCount
};
}
}
Some of the key optimizations include:
Field boosting - Matches found in the product’s name are weighted 10 times more heavily than those found in the product’s description. The product "Nike Shoes" will rank above any document containing it.
Filtered queries - Limiting the query to a specific product category and/or sale status.
Multifield Search - Searching for matches in name and description fields at the same time.
Search Query Optimisation Best Practices
1. Index selectively - Not everything needs to be indexed. Index only things that people can and will be searching for. Smaller indexes are quicker to execute.
2. Data normalization - Perform data pre-processing operations such as stripping out HTML tags, converting prices into numbers, and ensuring uniform text casing.
3. Analyze intelligently - Select a text analysis algorithm suited to your use case. English analyzers strip out stop words; you might require custom algorithms for technical documents.
4. Control index size - Indexes that are large affect indexing speed and query performance. Make sure to frequently rebuild indexes and archive old content rather than indexing it constantly.
5. Ranking experimentation - Run experiments on ranking algorithms. Does giving higher weight to product names generate more clicks? Does time-based ranking work?
6. Caching - Search queries tend to be repetitive. Cache frequent searches in Redis or memory to save computing costs.
public async Task<SearchResults> SearchWithCaching(string term, string category)
{
var cacheKey = $"search_{term}_{category}";
var cached = await _cache.GetAsync(cacheKey);
if (cached != null)
return JsonConvert.DeserializeObject<SearchResults>(cached);
var results = _productSearch.Search(term, category, false);
await _cache.SetAsync(cacheKey, JsonConvert.SerializeObject(results),
TimeSpan.FromHours(1));
return results;
}
Conclusion
Examine gives Umbraco its power of search capabilities, but there is much more than can be done beyond the standard settings. With custom indexes based on your own content, optimized queries using field boosts and filters, and adherence to indexing guidelines, you will achieve great results and improve user interaction and conversion rates.
First, consider what users search for. Create an index according to their needs. Evaluate your ranking. Modify it as necessary based on actual user searches. This is worth the effort; effective search means more time spent on your website.
Related Blogs
Read More
Read More
Read More