Thursday, October 30, 2014

Vulnerability with using the Sitecore context search index

Mornings when production issues happen mean two things: no tea yet, and definitely no snacks until order is restored. So I was in a bit of a hurry today to find out what was going on when our default Sitecore content indexes seemed to have gone down. This was the exception thrown:

System.NullReferenceException: Object reference not set to an instance of an object. at Sitecore.ContentSearch.SitecoreItemCrawler.IsAncestorOf(Item item) at Sitecore.ContentSearch.SitecoreItemCrawler.IsExcludedFromIndex(SitecoreIndexableItem indexable, Boolean checkLocation) at Sitecore.ContentSearch.Pipelines.GetContextIndex.FetchIndex...

Here's a snippet that would cause the above error to be thrown:
Item contextItem = Sitecore.Context.Database.GetItem(SOME_ID);

if (contextItem != null)
{
    ISearchIndex index = ContentSearchManager.GetIndex(new SitecoreIndexableItem(contextItem));
}

That's just an example, but I would also see the error thrown in the __semantics field in my 'web' database.

So we're failing to retrieve the ISearchIndex... but why?

If you take a look at IsAncestorOf(Item item) (Sitecore 7.2+) using your favorite reflection tool, this is what you'll find:
// Sitecore.ContentSearch.SitecoreItemCrawler
protected virtual bool IsAncestorOf(Item item)
{
 bool result;
 using (new SecurityDisabler())
 {
  using (new CachesDisabler())
  {
   result = this.RootItem.Axes.IsAncestorOf(item);
  }
 }
 return result;
}
No null checks here, that can't be good.
The RootItem property -
// Sitecore.ContentSearch.SitecoreItemCrawler
public Item RootItem
{
    get
    {
        if (this.rootItem == null)
        {
            Database database = ContentSearchManager.Locator.GetInstance<IFactory>().GetDatabase(this.database);

            Assert.IsNotNull(database, "Database " + this.database + " does not exist");

            using (new SecurityDisabler())
            {
                this.rootItem = database.GetItem(this.Root);
            }
        }
        return this.rootItem;
    }
}
No null check here either. Fishy.

I had to dig a bit deeper to figure out where it all originates and what exactly is null.

I overwrote the Sitecore.ContentSearch.Pipelines.GetContextIndex.FetchIndex, Sitecore.ContentSearch pipeline processor to debug the code. It blew up trying to evaluate this:
   System.Collections.Generic.IEnumerable<ISearchIndex> enumerable =
                from searchIndex in ContentSearchManager.Indexes
                from providerCrawler in searchIndex.Crawlers
                where !providerCrawler.IsExcludedFromIndex(indexable)
                select searchIndex;
So the GetContextIndex pipeline tries to fetch all of the available crawlers, whose roots are ancestors of our indexable item (IsExcludedFromIndex). The ancestor check however fails on a null root item in any of the indexes.

So if you remember the good old "Root Item Not defined" error message,  it turns out this issue had the exact same cause - the RootItem for one of the search indexes was not defined in the 'web' database (i.e. the item was not yet published). Publishing is of course the quick fix.

While this is in the end a configuration issue, I still feel like the unpublished site should not be affecting ALL OF SEARCH (search based on GetContextIndex that is), which is why I'm going to leave my FetchIndex override in place and add some null checks for at least a more meaningful error message.