Adding search to your website with Azure Search

Posted on Wednesday, 10th February 2016

As traffic for my blog has started to grow I’ve become increasingly keen to implement some kind of search facility to allow visitors to search for content, and as most of you will probably know search isn’t an easy problem - in fact search is an extremely hard problem, which is why I was keen to look at some of the existing search providers out there.

Azure’s search platform has been on my radar for a while now, and having recently listened to an MS Dev Show interview with one of the developers behind Azure Search I was keen to give it a go.

This article serves as a high level overview as to how to get up and running with Azure Search service on your website or blog with a relatively simple implementation that could use some fleshing out. Before we begin it’s also worth mentioning that Azure’s Search service has some really powerful search capabilities that are beyond the scope of this article so I’d highly recommend checking out the documentation.

Azure Search Pricing Structure

Before we begin, let us take a moment to look at the Azure Search pricing:

One of the great things about Azure Search is it has a free tier that’s more than suitable for relatively low to medium traffic websites. What’s odd though is after the free tier there’s a sudden climb in price and specifications for the next model up - which I assume is because whilst it’s been available for a year, it’s still relatively new, so hopefully we’ll see more options become available moving forward but only time will tell; but as it stands the free tier is more than what I will need so let us continue!

Setting up Azure Search - Step 1: Creating our search service

Before we begin we’ve got to create our Azure Search service within Azure’s administration portal.

Note: If you’ve got the time then the Azure Search service documentation goes through this step in greater detail, so if you get stuck then feel free to refer to it here.

Enter in your service name (in this instance I used the name “jwblog”), select you subscription, resource group, location and select the free pricing tier.

Setting up Azure Search - Step 2: Configuring our service

Now that we’ve created our free search service, next we have to configure it. To do this click the Add Index button within the top Azure Search menu.

Add an index (see image above)

Once we’ve created our account next we need to decide which content we’re going to make searchable and reflect that within our search indexes. What Azure Search will then do is index that content, allowing it to be searched. We can do this either programatically via the Azure Search API, or via the Portal. Personally I’d rather do it up front in the Portal, but there’s nothing to stop you doing it in the code, but for now lets do it via the Azure Portal. In the instance of this post, we’d want to index our blog post title and blog post content as this is what we want the user to be able to search.

Setting retrievable content  (see image above)

When configuring our Azure Search profile we can specify what content we want to mark as retrievable, in this instance as I plan on only showing the post title on the search result page so I will mark the post title as retrievable content. As the search result page will also need to link to the actual blog post I will also mark the blog post id as retrievable.

Now we’ve set up and profile and configured it let’s get on with some coding!

Setting up Azure Search - Step 3: Implementing our search - the interface

Because it’s always a good idea to program to an interface rather than implementation, and because we may want to index other content moving forward we’ll create a simple search provider interface, avoiding the use of any Azure Search specific references.

public interface ISearchProvider<T> where T : class
{
    IEnumerable<TResult> SearchDocuments<TResult>(string searchText, string filter, Func<T, TResult> mapping);

    void AddToIndex(T document);
}

If you take a moment to look at the SearchDocuments method signature:

IEnumerable<TResult> SearchDocuments<TResult>(string searchText, string filter, Func<T, TResult> mapper);

You’ll see that we take a Func of type T and return a TResult - this will allow us to map our search index class (which we’ll create next) to a data transfer object - but more on this later. You’re also able to provide search filters to provide your website with richer searching capabilities.

Next we want to create our blog post search index class that will contain all of the properties required to index our content. As we’re creating a generic interface to our search provider we’re going to extend Azure Search’s AzureSearchIndex class. This allows us to create a BlogPostIndex class, but also gives us the scope to easily index other content such as pages (in which case we’d call it a BlogPageIndex).

What’s important to note is that the below BlogPostSearchIndex class is that our properties are the same name and type as the columns we configured earlier within Step 2 and that we’ve passed the index name to the base constructor.

// BlogPostSearchIndex.cs

[SerializePropertyNamesAsCamelCase]
public class BlogPostSearchIndex
{
    public BlogPostSearchIndex(int postId, string postTitle, string postBody)
    {
        // Document index needs to be a unique string
        IndexId = "blogpost" + postId.ToString();
        PostId = postId;
        PostTitle = postTitle;
        PostBody = postBody;
    }

    // Properties must remain public as they'll be used for automatic data binding
    public string IndexId { get; set; }

    public int PostId { get; set; }

    public string PostTitle { get; set; }

    public string PostBody { get; set; }

    public override string ToString()
    {
        return $"IndexId: {IndexId}\tPostId: {PostId}\tPostTitle: {PostTitle}\tPostBody: {PostBody}";
    }
}

Now that we’ve created the interface to our search provider we’ll go ahead and work on the implementation.

Setting up Azure Search - Step 5: Implementing our search - the implementation

At this point we’re now ready to start working on the implementation to our search functionality, so we’ll create an AzureSearchProvider class that extends our ISearchProvider interface and start fleshing out our search.

Before we begin it’s worth being aware that Azure’s search service does provide RESTful API that you can consume to manage and indexes, however as you’ll see below I’ve opted to do it via their SDK.

// AzureSearchProvider.cs

public class AzureSearchProvider<T> : ISearchProvider<T> where T : class
{
    private readonly SearchServiceClient _searchServiceClient;
    private const string Index = "blogpost";

    public AzureSearchProvider(SearchServiceClient searchServiceClient)
    {
        _searchServiceClient = searchServiceClient;
    }

    public IEnumerable<TResult> SearchDocuments<TResult>(string searchText, string filter, Func<T, TResult> mapper)
    {
        SearchIndexClient indexClient = _searchServiceClient.Indexes.GetClient(Index);

        var sp = new SearchParameters();
        if (!string.IsNullOrEmpty(filter)) sp.Filter = filter;

        DocumentSearchResponse<T> response = indexClient.Documents.Search<T>(searchText, sp);
        return response.Select(result => mapper(result.Document)).ToList();
    }

    public void AddToIndex(T document)
    {
        if (document == null)
            throw new ArgumentNullException(nameof(document));

        SearchIndexClient indexClient = _searchServiceClient.Indexes.GetClient(Index);

        try
        {
            // No need to create an UpdateIndex method as we use MergeOrUpload action type here.
            IndexBatch<T> batch = IndexBatch.Create(IndexAction.Create(IndexActionType.MergeOrUpload, document));
            indexClient.Documents.Index(batch);
        }
        catch (IndexBatchException e)
        {
            Console.WriteLine("Failed to Index some of the documents: {0}",
                string.Join(", ", e.IndexResponse.Results.Where(r => !r.Succeeded).Select(r => r.Key)));
        }
    }
}

The last part of our implementation is to set our configure out search provider with our IoC container of choice to ensure that our _AzureSearchProvider.cs_ class and its dependencies (Azure’s _SearchServiceClient_ class) can be resolved. Azure’s _SearchServiceClient.cs_ constructor requires our credentials and search service name as arguments so we’ll configure them there too.

In this instance I’m using StructureMap, so if you’re not using StructureMap then you’ll need to configure your IOC configuration accordingly.

public class DomainRegistry : Registry
{
    public DomainRegistry()
    {
        ...

        this.For<SearchServiceClient>().Use(() => new SearchServiceClient("jwblog", new SearchCredentials("your search administration key")));
        this.For(typeof(ISearchProvider<>)).Use(typeof(AzureSearchProvider<>));

        ...
    }
}

At this point all we need to do is add our administration key which we can get from the Azure Portal under the Keys setting within the search service blade we used to configure our search service.

Setting up Azure Search - Step 6: Indexing our content

Now that all of the hard work is out of the way and our search service is configured, we need to index our content. Currently our Azure Search service is an empty container with no content to index, so in the context of a blog we need to ensure that when we add or edit a blog post the search document stored within Azure Search is either added or updated. To do this we need to go to our blog’s controller action that’s responsible for creating a blog post and index our content.

Below is a rough example as to how it would look, I’m a huge fan of a library called Mediatr for delegating my requests but below should be enough to give you an idea as to how we’d implement indexing our content. We would also need to ensure we did the same thing for updating our blog posts as we’d want to ensure our search indexes are up to date with any modified content.

public class BlogPostController : Controller
{
    private readonly ISearchProvider<BlogPostSearchIndex> _searchProvider;

    public BlogPostController(ISearchProvider<BlogPostSearchIndex> searchProvider)
    {
        this._searchProvider = searchProvider;
    }

    [HttpPost]
    public ActionResult Create(BlogPostAddModel model)
    {
        ...

        // Add your blog post to the database and use the Id as an index Id

        this._searchProvider.AddToIndex(new BlogPostSearchIndex(indexId, model.Id, model.Title, model.BlogPost));

        ...
    }

    [HttpPost]
    public ActionResult Update(BlogPostEditModel model)
    {
        ...
        // As we're using Azure Search's MergeOrUpload index action we can simply call AddToIndex() when updating
        this._searchProvider.AddToIndex(new BlogPostSearchIndex(indexId, model.Id, model.Title, model.BlogPost));

        ...
    }

}

Now that we’re indexing our content we’ll move onto querying it.

Setting up Azure Search - Step 7: Querying our indexes

As Azure’s Search service is built on top of Lucene’s query parser (Lucene is well known open-source search library) we have a variety of ways we can query our content, including:

  • Field-scoped queries
  • Fuzzy search
  • Proximity search
  • Term boosting
  • Regular expression search
  • Wildcard search
  • Syntax fundamentals
  • Boolean operators
  • Query size limitations

To query our search index all we need to do is call our generic SearchDocuments method and map our search index object to a view model/DTO like so:

private IEnumerable<BlogPostSearchResult> QuerySearchDocuments(string keyword)
{
    IEnumerable<BlogPostSearchResult> result = _searchProvider.SearchDocuments(keyword, string.Empty, document => new BlogPostSearchResult
    {
        Id = document.PostId,
        Title = document.PostTitle
    });

    return result;
}

At this point you have one of two options, you can either retrieve the indexed text (providing you marked it as retrievable earlier in step 2) and display that in your search result, or you can return your ID and query your database for the relevant post based on that blog post ID. Naturally this does introduce what is an unnecessary database call so consider your options. Personally as my linkes include a filename, I prefer to treat the posts in my database as the source of truth and prefer to check the posts exist and are published so I’m happy to incur that extra database call.

public IEnumerable<BlogPostItem> Handle(BlogPostSearchResultRequest message)
{
    if (string.IsNullOrEmpty(message.Keyword) || string.IsNullOrWhiteSpace(message.Keyword))
        throw new ArgumentNullException(nameof(message.Keyword));

    List<int> postIds = QuerySearchDocuments(message.Keyword).Select(x => x.Id).ToList();

    // Get blog posts from database based on IDs
    return GetBlogPosts(postIds);
}

Setting up Azure Search - Wrapping it up

Now we’re querying our indexes and retrieving the associated blog posts from the database all we have left to do is output our list of blog posts to the user.

I’m hoping you’ve found this general overview useful. As mentioned at the beginning of the post this is a high level overview and implementation of what’s a powerful search service. At this point I would highly recommend you take the next step and look at how you can start to tweak your search results via means such as scoring profiles and some of the features provided by Lucene.

Happy coding!