Sitecore Azure Search overview

Last updated Monday, January 23, 2017 in Sitecore Experience Platform for Developer
Keywords: Azure, Cloud, Search

The Sitecore Azure Search provider integrates the Sitecore Search engine with the Microsoft Azure Search service. The Microsoft Azure Search service is a part of the Microsoft Azure computing platform, you can read more about the Microsoft Azure Search service on their website.

This topic applies to Sitecore Experience Platform 8.2 Update-1 and later and describes:

Features of Azure Search

The Microsoft Azure Search service provides the following features:

  • Extreme scalability, simplicity, and stability.
  • A highly available infrastructure with 99.95% uptime as a part of the Microsoft Azure service level agreement (SLA).
  • An easy way to scale up and scale down as needed.

The Sitecore Azure Search provider includes the following features:

  • Support for all Sitecore search-driven UIs, including user-typed queries, and faceted searches.
  • Support for the majority of LINQ expressions, to enable rapid development of search-powered applications.
  • Native support for fundamental data types such as numbers and dates in faceting, and range queries.
  • Flexible configuration and precise control over the schema of the indexes.
  • Support for running Sitecore in geo-replicated scenarios.

Note

Sitecore Azure Search behaves slightly differently from the Lucene and Solr search providers; this is important to consider if you are going to switch between search providers. Read more about Sitecore Azure Search limitations and behavioral differences in the Limitations of Azure Search section.

Sitecore Azure Search is the default provider for Sitecore instances that are deployed using the Sitecore Azure SDK. It supports on premise and IaaS deployments. Follow the instructions in Configure Azure Search to configure Sitecore Azure Search.

Limitations of Azure Search

Compared with Sitecore Search on Lucene and Solr, Sitecore Search on Azure Search has several limitations:

  • Automatic tokenization by the Azure Search service of document field values and queries when searching and faceting. This means that:
    • Substring searches that are limited to a single term, for instance, predicates, .StartsWith(), .EndsWith() and .Contains(), will match parts of terms, and will match terms that are located in any part of the field value. When multiple terms are passed, each term is searched separately, (this can provide more results than expected).
    • Regular expressions spanning multiple terms (containing spaces) returns 0 results.
    • Multiple terms that are passed to .Wildcard() are interpreted as individual wildcards in a field-scoped query.
    • The facet values are calculated based on individual terms in faceted fields, not on whole field values, when a value contains multiple words, (unlike Lucene and Solr).
  • The Azure Search service has a strong schema, this means there cannot be such things as fields that have the same name but different types in different documents.
  • Joining queries, for example, .GroupJoin(), .SelfJoin(),and other operators that join queries, is not supported and results in an error.
  • Range queries are always expressed as filters, as a result:
    • Combining range queries with Search using the logical operator OR (||) produces an error.
    • Range queries on string fields always operate on the whole field value without tokenization and are case-sensitive.
  • The Azure Search service stores date-time and numeric values as native types and only allows filtering on these fields. Search and filter parts can only be combined with the logical operator AND (&&), as a result:
    • Complex queries involving fields with different types that are combined with the logical operator OR (||) can return an error.
    • .Union() and .Except() operators may generate queries that return an error, depending on the types of the fields used.
    • Certain user queries in the Content Editor that span multiple fields with different types (such as creation date or version), return an error.
  • Fuzzy query semantics are different in Azure Search, for example:
    • .Like(pattern, similarity) interprets the similarity parameter as the Damerau-Levenshtein Distance (value between 0 and 2). This is different from the way Lucene implements the similarity parameter in Sitecore.
    • The similarity and slop parameters cannot be combined in the Azure Search Lucene syntax, this means multiple-word fuzzy queries, such as .Like() are always interpreted as a phrase query with a slop.