Azure Cognitive Search, or Azure Search, has the ability to attach a cloud based indexer from various data sources, for easy indexing of your data. But there is also the possibility of defining and managing the indexing logic with code. If you need to be in control of the indexing and incrementing of your search indexes, the .NET SDKs are there for you.

Getting started

  • Create an instance of Azure Cognitive Search in the Azure Portal. For development purposes you can use the Free tier.
  • Create a .NET project, and add the nuget package Azure.Search.Documents

To connect to the service, create a SearchIndexClient with a uri and an admin key, which you can find in the Azure Portal:

let uri = config.GetValue<string>("Uri")
let adminKey = config.GetValue<string>("AdminKey")

let indexClient =
    SearchIndexClient(Uri(uri), AzureKeyCredential(adminKey))

The index schema definition is defined by a record (or class) decorated with attributes, specifying which fields are searchable, filterable, facetable etc.

When defining your schema you can choose between the field attributes SimpleField and SearchableField:

  • SimpleField is non-searchable, but retrievable. You can use it for either the document Id, or filterable, sortable, facetable fields, or to assist scoring profiles.
  • SearchableField is searchable and retrievable, and must be a string. With searchable fields you can define custom analyzers, and use synonym maps.

In this example we'll make due with specifying a key, a searchable and a filterable field. The searchable field allows us to search the title of the movie, while the filterable field allows us to filter away any unwanted ratings, e.g. below the value of 50.

type Movie =
    {
        [<SimpleField(IsKey=true)>]
        Id: string
        [<SearchableField>]
        Name: string
        Poster: string
        [<SimpleField(IsFilterable=true)>]
        Rating: int
    }

Next we'll create the index definition with a FieldBuilder, which gives us a list of the type SearchField.

let builder = FieldBuilder()
let indexDefinition = builder.Build(typeof<MovieIndex>)

Then to create, or update, the index definition in the service it self:

let index = SearchIndex("movie-index", indexDefinition)
        
indexClient.CreateOrUpdateIndex(index) |> ignore

Now we can insert some data into our index with a SearchClient:

let movies =
    [
        {
         Id="braveheart"
         Name="Braveheart"
         Poster="bravheart-poster.jpg"
         Rating=99
        }
        {
         Id="honey-i-shrunk-the-kids"
         Name="Honey, I shrunk the kids"
         Poster="honey-i-shrunk-the-kids-poster.jpg"
         Rating=65
        }
        {
         Id="home-alone"
         Name="Home alone"
         Poster="home-alone-poster.jpg"
         Rating=48
        }
    ]

let searchClient = indexClient.GetSearchClient("movie-index")
let batch = IndexDocumentsBatch.Upload movies
searchClient.IndexDocuments(batch) |> ignore

Searching the index

Lets test our search instance with an HttpClient to query the data. The Uri is the same as for the SearchIndexClient, and the query key can be found in the Azure Portal.

let uri = config.GetValue<string>("Uri")
let queryKey= config.GetValue<string>("QueryKey")

let queryClient = new HttpClient()
queryClient.BaseAddress <- Uri(sprintf $"{uri}/indexes/movie-index/docs/")
queryClient.DefaultRequestHeaders.Add("api-key", queryKey)

To query, we run a Post request towards the search instance with some parameters defined. Say we only want movies with 50 or greater in ratings, and movies starting with the letter B:

let parameters =
    {|
      filter = "Rating gt 50"
      search = "B*"
      searchFields = "Name"
    |}
    
let serialized = JsonSerializer.Serialize parameters
let body = new StringContent(serialized, Encoding.UTF8, "application/json")

let result = queryClient.PostAsync("search?api-version=2020-06-30", body) |> Async.AwaitTask |> Async.RunSynchronously
let content = result.Content.ReadAsStringAsync() |> Async.AwaitTask |> Async.RunSynchronously

which gives us the following result back:

[
	{
		"@search.score":1.0,
		"Id":"braveheart",
		"Name":"Braveheart",
		"Poster":"bravheart-poster.jpg",
		"Rating":99
	}
]

Other functionality

There are loads of other functionality in the Azure Search SDKs, all managed by code. Among some of the features I have used include:

  • Creating a custom search analyzer and an index analyzer
  • Defining custom tokenizers
  • Creating scoring profiles based on freshness, tags etc.
  • Synonym maps
  • Get index status, such as document count, among other values
  • You can use more complex objects than used in this example

Gotchas

Azure Cognitive Search definitions are strongly typed, and existing fields can't be altered. In order to make a field filterable, or remove a filterable definition on a field, the entire index needs to be recreated and reindexed. You can however add new fields without having to recreate the index.

Scaling your instance is not a quick operation, and you can expect som outage while the scaling is being applied.

Azure Cognitive Search can get quite pricy if you scale it up. Check the needs of your system before you decide which scale settings you need.

References: