A customer asked what would be recommended to allow searching across their microservices.

The decision regarding using search index spans across two axes: how the index is updated (push or pull), and the number of indexes needed (one centralized, or many locals).

tl;dr;

Querying service-specific catalogs seem to be the winning strategy (see illustration below). Unless the system is built using event-sourcing, in which case creating the catalog from the event-bus might work.

Updating the index

The two choices are pull (the search index pulls the data it needs on regular basis) or push (the search index receives updates when they happen).

Pull

The search index pulls the data from data-sources itself.

The main advantage of that approach is to leverage pre-built indexers¹ that might be available in the search technology being used to reduce the cost of initial setup.

The main issues are that a) it creates coupling between the index and the data source because of the indexer. This in itself might not be bad, as long as the search index is part of the service, b) there is going to be a delay between data creation and searchability, c) it might have an impact on the data source performance itself, to be tested.

Push

Services update the search index on modification. This can be done either by having the apps update the index directly, or through subscribing to CRUD events in an event bus.

The main advantage is that data in the catalog is always up to date. It also reduces the coupling between the search index and the data source, since it shifts the responsibility to update the index to the service itself.

The main issue is that it is more work to set it up². Each of the services that need be indexed will need to be updated to either update the index or send events to an event-bus that the search index will listen to. Modification to the “CRUD” logic of the service will require the additional maintenance of the index update logic as part of it.

Central index, or local indexes

The choice is between one central catalog indexing all services’ data, or one index per service.

One central index

Create one search index that indexes all the services somehow (with either of the solutions expressed above).

The main positive effect of this is the simplicity of querying the index: send a query to the search catalog, everything is in it, so returning the response is simple.

It has several problems though:

It creates coupling between services: every change in the service data sources will need to change the indexer query, this breaks the single responsibility principle (one reason to change: here the search service inherits of the reasons to change from all the services it indexes).
That also creates a problem with assigning responsibility for the search service itself: who is responsible to update the indexer? Is it the team updating the service, or is it the team responsible for the search service?
Complexity of indexing in the same catalog various data sources that might be structured and have different granularity.

Using a single central repository with the data-source indexing capabilities of the tool creates the following architecture:

This is transgressing the μ-service principle that data-sources should only be accessed by their own services, thus creating coupling, and making change more difficult. Since the way the data is stored is proper to the service, the way it’s indexed should also be particular to it.

Which records should be returned? Which are archived, active? Which are for customer A, B or C, which should be deleted in case of a GDPR request? This is defined in the close relationship between the service and the data-source. Having the data source accessed by an external service is a risk in that sense.

And then, the subtle logic of the model, how RLS is set, how data is filtered, how records are marked as deleted, archived or invisible is also set in the close relationship between the application and its data-source. Other systems accessing it might make developers anxious to change either in fear of breaking the other system, it is therefore an impediment to change.

→ ✗ that solution does not seem to be good.

There is a case where having a centralized search index seems reasonable though:

Apps publish events on CRUD events to an event bus, and the search service listens to all these events and updates its catalog accordingly.

→ ✓ This seems reasonable for the following reasons:

The coupling it creates is based on a common abstraction³. By construct, event logs require to have some standardization of the events they receive, and search services tend to be flexible with their index model. The services therefore send an event that contains standardized information, creating little coupling on this side, the search service listens to a relatively standardized event model, creating little coupling on that side.
Single responsibility principle is relatively respected, updates to the service remain in its APIs (events and REST API), the events it publishes are what the service wants the world to see. The team responsible for the search service is responsible of parsing the event bus, the team responsible for the indexed service is responsible of sending events.

However, the pre-requisite is to have a functioning event bus, which is an entire topic altogether. De-synchronization between services and the search service might also occur, full re-indexing might be required, this may introduce complexity.

De-centralized indexes

Create one search index per service⁴, exposing a common interface, and the central search index is querying all these indexes.

On a purely philosophical level, it seems to fit with microservices and DDD: searching globally is a function of the union of searching for orders, customers and inventory. Searching for orders is a function of the order domain, so having it in the order service is natural; and having the global search as a service which composes the other services also seems natural.

To reduce complexity, a common search API can be defined, searchable services need to implement it. Standards such as OData or GraphQL can be used.

This natural logic manifests itself in several positive aspects:

Customized search per service: the search logic (ordering, filtering, etc.) might is proper to each service, advanced features like pdf parsing and such might be turned on for only the relevant services.
If they implement the common search interface, some services might not require a specific search catalog. An order service, for example, could be queried using simple SQL full-text search. This makes things simpler in some cases and gives the fastest time-to-search.
It places the logic of search within the service itself, which is consistent with “single responsibility principle”. The team responsible for changing the service is also responsible for how to make it searchable. This also means that the built-in indexers can be used
The user might not want to search globally. This solution makes services locally searchable, which is adding to the features of the service itself. If a user is on the patient information service’s UI, searching for patients, having drug information, maintenance orders, and standard operating procedure manuals returned might not be helpful. In this case, hitting the patient information service’s search API might be more useful.

The main negative impacts:

The search service logic might not be simple – search offers some standard features such as ordering logic, pagination, and faceting that might be hard to recreate using this composite pattern. This will require a fair bit of caching, and caching is hard.
Defining a common search API or language can be against the interface segregation principle. It might also not be easy to define and build.
Search data is scattered, so any form of advanced queries (joining search indexes) might be more complex, if this becomes a requirement
Each of the service will need to implement search², the initial time of implementation might be larger than with a single search service.

It also allows to do that:

Service 1 handles non-transactional data. Search pattern is varied, such as “one of a kind” or “one that fits”, it can be faceted to refine the search (e.g. product catalog, warehouse information, supplier catalog, etc.). It is ok if it is 5 minutes behind. We decide to use the pre-built search indexer.

Service 2 handles transactional data, search pattern is the same as service 1, but absolutely requires information that is up to date (e.g. device information, booking system, inventory management). We decide to have the app update the index in real time.

Service 3 handles transactional data that is mostly textual in its nature. It is typically queried to find a specific record (e.g. customer information, address book, order management). Information should be updated in near-real time. We decide not to create a separate search index and use SQL full-text search instead.

→ ✓ This seems to be a commandable approach.

Notes

The indexers available in Azure Search are described there. ↩
“There ain’t no such thing as a free lunch” ↩ ↩²
This is an instance of the dependency inversion principle, much in the same way that we reduce coupling in code using interfaces and dependency injection. ↩
Note that Azure Search services cost money per instance. It might be decided to have independent indexes and still deploy all of them within the same search service – this is a logical boundary, how it is deployed is an implementation detail. ↩