Diagnosing enterprise search
When I do employee focus groups for clients on their digital workplaces, it takes around three minutes before somebody complains how awful the search is. Even if that isn’t what I asked about. Everyone grumbles and empathises, and the conclusion is invariably ‘it should just work like Google’.
Trust me, it will never work like Google. For an intranet or digital workplace manager, it is tempting to blame the search engine or feel it is something for IT to solve. Trust me, that will never work either.
What I want to share here is a diagnostic tool that breaks down the underlying causes of search failure, and point out the many elements that intranet managers, content owners, knowledge managers, and even IT professionals can improve without changing the search engine. New research by Cleverly and Burnett attribute 62% of enterprise search dissatisfaction to non-technical factors: information quality and search literacy.
I don’t want to give the impression that you shouldn’t pay attention to the search engine too, but I know for many organisations search expertise can be hard to find so people end up doing nothing.
Searching step by step
Being more precise, we should talk about findability rather than search. This is because search is often a combination of searching and browsing. For example, a user might navigate to the HR section and then do a search just within that sub-site, or search for ‘policies’ and then navigate to ‘HR policies’ in a policies centre.
A simple decomposition of enterprise search.
We can break down the search process into 4 basic steps:
- Content is published
- The search engine indexes it
- A query retrieves a selection from the content
- The user uses the query complete their search
This greatly simplifies what really happens, but from a diagnostic point of view it gives us four useful starting points for things that might go wrong.
Using the tool
For each step in the process, there are things that need to go right, such as metadata, security settings and results presentation (see column 3 in Figure 2) and then underlying symptoms (the last 2 columns). Note all the ones that aren’t coloured green (i.e. not primarily a technical issue)!
It’s not practical to go through the diagnostic for all the content in your digital workplace. Instead what I suggest is that when you get feedback that “search isn’t working”, use the tool to check for systemic issues that might broadly apply to sets of content.
Often, I see employee satisfaction surveys that rate search poorly, and I use focus groups to dig deeper into what’s happening: “Can you remember a recent time you tried to search for something? How did you search? Did it exist at all?”
An enterprise search diagnostic.
1. Failures of content
It sounds obvious, but often the big issue in digital workplace search is that the thing somebody is searching for just doesn’t exist [1.1]. On the web, somebody, somewhere probably has put the answer there, but in the enterprise, this isn’t necessarily true. So if something is asked a lot, the solution might just be to get someone to write the answer (there’s a diagnostic tool for that too: Clear Knowledge Management Roadblocks).
Metadata [1.2] can often be poor or lacking. Just using good writing principles for headlines and subheads can help, as can clear filenames (if you ever shared a document called “Proposal draft” or “Announcement” then I’m looking at you).
Language [1.3] can also present a barrier. A technical document may be written in jargon (‘variable performance related pay’) when a user searches in plain English (‘bonus’). Even harder, we may expect everything to be in our language and overlook other languages (‘2016 sales results for Spain’ wouldn’t necessarily find a document called ‘Resultados de ventas de Espana 2016’).
2. Indexing failures
Search retrieval works so quickly because a crawler creates an index first, and your query is actually run against the index. So the first failure point here [2.1] is that the content needed isn’t indexed. Unlike the web, a great deal of enterprise content might have security controls in place, blocking the indexer from seeing it.
More fundamentally, it may exist in a system that the crawler can’t access, such as a network drive or an application. I sometimes see HR departments move all their guidelines into an employee self-service system, but if there is no connector with the enterprise search engine then routine content like ‘Parental leave policy’ won’t get indexed. Nor will all those documents in Dropbox if it’s only shadow IT.
Next we need to consider the index itself [2.2]. This is definitely in the technical realm, but check that document content is indexed and not just the title. You may also need to define words that are specifically meaningful to your organisation. For example, if you have a product called ‘Teams’, then the indexer needs to know it is more significant than casual usages of ‘teams’.
3. Retrieval failures
Largely we rely on the search engine technology to get this right [3.1], and do all the good stuff like sensible ranking and knowing that ‘bicycle’ and ‘bike’ are the same. Martin White has a useful summary of 10 options for enhancing search engines.
However, too many results can be a symptom of duplicate content or ROT (Redundant, Outdated, Trivial), meaning a clean-up is in order. It may also mean we don’t have good refiners, to whittle down results to the last six months, or only show sales collateral (see Metadata [1.2]).
Retrieval also relies on user search skills though. Google is so good we’ve got lazy. But enterprise search sometimes needs very good search skills, such as the use of logical operators (AND, OR, NOT). If that’s unrealistic, consider ready-made search interfaces.
4. Search results
Finally we get to the results page (I know, I know, Google gets there in about 0.47 seconds).
You’d think if the answer was on the page we’d be successful, but if you’ve ever done observational user testing you’ll know that sometimes people seem fly straight past the answer and onto the phone.
So the layout of the results page matters [4.1], and the good news is you can often change it. Usually, the more like Google, the better, as this is what people have already learned.
Make it so that the format matches the results [4.2]: show images and videos as thumbnails, people as a contact card and, heck, even just show the answer itself rather than a link.
Hits on documents can make scanning of the results harder [4.4]. If the answer is on page 52 of a document, consider breaking it into HTML pages. If the document exists but isn’t shown, ask if the security settings on it are right [4.3].
Finally, users may find the right result, but carry on searching because they don’t trust it [4.5]. Governance and training can help here – make sure it has things like owner and expiry details. Ratings and feedback can help too.
This post was partly inspired by an old LinkedIn thread, which Paul Culmsee analysed in forensic detail on CleverWorkarounds.
My thanks to Martin White for commenting on an earlier version of the model. For a much more detailed analysis of how search works, I definitely recommend the Search Insights 2018 whitepaper.
I plan to keep refining this tool, so any comments or questions would be most welcome.