logo

Newsletter

Join our mailing list for intranet and digital workplace links from around the web.
Newsletter

We’re careful with your personal information. Read our privacy statement for more about how we manage your details, and your rights.

Get in touch

Make your intranet work harder for you. Contact us to see how we can help.
hello@clearbox.co.uk
+44 (0)1244 458746

Diagnosing enterprise search

Snellen eye test chart and glasses.

When I do employee focus groups for clients on their digital workplaces, it takes around three minutes before somebody complains how awful the search is. Even if that isn’t what I asked about. Everyone grumbles and empathises, and the conclusion is invariably ‘it should just work like Google’.

Trust me, it will never work like Google. For an intranet or digital workplace manager, it is tempting to blame the search engine or feel it is something for IT to solve. Trust me, that will never work either.

What I want to share here is a diagnostic tool that breaks down the underlying causes of search failure, and point out the many elements that intranet managers, content owners, knowledge managers, and even IT professionals can improve without changing the search engine. New research by Cleverly and Burnett attribute 62% of enterprise search dissatisfaction to non-technical factors:  information quality and search literacy.

I don’t want to give the impression that you shouldn’t pay attention to the search engine too, but I know for many organisations search expertise can be hard to find so people end up doing nothing.

Searching step by step

Being more precise, we should talk about findability rather than search. This is because search is often a combination of searching and browsing. For example, a user might navigate to the HR section and then do a search just within that sub-site, or search for ‘policies’ and then navigate to ‘HR policies’ in a policies centre.

Enterprise search.

A simple decomposition of enterprise search.

We can break down the search process into 4 basic steps:

  1. Content is published
  2. The search engine indexes it
  3. A query retrieves a selection from the content
  4. The user uses the query complete their search

This greatly simplifies what really happens, but from a diagnostic point of view it gives us four useful starting points for things that might go wrong.

Using the tool

For each step in the process, there are things that need to go right, such as metadata, security settings and results presentation (see column 3 in Figure 2) and then underlying symptoms (the last 2 columns). Note all the ones that aren’t coloured green (i.e. not primarily a technical issue)!

It’s not practical to go through the diagnostic for all the content in your digital workplace. Instead what I suggest is that when you get feedback that “search isn’t working”, use the tool to check for systemic issues that might broadly apply to sets of content.

Often, I see employee satisfaction surveys that rate search poorly, and I use focus groups to dig deeper into what’s happening: “Can you remember a recent time you tried to search for something? How did you search? Did it exist at all?”

An enterprise search diagnostic.

An enterprise search diagnostic.

1. Failures of content

It sounds obvious, but often the big issue in digital workplace search is that the thing somebody is searching for just doesn’t exist [1.1]. On the web, somebody, somewhere probably has put the answer there, but in the enterprise,  this isn’t necessarily true. So if something is asked a lot, the solution might just be to get someone to write the answer (there’s a diagnostic tool for that too: Clear Knowledge Management Roadblocks).

Metadata [1.2] can often be poor or lacking. Just using good writing principles for headlines and subheads can help, as can clear filenames (if you ever shared a document called “Proposal draft” or “Announcement” then I’m looking at you).

Language [1.3] can also present a barrier. A technical document may be written in jargon (‘variable performance related pay’) when a user searches in plain English (‘bonus’). Even harder, we may expect everything to be in our language and overlook other languages (‘2016 sales results for Spain’ wouldn’t necessarily find a document called ‘Resultados de ventas de Espana 2016’).

2. Indexing failures

Search retrieval works so quickly because a crawler creates an index first, and your query is actually run against the index. So the first failure point here [2.1] is that the content needed isn’t indexed. Unlike the web, a great deal of enterprise content might have security  controls in place, blocking the indexer from seeing it.

More fundamentally, it may exist in a system that the crawler can’t access, such as a network drive or an application. I sometimes see HR departments move all their guidelines into an employee self-service system, but if there is no connector with the enterprise search engine then routine content like ‘Parental leave policy’ won’t get indexed. Nor will all those documents in Dropbox if it’s only shadow IT.

Next we need to consider the index itself [2.2]. This is definitely in the technical realm, but check that document content is indexed and not just the title. You may also need to define words that are specifically meaningful to your organisation. For example, if you have a product called ‘Teams’, then the indexer needs to know it is more significant than casual usages of ‘teams’.

3. Retrieval failures

Largely we rely on the search engine technology to get this right [3.1], and do all the good stuff like sensible ranking and knowing that ‘bicycle’ and ‘bike’ are the same. Martin White has a useful summary of 10 options for enhancing search engines.

However, too many results can be a symptom of duplicate content or ROT (Redundant,  Outdated, Trivial), meaning a clean-up is in order. It may also mean we don’t have good refiners, to whittle down results to the last six months, or only show sales collateral (see Metadata [1.2]).

Retrieval also relies on user search skills though. Google is so good we’ve got lazy. But enterprise search sometimes needs very good search skills, such as the use of logical operators (AND, OR, NOT). If that’s unrealistic, consider ready-made search interfaces.

4. Search results

Finally we get to the results page (I know, I know, Google gets there in about 0.47 seconds).

You’d think if the answer was on the page we’d be successful, but if you’ve ever done observational user testing you’ll know that sometimes people seem fly straight past the answer and onto the phone.

So the layout of the results page matters [4.1], and the good news is you can often change it. Usually, the more like Google, the better, as this is what people have already learned.

Make it so that the format matches the results [4.2]: show images and videos as thumbnails, people as a contact card and, heck, even just show the answer itself rather than a link.

Hits on documents can make scanning of the results harder [4.4]. If the answer is on page 52 of a document, consider breaking it into HTML pages. If the document exists but isn’t shown, ask if the security settings on it are right [4.3].

Finally, users may find the right result, but carry on searching because they don’t trust it [4.5]. Governance and training can help here – make sure it has things like owner and expiry details. Ratings and feedback can help too.

Credits

This post was partly inspired by an old LinkedIn thread, which Paul Culmsee analysed in forensic detail on CleverWorkarounds.

My thanks to Martin White for commenting on an earlier version of the model. For a much more detailed analysis of how search works, I definitely recommend the Search Insights 2018 whitepaper.

I plan to keep refining this tool, so any comments or questions would be most welcome.

Icons designed by smashicons from flaticon.

A version of this article was first published at CMSWire.

Sam Marshall

I'm the director of ClearBox Consulting, advising on intranet and digital workplace strategy, SharePoint and online collaboration. I've specialised in intranets and knowledge Management for over 19 years, working with organisations such as Unilever, Astra Zeneca, Akzo Nobel, Sony, Rio Tinto and Diageo. I was responsible for Unilever’s Global Portal Implementation, overseeing the roll-out of over 700 online communities to 90,000 people and consolidating several thousand intranets into a single system.

4 Comments
  • Ksenia Cheinman
    Reply
    Posted at 7:49 pm, 21 June, 2018

    Thank you for the simple and excellent diagram that so effectively illustrates how poor search results often hinge on issues other than technical. I would like to add to your point however: 2.2 index quality is not fully a technical issue. There is another component missing – and that is content strategy around search . Identifying what should be indexed and to what extent, and de-indexed for that matter, are very important elements in informing the index quality and are part of understanding clients’ content, user needs, as well as having done a thorough content audit of what information is already present on the subject..

  • Posted at 4:31 pm, 10 December, 2018

    Great article and tool Sam, good to see you backing up Martin and others by bringing this to people’s attention. As I am currently leading the project to replace the GSA on the intranet in my organization, I can report that terrible metadata is real problem under the “content quality” bucket, and wonder if, as information management professionals we moved too far away from being metadata specialists, thinking the tools would solve the problems for us?

Post a Comment

Comment
Name
Email
Website

This site uses Akismet to reduce spam. Learn how your comment data is processed.