Friday, August 3, 2007

Enterprise Search Done Right

When I was doing an assignment with a leading energy research company on knowledge management assessment, I learnt how important a search could be for the business and how much a company could go wrong in implementing a search solution. The key to success in research and consulting organizations is effective implementation of knowledge management practice and search plays a important role in it. Enterprise search has direct impact on the business and productivity of employees in these type of organization.

When I were there, I looked at their infrastructure and their business processes, and tried to assess the maturity level of their knowledge management program. The results were appalling. They did not have a knowledge management program. They did not have enterprise content and collaboration management system. They did not have internal search. Their external search engine despite of having two searches was redundant. They did not understand the significance of search for their employees as well as business with respect to productivity and knowledge access. The researcher procured information from various knowledge databases and repositories through a library services. There were not processes in place to measure the effectiveness and efficiency of these library services. The researchers and scientist were so dependent on Google and Yahoo's for information related to their work.

This is not the only story of, how organizations can go wrong in understanding the user needs and implementing ineffective search solutions. I have browsed through so many organization's websites that have great products and services but lack in providing effective search solutions for their users. I do and do not blame them. There is so much work done on external search engines like Google, Yahoo that the expectation of end users have gone up for all kind of searches. When they browse organization search sites, they expect similar user experience and relevance as they get on external web search engines. I do not mean to say that there aren't enterprise search engines who provide similar relevance and experience as external search engines. There are search products from Fast, Verity and Autonomy that leaders in enterprise search.

If I need to suggest a tool that would solve the research organization's search problem, first I would try to understand their business and user needs. Then I would list the features that I need in my product. Lets drill down on the requirements:
1. Ability to search all my repositories including websites, file systems, databases, enterprise applications like SharePoint, Documentum etc. They researcher will be able to access of information including past work done as well knowledge acquired from various sources.
2. Ability to search on external Corporate and academic libraries, journals, feeds etc. It is also called federated search. They do not need to go to various search engines to get information.
3. Ability to classify my documents and content in easy to browse categories. This would help me drill down on the information based on categories rather than pages with 10 results. No one actually browse beyond couple of pages on search.
4. Web administrative interface with advance linguistic capabilities including metadata, synonyms, antonyms, relative weighting for text fields and stemming.
5. Ability to perform secure searches.
6. Ability to scale and perform.

The rest of requirements are generic like relevance, supporting formats, summary extraction, metadata indexing and crawling configuration. If I have a tool that provides solution to all the requirements, certainly it is a candidate for my approval.

When I did the research, I found Vivísimo Velocity Search Engine as one of most powerful enterprise search that satisfy my set of requirements and a great relevance algorithm as compared to Google/Yahoo. There were other who did come close but not close enough. They were either not able to satisfy all requirements or not packaged as single solution.

Vivísimo, a difficult word to pronounce, provides innovative search solution that not only provides same search relevance as compared to the external web search engines but also provides similar user experience. You can access their public web search at http://vivisimo.com/ to get a glimpse. Their clustering solution is also popularly known as Clusty.

Integral components of the Vivísimo Velocity Search Platform:
  • Velocity Search Engine
  • Velocity Content Integrator
  • Velocity Clustering Engine

The external web search of Vivísimo looks impressive and I just hope the enterprise search is as promising as their web search. I think anyone looking for enterprise search engine should take a look at their offering.

I have done lot of research of federated search and clustering technology which are commercial as well open source. I will write on these technologies in coming posts.

5 comments:

Anonymous said...

Numerous usability studies have shown that the Vivisimo clustering interface and underlying technology is confusing to users and actually hurts in task completion as a distraction. Better to go with a powerful enterprise search solution like the Google Search Appliance or Google Mini.

Dave said...

Interesting post, Ravi. Although I like the Google Search Appliance, and I've confessed to not seeing the point of clustering, I think it's bad form for Anonymous to post an accusation like that without providing references to the alleged usability studies.

I'm looking forward to hearing more about your research.

Dan Keldsen said...

This was going to be a quick comment, and turned into a lengthier commentary - perhaps rant. If I'd had more time, this letter would be shorter, as the saying goes!

Summary:
Clustering is neither good nor bad independent of context of use.

Google's enterprise search solutions are tough to impossible to classify as a "powerful enterprise search solution." What does that mean to the anonymous commenter?

Google does many things well, and has tremendous resources, but they have not addressed the enterprise seriously... yet.

Why hasn't Google's infamous innovation more seriously targeted the enterprise? Don't know - and that frustrates me.

The full commentary:
Ditto on Dave's comment... "numerous usability studies" is a hard statement to just let lie. What studies? Done by whom? In what context? On what content? What type of users? I've spent a number of years orbiting the usability world, and belong to the Usability Professionals Association. Context is hugely important, and task analysis is a useful tool, but does not predict the overall experience of a tool like Vivisimo's - which would be embedded in a portal, website, or other "enterprise system." If your content is quite static, and you know the information is in there, just not what folder/structure it's in, then perhaps clustering is the wrong way to get at that information, but again, depends on use and need.

An easy and early ding against Vivisimo were when all they provided (or primarily provided) was their clustering capability. This has long since changed, and clustering is certainly not the magic bullet in findability concerns, but again, that is only ONE method they can use to display information results.

And "Better to go with a powerful enterprise search solution like the Google..." - sorry, but what exactly makes Google a "powerful enterprise search solution?"

Clearly Google as a whole understands scalability, or they would not have the ability to return results across billions of documents on the web, however, the enterprise, believe it or not, is a messier universe than the web. Scalability *within* organizations, whom are all using different platforms, connecting to different systems, and NOT in a pure web environment, is a completely different universe.

Based on the research, analysis and consulting I've done in search, taxonomy, discovery, navigation, information architecture, and beyond over the last 7+ years - I have found that nearly all of the organizations that I have dealt with, who have bought the Google Search Appliance, have experienced buyers' remorse. Many to the point that, just as they booted their previous solution (likely an ancient install of Verity), Google too, sees the exit door, very quickly.

Common gripes?
(Feel free to correct any of this if my information is not 100% up to date here) Unchangeable relevancy ranking, lack of taxonomy (static classification) ability or integration, no clustering, no semantic/linguistic/concept understanding, no entity extraction... Google does not do much of what makes the specialized findability needs of the enterprise a realistic solution.

Buyers' remorse may take 30 days, it may take a year or more, but while no other solutions are perfect either, Google is far from free (another misnomer), and clearly not THE solution for enterprise search.

I am happy to see Google provide some goosing of the more established players to wake them up and make sure they aren't disrupted out of business (aka Clayton Christensen's research), but buyers and suppliers alike need to keep their eyes open beyond Google for the near-term timeline. If the first and ONLY solution considered is the Google Search Appliance offerings, I would seriously question that investment. If anyone's needs are met by these solutions, fantastic, and it's certainly possible, but to short-circuit and assume that all search/findability problems point to Google, that's very dangerous thinking.

The Google solutions are (maybe) at v1.0 in the enterprise. There is a long road ahead, and they are not innovating nearly as quickly in this arena as they are on their consumer/individually focused offerings.

In theory, Google could completely own this space, given their resources. But they don't now, and don't appear to have any interest in doing so, which has certainly frustrated me as an analyst/consultant in this space - they're only partly serious about this slice of the business. Don't ignore them, but don't bet the future solely on them either.

And like just about everyone, I use Google services all day long, they're not perfect elsewhere either, but in far better shape than the enterprise search world.

Anonymous said...

An anonymous posting about uncited usability studies?

Hmmm ...

Try this U. Maryland study by Prof. Ben Bederson:
http://hcil.cs.umd.edu/trs/2003-36/2003-36.pdf

Second sentence of the abstract: "These studies have resulted in confirmed efficiency and preference of clustering over sequential lists."

So there.

Ravi Govil said...

I would appreciate if anonymous person who had posted the comment dated 08/05/2007 "Numerous usability studies have shown.." come forward and justify the claims. I would like to make this blog to be a constructive forum of thoughts on search technologies based on facts, analysis and experience not on official lobbying or biasing. I would love to open a debates on search technology comparison if they are based on facts and experience.

I have been getting lot of mails for clarifying anonymous comments. If you can claim your comments and justify it.