Showing posts with label Auto-classification. Show all posts
Showing posts with label Auto-classification. Show all posts

Sunday, August 19, 2007

Enterprise Search - Redefining Scope

Continuing my discussion on my previous post 'Enterprise Search Done Right' where I wrote about the user needs of enterprise search, in this post I want to share my thoughts of enterprise search and how things have changed in last couple of years in this technology to force the community to rethink about its scope and functionality.

In last few year there has been a shift in how users perceive search. The users are not just searching for the information but are also concerned about how they are searching, how effective and relevance is the search information and how information is being rendered to them. They are also judging the quality and efficiency of the search services. For the organization, emphasis is not how to make the information searchable but findable. The organization's web strategies are more aligned to serving the customer better and converting more business opportunities. For the search vendors, the emphasis has been on advancing the search algorithm, embedding more technologies under the search domain to provide a complete enterprise information management solution, and providing the search processes more engaged to have effective user experience.

How the research companies perceive the growth of enterprise search industry in next few years? Gartner suggests the search industry will realize double-digit percent growth in 2007, surpassing $728 million--a 15 percent increase from 2006's $633 million [According to the Gartner report, "Dataquest Insight: Forecast for Information Access With Search Technology in the Enterprise, 2006-2011."]. This is a positive sign, search vendors can add more technologies on the search stack to bring benefits to the organizations looking for enterprise information management solutions.

Before scoping the requirements, lets take a look at how users find or access information:

1. Pattern Search - It is typical form of searching as users search Google or Yahoo. You search for a word or a phrase, search engine will bring set of pages with the match. The advance version of pattern matching is Clustering in which search engines automatically classify initial set of search results in buckets. You can read about Clustering from Clustering with Search Engines. The public version of clustering engine can be found on Clusty

2. Browsable Taxonomy or Topical Navigation - It is browsing though the information based on pre-defined topics or categories. The organization information is classified into organization-wide taxonomies. The content can be classified or categorized at content creation time or at content indexing time (by search engines). The content authoring tools should provide capabilities to define taxonomies and provides association of categories to the information. For example Documentum provides taxonomy support for content classification. Alternatively, search engines can also classify information at the time of indexing based on pre-defined rules and taxonomy. For example, Verity provides auto-classification of information.



3. Navigating through Semantic Web - Semantics web is not just navigating though links on the web pages. It describes relationship based on meta attributes or properties. RDF (Resource Description Framework) is a markup language for describing information and resources on the web. The Semantic Web uses RDF to describe web resources. How this might this be useful? Suppose you want to compare the price and choice of ipods in your zip code, or you want to search online catalogs from different manufactures and service providers for mobile phones. The tools like Siderean provides RDF based alternate navigation.

If you look now, patten matching only symbolizes the search functionality in true sense, other type of information access is getting popular because of boundary between search and navigation is getting hazier every year. Now search can not work in isolated technology to solve enterprise information management solutions. It needs to provide and integrate with the collection of technologies to meet demanding enterprise needs. Now search is not just pattern matching algorithm, it has been expanded into a complete enterprise information access and management platform that includes extraction, classification, taxonomy support and pattern matching. The requirement and scope of search is not limited to 'searching information based on words and key phrases'. It is now an integral part of enterprise information management platform. The functions of enterprise search now includes:

1. Enterprise Search
2. Taxonomy Management - Ability to create or extend organization taxonomy
3. Information Classification i.e. categorization
4. Entity Extraction - Identifies and extracts key entities i.e. the who, what, when and how, such as people, dates, places, companies, email addresses, geo-coordinates, facilities, etc.
5. Application Integration - one, ability to integrate with various data sources within the organization for search. For example, now RSS feeds are good source of data for indexing information. Second, ability to capture user search and navigation information to collect data for search optimization.
6. Information Rendering - How the information that is searched is rendered to end users including caching, translation, transformation of information in various formats and user experience.
7. Administrative Interface - Ability to give an administrative interface to all site web master to control and view their site performance and control data.

If you look at the enterprise search vendors, you see most of the companies have made strides in developing next-generation search and advanced find tools. These include Autonomy, IBM , Convera, FAST, Inxight, Vivisimo, Siderean etc.

Tuesday, July 31, 2007

ISYS:web - Enterprise Search for Small and Medium Business

Yesterday ISYS Search Software announced Enterprise Search Support for Interwoven WorkSite. Last month, they had announced integration of ISYS:web 8 into Microsoft Office SharePoint Server 2007 (MOSS). When I read the news release, I got curious to know more about the ISYS and its market in the Enterprise search.

ISYS Search Software is based out of Sydney, Australia and has been business since 1998. Its major focus has been in Small & Medium Business (SMB) and Government sector. Currently ISYS has over 10,000 customers on seven continents, including Antarctica and was selected as 'one of the 100 most significant Australian innovations of the twentieth century' by the Powerhouse Museum in 1999.

When I looked into its product portfolio, I discovered following products:
1. ISYS:Desktop - indexing and retrieving tool for desktop and laptop primary windows based.
2. ISYS:web - designed for indexing and searching on public and intranet websites.
3. ISYS:sdk - SDK toolkit to support development

What interested me most is their ISYS:web product, an Enterprise Search Engine, suited for mid-market segment. It is out-of-box search tool with small foot-print, an easy installer, and can be customized and configured to meet complex enterprise needs. It compares with Google Search Appliance from pricing perspective and meet features and functionality of full blown enterprise search engine like Verity and Fast.

From platform perspective, it is based on Windows and supports Windows XP, Windows 2000, Windows 2003 server, Vista operating systems. That is one of the reasons, it can easily provide integration points to Interwoven WorkSite and Microsoft SharePoint Server, both windows based products.

What impressed me most, is its rich set of features. It provides a range of search, navigation and discovery tools all bundled into a single platform. One can execute natural language queries or easily construct advanced Boolean and proximity queries via point-and-click operators. Navigate, drill down and instantly locate the right information via On-The-Fly Categorization, ‘search within’ queries, hit highlighting and hit-to-hit navigation. Discover associations and connections between your search terms and the entities ISYS has automatically identified within your search results with ISYS Entities. Administrators will also benefit from ISYS:web’s easy-to-deploy, web-based administration, which includes features like the ISYS Site Designer and Search Designer, federation of remote ISYS indexes, Best Bets, and ISYS SearchTrends for measuring and analyzing search patterns.

It can seamlessly search multiple disparate data sources, including Microsoft Office, WordPerfect, Open Office, HTML, ZIP files, all major email products, all SQL data sources, SharePoint, Lotus Notes and Lotus Domino.

I wanted to try it myself, ISYS Search Software website provides search through its own search engine and I also found some more organization using ISYS, including City of San Jose and Town of Frisco. ISYS also have impressive list of customers, including Boeing and Cisco Systems.

When I look at the ISYS:web, it has all required features of Enterprise search engine for SMB and with low cost of ownership. Even the pricing model is quite different from tools in same category. Most of search engines in this category are per document based licensing or appliance model. ISYS:web has base license cost and an additional per user ($100) licensing model. In addition, it claims to have provision for volume licensing. If I am company with 100 employees, my total cost of ownership will $1000' + $100*100 = $11000, which is significantly lower than other search engines.

Though I am impressed and want to install and run the Search engine myself, I am also skeptical about its performance and scalability. Whatever it turns out to be, but it still meets its objectives of rapid return and low cost of ownership.

I am going to explore more on its navigational (auto-classification, faceted search) and discovery features and share with you.