Monday, August 27, 2007

Enterprise Search: Google Expands Capacity of Custom Search Business Edition

Good News for Google CSBE fans!! Google announced today that it will expand the capacity of its Custom Search Business Edition. Previously it was only offering two options; 5,000 and 50,000 pages. According to Google, large number of clients have asked what to do if they have more than 50,000 pages. Today Google added two more options to the online Custom Search Business Edition offering. They introduced two new plans that businesses can purchase online:
  1. Search up to 100,000 web pages: $850 per year
  2. Search up to 300,000 web pages: $2,250 per year
Besides If you want to search more than 300,000 pages, contact them.

When I heard first time about Google's new offering, I wrote in my article about pros and cons of having Google CSBE. After a quick reading, I suggested my own company to buy the service and drop the open source Nutch search engine to serve about 1000 pages. Since it was just $100 for less than 5,000 pages, marketing did not waste too much time in buying the search services. Next morning I got a 'thank you' message, saying good things about Google CSBE. It took the marketing manager less than an hour to setup the account and get the search on the website without UI customization. Besides it was less of headache than managing the open source search engine for such a small website. We have also suggested to one of our very large client to include CSBE for their public search. This client has been struggling to support effective search to the public for a very long time. Since its inception, it has switched couple of search engine already trying to meet users expectation. If everything goes right, CSBE will start providing searches for the large enterprises.

I think this is one of the Google's best offerings lately. I think Google Search Appliance is great but Google CSBE is awesome for all types of Enterprises.

Sunday, August 19, 2007

Enterprise Search - Redefining Scope

Continuing my discussion on my previous post 'Enterprise Search Done Right' where I wrote about the user needs of enterprise search, in this post I want to share my thoughts of enterprise search and how things have changed in last couple of years in this technology to force the community to rethink about its scope and functionality.

In last few year there has been a shift in how users perceive search. The users are not just searching for the information but are also concerned about how they are searching, how effective and relevance is the search information and how information is being rendered to them. They are also judging the quality and efficiency of the search services. For the organization, emphasis is not how to make the information searchable but findable. The organization's web strategies are more aligned to serving the customer better and converting more business opportunities. For the search vendors, the emphasis has been on advancing the search algorithm, embedding more technologies under the search domain to provide a complete enterprise information management solution, and providing the search processes more engaged to have effective user experience.

How the research companies perceive the growth of enterprise search industry in next few years? Gartner suggests the search industry will realize double-digit percent growth in 2007, surpassing $728 million--a 15 percent increase from 2006's $633 million [According to the Gartner report, "Dataquest Insight: Forecast for Information Access With Search Technology in the Enterprise, 2006-2011."]. This is a positive sign, search vendors can add more technologies on the search stack to bring benefits to the organizations looking for enterprise information management solutions.

Before scoping the requirements, lets take a look at how users find or access information:

1. Pattern Search - It is typical form of searching as users search Google or Yahoo. You search for a word or a phrase, search engine will bring set of pages with the match. The advance version of pattern matching is Clustering in which search engines automatically classify initial set of search results in buckets. You can read about Clustering from Clustering with Search Engines. The public version of clustering engine can be found on Clusty

2. Browsable Taxonomy or Topical Navigation - It is browsing though the information based on pre-defined topics or categories. The organization information is classified into organization-wide taxonomies. The content can be classified or categorized at content creation time or at content indexing time (by search engines). The content authoring tools should provide capabilities to define taxonomies and provides association of categories to the information. For example Documentum provides taxonomy support for content classification. Alternatively, search engines can also classify information at the time of indexing based on pre-defined rules and taxonomy. For example, Verity provides auto-classification of information.

3. Navigating through Semantic Web - Semantics web is not just navigating though links on the web pages. It describes relationship based on meta attributes or properties. RDF (Resource Description Framework) is a markup language for describing information and resources on the web. The Semantic Web uses RDF to describe web resources. How this might this be useful? Suppose you want to compare the price and choice of ipods in your zip code, or you want to search online catalogs from different manufactures and service providers for mobile phones. The tools like Siderean provides RDF based alternate navigation.

If you look now, patten matching only symbolizes the search functionality in true sense, other type of information access is getting popular because of boundary between search and navigation is getting hazier every year. Now search can not work in isolated technology to solve enterprise information management solutions. It needs to provide and integrate with the collection of technologies to meet demanding enterprise needs. Now search is not just pattern matching algorithm, it has been expanded into a complete enterprise information access and management platform that includes extraction, classification, taxonomy support and pattern matching. The requirement and scope of search is not limited to 'searching information based on words and key phrases'. It is now an integral part of enterprise information management platform. The functions of enterprise search now includes:

1. Enterprise Search
2. Taxonomy Management - Ability to create or extend organization taxonomy
3. Information Classification i.e. categorization
4. Entity Extraction - Identifies and extracts key entities i.e. the who, what, when and how, such as people, dates, places, companies, email addresses, geo-coordinates, facilities, etc.
5. Application Integration - one, ability to integrate with various data sources within the organization for search. For example, now RSS feeds are good source of data for indexing information. Second, ability to capture user search and navigation information to collect data for search optimization.
6. Information Rendering - How the information that is searched is rendered to end users including caching, translation, transformation of information in various formats and user experience.
7. Administrative Interface - Ability to give an administrative interface to all site web master to control and view their site performance and control data.

If you look at the enterprise search vendors, you see most of the companies have made strides in developing next-generation search and advanced find tools. These include Autonomy, IBM , Convera, FAST, Inxight, Vivisimo, Siderean etc.

Thursday, August 16, 2007

American Customer Satisfaction Index - Google vs Yahoo

There has been so much talks about the results of American Customer Satisfaction Index and how Google slips badly against Yahoo and Ask in search and portal market. I was trying to understand the survey criteria and getting the facts with respect search engine market before adding my 2 cents.

First of all, the survey asked a statistically representative of consumers to rate their experiences with portals and search engines according to a number of criteria, which produces an overall satisfaction score on a 100 point scale. Wait a minute, search and portal. Does any agree with the fact that these are two different technologies and solves different business problems? So the data does not correlate with the search engine market share or portal share market. Why should I and anyone else be interested in ACSI results? Is it because there are major player in the fray for the top spot? Would it have made headlines if Google and Yahoo were not affected by ratings? I am not sure why people are comparing apple with oranges and saying Florida Oranges are juicer than Washington State apples. Wow! that was strange.

Portal and Search are two different technologies. The truth is that Google had already taken giant leap ahead of its competitors in search technology in early 2001 and since then rest of herd is just trying to catch up with it. Portal has never been Google forte, though it had tried various variations with the personalized iGoogle Portal. Yahoo has always been good in portals, in fact it's Yahoo! Finance has been more respected and used financial portal. Recently they have given new look to their Portal and Mail platforms. Yahoo is doing great work in search but I still feel they have lot to do to catch up with Google. Recently they added search suggestion which Google had for a long time. It is a good start by nowhere at the same level of Google suggestions.

I personally use Yahoo for portal, Google for search, and Google & Yahoo for mail. Not because I am Google fan or Yahoo loyalist, but for what they do best. Besides, there is also brand loyalty and inertia to hold back in using new products with the consumers. The reasons why MSN has same points even though there hasn't been any product from Microsoft which has attracted anyone attention in search area.

What is it for Google, Yahoo and consumers after this survey? Google already knows if they need to stay ahead of race, they need to keep inventing and bring new technology innovations. Google is second name of search. People recognize Google with search innovations and have high expectations. Would there be further decline in Google numbers next year and what if? Since I do not rate these number very highly, I do not think Google should care. It kind of echoed in their response to these reports. But number can trigger some fluctuations in the stock prices, but momentarily. But it must have been a champagne day at Yahoo, beating Google after a long time in their own territory. The credit must go to Yahoo, they have been working hard to bridge the gap between their and Google search engines. I wish them good luck, it is competition that will keep Google on it feet, always guessing. Now most important, what is there for me, the consumer? Being a consumer is the easiest thing, need to use the technology and give the ratings. I see them having lot of responsibility in guiding these technologies with their needs and expectations. The consumers are the one who decides who will succeed based on who will fulfill the needs and requirements. I do not think these rating will impact Google and Yahoo's road map for the future, but certainly intrigued the bloggers to write.

Wednesday, August 15, 2007

Enterprise Search - Find Cost of not Finding Information

Before starting on any new initiative on enterprise search, first thing you need is commitment from the top executive. The executive would want to know the benefits of the initiatives from returns and value perspective before signing of checks for the project. From consulting perspective, project sponsors wants to know Return on Investment (ROI) before committing on the new assignments to the consultant.

The question I was asked so many times is "What is ROI of Enterprise Search Implementation?". Everyone understands importance of Internet search from brand awareness, web marketing and e-commerce stand point. The executives are ready to sponsor on the Internet search based on traffic they are getting on the website. But how about intranet, is there any reason why anyone would spent hundreds of thousands of dollars for employee search? Does it make sense even to spend that much of money for searching within the organization assets for information that an employee could get by asking various functional teams? But still core issues remains, is there a way to calculate your returns on investment in finite dollars figures? Are there any standard benchmarks against which I can measure the success of implementing Enterprise Search? Besides I think needs for enterprise search varies from organization to organization and individual to individual. No two employees within same or different organizations will have same search needs. Then how will I measure quantitatively the returns on investments for an organization? What do I do as a executive or sponsor? Do assessment on returns or start the initiative!

If you ask me, I would first rename the phrase 'ROI' to 'finding cost of not finding information' as step one. It makes me more comfortable as I am trying to calculating returns from problem perspective rather solution side, to be able to justify myself from both stand points - quantitatively and qualitatively. Second step would be list of research facts about search usage, average time spent by employees, time spent to recreate an asset etc. from all the research companies. It gives key numbers to justify your claims, secondly no one questions their claims as these are well accepted figures. The third step would be list of all major organization roles/titles who use search in their daily routine for getting the job done, along with information about what they search for and why. The fourth steps is to detail the impact of their search on their work routine from efficiency and quality perspective, if they are not able to find information they asked for because of ineffective search technology. Once you get to that level, you would realize either significance or redundancy of enterprise search with the organization. It is very important to identify the core problems first and then challenge it.

Lets list down the relevant research facts;
- Knowledge workers spend from 15% to 35% (average 25%) of their time searching for information.
- Knowledge workers spend 10-15% of their time in duplicating existing information
- Searchers are successful in finding what they seek 50% of the time or less
- 40% of corporate users reported that they can not find the information they need to do their jobs on their intranets.
- Every employee in a company produces more than 800 megabytes of digital information every year
- Not locating and retrieving information has an opportunity cost of more than $$ annually based on industry size.
- Call center costs and volumes have been decreased by 30% and more when better search and browsing tools were implemented.

There are more research results based on statistical data. But one thing is clear. What we can not do is, measure the increase in creativity and original thinking that might be unleashed if knowledge workers had more time to think and were not frustrated with not finding information. The wrong and delayed decisions are caused by lack of right information to the right people at the right time. For their daily routine work, they not only need to have access to the right information, but only when they need it.

Now get the list of roles that need to access to the enterprise search for right information to get their work done.
- Marketing needs access to all the product and service collateral that have been prepared for specific customers
- Sales needs access to all the product and service collateral, the new/old/lost opportunities of sales, contact information of prospective clients and reports of existing accounts
- Program managers need to access all project related information including deliverable, status and baseline documents
- Project managers need to access all the project methodology documents including processes, standards, guidelines, and templates
- Project team members need to access all the code and design documents that been created as part of past projects.
- All employees need to access organizational news, assets, HR policy documents, employee search and new/existing employee benefits.
- Help desk need to access all the past issues and resolution database to be able to answer call effectively

We can keep adding lists of roles and responsible titles who use search to find information to get their daily routine work done. Now what if they do not find the right information at right time, how it will impact their quality and productivity. How much time would they spend extra in getting to the right information or rewriting the information they could have search.

For example,
- If the marketing manager has to recreate the product collateral that has already been created for another client. If he spends a week doing same work all over again because he could find the information or he was not aware of work done in the past.
- or if project team member spend months rewriting the code that was already done for another client because the code was not searchable
- or if project manager compromises on the quality of the project and spend thousands of extra dollars because he was not aware of existence of process documentation including guidelines and templates

The cost of not providing effective, efficient and reliable search often results in delayed decision, quality of work and missed opportunity which does not get classified in right bucket. The reasons of failure are often targeted at individual who are responsible for executing the work rather than attributing the reasons of these failure to ineffective, inefficient and unreliable enterprise search. That is the main reason for difficulty in finding ROI of enterprise search.

One thing for sure, organizations cannot afford to ignore the enterprise search today. The cost of not finding information is simply too high.

Wednesday, August 8, 2007

People Search beyond Web

New search engine Spock 'Search for People and Discover Where Your Friends Are On The Web' launched with a public beta today. It is next generation people search engine after and Linkedin. It is a marriage between wikipedia, Linkedin and social networking sites. Spock enables search to find friends, old acquaintances, favorite celebrities, or people you would like to know. With Spock you can discover people by searching for their name, or by using descriptive tags. For example if you type in "President" into the search box, Spock will pull back a presidents of all countries and companies. Then using tags, you can narrow down the results and find only "President of United States of America".

This next generation of people search engine indexes and organize all the public available data about individuals so that they can be discovered easily. It also enable enriching the public data using web 2.0 and social networking philosophy. Claiming your Spock search result allows you to add biographical information, tags (tags are just short keywords that describe someone), images, relationships, and other web content to make your search result more accurate. It will also allow you to receive alerts when Spock or another Spock user updates your search result.

The usability and relevance of the people search will increase as more intelligence is added by collaborating and networking with people. It is a self learning system after initial indexing.

I see lot of value of people search on public web in connecting people and finding old acquaintances, but also I recognize the value of people search in some type of organizations. For example, the research and consulting businesses are based on finding right people for the job.

Again referring to my previous assignment on knowledge management assessment for research organization, I had recommended people search similar in lines with Spock search as one of key initiatives in knowledge management practice implementation. This organization's core business is doing research in new technologies which has lot of economic and social obligations. The organization design its own research portfolio but outsource the work to the contractors who are experts in the respective fields. To get the research started, they normally rely on same set of contractors who had done similar work for them in the past or people in their professional network. They do not have means to find people based on subject matter expertize, skills or past work performances. Due to lack of people search, they do not know if they are finding the right expert to get the job done. There is no process data to substantiate their claim of delivering quality products and services using same set of experts. It was a concern for them but they do not have solution.

A similar kind of tool would not only benefit this organization but others who rely on expertize from outside and are always searching for people. I, being from consulting background, knows the pain of finding right people at right time to get work done. The best part about this search engine is that it is self served, individuals can claim themselves and own the node. Any information updated by others get moderated by the owner. The search tool is marriage between wikipedia and linkedin, providing best of both worlds.

I just hope Spock lives up to the expectations. I am afraid the site functionality is not working. It does not allow me to create user.

Clustering with Search Engines

When I first read about the Clustering, I was confused about its utility and ability to work on limited set of search results. But over the period time, after reading lot of research material on search usability and taking to people, I realized that searcher do not go beyond few pages. In fact the study shows that 2/3 of searcher do not go beyond 2 pages of 10 results each. The searcher either find the information that they are looking for or change the search terms. I personally do not go beyond 2-3 pages unless I am not able to refined search phrase. Even with 2 pages of result set, users have filter and infer the context of search results to find for the relevant information that they are looking for. So actually the users are spending more time on search result windows rather than actually working on searched information. When I am searching, it is not only important how relevant the search results are, but also how much time it took to me to get to those results. Here I see value of clustering with search engines.

Search technology has been evolving and maturing over last few years. The search companies are completing with each other for creating a niche for themselves and attracting more traffic to get benefit from advertising, but ever increasing demand of end users are always overtaking them from behind. The users are getting interested in federated search, clustering and faceted search apart from the regular sequential search. Google web search provides clustering as indented search results when it find more search results from same site or site section. Some other search engines like Vivisimo Velocity provide Topic Clustering, or grouping results into topics/subjects, that help in refine searches.

Can we use clustering for anything else? We can use clustering to build applications like job sites, event sites, social networking etc. Also we can generate tag cloud to guide users navigate through popular topics. The list can grow as we see more need for applications.

I was working with Nutch and Google search appliance, when I got interested in the clustering and faceted search technologies. I wanted to integrate the clustering technology with both these search engines to show my clients the value of clustering technology. I gave them demo with Clusty web search and they see lot of value. But question was, can I integrate it with Google search appliance and is there any free or low cost tool that can provide the same functionality. I knew about Vivisimo, but client wanted a cheaper solution, so I started digging more in the open source arena.

Then I came across Carrot2, Open Source Search Results Clustering Engine. Search Carrot2 provides an architecture for acquiring search results from various sources (YahooAPI, GoogleAPI, MSNAPI, eTools Meta Search, Alexa Web Search, PubMed, OpenSearch, Lucene index, SOLR), clustering the results and visualising the clusters. Currently, 5 clustering algorithms are available that are suitable for different kinds of document clustering tasks. Carrot2 has been successfully used in a number of commercial and research applications and resulted in a number of interesting publications.

I found this tool interesting and wanted to use it with my existing search engines. The tool provides seamless integration with nutch and lucene search engines. The developer only needs to point to existing search indexes and customize the page layout. The carrot application is deployed as webapp which you can be drop in any web application server. The deployment was cake walk and results were fascinating. Then I also got it working with the Google search appliance. Here I had to use its Java APIs to build a clustered search interface from the results of Google appliance.

If anyone is looking for low cost clustering solution, one can use open source clustering engine which can be integrated with any existing web search engines and also with enterprise search engines like Google search appliance, lucene index (one can debate on its enterprise capability). It is easy to deploy and configure and does not impose any extra baggage.

I am sure Google must be working on this technology and would come with a solution which is a notch better than Vivisimo or Carrot2. I am eagerly waiting to hear from them.

Friday, August 3, 2007

Enterprise Search Done Right

When I was doing an assignment with a leading energy research company on knowledge management assessment, I learnt how important a search could be for the business and how much a company could go wrong in implementing a search solution. The key to success in research and consulting organizations is effective implementation of knowledge management practice and search plays a important role in it. Enterprise search has direct impact on the business and productivity of employees in these type of organization.

When I were there, I looked at their infrastructure and their business processes, and tried to assess the maturity level of their knowledge management program. The results were appalling. They did not have a knowledge management program. They did not have enterprise content and collaboration management system. They did not have internal search. Their external search engine despite of having two searches was redundant. They did not understand the significance of search for their employees as well as business with respect to productivity and knowledge access. The researcher procured information from various knowledge databases and repositories through a library services. There were not processes in place to measure the effectiveness and efficiency of these library services. The researchers and scientist were so dependent on Google and Yahoo's for information related to their work.

This is not the only story of, how organizations can go wrong in understanding the user needs and implementing ineffective search solutions. I have browsed through so many organization's websites that have great products and services but lack in providing effective search solutions for their users. I do and do not blame them. There is so much work done on external search engines like Google, Yahoo that the expectation of end users have gone up for all kind of searches. When they browse organization search sites, they expect similar user experience and relevance as they get on external web search engines. I do not mean to say that there aren't enterprise search engines who provide similar relevance and experience as external search engines. There are search products from Fast, Verity and Autonomy that leaders in enterprise search.

If I need to suggest a tool that would solve the research organization's search problem, first I would try to understand their business and user needs. Then I would list the features that I need in my product. Lets drill down on the requirements:
1. Ability to search all my repositories including websites, file systems, databases, enterprise applications like SharePoint, Documentum etc. They researcher will be able to access of information including past work done as well knowledge acquired from various sources.
2. Ability to search on external Corporate and academic libraries, journals, feeds etc. It is also called federated search. They do not need to go to various search engines to get information.
3. Ability to classify my documents and content in easy to browse categories. This would help me drill down on the information based on categories rather than pages with 10 results. No one actually browse beyond couple of pages on search.
4. Web administrative interface with advance linguistic capabilities including metadata, synonyms, antonyms, relative weighting for text fields and stemming.
5. Ability to perform secure searches.
6. Ability to scale and perform.

The rest of requirements are generic like relevance, supporting formats, summary extraction, metadata indexing and crawling configuration. If I have a tool that provides solution to all the requirements, certainly it is a candidate for my approval.

When I did the research, I found Vivísimo Velocity Search Engine as one of most powerful enterprise search that satisfy my set of requirements and a great relevance algorithm as compared to Google/Yahoo. There were other who did come close but not close enough. They were either not able to satisfy all requirements or not packaged as single solution.

Vivísimo, a difficult word to pronounce, provides innovative search solution that not only provides same search relevance as compared to the external web search engines but also provides similar user experience. You can access their public web search at to get a glimpse. Their clustering solution is also popularly known as Clusty.

Integral components of the Vivísimo Velocity Search Platform:
  • Velocity Search Engine
  • Velocity Content Integrator
  • Velocity Clustering Engine

The external web search of Vivísimo looks impressive and I just hope the enterprise search is as promising as their web search. I think anyone looking for enterprise search engine should take a look at their offering.

I have done lot of research of federated search and clustering technology which are commercial as well open source. I will write on these technologies in coming posts.

Tuesday, July 31, 2007

ISYS:web - Enterprise Search for Small and Medium Business

Yesterday ISYS Search Software announced Enterprise Search Support for Interwoven WorkSite. Last month, they had announced integration of ISYS:web 8 into Microsoft Office SharePoint Server 2007 (MOSS). When I read the news release, I got curious to know more about the ISYS and its market in the Enterprise search.

ISYS Search Software is based out of Sydney, Australia and has been business since 1998. Its major focus has been in Small & Medium Business (SMB) and Government sector. Currently ISYS has over 10,000 customers on seven continents, including Antarctica and was selected as 'one of the 100 most significant Australian innovations of the twentieth century' by the Powerhouse Museum in 1999.

When I looked into its product portfolio, I discovered following products:
1. ISYS:Desktop - indexing and retrieving tool for desktop and laptop primary windows based.
2. ISYS:web - designed for indexing and searching on public and intranet websites.
3. ISYS:sdk - SDK toolkit to support development

What interested me most is their ISYS:web product, an Enterprise Search Engine, suited for mid-market segment. It is out-of-box search tool with small foot-print, an easy installer, and can be customized and configured to meet complex enterprise needs. It compares with Google Search Appliance from pricing perspective and meet features and functionality of full blown enterprise search engine like Verity and Fast.

From platform perspective, it is based on Windows and supports Windows XP, Windows 2000, Windows 2003 server, Vista operating systems. That is one of the reasons, it can easily provide integration points to Interwoven WorkSite and Microsoft SharePoint Server, both windows based products.

What impressed me most, is its rich set of features. It provides a range of search, navigation and discovery tools all bundled into a single platform. One can execute natural language queries or easily construct advanced Boolean and proximity queries via point-and-click operators. Navigate, drill down and instantly locate the right information via On-The-Fly Categorization, ‘search within’ queries, hit highlighting and hit-to-hit navigation. Discover associations and connections between your search terms and the entities ISYS has automatically identified within your search results with ISYS Entities. Administrators will also benefit from ISYS:web’s easy-to-deploy, web-based administration, which includes features like the ISYS Site Designer and Search Designer, federation of remote ISYS indexes, Best Bets, and ISYS SearchTrends for measuring and analyzing search patterns.

It can seamlessly search multiple disparate data sources, including Microsoft Office, WordPerfect, Open Office, HTML, ZIP files, all major email products, all SQL data sources, SharePoint, Lotus Notes and Lotus Domino.

I wanted to try it myself, ISYS Search Software website provides search through its own search engine and I also found some more organization using ISYS, including City of San Jose and Town of Frisco. ISYS also have impressive list of customers, including Boeing and Cisco Systems.

When I look at the ISYS:web, it has all required features of Enterprise search engine for SMB and with low cost of ownership. Even the pricing model is quite different from tools in same category. Most of search engines in this category are per document based licensing or appliance model. ISYS:web has base license cost and an additional per user ($100) licensing model. In addition, it claims to have provision for volume licensing. If I am company with 100 employees, my total cost of ownership will $1000' + $100*100 = $11000, which is significantly lower than other search engines.

Though I am impressed and want to install and run the Search engine myself, I am also skeptical about its performance and scalability. Whatever it turns out to be, but it still meets its objectives of rapid return and low cost of ownership.

I am going to explore more on its navigational (auto-classification, faceted search) and discovery features and share with you.

Monday, July 30, 2007

Google gets straight with Autonomy

There have been numerous debates on the capabilities of Google as an truly effective enterprise search engine. I have read also that some of the largest enterprises are thinking of replacing or actually replacing Google as their enterprise search. I came to know that Cisco is one of them. On the other side, I also have seen organizations replacing other enterprise search engines with Google Appliance citing relevance as one of their concerns. NASA is one to replace Verity with Google Appliance on their portal.

The reasons as I know, why Google has not been taken seriously in enterprise search were:
1. Google Appliance search is not as effective as their web search. People are used to of using Google web search.
2. Google enterprise search are only able to index web pages, not databases, file systems etc.
3. Google enterprise search is not secured

There may be other reasons, but these are the one that I have heard and read more frequently.

Autonomy blasted Google on some of the limitation that have been listed above in their white paper which was published sometime ago. Also read Google has 'dumbed-down search', says Autonomy boss. The reasons, Autonomy feels threatened by Google enterprise search capability which is growing at the phenomenon rate. I do not have exact figures, but it seems Google already has over 9000 customers and lead the enterprise search market segment.

In the response to the allegations, Google published Don't believe everything you read on their enterprise blog. I think it would clear doubts in minds of readers who read the white paper and more so believed in what was written. I do not blame the readers as not all of them get chance to try new technology as it evolves or progresses. They believe in what is printed. Google did great job in responding to all the issues in details.

I need to ask Autonomy that why they are targeting Google alone, why not compare it with our world class search engines. Secondly, not long time ago Autonomy refused to even consider Google in the same league, why now.

I think this war of words will do lot of good to Google and its credibility as enterprise search engine.

Monday, July 23, 2007

EU authorized Germany to give $165 million for research

The European Union on Thursday authorized Germany to give $165 million for research on Internet search-engine technologies that could someday challenge U.S. search giant Google Inc.

The project is named Theseus and it has been designed to create an advanced multimedia search engine for the next-generation Internet.

The EU would also be paying the funds to companies including Siemens AG, SAP AG, Deutsche Thomson oHG and EMPOLIS GmbH for this research purpose. European companies in general spend far less on research than those based in other parts of the world, and the EU said the project should help change that.

Google spokeswoman Katie Watson had this to say on the new development: "We welcome all efforts to help democratize access to information."

Google has been aggressively working on enhancing it's multimedia search by including books, video content. Google's Book search beta edition is live from May last year and provides searching and rendering of books online. Google Video and Youtube lets searches on the rich media content including TV shows, movies, music videos, documentaries, personal productions and more.

The funding will help in bringing new strategy and technology for searching the multi-media content. Thanks EU. Other countries should also funds the local researchers and universities so that they can help researching in technologies that is most suited to the local market. It will also help in stopping monopolization the search market.

Sunday, July 22, 2007

Microsoft: SingleView augments SharePoint

Microsoft announced a new search product developed in partnership with consulting giant BearingPoint at Microsoft's Worldwide Partner Conference here on July 10.

A new product, SingleView, that augments Microsoft SharePoint Server 2007. According to press release, the new product is already been used by dozens of Enterprise customers.

What is new in this product which is not supported in existing search of MOSS 2007? MOSS 2007 search provides full text indexing and IFilters interfaces to support meta data search in the Microsoft Office documents. It is still limited to searching the data stored in its SQL Server and Microsoft-related servers.

The SingleView will act as master search index for all kinds of enterprise data, including data that isn't stored on any Microsoft-related servers such as Oracle databases, document management archives, SAP and other CRM (customer relationship management) repositories, e-mails and other sources of information.

"Enterprise search is a very attractive business for us and our partners," said Jared Spataro, the group product manager for enterprise search at Redmond, Wash.-based Microsoft. The software company says SharePoint has more than 85 million seats sold to date, and continues to experience a strong demand for future growth.

Enterprise search is also more complex because unlike general Web search, it isn't just looking for keywords in a massive pile of data. The difference is that most enterprise searches are looking at the relationships among different—and disparate—pieces of data. Enterprise Search Engines have evolved over the period of time while trying to solve real world. The Enterprise Search Engines have matured while engaging in the assignments with the clients. None of search engines can claim to 100% out-of-box solution for any mid-size or large organization. It is usually 60-80% depending on the complexity of problem of client.

There are other companies like Coveo and other, that have built search services around MOSS 2007. This is clear indication that MOSS search engine is not matured and feature-rich enough to solve enterprise problems.

I am yet to see any collateral on SingleView on the Microsoft site. I think it's great initiative from Microsoft to compete with other Enterprise search companies like Google Search Appliance, Autonomy etc.

The question here is "whether SingleView will always be packaged with MOSS or can become a standalone Enterprise search tool".

Friday, July 20, 2007

Google adds Custom Search Business Edition

On Tuesday, Google added a new Customer Search Business Edition (CSBE) to its search offering. It is a lower-end complement to Google's enterprise-targeted appliances targeting small businesses who are not willing to pay even $2000 for Google Mini. The new Google search edition is just $100 for searching upto 500 pages and $500 for pages up to 50,000 pages. Higher volume is supported but the pricing model is not yet announced.

The key feature of Google CSBE:

1. Hosted search edition for small business at very nominal cost
2. Add-free search lets small business to do their own branding
3. Quick setup through online wizard
4. Customization feature on website for branding
5. Online reports and web analytics
about visitor behavior
6. $100 for upto 500 pages, $500 for upto 50,000 pages
7. XML feed for search results
Make refinements to help categorize search results
9. Subscription
10. Internationalization
11. Payment integrated with Google checkout

Some of the limitations of this edition:

1. No control on indexed content - crawling as well as indexing
2. No scheduling or control on indexing of content - specially for events

If I have a small company with no or limited IT resources and budget, this is great option I have. The advantages are more than the limitations of Google CSBE. I still feel it is better option for small businesses with no or limited IT resources over IBM OmniFind Yahoo Edition.

But IBM OmniFind Yahoo Edition is not a hosted solution and can be alternative to overcome the limitations of Google CSBE. We are not trying to compare oranges to apples. Both products are trying to solve small and medium businesses but not in the same category.

Sunday, May 20, 2007

Welcome to World of Search

Welcome to World of Search.

Searching has never attracted so much importance since advent of Google. People have started using "google" instead of "search". Can you imagine the world without the Google or search?

But have you given a thought what goes behind the search engines? Why one search engine is better than other? When should I use Google or Yahoo or any other?

Why he can find better results than me? Is searching an art or science? Which is more important, searcher or search engine?

Is there any scope of another "Google" on the web? What is missing in search in popular search engines?

Why is web search better than enterprise search? Why do I need an enterprise search engine when I have web search? What are key areas I need to learn about Search to make it effective within my enterprise? Why should I spend millions of dollars in Enterprise Search when I can get web search for free?

What are ROI on search investments? Does it make sense to spend millions on search? Why shouldn't I use the open source search engines ?

We are going to answer some of key questions on search. We are here to share and learn from our experience with Search and "Search Engines".