Search Engines Are Still in the Model T Era

What does the future hold for search engine technology companies? What about larger companies like Yahoo which are still heavily identified with search? Can they survive on their own? How will they prosper? What technologies loom on the horizon? These are the kinds of open-ended questions I’m asked fairly often. If I had to answer them at a holiday cocktail party, I’d probably keep it short and snappy (taking a cue from our long-forgotten Cocktail Party Crib Sheet).

But for the slightly more attentive reading public, this is what I would offer as a tentative answer in short essay format.

The stock market boom made search engines famous before there was very much to brag about. That’s unfortunate because it obscures the deep changes Internet search and retrieval technology have introduced into our daily routines.

Our love of simple definitions and recognizable brand names sometimes leads analysts to describe the technological status quo with a sense of finality. A few years back, everyone “knew” that about ten search engines like Infoseek, Lycos, Excite, Webcrawler, AltaVista, and Open Text were “important.” Most engines (and directories like Yahoo) had cute names, so it was a natural tendency to talk about them almost as if they were people. But there were not many important differentiators amongst these technologies. Many of us knew less about them than we thought we did. What’s more, we continued to waste our time talking about some of them as if they were living breathing entities, failing to notice that some of them were all but dead, and that new technologies, as yet without cute names, were emerging. Inktomi rose, fell, stabilized. Google displaced Inktomi as Yahoo’s spider engine of choice. The Open Directory and LookSmart came along to challenge, but not defeat, the Yahoo directory.

The end user’s persistent uncertainty about the reliability of results from any particular search engine fed into the popularity of the Internet’s first major “metasearch” engine Metacrawler, developed by Oren Etzioni, a University of Washington scientist. Metasearch is still popular. What’s more important than the presence of specific brand name metasearch engines (other popular ones include Dogpile, Ixquick, and Mamma), however, is the concept behind metasearch: polling a variety of search technologies so that one need not be committed to a single method of ranking results. There really can be no single best way of classifying or ranking information culled from a huge database such as the World Wide Web. In a similar vein, Steve Thomas, CEO of Wherewithal, a taxonomy software company, has argued that there is a “fixed taxonomy problem” associated with Yahoo-style categorized directories. Metasearch is one way of addressing the problem; another is a rethinking of the technology and assumptions behind categorized directories.

Lurking somewhere underneath the surface appeal of brand names and user interfaces, then, is the core functionality of search and retrieval technology in key applications in the scientific, business, and consumer worlds. Most of us are now sophisticated enough to know the difference between a directory and a “crawler-based” search engine, and more debate now takes place about what constitutes a relevant search result.

Google has captured the world’s imagination with a results ranking technique that might be referred to as reputation analysis. The presence or absence of the user’s key phrases still determines which pages will contend for a high ranking in the search results. But the ranking within those results depends on an analysis of the linking structure of the web, on the assumption that a link to a page from another important page is a measure of importance or external reputation.

No one will disagree that Google is awesome. Consumers loved it for a host of reasons; one of them was a drastic reduction in search engine “spam.” But Google is far from the last word on the subject. They’re pioneering the idea of reputability measurement. It’s such an important trend that most of their major competitors – Inktomi, FAST Search, AltaVista, Teoma – are also measuring “off-page factors.” It’s a particularly impressive feat given that Google is doing it in the wild west atmosphere of the whole Internet – a very public, very large database with plenty of incentives for spoofing and spamming on the part of particular web site owners. Google is cool, much in the same way a Model T Ford was cool. As long as you want it in black.

Search engines aren’t yet very customizable. The Google I’d like to see would offer all manner of dials and switches so that I could test out different versions of the algorithm. If today’s “power searcher” is a bit like yesterday’s librarian (an intimate familiarity with Boolean searches), tomorrow’s will be more like a cross between a mathematician and a forensic scientist – someone who wants to pick and choose amongst five different ways of measuring page reputability. Custom metasearch will make its present felt in the coming years. We’ll also see more options in terms of our ability to retrieve multimedia and other varied file types, and to access the so-called “invisible web.”

There is unwarranted concern today about the economic viability of search engine companies, perhaps because conglomerates like ExciteAtHome (to shut down forever on Feb. 28) and Yahoo (struggling to replace advertising revenues with fee-based services) have had financial struggles. But even in a time of severe recession and limited access to capital, there are a growing number of success stories at various stages of development. Google is purported to be profitable already, and much of the revenues derive from a stream that is supposed to be dead: CPM-based advertising. Bob Thomas, Director of Marketing for the I-Business Unit of FAST Search, notes that IBM’s public web sites saw significant increases in page views following the installation of improved site search functionality. If their recently rejuvenated stock market valuations are any indication, even struggling companies like Ask Jeeves, Inktomi, and LookSmart may be turning the corner.

Corporations are beginning to realize that whether it’s on a public e-commerce web site or behind the corporate firewall, the Internet is primarily a mechanism for classifying, searching for, and retrieving information. The search technology sector is neither a basket case nor a charity case, given the ROI that is associated with allowing users to access the information they need.

Search companies will do better if they and their venture backers can remain fiercely independent against the forces that want to turn them into something they aren’t. In the last wave, AltaVista fell victim to the pressure to compete with AOL and Yahoo as a shopping mall and/or media company. There will be more of these pressures, and they should be resisted. Yahoo itself might even want to resist some of them.

Open Text, founded in 1991, exited the consumer search engine business in 1996 to focus on developing integrated corporate intranet software, in which search played a key but not exclusive role. Today, they’re a profitable example of how a low-key approach and a sustained process of discovering the needs of corporate customers can pay off in the long run.

I recently spoke with Jason Liebman, President of Naming Solutions, a division of Applied Semantics. Since its inception, Applied Semantics has been focused on one aspect of search engine technology – the ability to discern the meanings of search queries based on a “concept map” or proprietary lexicon rather than merely recognizing keywords. Last March, Applied Semantics co-founder Gil Elbaz expressed a refreshing degree of surprise that so many of the people interested in search engines are coming at things from a marketer’s perspective, seeing search technology as interesting only insofar as they can help sell one’s products. “I’ve had my head so much into meaning-based search technology,” offered Elbaz, “I hadn’t realized how many people were so focused on the marketing-to-search-engines side of things.”

Good for him for being so oblivious. The fact is, search engines were born to help users find stuff. In the context of the whole web, search engines work best when they cater to marketers’ interests only indirectly by playing the role of neutral referee amongst the many sources of information clamoring for the search engine user’s attention. If companies like Inktomi or Google start spending too much time figuring out how to make money for their advertisers, they’ll forget what excited consumers about search engines in the first place. As in any technological field, the right balance needs to be struck between basic and applied research. Too little focus on basic research, and you might not wind up with anything innovative or interesting to sell.

As it turns out, there have been plenty of applications for Applied Semantics’ technology. Their Naming Solutions division, which offers a variety of products to help domain name registrars provide name suggestions to their clients, and thereby increase their sales, is exceeding revenue targets. 98% of domain name searches on registrars’ sites “don’t result in a sale,” according to Liebman. Applied Semantics wants to help registrars improve on that figure, and have already partnered with most of the major players, including Register.com, Verisign, and Yahoo.

Maybe the lesson here echoes the title of a popular career-planning guide: do what you love, and the money will follow.

Large diversified media companies such as AOL, MSN, and Terra Lycos, admittedly, do not necessarily need cutting edge search technology as long as they can keep their databases relatively free of spam. There is even a theory that erratic, unreliable search engine and directory technology helps companies like these make more money by forcing more advertisers to pay for placement if they want to be seen. Thankfully, companies like Google and Applied Semantics are showing that you can not only make the world a better place by building a better mousetrap, but you can also thrive financially without having to dumb down your pursuit of new horizons in search.

An earlier version of this article appeared in a recent edition of Internet Markets, a European telecommunications trade journal.

Search Engines Are Still in the Model T Era

Google’s Browser-Based Data Collection Opt-Out: Staying Ahead of Regulation

Say Your Final Prayers, Black Hat SEO’s: Guest Post by Dr. Ken Evoy

Google, Caffeinated

Google’s Shuttered Projects: Does it Come Down to Trust?

Frequently Asked Questions about Portals (FAQs)

Guide to statistics (intro) — for Yelp-listed business owners

In Online Reviews, How Much Negativity Do We Really Want?

Guide to statistics (intro) — for Yelp-listed business owners

Lowering PPC Bids: A Powerful Dynamic

In Online Reviews, How Much Negativity Do We Really Want?

You may also like