Start your spiders Web search software will uncover treasures but not always what you're looking for

By Simson Garfinkel

Special to the Mercury News

LOOKING for the lyrics for a new song you heard on the radio? Want to locate your high-school sweetheart? Trying to learn the fat content of a banana?

The answers may well be on the Internet -- if you only knew where to look. The looking is getting easier, thanks to the increasing number of companies offering Net-search software.

Their programs, sometimes called ``spiders'' or ``Web crawlers,'' scan all of the pages they can find on the World Wide Web. They build an index of words and then make that index available for searching through some kind of Web page of their own.

Some search engines are consistently among the fastest; others are faster at some times of the day, but not at others. And some of the search engines simply do a better job finding the kinds of documents you happen to be looking for.

With all of this in mind, I tried some of today's most prominent Internet searching tools. I looked only at the information provided on the first page of information returned by each search query -- typically 10 to 20 suggested answers or ``hits.''

I used three sample queries:

-- ``What is Bob Dole's position on abortion?'' -- The Net is full of information about politics and politicians. This seemed like a question that would produce results.

-- ``SWIFT currency codes'' -- SWIFT is an international funds transfer network. Recently Bernard Lyons, a senior information systems specialist working for Claris in Ireland, spent two months looking for this information on-line. (He finally found it.) I wondered if the search engines could do better.

-- ``Bananas'' -- A 9th grader writing a research report on bananas would probably just type in this one-word search request and then look at the results, so I did the same.

There's only one Internet, but different search tools processed the same question in different ways. Some sites only returned the pages that contained all of the words typed into the search-field. Others tried to figure out the sense of the query.

We might have obtained different results if we'd searched for different phrases.

Some returned pages that didn't match the search at all; Digital's Alta Vista, for example, claimed to find more than 40,000 pages matching the search for ``SWIFT Currency Codes,'' even though it only found 6,896 pages with the word ``SWIFT.'' Those remaining 33,000 pages had something to do with ``currency'' or ``codes.''

All of the search services give you a way to refine your search and try again. But given the differences I found among the services, a more intelligent approach may be to try a different search service instead.

Before I started this survey, I was a big fan of Digital's Alta Vista, which is among the fastest search engines and has incredibly broad coverage. After looking carefully at what each service has to offer, I've decided to switch my primary search bookmark to Excite.

But don't take my word for it: Try all of the services yourself. Then go back and try them in a few months from now. Things on the Net are changing rapidly.

Excite

http://www.excite.com/

Excite started off as an index of Usenet news. Today it also indexes the Web and has reviews.

Coverage: Web documents, reviews; Usenet, including classified ads. Excite provides a summary of each document found.

Usability: Excite is very fast, has an easy-to-understand interface and provides the user with a brief summary of each document.

Dole: It found NBC News article on Microsoft Network, as well as Dole Watch, The New Republic, NetVote 96's article on Abortion, Political Woman Hotline and the NPAT (National Political Awareness Test) survey results.

Swift: Excite returned many links about international currency, including a link from the University of British Columbia listing the names, symbols and ISO currency codes for every nation of the world.

Bananas: Excite found ``Advice to Sellers of Bananas,'' with ``five steps to healthy bananas and stronger profits,'' courtesy of Turbana Corporation, a Banana distributor. It also located a copy of Harry Chapin's song, ``Thirty Thousand Pounds of Bananas.''

Alta Vista

http://altavista.digital.com/

Alta Vista is Digital Equipment Corp.'s Web searching site.

Coverage: Web & Usenet. Claims 11 billion indexed words from 22 million Web pages.

Usability: Searches typically took less than two seconds. Another search form tops each return of search info, making it easy to start a second search.

Dole: I found ``Dole's Abortion Statement Sets Off GOP Fireworks,'' a December 1995 New York Times article residing on the Pat Buchanan Web site. I also found a University of Pennsylvania's student page ``Where He Stands,'' with a link to Project Vote Smart, as well as a link to an anti-abortion organization.

Swift: The first hit had information on wiring money to a Latvian Savings Bank. Another was a mail-archive message regarding ISO Currency Codes, in which a bank vice president said that SWIFT currency codes were the same as the ISO currency codes. No link on the first page revealed the codes themselves.

Bananas: First up was ``Mangos, Bananas and Coconuts: A Cuban Love Story,'' by Himilce Novas. The second was W5GB, ``Green Bananas,'' an amateur radio club at New Mexico State University. Also found: a link to Bananas in Space. The 10th link was to http://www.safari.net/~ lychee/, where we learned that bananas are in the family of Musaceae. I found lots of scientific facts about bananas here, including the fact that bananas are used to treat bronchitis and fungus disease.

Lycos

http://www.lycos.com/

Lycos was one of the first Web crawlers. Today it claims 3.5 million unique URLs or Web addresses in its data base.

Coverage: Lycos Catalog, A2Z directory, and Point Reviews.

Usability: Provides a brief summary of each hit. At the bottom of each search page is a new search field that allows you to refine your search

Dole: The first match was a Politics USA's ``Sen. Bob Dole on the Issues,'' a comprehensive list of Dole's positions. Second was a page from Foster's Daily Democrat, a New Hampshire newspaper, with more comprehensive information and a link to the official Bob Dole WWW site.

Swift: First up was a series of tables from a book that contained a list of ISO currency and country codes, on a Web site belonging to IDRC Books. Second was an FTP site with ISO currency and country codes.

Bananas: ``There are virtually no bananas on the World Wide Web, but this is as much as I could find,'' writes Abdon Pijpelink, a Dutch student who created the ``Everything you ever wanted to know about Bananas'' Web page. Lycos also listed such things as recipes for Bananas Foster, Bananas at Large (a music company in Marin County) and Banana Software.

InfoSeek

http://www.infoseek.com/

InfoSeek is one of the most widely used search services.

Coverage: Web, Usenet newsgroups, FAQs (Frequently Asked Questions), ``Infoseek Selected Sites'' and ``Categories of Sites.''

Usability: Infoseek frequently slow took 20 seconds or more to answer a query. It wasn't clear how to start a second search, other than by hitting the browser's Back button. The service gives a brief summary of each link.

Dole: InfoSeek found links to ProLife News and SqURL's Abortion Info Page, information from Planned Parenthood of Southeastern Pennsylvania, and even info about the Telecommunications Act of 1996 (which criminalized distribution of information about abortion on the Internet), but the first 10 hits returned no pages from what one could assume to be a trusted source.

Swift: The first hit was ``Marshall & Swift,'' a firm that works in the insurance industry. It also found Roderick Swift's Home Page, and shareware for 3-D water and sediment flow visualization -- but no financial codes.

Bananas: Infoseek found links to banana breeders, recipes for bananas and the Costa Rica Natural Paper Company, which makes paper from bananas.

Webcrawler

http://www.webcrawler.com/

Webcrawler was developed by Brian Pinkerton, a student at University of Washington, and sold last year to America Online. Lately it feels a little moribund. Its research findings about the Internet haven't been updated since August 1994, for example.

Coverage: Web.

Usability: Whereas the other search engines return some content information about the site that is found, Webcrawler just returns the title and a score for relevancy. There was no obvious way to start a new search other than returning to the home page.

Dole: Webcrawler found 16 documents. The first was Project Vote Smart's reporting of the NPAT, which states which policies a politician would support if elected -- a direct hit. Webcrawler also found Libertarian Review of the candidates, Alan Keyes Web page, and a ``Pete Wilson Exposed'' page on the Democratic Party's Web site.

Swift: Webcrawler found six documents including the Loompanics Unlimited catalog, but none hit the correct topic.

Bananas: Webcrawler found information about Banana Programming, Joe Banana's photographs from all over the world, Banana Bungalow Hostels and even the nutritional information for Healthy Choice's Bananas Foster premium low fat banana ice cream, but no information about bananas themselves.

Magellan

http://www.mckinley.com/

Magellan is an Internet search service offered by the McKinley Group.

Coverage: Web, Magellan's ratings & abstracts.

Usability: Magellan allows the search to be restricted to sites that have a minimum ``Green Light'' rating. The service can return short, medium or long descriptions.

Dole: Magellan's first hit was ``The Unofficial Bob Dole Home Page'' at University of Pennsylvania. Second was The Osiris Weekly Trivia Quiz, which mentioned Bob Dylan, not Bob Dole. Magellan also found the Michael Jordan Home Page, Greenpeace, the comedy troupe The Capital Steps and even Dole Banana's Dole 5 a Day site, but nothing from the Bob Dole campaign in the first page of hits.

Swift: Magellan found an International Financial Encyclopedia, which has the codes displayed under each country's entry. It also found the Global Network Navigator/Koblas Currency Converter, probably useful for people who are looking for the SWIFT codes.

Bananas: It found only seven sites with the word, including a Usenet newsgroup, SlugWeb (information about the banana slug), two sites featuring insect recipes and Maui's Island Doll Shoppe. Insect Recipes got two stars.

Yahoo & Open Text

http://www.yahoo.com/

Yahoo was one of the original Internet guides. If there is no matching category, as was the case with Dole and Swift, Yahoo hands the search over to Open Text, another Internet search service.

Coverage: Yahoo & Web.

Usability: We liked Yahoo's categories, but they are only useful if the thing that you are searching for has already been found and categorized. Open Text's coverage was spotty at best.

Dole: Yahoo's search service couldn't find anything from the query, which is surprising considering that Yahoo has an excellent set of Dole links. Instead, Yahoo sent the query to Open Text, which returned two broken links to Usenet archives, a National Press Club speech by Phil Gramm, an ``Open Letter to Bob Dole'' on ``The Right Side of the Web'' and a copy of the July 17, 1995 issue of Political Woman Hotline attacking Bob Dole's ``pre-election conversion'' to family values.

Swift: Again, Yahoo failed. Open Text found a copy of the World Factbook 1992, a 2.2 megabyte file from the Central Intelligence Agency that contains information about every country. Buried in here we found the SWIFT currency codes, but finding them was a lot of work. Many users would be unable to download the file because of its size.

Bananas: Yahoo has an entire category for bananas. But the only thing in this category was a digitized collection of Banana fruit labels from all over the world, and nothing about the fruit itself. Open Text found 1,587 pages containing the word ``bananas.'' The first 10 returned were all from the same unhelpful (at least in this circumstance) Web site, Bananas at Large, the Marin County music company.


| Mercury Center Home | Index | Feedback |
©1996 Mercury Center. The information you receive on-line from Mercury Center is protected by the copyright laws of the United States. The copyright laws prohibit any copying, redistributing, retransmitting, or repurposing of any copyright-protected material.