Understanding

WWW

Search Tools


By Jian Liu (jiliu@indiana.edu)
Reference Department, IUB Libraries

URL: http://www.indiana.edu/~librcsd/search/


First Draft: September 1995
First Update: February 1996












Introduction

Generally speaking, there are two major types of internet databases/search tools that assist people in locating internet resources.

The first type arranges internet resources in some sort of classifictory schemes: alphabetical, chronological, geographical, subject-oriented, or a combination thereof. Here are some typical examples.

The main function of these general listings is for easy browsing. Most of them provide searching as well. They require a great deal of human effort in terms of collecting, arranging, html coding and annotating of resources.

The second type, which we are concentrating on today, attempts to collect and index resources in a more automatic fashion. It does not require extensive human intervention. Searching, instead of browsing, is the main feature of this type of tools.

These search engines/tools have two components: collection and search. The collection (also known as automated robot Wanderer, Spider, Harvest and Pursuit) part roams internet sites, mostly www, gopher and ftp sites, brings back resources, sorts, indexes and creates a database out of them. The metaphorical definition of Lycos, one of the most popular search engines, sheds some interesting light on the collecting activity:

Lycos comes from Lycosidae, a cosmopolitan family of relatively large active ground spiders that catch their prey by pursuit, rather than in a web. They are noted for their running speed, and are especially active at night.

The search component concerns the end user. It is the interface between the human searcher and the indexed database of resources.

There are several factors that determine the success of a search engine, chief among which are: the size, content and currency of the database, the speed of searching, the availability of search features, the interface design and ease of use.

We are going to introduce some of the major search engines, comparing their features in terms of the success factors mentioned above, and pointing out their database characteristics. In doing so, we hope you will become more familiar with those tools, more aware of their tricks and features, and find what you are looking for with more comfort and ease.


Major WWW Search Tools

WWWW - World Wide Web Worm
URL: http://wwww.cs.colorado.edu/wwww/
Developer: Oliver A. McBryan, University of Colorado at Boulder
Other Info:
Features:
Comments and Search Tips:

WebCrawler
URL: http://webcrawler.com/
Developer: Brian Pinkerton, University of Washingtin
Other Info:
Features:
Comments and Search Tips:

Lycos: The Catalog of the Internet
URL: http://www.lycos.com/
Developer: Dr. Michael L. Mauldin, Carnegie Mellon University
Other Info:
Features:
Comments and Search Tips:

InfoSeek Guide
URL: http://guide.infoseek.com/
Developer: Steve Kirsch for InfoSeek Corporation
Other Info:
Features:
Comments and Search Tips:

Open Text Web Index
URL: http://www.opentext.com/omw/f-omw.html
Developer: Tim Bray for Open Text Corporation
Other Info:
Features:
Comments and Search Tips

Three New Comers

excite Netsearch
URL: http://www.excite.com/
Developer: Architext Software
Other Info:
Features:
Comments and Search Tips:

Inktomi
URL: http://inktomi.berkeley.edu/
Developers: Paul Gauthier and Professor Eric Brewer, UC Berkeley.
Other Info: Features:
Comments and Search Tips:

Alta Vista
URL: http://altavista.digital.com/
Developer: Digital Equipment Corporation
Other Info:
Features:
Comments and Search Tips:

Search the Search Engines (Meta-Search Engines)

These search engines will send queries to multiple search engines simultaneously.
SavvySearch: Parallel Internet Query Engine
IBM infoMarket Search Service
MetaCrawler Multi-Threaded Web Search Service
ProFusion

Appendices

Collections of Search Engines

Internet Sleuth
URL: http://www.intbc.com/sleuth/meta-index.html

CUSI - Configurable Unified Search Index
URL: http://www.eecs.nwu.edu/susi/cusi.html

W3 Search Engines
URL: http://cuiwww.unige.ch/meta-index.html

All-in-One Search
URL: http://www.albany.net/allinone/

SearchPlex
URL: http://www.west.net/~jbc/tools/search.html

Special Purpose Databases

DejaNews
for searching Usenet
URL: http://www.dejanews.com/
U.S. GovBot Database
More than 100,000 web pages from government site
URL: http://www.business.gov/Search_Online.html

Z39.50 Gateway

Library of Congress
URL: http://lcweb.loc.gov/z3950/

ZWeb: Library Search Gateway
URL: http://zweb.cl.msu.edu/

InterCat: OCLC Internet Cataloging Project's Catalog of Internet Resources
URL: http://www.oclc.org:6990/