W3 Catalog History

This page briefly describes the CUI W3 Catalog, how it got started, and why it has come to an end.

What is W3 Catalog?

W3 Catalog was one of the first search engines that attempted to provide a general searchable catalog for WWW resources. It ran from September 2, 1993 to November 8, 1996, at the Centre Universitaire d'Informatique (CUI) of the University of Geneva.

Back to top

How did W3 Catalog get started?

The World Wide Web was developed by Tim Berners Lee at CERN in the early 1990s. Initially, the only widely available browsers were purely textual, and the only graphical browser was Tim's NeXT implementation. This changed in the Spring of 1993, when NCSA introduced its Mosaic browser for X platforms. At the same time, multiple servers for different platforms became available.

Since CERN was just up the road from the University of Geneva, we invited Tim to give a seminar on the WWW with a live demo. For the demo we set up our own server, and on June 25, 1993, the CUI server was publicly announced.

Although the navigational possibilities of the Web were self-evident, it was not clear how one could (or should) provide query facilities for Web resources. We had a number of applications waiting for an easy way to be implemented on the Web (such as a searchable interface to the CUI library database).

At the time, the CERN http server seemed to be very difficult to configure to run ``active pages'' (i.e., pages whose output would be dynamically generated). The search for a better platform led us to switch on August 10, 1993 to the Plexus server, implemented in Perl. This made it very easy for us to install ad hoc search engines for The Language List and the OO Bibliography Database as Perl packages directly integrated into our server.

Since the ability to set up a search engine seemed generally useful, I decided to implement a simple, configurable Perl package for the Plexus server that could easily be adapted to different applications. (Basically, all the package - called parscan.pl - would do is return all the HTML paragraphs matching an ISINDEX HTML query string.) Parscan was made available September 2, 1993.

At the same time, I noticed that many industrious souls had gone about creating high-quality lists of WWW resources, and made these lists available as part of other services, such as CERN's WWW Virtual Library. The only problem with these lists is that they were not searchable. With parscan, a simple solution suggested itself:

The search engine (briefly called ``jughead'', but soon renamed ``w3catalog'') was announced on Sepember 2, 1993.

Back to top

How did W3 Catalog evolve?

Essentially very little has changed since W3 Catalog was initially installed. The main changes are: Parscan itself evolved independently, since it served as a generic implementation for multiple search engines. In October 1993, several http servers adopted a convention that would allow active pages to be implemented as separate programs and scripts. Such a program would be put in a standard directory (called ``htbin''), and files found there would be executed by the server instead of displayed as text or as HTML.

This seemed like an ideal way to open up parscan and make it available as a utility for multiple servers. On October 27, 1993, I released an htbin package for Plexus, allowing the Plexus server to run htbin scripts, and about the same time I converted parscan so it could be run as an htbin script.

My timing was a little off, though, since shortly afterwards the CGI standard was proposed and adopted by most of the servers. Nowadays most simple active pages are implemented as ``cgi scripts''.

Finally, on May 12, 1994, parscan was reborn as a generic cgi script, and renamed htgrep. Htgrep is now used as a back-end for numerous search engines around the WWW.

Back to top

Why has W3 Catalog been stopped?

Although W3 Catalog was very popular, it has been made obsolete for a number of technical and practical reasons. Back to top

Is there a life after death?

Well, in this case, I don't think so. W3 Catalog has had its run, and has been widely referenced in many of the early books on the WWW. There is still a need for a search engine that is based on high-quality, human-managed lists, but a coordinated effort is needed to provide a common interface to such lists. Experience shows that individual list maintainers are generally reluctant to switch to a standard format if they feel this will make more work for them. (If you have ever maintained a list, you will sympathize with this position!) Still, if anyone is interested in reviving such an effort, I will be happy to offer my 2 cents worth.

Back to top

Oscar Nierstrasz
November 8, 1996