The SCG web site includes a citation search engine called scgbib implemented on the server side. In the current implementation, users enter a simple text query, and the search engine displays all matching bibliography entries formatted as HTML.
The current database is a 5MB text file in bibtex format. The database resides in a git repo available at git://scg.unibe.ch/scgbib. The server caches the file and reloads it whenever it detects that the file has been updated.
Whenever the user enters a new query, or requests the results to be formatted in a different way (e.g., by author, by year, including abstracts, etc.), a new query is generated and evaluated on the server. This creates a considerable delay for each request.
As part of the project, the following questions should be addressed:
- Is it reasonable to expect such a speedup? Implement a simple prototype that loads a 5MB file in a client browser, and evaluate some benchmarks to see how quickly the data can be queried and formatted.
- Should the database be transferred in bibtext format, or pre-processed to JSON? In the first case, we could use tools like bibtex-js to parse and translate bibtex on the fly, or in the second case, we could use citation.js or Pandoc to translate bibtex to JSON on the server, avoiding any parsing on the client.
- What is the best way to query the database? Is something like vQuery.grep enough, or should we use some kind of client-side JSON database?
See also: How to link to the SCG Bibliography
Contact: Oscar Nierstrasz