• Because of some download problems the downloadsize is limited to 1 MB. It ensures the termination of the download procedure.

> Well it does not! When trying to download the contents of, the read method of InputStream never terminated.


  • Several downloads of websites fail, because "subcategories" are no longer available. An example: is no longer available. An option could be to download the "root" web site

> This approach is not necessary. "Problematic websites" can be dismissed.


> For simplicity reasons only the source code of a web site is necessary. An index search doesn’t require any images or videos to work. The result of an index search respectivley tag search, shows the correspondent URL. The user can view the result in it’s original form on the internet.



  • The newest version of KollektionsErfasser. With this version several queries in a row are possible.


  • Because delicious restructured, i changed the feasability study. View it (in german) as doc or as pdf.
  • The newest version of KollektionsErfasser


  • The prototype of the masterreport as doc or pdf



  • Update of the "Pflichtenheft"


  • About downloading metadata

There seems to be a time limit and a download limit (125 MB). The attempt to get only the 50 first users for every website reduces the download size. That way i hoped to get more usersAndTags sites in one phase. But still after aproximatley 300 files the delicious service shuts down. The size of the downlaoded metadata was less than 10 MB.

  • About IP adress blocking

Sometimes when delicious blocks the IP adress, a query is still possible. But the allUsersAndTags site for the resulting hits are not available. This means there is no access to users and their tags who bookmarked these results.


  • New and updates of recent documents are available (see below).


  • Three new documents in german describing the master project more precisly are available.


  • A new document in german about Soekia 2.


  • Three documents describing the Use Cases and the GUI are available under "Documents" (see below).
  • A documentation about the current system architecture is also available.


Relevant java classes:

  • This is the main class to interact with
  • Download a website from a URL. This class contains the method to set the filename (more precise: the absolute filepath)
  • This class implements a method to add Strings (alphabetically) in a fast way (O(log(N)) to an ArrayList.

For ranking purposes, the number of how many people saved a certain website could be used. doesn’t rank this way. The problem with this aproach is the manipulability. One real person could open various accounts and save one website with each account.


The URL-pattern to get websites for a query is


  • No matter how many results the query yields, a maximum of 1000 websites (URLs) can be downloaded.
  • To get all tags can be problematic in case to many users saved this website (sometimes a hit is saved by over 10000 people).
  • To solve this problem, offers two possibilites: View the first 50 users or view all users.

Example with 50 | Example with all


Delicious consists of several views from which two are appropriate to download websites and all their tags. "" lists the latest websites of this user. Example Some of these websites are saved by other people if there is a "saved by" link. This link plus "?all" at the end shows all people and their tags for this website. Example

Websites (the URLs) can be read with Java Stream Classes.

Every file format can be downloaded from the internet. Although some formats like images/pdfs/movies take a long time.

Documents ⇒ Machbarkeitsstudie als doc (5. November 08 Autor: Adrian Urfer) Machbarkeitsstudie als pdf (5. November 08 Autor: Adrian Urfer) Social Tagging Projektbeschrieb als doc (15. July 08 Autor: Adrian Urfer) Social Tagging Projektbeschrieb als pdf (15. July 08 Autor: Adrian Urfer) Social Tagging Pflichtenheft als doc (24. July 08 Autor: Adrian Urfer) Social Tagging Pflichtenheft als pdf (24. July 08 Autor: Adrian Urfer) GUI für den KollektionsErfasser als doc (15. July 08 Autor: Adrian Urfer) GUI für den KollektionsErfasser als pdf (15. July 08 Autor: Adrian Urfer) The current masterreport as doc (15. July 08 Autor: Adrian Urfer) The current masterreport as pdf (15. July 08 Autor: Adrian Urfer)

Most important Classes:

ch.swisseduc.kollektionserfasser ⇒ (Autor: Mathias Dreier, changed by Adrian Urfer)

ch.swisseduc.kollektionserfasser.tag ⇒ (Autor: Adrian Urfer)

ch.swisseduc..kollektionserfasser.util ⇒ (Autor: Adrian Urfer) (Autor: Adrian Urfer) (Autor: Adrian Urfer)

Meeting with Werner / Diana

General: Search for 1 to 10 single words. Something like "hello world" as one word isn’t allowed.


  • Two groups of people search for the same websites. One group uses the index approach, the other one the tag approach.
  • Who is faster and how do they search?
  • A tag search should show the ranking of all documents containing any word or words. The ranking must be the same as in The rank of a site is the position of this site in a query. After N queries, there will be N sites with rank 1, N sites with rank 2, N sites with rank 3, ...
  • How do people tag websites and how many tags are necessary?

Problem with the official delicious java API

This package only provides methods to get data from single delicious users if we know their passwords.

Last changed by admin on 21 April 2009