Previous: Ontology-based intelligent information gathering
Up: Ontology-based intelligent information gathering
Next: Heuristics
Previous Page: Ontology-based intelligent information gathering
Next Page: Algorithm

Information gathering on the WWW

IICA collects WWW pages by (1) accessing HTTP or (2) searching the archive of WWW pages. In the former case, IICA gets the specified page by sending a URL address to its socket modules and accessing the specified host. The gathered page is added to the archive. All pages in the archive are managed by IICA with its file table . In the latter case, IICA searches the archives using the file table.

The algorithm is basically breadth-first searching. The difference is that IICA evaluates gathered pages and decides which anchor to access next.


mitiak-i@aist-mandara-net
Tue Jul 30 14:26:54 JST 1996