The workshop will be held in the Senate House of the University of London.
On arrival on the morning of 3rd January, please come to the WebDyn registration desk which will be located on the ground floor of the Senate House.
| 8:30--9:00 | Registration |
| 9:00--9:15 | Welcome |
| Invited Talk 9.15--10.15 |
Soumen Chakrabarti Indian Institute of Technology, Bombay (abstract) |
| 10:15--10:45 | Coffee |
| 10:45--12.15 | A Web Site Navigation Engine (demo) Mark Levene, Richard Wheeldon, Jon Bitmead |
| Toward a Structured Information Retrieval System on the Web: Automatic Structure Extraction of Web Pages Mathias Gery, Jean-Pierre Chevallet | |
| Learning of Ontologies from the Web: the Analysis of Existent Approaches Borys Omelayenko | |
| 12:15--1:10 | Lunch |
| Invited Talk 1.00--2.00 |
Knut Magne Risvik, R&D Director Search Technology Fast Search & Transfer ASA (abstract) |
| 2:00--2:30 | An Active Web-based Distributed Database System for E-Commerce Hiroshi Ishikawa, Manabu Ohta |
| 2:30--3:00 | Coffee |
| Invited Talk 3.00--4.00 |
Luca Cardelli, Microsoft Research, Cambridge Logics for Mobility (abstract) |
| 4.15--5.15 | A Probabilistic Approach to Model Adaptive Hypermedia Systems Mario Cannataro, A.Cuzzocrea, Andrea Pugliese |
| Run-time Management Policies for Data Intensive Web sites Christos Bouras, Agisilaos Konidaris |
|
| 5.15 | Discussion/Close |
| 7:00 | Meet for dinner at Chutney's Vegetarian/Vegan Indian Restaurant (optional!), 124 Drummond Street, London NW1 2PA |
Mathias Gery, Jean-Pierre Chevallet
The World Wide Web is an distributed, heterogeneous and semi-structured
information space. With the growth of available data, retrieving interesting
information is becoming quite difficult and classical search engines give
often very poor results. One of the main problems is the lack of explicit HTML
pages structure, and more generally the lack of explicit Web sites structure.
We show in this paper that it is possible to extract such a structure, which
can be expressed following differents ways : semantics cut-off in a linear
document, hypertext links (links, frames, etc) between several pages, dynamic
pages, etc. We present some preliminary results of an analysis of a Web
sample, extracting several levels of structure (a hierarchical tree structure,
a graph-like structure), independently of the way in which this structure is
implicitly described. Finally, we show how we will use this approach in an
Information Retrieval System (IRS) based on a structured IR model, and thus
manipulating a structured index of the Web. This IRS will use some IR methods
used in the context of structured documents and hypertexts.
Borys Omelayenko
Everybody knows how difficult is to search and extract necessary information
from the Web. The next generation of the Web called Semantic Web has to
improve the Web with semantic (ontological) page annotations to enable
knowledge-level querying and searches. Manual construction of these
annotations will require tremendous efforts that forces future integration of
machine learning with knowledge acquisition to enable highly automated
ontology learning. The objective of the paper is to present state-of-the art
and future directions of ontology learning from the Web. We point out the
requirements for machine learning algorithms to be applied for ontology
extraction from the Web documents, and survey the existent ontology learning
and other closely related approaches. We investigate three components of the
approaches: ontology domains treated, learning tasks that were automated and
machine learning technologies that were applied. We discover several
frequently used combinations of the components and point to several ontology
learning tasks that are quite important for the future Web but are not under
research focus now.
Mark Levene, Richard Wheeldon, Jon Bitmead
Often users navigating (or ``surfing``) through a web site ``get lost in
hyperspace``, when they lose the context in which they are browsing, and are
unsure how to proceed in terms of satisfying their original goal. The
unresolved problem in web site usability, of assisting users in finding their
way, is termed the navigation problem. This problem is becoming even more
accute with the continuing growth of web sites in terms of their structure,
which is becoming more complex, and the vast amount information they intend to
deliver. In contrast users are not willing to invest time to learn this
structure and expect the delivery of the relevant content without delay. To
tackle this problem we are developing a navigation system for semi-automating
user navigation which builds trails of information, i.e. sequences of linked
pages, which are relevant to the user query. The preferred trails are
presented to the user in a tree-like structure which they can interact with.
This is in sharp contrast to a search engine which merely outputs a list of
pages which are relevant to the user query without addressing the problem of
which trail the user should follow. We discuss the architecture of the
navigation system and give a brief description of the navigation engine and
user interface.
Hiroshi Ishikawa, Manabu Ohta
Electronic Commerce (EC) business models like e-brokers on the Web use
WWW-based distributed XML databases such as product and customer data. To
flexibly model such applications, we need a modeling language for EC
businesses, specifically, its dynamic aspects or business processes. To this
end, we have adopted a query language approach to modeling, extended by
integrating active database functionality with it, and have designed an active
query language for WWW-based XML databases, called XBML, suitable for
specifying and controlling EC business processes. In this paper, we explain
and validate the functionality of XBML by specifying e-broker and auction
business models and describe the implementation of the XBML server, focusing
on the distributed query processing in the WWW context.