Research Highlights at DCSIS
This is a selection of the research projects currently running at DCSIS.
Search Query Semantics
Discovering Semantics of Search Queries through the Recognition and Classification of Named Entities
Search engine queries are more than just a sequence of words. Discovering the structure of a query can help search engines to identify users' search intent. As a result, search engines apply techniques such as query segmentation and word sense disambiguation in order to reveal and meet users' search requirements. A technique that has recently been applied to uncover the semantics of a query is that of named entity recognition, that is the task of extracting from text instances of different categories such as person, location, or company.
In this project, we propose a framework for the detection and classification of named entities in search queries. Typically, a web search query consists of only few words and does not provide enough context nor surface clues, such as capitalisation, to accurately detect named entities. Our framework overcomes these challenges by applying two-stage approach.
The first stage involves the recognition of candidate named entities by grammatically annotating query tokens, and sets the boundaries of named entitiess using query segmentation. The second stage involves the classification of extracted candidate named entities using the vector space model.
Mobile Location Recommendation
Recommending landmarks to mobile users by building a collective model of users movements
The widespread use of mobile communication devices has generated a large amount of interest in location-based services, particularly, those services that are based on information about locations near to users of those devices. Examples are: giving directions or advice about different routes and recommending nearby landmarks to users. The location of a user provides context to a recommendation, and a user model, built over time from the user's navigation traces, provides behavioural patterns which inform on the type of recommendation to make.
In this research work, we are looking into mobile location recommendation from a collective model of users movements. The collective model is built from an aggregate of many trails from multiple users, and thus does not compromise the privacy of any individual. Our research shows that an aggregate model can give accurate predictions, despite the loss of information about individual users. Moreover, the aggregate model has potential in providing social recommendations based on other users preferences.
Integrating Description Logics and Database Technologies for Expressive Ontology-Based Data Access
EPSRC project (2010 - 2013)
We believe that the next generation of ontology-based information systems (ISs) should be based on a synthesis and an extension of ontology and database systems and techniques, providing data handling capabilities similar to current relational database management systems, but with schemas that are rich, flexible, and tightly integrated with the data. In order to achieve this ambitions goal, however, a number of challenging fundamental problems must be solved.
First, ontology and dependency languages need to be unified in a coherent theoretical framework.
Second, it will be necessary to identify fragments of the framework that are likely to exhibit robust scalability but can still support realistic use cases.
Third, it will be necessary to devise effective algorithmic techniques that can form the basis of practical ISs.
Prof Michael Zakharyaschev is PI at Birkbeck, Dr Roman Kontchakov is researcher-co-investigator, Dr Stanislav Kikot is research fellow on the project.
Prof Ian Horrocks is PI at Oxford, with Prof Georg Gottlob, Prof Michael Benedikt, Dr Boris Motik and Dr Thomas Lukasiewicz co-investigators.
Computational Logic of Euclidean Spaces
Computational Logic of Euclidean Spaces
The aim of the project is to investigate the computational properties of practically applicable spatial and spatio-temporal logics interpreted over well-behaved regions in 2- and 3-dimensional Euclidean spaces, and to develop and implement algorithms for reasoning with them.
1. Analyse the computational complexity of decidable topological representation formalisms over well-behaved regions in the Euclidean plane. Identify tractable fragments. Develop and implement reasoning procedures.
2. Analyse the computational complexity of decidable metric representation formalisms over well-behaved regions in the Euclidean plane. Develop and implement reasoning procedures.
3. Investigate topological and metrical representation formalisms over well-behaved regions in 3-dimensional Euclidean space and 3-dmensional spatio-temporal structures. Identify decidable fragments.
Prof Michael Zakharyaschev is PI at Birkbeck, Dr Roman Kontchakov is research fellow on the project;
Dr Ian Pratt-Hartmann is PI at Manchester.
Birkbeck participates in European iTalk2Learn project
iTalk2Learn is funded under the EU Framework Project 7 and involves four universities and three companies from four European countries. The project tries to find a better way of using technology to improve the way young students (5-11yo) learn mathematics. At this age, students are still developing their reading and writing skills, and this can be an obstacle when they try to learn using computers. Therefore, the main three tenets of the project are:
1. The use of innovative human-computer interaction modalities and in particular speech recognition especially designed to be used with children.
2. The combination of structured practice and exploratory activities to achieve more robust and long-term learning.
3. The use of machine learning to analyse the history of students and recommend future activities that are optimal for them.
Prof. Alex Poulovassilis and Dr. Sergio Gutierrez-Santos are leading the effort at Birkbeck, and Mr. Jose-Luis Fernandez-Gomez has recently joined the team. Another postdoctoral research assistant will join the project soon.
Knowledge Representation & Reasoning about Distances
Models of closeness in space
Distance spaces - as models of closeness in space or similarity of objects - play a fundamental role in such diverse fields as geographic information systems, computational molecular biology, text processing, and data mining.
Complementing the existing database technology by means of knowledge representation methodologies is a promising approach to many of the challenging open problems in these and other fields, where sets of objects whose properties depend heavility on some notion of distance play an important role. In this project we will design representation and reasoning formalisms covering both quantitative and qualitative knowledge about distances. The project will combine work on logical and computational properties of the designed logics with work on tableau, resolution, and term-rewriting types of reasoning algorithms. The algorithms will be implemented and resulting systems will be used for initial experiments with representative case studies. An integration of the resulting languages with terminological languages (description logics) will be developed, implemented, and tested as well.
Life Long Learning London for All
L4All has targeted the independent lifelong learner by creating a system that records and shares learning pathways and trails.
L4All has created a portal that allows learners to access selected information and resources, plan their own learning pathways, and maintain and reflect upon their individual record of learning throughout their lives. The system allows learners to share their learning plans and pathways with other learners, in order to support collaborative learning and to formulate future learning goals and aspirations. Tutors are able to publish recommended pathways through courses and modules, thereby facilitating progression into Higher Education (HE) and supporting career choices. Go to the L4All project page.
The Ubicomp Google
Exploring ambient dynamics and findability
The construction of spaces composed of physical artefacts augmented with computational, sensing, auto-identification and wireless communication capabilities is becoming increasingly practical at larger scale and drives research interest in the technical challenges related to the everyday use of such intelligent environments. Nevertheless, several barriers remain before intelligent environments can be effectively used, notably the fact that abundance of such computational and communication capability does not necessarily imply the availability of useful or usable services and applications. In fact, the contrary is often the case since such spaces are the source and possibly also the repository of massive amounts of data created by the continuous archival of personal experiences, which users cannot access in a meaningful way. A major challenge in making intelligent environments useful is indeed the development of efficient and effective navigation mechanisms that is, the ability to search, locate and retrieve information as and when needed so as to fit the task at hand. To be sure, in addition to capture, intelligent environments must provide mechanisms for the effective navigation of recorded personal experiences as a core ingredient of their architecture.
What background environmental factors affect our neighborhoods?
Robotic Feral Public Authoring links together two branches of research for community fun and action. Hobbyist robotics and public authoring (knowledge mapping and sharing) both enable people to use emerging technologies in dynamic and exciting new ways. Brought together they open up whole vistas of possibilities for exploring our local environments with electronic sensors to detect all kinds of phenomena and map them using online tools.
MiGen: Intelligent Support for Mathematical Generalisation
The MiGen project is tackling a thorny problem that confronts all teachers of mathematics:
What is algebra for?
* How is it useful for expressing generalisations?
* What does it mean to generalise in mathematics?
We are building a pedagogical and technical environment to support 11-14 year-old students' learning of mathematical generalisations. The system comprises a microworld, the eXpresser and two intelligent tools, the eGeneraliser and the eCollaborator. When students are tackling generalisation tasks, the eGeneraliser will be providing personalised feedback adapted to the learning trajectories of each student. Through the eCollaborator, students will be able to view each others' constructions and compare, critique and discuss them. Both intelligent tools will send information to teachers to help them provide appropriate guidance.
Our research team of social, educational and computer scientists, together with teachers and teacher educators are co-designing the system and iteratively testing it with students.
Samtla: Search And Mining Tools for Linguistic Analysis
The SAMTLA system has been designed to assist researchers in the Humanities with the task of quantifying historic corpora through phrase searches and comparative methods. SAMTLA adopts methods developed in Information Retrieval including character-based Suffix Trees, probabilistic Language Models, Named Entity Recognition, and Data Mining techniques such as clustering and classification. SAMTLA presents search results according to the underlying principles and structure of the language present in domain specific corpora.