Project Leaders

Mark Levene

Boris Mirkin

Project Participants
Rajesh Pampapathi
Mike Hu
Tricia Ford
Nick Tischler
Janina Craske
Kelly Gill

Project Details
Duration: 3 years


Expertise Profiling

Project website

EPALS: Expertise Profiling and Location System Building and Classifying Profiles of Experts and Expertise from Heterogeneous Online Information Sources


Like many organisations, the Police Information Technology Organisation (PITO) sees the expertise of it's employees and workforce as one of it's greatest assets. This project aims to provide PITO with tools and techniques to allow employees throughout the organisation to share their knowledge, know-how and skills with each other.

Project Description

Expertise profiling and location involves profiling the knowledge and skills of individuals in an organisation and using them to match appropriate expertise with specific needs and queries from others within the organisation. Such profiling of an individual involves the identification of types and areas of skills and knowledge, and an evaluation of levels of proficiency in each. Once profiles of each individual within an organisation have been generated and organised, an expertise seeker can search by means of queries or by browsing under particular topics. The visualisation of topic structures, profiles and responses to queries is therefore seen as important in the final solution.

Algorithms and Methods

We are exploring the principles and developing the algorithms for analyzing and building profiles of people, where the information is dispersed among heterogeneous information sources. The techniques developed will combine statistical analysis and deterministic (logical) reasoning. We view profiling of a person as analogous to the profiling of a class and the identification of a particular expert as analogous to processes in other domains such as web searching and classification. Consequently, much of the work done under this project has parallels in these other domains.
The techniques we have developed so far borrow from, and greatly extend, approaches which have proven track records in gene classification, text compression and clustering. By merging know-how from different areas, we are able to statistically analyse phrases of variable length, making the approach automatically more powerful than existing approaches which only consider words independently of each other.

Visualisation and Front-End

We consider it important that the profiles can be processed into human-readable summaries for usability purposes, so as to allow users of the system to examine the rationale for the decision to classify the information into a particular category. The statistical analyser will provide summaries of text based on key phrases from the corpus, while the logical component will provide listings of observations which support and observation which oppose the deductions made. The update of profiles and summary creation will be achieved through a maintenance subsystem, and the meta-information in the profiles will be represented in XML format to allow inter-operability with other applications.

Finally, the algorithms methods and visualisations will be supplied with a windows-based graphical user interface (GUI). A prototype of such an interface has been developed (see screenshots available on the on project webpage) which is currently geared more towards general text profiling and classification analysis, and provides tools to select and prepare training sets, and to perform testing and validation of training algorithms. Later versions will be directed more closely towards skills profiling and expertise location, and will be made available for downloading.


Last Updated ( Wednesday, 09 May 2007 )