Possible topics

The project topics listed here are not the only types of project I am prepared to supervise - they are merely an interesting list of current topics.

You are encouraged to discuss your ideas and maybe we can shape them into an honours or masters project.
Some of the projects, although at Masters level, might be able to be adapted for Undergraduate level by only focussing on part of the problem.
I've been asked repeatedly to distinguish between Artificial Intelligence, Machine Learning, and Deep Learning, and this article helps with the differences.

Automated Web Navigation: Navigation information is a critical component of a usable website. Many sites provide incomplete site maps, with poor categorisation, so it can be very difficult to track down the information you need. Once you are deep in the site hierarchy, it can be particularly difficult to work out where you are in relation to the rest of the site, and what else the site offers. Good navigation information provides answers three questions: Where am I? Where can I go from here? Where do I want to be? Much of this information can be extracted quite simply from collections of pages, to construct a consistent navigation mechanism for each page.

This project aims to provide tools for automatic extraction of navigation information, as well as construction of customisable navigation mechanisms for a website. A natural extension of this might be an automated indexing system which uses the text on each page to construct a site map to guide the user to the desired information quickly and easily.
A mini interpreter for Swift - NEW for 2019-20: The goal of the project is to implement an interpreter for the Swift programming language based on the LLVM. Most Swift programs depend on libraries written for OSX, a challenge is how to deal with those dependencies. There are open-source, cross platform versions of Swift, and this project would examine a mini-interpreter for a subset of the language. The evaluation of the project will be whether the interpreter can successfully interpret all the run tests in the Swift compiler.
Cleaning “dirty” data: The booming field of “Knowledge Discovery in Databases” (KDD) or “Data Mining” combines techniques developed in database theory with machine learning. A number of techniques unique to this area have been developed for coping with the inevitable noise in data, which results in inaccurate (or simply wrong) measurement and missing values. KDD people typically clean their data before providing it to their inferential or statistical programs. There are excellent theoretical reasons for believing that “cleaning” data throws out good information with the bad.

This project will look at the effect of cleaning data by throwing it out and the prospects for the alternative of recycling dirty bath water.
Autonomous Agent Toolkit: An autonomous agent (AA) is a simple entity that interacts with its environment and other AAs, typically based on a simple set of rules. For example, the AAs may be birds that randomly fly around a grid, but obey simple rules like avoiding bumping into each other and not flying directly behind another bird (otherwise it cannot see). It is interesting to then observe the “emergent behaviour,” such as the patterns of movement. Researchers have developed very simple rules that seem to mimic the patterns seen in nature of how birds fly in formations. In this project, you will explore the various types of AAs and their behaviours.
Content-based spam filtering: If you’ve sent me an email and it has “disappeared” then this is where it may have gone!

This project involved the design and implementation of a system that categories email messages. The aim is to identify as many unsolicited email messages (spam) as is possible without incurring a large number of false-positives (i.e., valid email messages classified as spam).

The suggested approach is to use content-based and usage-based techniques rather than explicit rules to filter the messages.
Web Search with paragraph spreading…: When searching for specific documents in a document collection (such as the Web),results of high precision can be obtained using information retrieval systems that focus on paragraphs rather than whole documents. Unfortunately, paragraph level retrieval systems perform poorly when the query terms are not focused within a single paragraph. To overcome this, we could simply include the surrounding paragraphs in our query term search, but this would increase the query time. We believe that we are able to obtain the benefits of paragraph level retrieval and document level retrieval by smoothing the weight of a paragraph across all paragraphs within each document.

This project will be to investigate the impact of using various smoothing functions across document paragraphs when compared to document and paragraph retrieval.
Document Image Analysis: The objective of document image analysis is to recognise text, graphics, and pictures in printed documents and to extract the intended information from them. There are two broad categories of document image analysis, namely, textual processing, and graphical processing. Textual processing includes skew determination (any tilt at which the document may have been scanned), finding columns, paragraphs, text lines and words, and performing optical character recognition. Graphical processing deals with lines and symbols.

The scope of the proposed project will be the aspect of text processing and it will concentrate on the development of a system for page layout analysis.
Reputation systems: For spam prevention in email and VoIP, reputation of the sender is an important means of distinguishing likely desirable communications from messages that need more scrutiny. This project investigates the use of trust paths, i.e., chains of trust from the sender to the receiver, to establish the credentials of the sender.
Mobile Game Design: To design and develop a 3-dimensional platform/puzzle game that is easily understood and yet fun to play and extensible; the target system should be a mobile device platform such as the iPhone, Android, etc.
Online Job Portal: Online job sites are now being used to source new staff and perform the online matchmaking process between an employer and employee. The efficiency of searching for opportunities and candidates are replacing traditional sources like newspapers, job fairs, etc. The Computer Science Dept. would like to create a job site that allows students and employers to connect students with employers. Students can publish resumes and other information and employers can publish job opportunities.
Search Engine Improved Page Ranking: Web users access the Web using Web search engines as their portal, therefore the results they supply are a critical factor in the growth of the Web. The most popular Web search engine, Google, uses the PageRank algorithm, which is a measure of global popularity. By using this, the more popular sites (such as amazon.com) will be found more easily and hence their popularity will increase. If this cycle continues, the popular sites will eventually monopolise the Web.

We believe that a popularity measure is a good idea, but popularity should be measured from peers, not from the entire community. This project would investigate the effect a peer based popularity rank will have on search engine results.
Image segmentation by clustering methods: The aim of the project will be to make a critical study of a number of existing non-hierarchical cluster analysis methods, such as, c-means, fuzzy c-means, and isodata, with a view to their application in image segmentation. Study of cluster validity will form an essential part of the project. Applications will include both monochrome and colour images.
FlashMob: FlashMob is the brainchild of a group of graduate students at USF studying supercomputers.

...Our hope at the beginning of the semester was to build a supercomputer that would make the Top 500 list of supercomputers. After some back-of-the-envelope calculations, we concluded that we were about 100 computers short of having a good shot. Someone raised their hand and said: “We could post a message on Craig’s List and get a hundred people to just show up.”

Thus the idea of FlashMob Computing was born.

The objective of this project would be to build a framework that allowed for the dynamic configuration of a FlashMob network.
Pattern Classification Algorithms: Pattern recognition is a basic attribute of human beings. We are good at recognising the objects around us. We can easily recognise a friend from his/her voice; we can recognise handwritten and machine printed characters; we can recognise fruits from their shape, colour and texture features. The development of a computerised pattern recognition system to do such tasks is, however, non-trivial. Two main components, or modules, of a pattern recognition system are feature extraction and pattern classification. Feature extraction is the process of obtaining a set of measurements, or features, that are suitable for computer processing. Pattern classification refers to the process of deciding as to which category, or class, a given input pattern belongs to in the light of the extracted features.

The scope of the present project is to develop algorithms for pattern classification in two different modes, supervised and unsupervised. “Supervised” mode refers to situations in which we have samples for which the class identifications are known and we learn the class features from these samples. In “unsupervised” mode the class identifications of the given samples are not known.
Visualisation of web query results: With the rapidly increasing size of the world-wide web, the visualisation of web sites, their contents and structure is quickly becoming a necessary tool for managing the complexity of the web. Visual representations are extremely helpful for analysing and interpreting the results of world-wide web searches, and they can also be used to understand and optimise the structure of individual sites or intra-nets.

This project builds upon previous projects for the visualisation of query results and web site structures. It will aim to create systems that provide effective visualisations based on the combination of different types of information:

structure-based representations, and

contents-based visualisations.

Structure-based visualisations analyse the link structure and/or traffic information and use them to produce typically network-like site maps. Contents-based visualisations analyse the text contents (and/or keywords, meta-tags etc.) in web pages and produce displays that make it easier to judge the similarity of two pages.
Self-Organizing Imagery: The making of an original image is usually a task achieved by an artist or possibly a computer program with some form of representation of the work as a ‘whole’. The purpose of this project is to develop animated and still imagery which is the result of the interactions between many individual agents acting under the instructions of their own internal rules.

The project will require the development of a system for specifying the elements of a picture and the agents responsible for maintaining/manipulating it. The end result will be a series of still images and a short animation demonstrating some results achieved using the new system.

Student Projects