Student Projects
-
Overview
If you have any ideas for a project that fits within my interests, then please feel free to come and discuss it with me. If you are trying to find a topic then there are some areas to consider on the next tab.
Please note: At the Undergraduate level I only supervise Type 4 projects as I’m a coder, not an information systems guy!
I look to take on students who are intelligent (which you are by definition of being a Birkbeck student) and who have an interest in the project that they are about to undertake. Students should be self-motivated and be keen to explore their topic. You should not be afraid to express your ideas and follow them. However, you need to be balanced in your approach and practice scientific practices in evaluating your own work. If you have the basics, we can work together to put together a good solid project.
I like to see students take control of their project and run with their ideas and thoughts. This means that students exercise a degree of independence - however, I don’t expect this to translate to going off into a dark corner and never consulting me. My role is to provide some guidance and advice. There is clearly a conflict between these two expectations and so what we need to do is work together to strike a suitable balance between self-direction/independence and guidance.
Bottom line - it is your project and the supervisor only makes suggestions...
Useful links
- “A Guide to writing MSc dissertations” (yes it is for Mathematics students but most of the recommendations carry over to Computer Science AND apply to BSc projects)
- A writing style guide I have found useful is “The Elements of Style” by William Strunk, Jr., and E. B. White
-
Possible topics
The project topics listed here are not the only types of project I am prepared to supervise - they are merely an interesting list of current topics.
- You are encouraged to discuss your ideas and maybe we can shape them into an honours or masters project.
- Some of the projects, although at Masters level, might be able to be adapted for Undergraduate level by only focussing on part of the problem.
- I've been asked repeatedly to distinguish between Artificial Intelligence, Machine Learning, and Deep Learning, and this article helps with the differences.
-
Automated Web Navigation
Navigation information is a critical component of a usable website. Many sites provide incomplete site maps, with poor categorisation, so it can be very difficult to track down the information you need. Once you are deep in the site hierarchy, it can be particularly difficult to work out where you are in relation to the rest of the site, and what else the site offers. Good navigation information provides answers three questions: Where am I? Where can I go from here? Where do I want to be? Much of this information can be extracted quite simply from collections of pages, to construct a consistent navigation mechanism for each page.
This project aims to provide tools for automatic extraction of navigation information, as well as construction of customisable navigation mechanisms for a website. A natural extension of this might be an automated indexing system which uses the text on each page to construct a site map to guide the user to the desired information quickly and easily.
-
A mini interpreter for Swift - NEW for 2019-20
The goal of the project is to implement an interpreter for the Swift programming language based on the LLVM. Most Swift programs depend on libraries written for OSX, a challenge is how to deal with those dependencies. There are open-source, cross platform versions of Swift, and this project would examine a mini-interpreter for a subset of the language. The evaluation of the project will be whether the interpreter can successfully interpret all the run tests in the Swift compiler.
-
Cleaning “dirty” data
The booming field of “Knowledge Discovery in Databases” (KDD) or “Data Mining” combines techniques developed in database theory with machine learning. A number of techniques unique to this area have been developed for coping with the inevitable noise in data, which results in inaccurate (or simply wrong) measurement and missing values. KDD people typically clean their data before providing it to their inferential or statistical programs. There are excellent theoretical reasons for believing that “cleaning” data throws out good information with the bad.
This project will look at the effect of cleaning data by throwing it out and the prospects for the alternative of recycling dirty bath water.
-
Autonomous Agent Toolkit
An autonomous agent (AA) is a simple entity that interacts with its environment and other AAs, typically based on a simple set of rules. For example, the AAs may be birds that randomly fly around a grid, but obey simple rules like avoiding bumping into each other and not flying directly behind another bird (otherwise it cannot see). It is interesting to then observe the “emergent behaviour,” such as the patterns of movement. Researchers have developed very simple rules that seem to mimic the patterns seen in nature of how birds fly in formations. In this project, you will explore the various types of AAs and their behaviours.
-
Content-based spam filtering
If you’ve sent me an email and it has “disappeared” then this is where it may have gone!
This project involved the design and implementation of a system that categories email messages. The aim is to identify as many unsolicited email messages (spam) as is possible without incurring a large number of false-positives (i.e., valid email messages classified as spam).
The suggested approach is to use content-based and usage-based techniques rather than explicit rules to filter the messages.
-
Web Search with paragraph spreading…
When searching for specific documents in a document collection (such as the Web),results of high precision can be obtained using information retrieval systems that focus on paragraphs rather than whole documents. Unfortunately, paragraph level retrieval systems perform poorly when the query terms are not focused within a single paragraph. To overcome this, we could simply include the surrounding paragraphs in our query term search, but this would increase the query time. We believe that we are able to obtain the benefits of paragraph level retrieval and document level retrieval by smoothing the weight of a paragraph across all paragraphs within each document.
This project will be to investigate the impact of using various smoothing functions across document paragraphs when compared to document and paragraph retrieval.
-
Document Image Analysis
The objective of document image analysis is to recognise text, graphics, and pictures in printed documents and to extract the intended information from them. There are two broad categories of document image analysis, namely, textual processing, and graphical processing. Textual processing includes skew determination (any tilt at which the document may have been scanned), finding columns, paragraphs, text lines and words, and performing optical character recognition. Graphical processing deals with lines and symbols.
The scope of the proposed project will be the aspect of text processing and it will concentrate on the development of a system for page layout analysis.
-
Reputation systems
For spam prevention in email and VoIP, reputation of the sender is an important means of distinguishing likely desirable communications from messages that need more scrutiny. This project investigates the use of trust paths, i.e., chains of trust from the sender to the receiver, to establish the credentials of the sender.
-
Mobile Game Design
To design and develop a 3-dimensional platform/puzzle game that is easily understood and yet fun to play and extensible; the target system should be a mobile device platform such as the iPhone, Android, etc.
-
Online Job Portal
Online job sites are now being used to source new staff and perform the online matchmaking process between an employer and employee. The efficiency of searching for opportunities and candidates are replacing traditional sources like newspapers, job fairs, etc. The Computer Science Dept. would like to create a job site that allows students and employers to connect students with employers. Students can publish resumes and other information and employers can publish job opportunities.
-
Search Engine Improved Page Ranking
Web users access the Web using Web search engines as their portal, therefore the results they supply are a critical factor in the growth of the Web. The most popular Web search engine, Google, uses the PageRank algorithm, which is a measure of global popularity. By using this, the more popular sites (such as amazon.com) will be found more easily and hence their popularity will increase. If this cycle continues, the popular sites will eventually monopolise the Web.
We believe that a popularity measure is a good idea, but popularity should be measured from peers, not from the entire community. This project would investigate the effect a peer based popularity rank will have on search engine results.
-
Image segmentation by clustering methods
The aim of the project will be to make a critical study of a number of existing non-hierarchical cluster analysis methods, such as, c-means, fuzzy c-means, and isodata, with a view to their application in image segmentation. Study of cluster validity will form an essential part of the project. Applications will include both monochrome and colour images.
-
FlashMob
FlashMob is the brainchild of a group of graduate students at USF studying supercomputers.
...Our hope at the beginning of the semester was to build a supercomputer that would make the Top 500 list of supercomputers. After some back-of-the-envelope calculations, we concluded that we were about 100 computers short of having a good shot. Someone raised their hand and said: “We could post a message on Craig’s List and get a hundred people to just show up.”
Thus the idea of FlashMob Computing was born.
The objective of this project would be to build a framework that allowed for the dynamic configuration of a FlashMob network.
-
Pattern Classification Algorithms
Pattern recognition is a basic attribute of human beings. We are good at recognising the objects around us. We can easily recognise a friend from his/her voice; we can recognise handwritten and machine printed characters; we can recognise fruits from their shape, colour and texture features. The development of a computerised pattern recognition system to do such tasks is, however, non-trivial. Two main components, or modules, of a pattern recognition system are feature extraction and pattern classification. Feature extraction is the process of obtaining a set of measurements, or features, that are suitable for computer processing. Pattern classification refers to the process of deciding as to which category, or class, a given input pattern belongs to in the light of the extracted features.
The scope of the present project is to develop algorithms for pattern classification in two different modes, supervised and unsupervised. “Supervised” mode refers to situations in which we have samples for which the class identifications are known and we learn the class features from these samples. In “unsupervised” mode the class identifications of the given samples are not known.
-
Visualisation of web query results
With the rapidly increasing size of the world-wide web, the visualisation of web sites, their contents and structure is quickly becoming a necessary tool for managing the complexity of the web. Visual representations are extremely helpful for analysing and interpreting the results of world-wide web searches, and they can also be used to understand and optimise the structure of individual sites or intra-nets.
This project builds upon previous projects for the visualisation of query results and web site structures. It will aim to create systems that provide effective visualisations based on the combination of different types of information:
- structure-based representations, and
- contents-based visualisations.
Structure-based visualisations analyse the link structure and/or traffic information and use them to produce typically network-like site maps. Contents-based visualisations analyse the text contents (and/or keywords, meta-tags etc.) in web pages and produce displays that make it easier to judge the similarity of two pages.
-
Self-Organizing Imagery
The making of an original image is usually a task achieved by an artist or possibly a computer program with some form of representation of the work as a ‘whole’. The purpose of this project is to develop animated and still imagery which is the result of the interactions between many individual agents acting under the instructions of their own internal rules.
The project will require the development of a system for specifying the elements of a picture and the agents responsible for maintaining/manipulating it. The end result will be a series of still images and a short animation demonstrating some results achieved using the new system.
-
Proposal Structure (for PG students)
The following is a general template for a masters project proposal under my supervision, and therefore, software based.
- Title
- Disclaimer (stating your own work etc. - usual plagiarism requirements)
- Abstract
- Acknowledgements (optional)
- Contents
- Chapter 1 - Introduction
- State the problem you are trying to solve
- Why is it worth tackling?
- What approaches are available (briefly)?
- What approach have you chosen?
- Any special knowledge you presume of the reader to understand the proposal.
- Any special typography or terminology (if too many use a glossary).
- A "road map" of the proposal document ... "In Chapter Two we describe xxxx. In Chapter Three we describe yyyy..."
- Chapter 2 - Background
- Any information the reader requires in terms of techniques/technology that isn't part of the programme you have studied.
- Chapter 3 - Analysis, Requirements, and Design
- What it says on the title
- Should include appropriate formal design diagrams/notation as necessary
- Any language selection, libraries, frameworks, etc., and why you have selected them.
- If you've written some code then discuss this briefly and how this will impact the final work.
- Chapter 4 - Experimentation & Evaluation
- Again, what it says on the tin!
- Briefly describe how you are going to show that your work meets the original aims and objectives; this doesn't mean just testing the code!
- Chapter 6 - Timescale
- Provide a GANTT chart which shows the timeline for your work. Probably a topic, rather than week-by-week, view will be most useful to the examiners.
Now, "one size does not fit all" - this structure is meant to be a starting point; you may have more, or less, chapters. Be flexible and check with me about what you are writing.
I prefer to see the proposal a section at a time (or even just a few pages at the start). A whole proposal sent to me to read at the end, with no chapters submitted prior to that, is a recipe for problems. It really doesn't give me any time to say, "No, change the style", or "No, I suggest you do it this way".
-
Report Structure
The following is a general template for a project report under my supervision (and software based). This is not necessarily the structure other members of staff would recommend but it works for me!
First of all, don't bother with the abstract, introduction, and conclusions - they are the last things you write. Next, if you have a lot of code (being supervised by me probably means "yes") then put it on GitHub (or equivalent) as a private repository and include a link to that. Don't fill up appendices with lots of code listings - think of the trees.
So, here is the outline:
- Title
- Disclaimer (stating your own work etc. - usual plagarism requirements)
- Abstract
- Acknowledgements (optional)
- Contents
- Chapter 1 - Introduction
- State the problem you are trying to solve
- Why is it worth tackling?
- What approaches are available (briefly)?
- What approach have you choosen?
- Any special knowledge you presume of the reader
- Any special typography or terms
- A "road map" of the report..... "In Chapter Two we describe xxxx. In Chapter Three we describe yyyy..."
- Chapter 2 - Background
- Any information the reader requires in terms of techniques/technology that isn't part of the programme you have studied.
Please note: if you are submitting an MSc project report then this should not contain the material from the proposal - one cannot get credit twice for the same piece of work.
- Any information the reader requires in terms of techniques/technology that isn't part of the programme you have studied.
- Chapter 3 - Analysis, Requirements, and Design
- What it says on the title
- Should include appropriate formal design diagrams/notation as necessary
- Chapter 4 - Implementation
- Describe how the implementation maps onto the design you have already discussed.
- You should use "code snippets" to illustrate special features of your work or difficult (awkward) bits of coding. Don't make any of these snippets longer than half a page (and include line numbers if possible). If the code fragment is longer than half a page then break it up into smaller bits.
- Describe the code, both in terms of the overall architecture and in terms of the snippets. Make sure the reader understands what you have done and why!
- Chapter 5 - Experimentation & Evaluation
- Again, what it says on the tin!
- If you haven't done too much testing (for instance it is GUI based) then include a "walk-through" of the application with screenshots showing the scenarios in which the application can be used. After all, the examiners may not be near a computer to actually "run" your code.
- You also need to clearly show how you have evaluated your work and show how it meets the original aims and objectives.
- Chapter 6 - Conclusions
- Restate the problem.
- Say how successful (or not) you have been in solving/tackling the problem.
- What would you have done differently?
- What have you learnt?
- Basically, "reflect" on the work you have done.
- What additional features/extensions can be done to the work and/or what would you have done if you had more time.
- Appendices
- Include design diagrams, data formats, etc.
- Basically, don't overfill the report.
- Link to the GitHub repository (if appropriate).
Now, "one size does not fit all" - this structure is meant to be a starting point; you may have more, or less, chapters. You may need a background chapter, you might not, etc. Be flexible and check with me about what you are writing.
I prefer to see the report a chapter at a time (or even just a few pages at the start). A whole report sent to me to read at the end, with no chapters submitted prior to that, is a recipe for problems. It really doesn't give me any time to say, "No, change the style", or "No, I suggest you do it this way".
-
College Policy on Supervision of Dissertations for Taught Students
Sample student projects
The following are not to be distributed without prior agreement
- An Aspect Oriented Framework in F# by Nitesh Chacowry
- The Role of Depth in Neural Network’s Multimodal Word-Learning Assumption by Akira Charoensit
- First steps in creative computational thinking with natural language programming and Lego Mindstorms by Geoff Falk
- A Recommendation Engine by Amy Peters
- AmazingStoke: A Facebook game for the Not-For-Profit sector by Joanna Pinto
- MotionJS - A JavaScript Framework for large applicationsa by Michael Sauter
Information for Personal Tutees
-
College Personal Tutors Policy