Information Retrieval and Organisation

Lecturers: Sven Helmer and Dell Zhang
Programme: MSc CS and MSc AIS/IWT/IT
Time: Friday evenings 6pm - 9pm
Room: MAL G16 [BBK Maps]
Code: COIY064H7
Document: Module Specification


Textbook

Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze,
Introduction to Information Retrieval,
Cambridge University Press, 2008.

Companion Website

Syllabus

Week Date Session I Session II
1-6 13/01/2012
20/01/2012
27/01/2012
03/02/2012
10/02/2012
17/02/2012
Please go to Sven's IR page
7 24/02/2012 *
A Breif Introduction to Probability and Statistics
[slides] [example]
Chapter 11
Probabilistic Information Retrieval
[slides]
8 02/03/2012 Chapter 12
Language Models for Information Retrieval
[slides] [example]
Chapter 13
Text Classification & Naive Bayes
[slides] [example]
9 09/03/2012 Chapter 14
Vector Space Classification
[slides] [demo] [example]
Chapter 15
Support Vector Machines & Machine Learning on Documents
[slides]
10 16/03/2012 Chapter 16
Flat Clustering
[slides] [demo] [example]
Chapter 17
Hierarchical Clustering
[slides] [example]
11 23/03/2012 Chapter 18
Matrix Decompositions & Latent Semantic Indexing
[slides]
*
Advanced Topics in Information Retrieval
[slides]
-- 30/03/2012 Coursework Part 2 - Submission Deadline
-- Friday
11/05/2012
6pm - 9pm
Revision Lecture at Clore MC, CLO G01

Assessment

Coursework: 20%
Part 1: An assignment for Sven's lectures.
Normal deadline: Fri 24/02/2012
Cut-off deadline: Fri 09/03/2012
Part 2: An assignment for Dell's lectures.
Normal deadline: Fri 30/03/2012
Cut-off deadline: Fri 13/04/2012
Penalty for late submission: a maximum mark of 50.
Method of submission: through Blackboard.

Examination: 80%
[NB] Mock Exam Paper
Past exam papers can be found at Birkbeck eLibrary.

Projects

Students committed to excellence are welcome to contact me for final project ideas.

Python Programming

The Python Tutorial
Swaroop C H: A Byte of Python
Allen Downey: Think Python --- How to Think Like a Computer Scientist (Free Online Book)
Mark Pilgrim: Dive Into Python (Free Online Book)
Mark Pilgrim: Dive Into Python 3 (Free Online Book)
The Definitive Guide to Jython --- Python for the Java Platform (Free Online Book)

Information Retrieval Software

Apache Lucene
Terrier IR Platform
The Lemur Project
Python Package - Whoosh

Supplements

Forsyth David and Ponce Jean: An Introduction to Probability.

Peter Norvig: How to Write a Spelling Corrector.
Peter Norvig: Natural Language Corpus Data, in Beautiful Data: The Stories Behind Elegant Data Solutions.
Paul Graham: A Plan for Spam.
Paul Graham: Better Bayesian Filtering.
Robert M. Bell et al.: The Million Dollar Programming Prize, IEEE Spectrum, May 2009.

Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd edition, Prentice Hall, 2010. (Chapter 22 Natural Language Processing)
Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology Behind Search, 2nd edition, Addison Wesley, 2010.
Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, international edition, Pearson Education, 2009.
Stefan Buttcher, Charles Clarke, and Gordon Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.
David Grossman and Ophir Frieder, Information Retrieval: Algorithms and Heuristics, 2nd edition, Springer, 2004.

Jeffrey Dean: Challenges in Building Large-Scale Information Retrieval Systems (WSDM-2009 Keynote Speech). [VideoLecture]
UC Berkeley Course SIMS141: Search Engines: Technology, Society, and Business [Guest Lecture Videos].

Michael McCandless, Erik Hatcher, and Otis Gospodnetic, Lucene in Action, 2nd edition, Manning, 2010.

Toby Segaran, Programming Collective Intelligence: Building Smart Web 2.0 Applications, O'Reilly, 2007.
Matthew Russell, Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites, O'Reilly, 2011.
Satnam Alag, Collective Intelligence in Action , Manning, 2008.
Haralambos Marmanis and Dmitry Babenko, Algorithms of the Intelligent Web , Manning, 2009.

Hans Rosling: The Joy of Stats [Video].

Related Courses

Stanford Course CS276/LING286: Information Retrieval and Web Mining
Stuttgart Course: Introduction to Information Retrieval

MSU Course CSE484: Information Retrieval
Cornell Course CS430/INFO430: Information Retrieval
UNT Course CSCE5200: Information Retrieval and Web Search
UIUC Course CS410: Introduction to Text Information Systems (Spring 2008)
UIUC Course CS598: Integrative Intelligent Information Systems (Spring 2008)
UMass Course CS646: Information Retrieval
UCSC Course ISM260: Information Retrieval
UTexas Course CS 371R: Information Retrieval and Web Search
UPenn Course CIS 430: Introduction to Human Language Technology
PSU Course IST 441: Information Retrieval and Search Engines
UNC Course INLS 490-154: Introduction to Information Retrieval System Design and Implementation (Fall 2008)
IIT Course CS429: Introduction to Information Retrieval
Columbia Course COMS 6998: Search Engine Technology

Colorado Course CSCI 7000-001:Introduction to Information Retrieval
JHU Course 605.744: Information Retrieval (Spring 2009)
UCL Course M052: Information Retrieval

Links

My Blog - Research on Search


Google
 
Web www.dcs.bbk.ac.uk