Tutors: Dell Zhang and Mark Levene
Time: Tuesday evenings 6pm - 9pm (Spring Term)
Room: Westminster Kingsway College (the King's Cross Centre) K214 [Room Description] [BBK-DCS Teaching Map]
Code: COIY064H7
Document: Module Specification
NOTICE: The exam date has been tentatively set to Wed 12-Jun-13 (10:00am - 12:00noon) .
![]() |
Christopher D. Manning,
Prabhakar Raghavan, and
Hinrich Schutze, Introduction to Information Retrieval Cambridge University Press, 2008. Companion Website |
| Week | Date | Session I | Session II |
|---|---|---|---|
| 1 | 08/01/2013 |
Chapter 00 Motivation [slides] |
Chapter 01 Boolean Retrieval [slides] [classwork-p] [classwork-s] |
| 2 | 15/01/2013 |
Chapter 02 The Term Vocabulary and Postings Lists [slides] |
Chapter 03 Dictionaries and Tolerant Retrieval [slides] [classwork-p] [classwork-s] |
| 3 | 22/01/2013 |
Chapter 04 Index Construction [slides] |
Chapter 05 Index Compression [slides] [classwork-p] [classwork-s] |
| 4 | 29/01/2013 |
Chapter 06 Scoring, Term Weighting, and the Vector Space Model [slides] [classwork-p] [classwork-s] [example] |
Chapter 07 Computing Scores in a Complete Search System [slides] |
| 5 | 05/02/2013 |
Suffix Tree and Suffix Array [slides] [classwork-p] |
Chapter 08 Evaluation in Information Retrieval [slides] [example] |
| 6 | 12/02/2013 |
Chapter 09 Relevance Feedback and Query Expansion [slides] |
Chapter 09 Relevance Feedback and Query Expansion [slides] |
| 7 | 19/02/2013 |
* A Brief Introduction to Probability and Statistics [slides] [example] |
Chapter 11 Probabilistic Information Retrieval [slides] |
| 8 | 26/02/2013 |
Chapter 12 Language Models for Information Retrieval [slides] [example] |
Chapter 13 Text Classification & Naive Bayes [slides] [example] |
| -- | 26/02/2013 | Coursework Part 1 - Submission Deadline | |
| 9 | 05/03/2013 |
Chapter 14 Vector Space Classification [slides] [demo] [example] |
Support Vector Machines & Machine Learning on Documents [slides] |
| 10 | 12/03/2013 |
Chapter 16 Flat Clustering [slides] [demo] [example] |
Chapter 17 Hierarchical Clustering [slides] [example] |
| 11 | 19/03/2013 |
Matrix Decompositions & Latent Semantic Indexing [slides] |
Advanced Topics in Information Retrieval [slides] |
| -- | 02/04/2013 | Coursework Part 2 - Submission Deadline | |
| -- | Tuesday 07/05/2013 6pm - 9pm |
Revision Lecture at MAL B33 |
|
Coursework: 20%
Part 1
Normal deadline: Tue 26/02/2013 23:55
Cut-off deadline: Tue 12/03/2013 23:55
Part 2
Normal deadline: Tue 02/04/2013 23:55
Cut-off deadline: Tue 16/04/2013 23:55
Penalty for late submission (i.e., after the normal deadline): the maximum mark is capped at half of the normal full mark.
Please submit your solutions in electronic form,
through the Moodle system.
Examination: 80%
[NB] Mock Exam Paper
Past exam papers can be found at Birkbeck eLibrary.
Students committed to excellence are welcome to contact me for final project ideas.
Apache Lucene
Terrier IR Platform
The Lemur Project
Python Package - Whoosh
Forsyth David and Ponce Jean: An Introduction to Probability.
Peter Norvig: How to Write a Spelling Corrector.
Peter Norvig: Natural Language Corpus Data, in Beautiful Data: The Stories Behind Elegant Data Solutions.
Paul Graham: A Plan for Spam.
Paul Graham: Better Bayesian Filtering.
Robert M. Bell et al.: The Million Dollar Programming Prize, IEEE Spectrum, May 2009.
Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd edition, Prentice Hall, 2010. (Chapter 22 Natural Language Processing)
Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology Behind Search, 2nd edition, Addison Wesley, 2010.
Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, international edition, Pearson Education, 2009.
Stefan Buttcher, Charles Clarke, and Gordon Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.
David Grossman and Ophir Frieder, Information Retrieval: Algorithms and Heuristics, 2nd edition, Springer, 2004.
Jeffrey Dean: Challenges in Building Large-Scale Information Retrieval Systems (WSDM-2009 Keynote Speech). [VideoLecture]
UC Berkeley Course SIMS141: Search Engines: Technology, Society, and Business [Guest Lecture Videos].
Michael McCandless, Erik Hatcher, and Otis Gospodnetic, Lucene in Action, 2nd edition, Manning, 2010.
Toby Segaran, Programming Collective Intelligence: Building Smart Web 2.0 Applications, O'Reilly, 2007.
Matthew Russell, Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites, O'Reilly, 2011.
Satnam Alag, Collective Intelligence in Action
, Manning, 2008.
Haralambos Marmanis and Dmitry Babenko, Algorithms of the Intelligent Web
, Manning, 2009.
Ron Zacharski, A Programmer's Guide to Data Mining, Free Online eBook.
Hans Rosling: The Joy of Stats [Video].
Stanford Course CS276/LING286: Information Retrieval and Web Mining
Stuttgart Course: Introduction to Information Retrieval
MSU Course CSE484: Information Retrieval
Cornell Course CS430/INFO430: Information Retrieval
UNT Course CSCE5200: Information Retrieval and Web Search
UIUC Course CS410: Introduction to Text Information Systems (Spring 2008)
UIUC Course CS598: Integrative Intelligent Information Systems (Spring 2008)
UMass Course CS646: Information Retrieval
UCSC Course ISM260: Information Retrieval
UTexas Course CS 371R: Information Retrieval and Web Search
UPenn Course CIS 430: Introduction to Human Language Technology
PSU Course IST 441: Information Retrieval and Search Engines
UNC Course INLS 490-154: Introduction to Information Retrieval System Design and Implementation (Fall 2008)
IIT Course CS429: Introduction to Information Retrieval
Columbia Course COMS 6998: Search Engine Technology
Colorado Course CSCI 7000-001:Introduction to Information Retrieval
JHU Course 605.744: Information Retrieval (Spring 2009)
UCL Course M052: Information Retrieval