Students in this module will learn to understand the emerging area of cloud computing and how it relates to traditional models of computing, and gain competence in MapReduce as a programming model for distributed processing of big data.
This module aims to introduce back-end cloud computing techniques for processing "big data" (terabytes/petabytes) and developing scalable systems (with up to millions of users). We focus mostly on MapReduce, which is presently the most accessible and practical means of computing for "Web-scale" problems, but will discuss other techniques as well.
- Introduction to Cloud Computing
- Cloud Computing Technologies and Types
- Big Data
- MapReduce and Hadoop
- Running Hadoop in the Cloud (Practical Lab Class)
- Developing MapReduce Programs
- Data Management in the Cloud
- Information Retrieval in the Cloud
- Link Analysis in the Cloud
- Beyond MapReduce
- Selected Case Studies
- Advanced Topics in Cloud Computing
Good knowledge of Java programming would be necessary. Students who did not have much experience in this area before joining their respective MSc programmes should have already taken the ISD (BUCI021S7) module.
All dates and timetables are listed in the programme handbooks of individual programmes.
A couple of programming assignments.
AssessmentCoursework (20%). Examination (80%).
- Jothy Rosenberg and Arthur Mateos, The Cloud at Your Service, Manning, 2010.
- Jimmy Lin and Chris Dyer, Data-Intensive Text Processing with MapReduce, Morgan and Claypool, 2010.
- Extensive use is made of other relevant book chapters and research papers that are distributed or provided online.