The module will introduce the concepts of distributed computing systems, scalability and development and configuration of complex large scale systems. Students will learn how to develop and deploy modern applications on the cloud.
The students will be introduced to a variety of modern tools and technologies including use of virtual machines and containers, configuration of distributed systems, deployment of NoSQL systems, development of RESTFul services with Python and other. Finally, the module concludes with an introduction to big data systems by focusing on the Hadoop MapReduce ecosystem. This is a highly technical module with a heavy programming lab schedule.
"Learn distributed systems methods and techniques and how to develop, deploy and configure cloud systems and applications."
- Understanding of complex distributed computing systems algorithms
- Use of Linux systems and the command line interface
- Development of RESTFul services using Python frameworks (such as Flask and Django)
- Use of authorization protocols (e.g. OAuth v2)
- Configuration of NoSQL sytems (such as MongoDB and Apache Cassandra)
- Development of parallel and concurrent processing software with Python
- Deployment and configuration of containerized systems (e.g. using Docker)
- Ability to conceptualize Service Oriented Architectures
- Understanding of distributed services as distributed configuration, synchronization, naming and other (e.g. using Apache Zookeeper)
- Understanding on how to develop scalable applications (e.g. by using load balancers and messaging systems)
- Understanding of Hadoop MapReduce framework
The module includes:
- Lecture classes: Students will develop theoretical understanding of the concepts.
- Demonstrations: Showcases of practical aspects (e.g. application deployment on public cloud providers).
- Lab sessions: A significant amount of class time will be spent in lab sessions and practical work.
The module syllabus includes:
- Cloud computing concept
- Use of Linux environments and command line interface
- Cloud services and Virtualization
- Web services, REST and authorization protocols
- Using Python frameworks to develop APIs
- Distributed and parallel systems with Python
- Cloud data storage systems and NoSQL systems
- Container systems
- Service Oriented Architectures
- Distributed systems configuration
- Introduction to Big data and Hadoop-Map Reduce framework
- Hadoop Map-Reduce application development
- Scaling Cloud applications
- Introduction to the Internet of Things using Cloud systems
- Excellent knowledge of Python programming is necessary, including data structures, object-oriented programming, functional programming, recursion and exception handling.
- MSc Data Science students should have already taken the Principles of Programming 1 (POP1) module.
* Please note that this module will not teach you the basics of Python, it is expected that you can apply the concepts taught in previous modules.
All dates and timetables are listed in programme handbooks, found in the downloads section of individual programme pages.
- Timetable of all departmental teaching events
- Term dates
- Timetable for the week ahead (including venue information)
Enrolled students can find their personal teaching timetable and location of classes on their My Birkbeck profile.
One programming coursework with a report on developing Cloud based services.
- One programming coursework using a Python framework (20%)
- A written exam (80%) based on lecture and lab material. The exam will include writting code in Python.