Cloud Computing Concepts
The module introduces the concepts of distributed computing systems, scalable infrastructures as well as development and configuration of complex large scale applications and systems. Students learn how to develop and deploy modern applications on cloud platforms, such as in Google Cloud Platform and Amazon EC2.
Cloud Computing introduces a variety of modern tools and technologies including the use of virtual machines and containers, the configuration of distributed systems, deployment and understanding of operations of NoSQL systems, development of RESTFul services with Python, understanding of DevOps practices and infrastructure as a code and use of big data processing systems such as Hadoop MapReduce and Apache Spark.
Cloud computing is a highly technical module with a heavy programming lab schedule using Python, command-line interfaces and Linux systems.
"Learn distributed systems methods and techniques and how to develop, deploy and configure cloud systems and applications."
- Understanding of distributed computing systems and algorithms such as consensus algorithms and blockchain
- Use of Linux systems and the command line interface
- Development of RESTFul services using Python Django framework
- Use of authorization protocols (OAuth v2)
- Installation and configuration of NoSQL systems (Apache Cassandra)
- Development of parallel, concurrent and messaging protocol software with Python (ZeroMQ and Kafka)
- Deployment of virtual machines in cloud suites and platforms (Google cloud platform and Amazon EC2)
- DevOps practices, Infrastructure as a code and deployment tools (Terraform)
- Containerisation with Docker and Kubernetes
- Ability to conceptualize Service-Oriented Architectures
- Understanding of distributed services, distributed configurations, synchronization and naming services
- Understanding of how to develop scalable applications by using load balancers
- Understanding of Hadoop MapReduce framework and developing MapReduce applications
- Understanding of Apache Spark framework and its applications
The module is organised in:
- Lecture classes: Students develop a theoretical understanding of the concepts.
- Demonstrations: Showcases of practical aspects (e.g. application deployment on public cloud providers).
- Lab sessions: A significant amount of class time will be spent in lab sessions and practical work. Students, depending on their programming experience, are expected to spend 3-6 hours per week to complete lab tutorials and homework tasks.
The module syllabus includes:
- Cloud computing technology
- Use of Linux environments and command-line interface
- Cloud services and Virtualization
- Web services, REST and authorization protocols
- Using Python frameworks to develop APIs with Django
- Distributed and parallel systems with Python
- Cloud data storage systems and NoSQL systems
- DevOps and Container systems
- Service-Oriented Architectures
- Distributed systems configuration
- Introduction to Big data and Hadoop-Map Reduce framework
- Hadoop Map-Reduce and Apache Spark application development
- Scaling Cloud applications
- Excellent knowledge of Python programming is essential, including:
- Data structures with Python
- Object-oriented programming
- Functional programming
- Recursion, dynamic programming and exception handling.
- MSc students should already know how to program with Python and should already complete related modules (SP2 etc.).
* Please note that this module will not teach you the basics of Python and it is expected that you can apply the concepts taught in previous modules.
One programming coursework with a report on developing Cloud APIs and services.
- One programming coursework to develop software using a Python framework (30%).
- A written exam (70%) based on lecture and lab material. The exam includes writing code in Python.
The module is organised around a collection of books including:
- Ajay D. Kshemkalyani and Mukesh Singhal. 2008. Distributed Computing: Principles, Algorithms, and Systems (1st. ed.). Cambridge University Press, USA.
- Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann. O'Reilly UK; 1 edition (25 Jan. 2016)
- Tom White. 2015. Hadoop: The Definitive Guide (4th. ed.). O’Reilly Media, Inc.
- Bill Chambers and Matei Zaharia. 2018. Spark: The Definitive Guide Big Data Processing Made Simple (1st. ed.). O’Reilly Media, Inc.
- Docker Deep Dive by Nigel Poulton Independently published (12 July 2017)
Useful books to study:
Django 3 By Example: Build powerful and reliable Python web applications from scratch, Packt Publishing; 3rd Revised edition edition (31 Mar. 2020)
- The Kubernetes Book: Updated Feb 2020 by Nigel Poulton
- Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale by Neha Narkhede, Gwen Shapira, Todd Palino
[Last updated: 04/08/2020]