Foundations of Data Science II
Module outline
This module covers further fundamental aspects of data science and analytics. It is a direct continuation of Foundations of Data Science I. Apart from consolidating the knowledge they acquired from FDS I, students develop further mathematical knowledge and skills needed for studies in the BSc Data Science programme, and needed by data scientists/analysts in general. These include basic elements of calculus, further topics in linear algebra, as well as continuous probability theory and further statistics. The module will show you how to use the popular and powerful language Python to solve computational tasks from these mathematical subjects. In particular, this module will get you acquainted with popular Python libraries and packages for programming to solve problems arising from calculus, probability theory and statistics.
Aims
The module provides basic knowledge of data science necessary for further studies of modules such as Machine learning, data analytics. Upon successful completion of the module, the students will
- Be competent with the basic elements of calculus and further topics of linear algebra
- Be competent with the basic elements of continuous probability theory
- Be competent with the basic elements of further topics of statistics
- Be familiar with relevant Python libraries to calculus and probability/statistics
Apply Python to program and solve computational tasks from calculus and probability/statistics.
Learning Outcomes
On successful completion of this module, you will be expected to:
- Demonstrate satisfactory knowledge of basic calculus.
- Demonstrate satisfactory knowledge of further linear algebra and matrix theory.
- Demonstrate satisfactory knowledge of continuous probability theory and statistics
- Demonstrate satisfactory knowledge of relevant Python libraries and packages
- Demonstrate satisfactory skills of programming in Python to solve computational tasks from calculus, linear algebra, continuous probability theory and statistics.
- Understand the link between the basic knowledge acquired from the module and data science/analytics applications.
Syllabus
- Differentiation
- Indefinite and definite integration
- Solving systems of polynomial equations (e.g., Newton’s methods) and basic optimisation algorithm (e.g., gradient descent)
- Continuous probability (random variables, pdf, cdf, expectation, variance, and correlation)
- Common distribution families (Poisson, Normal distribution, etc)
- Probabilistic inequalities and concentration (LNT, CLT, etc)
- Statistical testing (Hypothesis testing, chi-squared testing)
- Sampling and confidence intervals
- Eigenvalues and Eigenvectors
- SVD decomposition
- Tools: Python
Various topics will be demonstrated by practical lab sessions.
Prerequisites
Foundations of Data Science I
Timetables
Indicative timetables can be found in the handbooks available on programme pages. Personalised teaching timetables for students are available via My Birkbeck.
Assessment
Coursework (20%) One two hour written examination (80%).