E-Commerce Technology 2 – Database Systems, Data Warehousing, OLAP and Data Mining

  Data mining- Study guide for 2003/04


News and announcements



Learning outcomes

Method of teaching

Lecture/seminar programme


How the assessment relates to the learning outcomes

Core reading list

Supplementary reading and study material

Reassessment details


  News and announcements

 This module is part of the Postgraduate Diploma in E-Commerce, Spring Term 2004. It covers the fundamental principles of modern database systems, data warehouses, on-line analytical processing and data mining, and their impact on e-commerce. For details on the databases part of the module please visit Prof. Poulovassilis web pages at . This study guide covers the data mining part of the module. Students should read this study guide carefully and also ensure that all the links have been followed to other accompanying. The information in this study guide is maintained by the Dr George D. Magoulas, who you can contact by email: if you have any queries or you want to set up an appointment.

 This document will be updated while the course is in progress. Please be sure that you check it to find out up-to-date information about the module.



 Web-based business is generating a vast amount of data on consumer transactions, browsing behaviours, usage times, and preferences. Data mining is in general the task of extracting implicit, previously unidentified and potentially useful information from data. In certain case, the data sets are characterised by incompleteness (missing parameter values), incorrectness (systematic or random noise in the data), sparseness (few and/or non-representable records available), and inexactness (inappropriate selection of parameters for the given task). The aim of this module is to present and discuss issues associated with data mining of web-based applications. It will cover the basic concepts and techniques for data mining and intelligent data analysis, including methods for knowledge engineering, artificial neural networks and clustering. No particular programming language knowledge is assumed and mathematical prerequisites are kept to a minimum.



 ·         To introduce the basic concepts in data mining and intelligent data analysis

·         To demonstrate the process of knowledge discovery using practical examples.

·         To have hands-on experience with the Clementine data mining tool.


 Learning outcomes

 By the end of the module students must demonstrate ability to:

 ·         Discuss basic concepts of data mining and intelligent data analysis.

·         Explain the process of knowledge discovery

·         Demonstrate the process of knowledge discovery in practical examples.


Method of teaching

 The module will be delivered through a series of lectures and lab sessions. Lectures are on Wednesdays 18:00-21:00 in room 153 in the Main Building. The Lab sessions are all in room 128. Room 128 is booked for this course for all Wednesdays, 18:00-21:00, until March 26th. It is strongly recommend that you attend these labs.


 Lecture programme

 Lectures are on Wednesdays 18:00-21:00. The lecture programme is as follows:


Week 23 February – 27 February 2004

Lecture 10: Data mining services


Week 1 March – 5 March 2004


Week 8 March – 12 March 2004

Lecture 14: Clustering algorithms


Week 15 March – 19 March 2004

Lab session 5: Clementine

Lab session 6: Clementine


Week 22 March – 26 March 2004


REVISION LECTURE:  Wednesday 28th of April – Room 121



 The exam paper of the module will have five questions in total; students will choose three. Two out of the five questions are on data mining.


How the examination relates to the learning outcomes

 The examination relates to the basic learning outcomes stated earlier in this document. Exam questions will cover all aspects of the data mining part of the module, assessing the accomplishment of ALL learning outcomes.


 Reading list

 The following reading list is a recommended source of course material.

 “Learning from data”, Vladimir Cherkassky, Wiley, 1998, ISBN: 0-471-15493-8.

Artificial Intelligence: a Guide to Intelligent Systems”, Michael Negnevitsky, Addison Wesley, 2002, ISBN: 0-201-71159.

Data mining techniques: for marketing, sales, and customer support “,Michael J.A. Berry, Gordon Linoff, Wiley , 1997, ISBN: 0471179809.

Data mining: concepts, models, methods, and algorithms”, Mehmed Kantardzic, Wiley-Interscience: IEEE Press , 2003, ISBN:  0471228524.

A guide to neural computing applications”, Lionel Tarassenko, Arnold Publishers, 1998, ISBN: 0-340-705892.

Neural Networks”, Picton P., 2nd edition, Palgrave, 2000, ISBN: 0-333-80287-X.

“Fundamentals of Neural networks: Architectures, Algorithms and Applications”, Faussett L., Prentice Hall, 1994, ISBN: 0-13-042250-9.  

“Pattern Recognition Using Neural Networks”, Looney C., Oxford University Press, 1997, ISBN: 0-19-507920-5

 “Mathematical classification and clustering “,Boris Mirkin, Kluwer Academic , 1996, ISBN:  0792341597.

“Data mining: practical machine learning tools and techniques with Java implementations”, Ian H. Witten, Eibe Frank, Morgan Kaufmann , 2000, ISBN:  1558605525


Electronic Resources

 On the following sites you can find relevant material:


The URL of the ACM digital library.


Links to several journals. You might need your ATHENS password to access that.


Elsevier Publisher. Access to several Neural Network journals. You might need your ATHENS password to access that.


EEVL is an award-winning free service, which provides quick and reliable access to the best engineering, mathematics, and computing information available on the Internet.


Links to several journals. You might need your ATHENS password to access that.

 If you don't have an ATHENS password please visit the Library's web page and follow instructions to get an ATHENS password.



 ·         ACM Transactions on Internet Technology (electronic access)

·         Data & Knowledge Engineering  (electronic access)

·         Data Mining and Knowledge Discovery  (electronic access)

·         IEEE Transactions on Neural Networks

·         Neurocomputing (electronic access)

·         IEEE Intelligent Systems (electronic access)

·         Expert systems with applications  (electronic access)

·         The knowledge engineering review (electronic access)

·         IEEE Transactions on knowledge and data engineering (electronic access)

·         Applied Intelligence (electronic access)

·         IEEE Internet Computing (electronic access)

·         Intelligent Data Analysis (electronic access)



Supplementary reading and study material

 The reading list, above, is the recommended source of course material. You are advised to acquire at least one of the books, but should initially satisfy yourself as to the suitability of each textbook. Use this study guide to assess the coverage in each book. Some of the books will not cover the course entirely and, may contain material not covered in the course.

 It is advisable to look in the library or on the Web for further reading around the topic of the module; you will find a lot of literature dealing with data mining, neural computing, and clustering. Books included in the reading list, mentioned above, as well as related books can be found in the Library. Feel free to buy a book of your own choice if it is not included in the reading list, and use the library frequently. You will find it contains lots of other material that will interest you.



Reassessment details

 Please refer them to the PDEC course booklet – this is standard across the whole college.