Bachelor in Business Intelligence and Data Analytics

Bachelor in Business Intelligence and Data Analytics

Become an expert in data analysis and business decision making in a technological ecosystem and with great networking opportunities

Data mining

Description: 

This course introduces data mining, the extraction of useful information and knowledge from large volumes of data, to improve business decision-making. This course provides a comprehensive introduction to the various techniques and methods used in data mining. Topics covered include data preprocessing, data exploration and visualization, data modelling and prediction. The course also covers real-world applications and case studies in the industry. The goal of this course is to provide students with a solid understanding of data mining techniques and their applications using Python, so they can use them to analyse and extract insights from data in various fields.

Type Subject
Tercer - Obligatoria
Semester
Second
Course
2
Credits
6.00

Titular Professors

Previous Knowledge: 

--

Objectives: 

The "Data Mining" (IN015) course focuses on extracting valuable knowledge from large volumes of data to enhance business decision-making. Throughout the semester, you will explore the complete data mining process, including data preprocessing, exploration, and predictive modeling using regression, classification, and tree-based methods. Ultimately, the course equips you with the practical skills to apply these techniques using Python and its core libraries (such as pandas and scikit-learn), enabling you to analyze real-world data, critically evaluate the reliability of your results, and effectively communicate your findings.

Contents: 

First part of the semester:

  • Introduction to Data Mining
  • Data Preprocessing
  • Regression Models
  • Classification Models

Second part of the semester:

  • Cross-Validation
  • Feature Selection
  • Tree Based Models
  • Text Mining

Project

  • Predicting Startup Success using Twitter

Methodology: 

The following table relates the learning outcomes to the content taught to achieve them:


RASyllabusContents
R1 Understanding of data mining concepts and techniquesIntroduction to Data Mining
R2Ability to analyze and interpret large datasets to extract meaningful insights and patternsData PreprocessingFeature SelectionCross-Validation
R3Knowledge of the various tools and technologies used in data mining using python, including numpy, pandas, matplotlib, seaborn and scikit-learn.Regression ModelsClassification ModelsTree Based Models
R4Ability to critically evaluate data mining results and decide their reliability and validityCross-ValidationFeature Selection
R5Ability to communicate and present findings from data mining analysis effectively.Project: Predicting Startup Success using Twitter

Evaluation: 

The evaluation system will be continuous combining several activities to ease the assimilation of knowledge by the student.

The following table shows the percentage of evaluation of each activity based on the final grade:


R1, R2Homework20%
R2, R3Mid-Term Exam30%
R4, R5Project20%
R2, R3Final Exam30%

The aims of the continuous evaluation are the following:

  • Progressive learning of the subject and evaluation of the activity
  • Evaluation of the knowledge got in exams
  • Practice the subject with a real-world project

Evaluation Criteria: 

--

Basic Bibliography: 

  • Provost, F., Fawcett, T. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking, O’Really
  • Mueller, A., Guido, S. (2016). Introduction to Machine Learning with Python, O’Really

Additional Material: 

--