Professors
The goals will focus on:
Know the key concepts and bases of the information models necessary to understand the role of the existing profiles around an information system and its exploitation.
Have the broadest possible view of the different informational architectures, both those already existing and established in the business market, and the new disruptive visions based on the new available technologies.
Learn to collect, transform and process information based on its origin, volume, format and periodicity of analysis.
Understand the different forms of Data Governance in order to understand the contribution to the day-to-day work of a Data Scientist.
1. Key concepts and bases of information models
1.1. History of Databases
1.2. Key figures around informational systems (Data Science, Data Analyst, Product Owner, etc.)
1.3. Roles of a Data Scientist and interaction with the rest
1.4. DBMS vs RDBMS (information exploitation concept)
1.5. Theory and practical examples of Relational Models
1.6. Definition and concept of ETL
2. What is information and how do we extract value from it?
2.1. Concept of data and information (from the origin of the data to exploiting information and achieving value)
2.2. Business Intelligence and how is it understood with Data Science?
23. Big data
2.3.1. ELT vs ETL
2.3.2. Architecture concept
2.4. Types of data architectures
2.4.1. Logic / Technological / Physical
2.4.2. Information environments (DEV, PRO, SandBox, ...)
2.4.3. Concepts DataLake, DWH, etc.
2.5. IA Data Model Architecture
2.5.1. Development
2.5.2. Validation and promotion to PRO (batch and online)
2.5.3. Monitoring
3. Exploit the different types of information
3.1. Structured DB (review and expansion of what has already been seen)
3.2. Semi-structured DB
3.3. Unstructured DB
4. Web Scraping as a data source
4.1. What is Web Scraping?
4.2. Legal aspects
4.3. Tools
5. Cloud Computing for Data Scientist
6. The importance of data traceability and reliability (Data Governance)
Note: Topics can be adjusted and/or modified at the discretion of the master's coordination.
The methodology used combines master classes, student participation, exercises, and practices. For the student, this will involve both individual and group works, as well as conceptual exercises, written exercises, and oral presentations.
This subject will be assessed on a continuous via from exercises, assignments, practices, and presentations in class.
Continuous assessment
This subject will be assessed on a continuous via from exercises, assignments, practices, and presentations in class. The final grade will be a weighting of:
- Practice on a relational information system: 30%
- Practice on unstructured data systems: 30%
- Final work and presentation: 40%
Extraordinary call
The exam and/or works of extraordinary call will be determined from the coordination of the subject.
Copies regulations
The subject is governed by the general regulations of copies of La Salle Campus BCN:
https://www.salleurl.edu/en/copies-regulation
The training activities will be considered to have the following category:
Exercises: moderately significant
Project: highly significant
Final Evaluation: highly significant
The bibliography will be detailed throughout the course:
Class/Lecture notes
Documentation and papers uploaded to Intranet (eStudy)
All class material (presentations, exercises, articles, documents, etc.) will be shared in the subject folder of the La Salle Intranet: eStudy.