Master of Science in Data Science La Salle Campus Barcelona URL

Master of Science in Data Science

Become an expert in analysing, structuring, filtering, visualizing and valuing the production of generated data

Natural language processing

Description
The goal of the subject is to be able to analyse text data coming from natural language. Because of its extensive domain, we will start by explaining how to process text data, how to encode it and how to extract features from it such that it can be used in classical artificial intelligence models. To continue, we will explain sequential models and morphological models that can be used to analyse natural language as individual text or as a group of words forming a sequence. Together with the objective of being able to make classifications and predictions based on language. To end, we will review in detail de transformers methodology, how it works and how it can be applied to language analysis.
Type Subject
Optativa
Semester
Second
Credits
5.00
Previous Knowledge

MD005 and MD008 subjects

Objectives

The goals will focus on:
• Learn to process language data in text format.
• Know how to apply classical artificial intelligence models to processed text data.
• Understand how transformers work in the language context and know how to apply them in text data.

Contents

SYLLABUS

1. Word Processing and Count Vectorizer
2. Word Embeddings
3. Generative models: Hidden Markov Model
4. Discriminatory models: Structured perceptron
5. Recurrent neural networks applied to NLP
6. Transformers and ELMO
7. BERT
8. Final practice

Note: Topics can be adjusted and/or modified at the discretion of the master's coordination.

Methodology

The methodology used combines master classes, student participation, practical exercise at class and solving a challenge or doing a research exercise as final work. For the student, this will involve group work with an oral presentation at class and a written assessment.

Evaluation

This subject will be assessed on a continuous via the development of a challenge proposed or by a research work on already existing solutions in some scientific context and a final presentation in class.

Evaluation Criteria

Continuous assessment
This subject will be assessed on a continuous via the development of a challenge proposed or by a research work on already existing solutions in some scientific context and a final presentation in class.
The final grade will be a weighting of:
- Challenge solution (implementation) and/or presentation or research work: 80%
- Class participation: 20%

Extraordinary call
The exam and/or works of extraordinary call will be determined from the coordination of the subject.

Copies regulations
The subject is governed by the general regulations of copies of La Salle Campus BCN:
https://www.salleurl.edu/en/copies-regulation
The training activities will be considered to have the following category:
• Final exercise or challenge: highly significant

Basic Bibliography

The bibliography will be detailed throughout the course.
All class material (presentations, exercises, articles, documents, etc.) will be shared in the subject folder of the La Salle Intranet: eStudy.

Additional Material

The complementary bibliography will be detailed throughout the course.