Applied data science in biomedicine is a discipline that combines data management with biomedical analysis to improve the quality and efficiency of medical processes. This course covers the data lifecycle, quality assessment, data governance, medical language normalization, data sharing, and cross-cohort analysis.
Titular Professors
No prior knowledge is required.
Students acquire the knowledge and develop the skills listed below:
1. Understand the stages of the data lifecycle, from creation to deletion or storage.
2. Use ontologies and knowledge graphs for medical language normalization.
3. Implement data anonymization techniques to share data securely.
1. Data lifecycle and quality assessment: stages, quality dimensions, FAIR principles.
2. Data governance and policies: ethical, legal, and institutional conditions, data ownership, permissions, data access committees.
3. Medical language normalization: ontologies, SNOMED CT, LOINC, HPO, interoperability.
4. Data sharing: anonymization techniques, aggregated and federated models, cloud-based environments, synthetic data generation, federated discovery, federated learning.
The classes of the Applied Data Science in Biomedicine course aim to enhance the active learning of the student, which is eminently practical. The student is an active member of the classes and learns as they develop the tasks presented with their laptop. The classes are focused on having the student code scripts, combining theoretical material with practical classes. Additionally, there will be an introduction to the R programming language, providing the necessary tools to successfully complete the course.
The student's evaluation is based on several variables:
- Attendance and participation in class.
- Individual development exercises completed outside of class.
- Group development exercises completed outside of class.
- Exams covering the course content.
The following will be assessed:
1. The appropriate selection and application of computational methods, demonstrating sound reasoning that shows technical coherence and biomedical relevance.
2. The ability to critically analyze the results obtained, identifying limitations, potential biases, and well-founded improvements.
3. The correct integration and processing of complex data, demonstrating a rigorously justified use of machine learning algorithms.
4. The clinical and ethical interpretation of the results, demonstrating sensitivity to privacy, equity, and the validity of the model.
5. The clarity and adaptability of the discourse to different audiences, reflecting precise, understandable communication appropriate to the required technical level.
6. The quality of the visualizations and the structure of the presentation, demonstrating narrative coherence and the ability to convey key conclusions.
7. The use of advanced programming tools.
8. The effective and independent use of advanced programming tools, demonstrating rigorous, efficient, and technically sound proficiency.
9. The quality of the developed code, including its reproducibility, adequate documentation, and alignment with good practices in biomedical research.
Benson, T; Grahame G, Principles of Health interoperability: SNOMED CT, HL7 and FHIR, 4rd, Springer, 2021 Venkataramanan N, Shriram A, Data Privacy: Principles and Practice, 1st, CRC Press, 2016
Alberts B, Heald R, Johnson A, Morgan D, Raff M, Roberts K, Peter Walter,Molecular Biology of the Cell, 7th, W. W. Norton & Co, 2022