Nonparametric inference for classification and association with high dimensional genetic data
- García-Magariños, Manuel
- Antonio Salas Ellacuriaga Director
- Wenceslao González Manteiga Director
- Ricardo Cao Abad Director
Universidade de defensa: Universidade de Santiago de Compostela
Fecha de defensa: 29 de xaneiro de 2010
- Ángel Carracedo Álvarez Presidente
- Carmen María Cadarso Suárez Secretaria
- Ignacio López de Ullibarri Galparsoro Vogal
- Vincent Macaulay Vogal
- Thore Egeland Vogal
Tipo: Tese
Resumo
Over the last years, genetic advances have meant a revolution that has expanded beyond genetic borders, influencing the future of many other scientific areas, As the boom of genetics has caused the arising of countless high dimensional datasets containing DNA/RNA profiles, statistics is the science required to deal with them. Not only new tools need to be developed, but also existing methods can be adapted, and their abilities evaluated, to be applied to genetic data. The term genetic data include a wide variety of datasets, having in common only the fact of coming from DNA information: from SNPs (categorical data) to gene expression measures (continuous data). Inside this DNA information could be the answer to many common diseases with a complex basis (psychiatric disorders, cancer, diabetes, etc), so the main aim of statistics is to provide with proper, powerful techniques, able to unravel the underlying nature of complex diseases. This essay contains several statistical approaches to both gene expression data and SNP/STR data. There is place here for penalized regression, machine learning or tree-based methods. Although the emphasis lays on clinical genetics, statistical tools for population and forensic genetics are also explained.