Nonparametric inference for classification and association with high dimensional genetic data

García-Magariños, Manuel

Nonparametric inference for classification and association with high dimensional genetic data

García-Magariños, Manuel

Dirixida por:

Antonio Salas Ellacuriaga Director
Wenceslao González Manteiga Director
Ricardo Cao Abad Director

Universidade de defensa: Universidade de Santiago de Compostela

Fecha de defensa: 29 de xaneiro de 2010

Tribunal:

Ángel Carracedo Álvarez Presidente
Carmen María Cadarso Suárez Secretaria
Ignacio López de Ullibarri Galparsoro Vogal
Vincent Macaulay Vogal
Thore Egeland Vogal

Tipo: Tese

Teseo: 286596 DIALNET

Resumo

Over the last years, genetic advances have meant a revolution that has expanded beyond genetic borders, influencing the future of many other scientific areas, As the boom of genetics has caused the arising of countless high dimensional datasets containing DNA/RNA profiles, statistics is the science required to deal with them. Not only new tools need to be developed, but also existing methods can be adapted, and their abilities evaluated, to be applied to genetic data. The term genetic data include a wide variety of datasets, having in common only the fact of coming from DNA information: from SNPs (categorical data) to gene expression measures (continuous data). Inside this DNA information could be the answer to many common diseases with a complex basis (psychiatric disorders, cancer, diabetes, etc), so the main aim of statistics is to provide with proper, powerful techniques, able to unravel the underlying nature of complex diseases. This essay contains several statistical approaches to both gene expression data and SNP/STR data. There is place here for penalized regression, machine learning or tree-based methods. Although the emphasis lays on clinical genetics, statistical tools for population and forensic genetics are also explained.