Info Theory and Ancient DNA Plant Picrobiome

  1. Gonzalez-Diaz, Humbert
  2. Munteanu, Cristian Robert
  3. Sierra, Alejandro Pazos

Editor: figshare

Ano de publicación: 2018

Tipo: Dataset

CC BY 4.0

Resumo

Factors like the low number of reliable samples and the possibility of contamination difficult the experimental and alignment analysis of DNA from ancient samples (aDNA). Consequently, it is crucial to be able to differentiate modern and aDNA sequences. PTML method use Perturbation Theory (PT) and Machine Learning (ML) algorithms could perform this task using as input Shannon’s entropy measures (θ<sub>k</sub>) of Sequence-Recurrence Networks (SRNs) to quantify information about DNA sequences. In this work, we carried out the experimental extraction and characterization of a new set of putative 16S aDNA sequences belonging mostly to phyllosphere, rhizosphere, endosphere, and related organisms from Miocene fossil amber. BLAST alignment algorithm was used to build and analyze the phylogenetic tree. Next, we explored 100000 query and template pairs of sequences to seek an alignment-free PTML linear model with more that 80% of specificity and sensitivity in training and validation series. Next, we compared the alignment-free PTML method with alignment-based and non-linear Artificial Neural Networks classifiers. Last, we implemented the public web server named MEIONet (<b>M</b>iocene sequence <b>E</b>stimation with <b>I</b>nformation-perturbation <b>O</b>perators of Networks, http://bio-aims.udc.es/MEIONet.php) to make the tool available.