Annotation of the Protein Coding Regions of the Equine Genome

Matthew S Hestand; Theodore S Kalbfleisch; Stephen J Coleman; Zheng Zeng; Jinze Liu; Ludovic Orlando; James N MacLeod

doi:10.1371/journal.pone.0124375

Annotation of the Protein Coding Regions of the Equine Genome

PLoS One. 2015 Jun 24;10(6):e0124375. doi: 10.1371/journal.pone.0124375. eCollection 2015.

Authors

Matthew S Hestand¹, Theodore S Kalbfleisch², Stephen J Coleman¹, Zheng Zeng³, Jinze Liu³, Ludovic Orlando⁴, James N MacLeod¹

Affiliations

¹ Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, Kentucky, United States of America.
² Biochemistry and Molecular Biology Department, School of Medicine, University of Louisville, Louisville, Kentucky, United States of America.
³ Department of Computer Science, University of Kentucky, Lexington, Kentucky, United States of America.
⁴ Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.

Abstract

Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Animals
Base Sequence
Genome
Horses / genetics*
Molecular Sequence Annotation*
Open Reading Frames / genetics*
RNA, Messenger / genetics*

Substances

RNA, Messenger

Abstract

Publication types

MeSH terms

Substances

Grants and funding