Molecular mechanism and history of non-sense to sense evolution of antifreeze glycoprotein gene in northern gadids

Xuan Zhuang, Chun Yang, Katherine R. Murphy, C. H. Christina Cheng

Research output: Contribution to journalArticle

Abstract

A fundamental question in evolutionary biology is how genetic novelty arises. De novo gene birth is a recently recognized mechanism, but the evolutionary process and function of putative de novo genes remain largely obscure. With a clear life-saving function, the diverse antifreeze proteins of polar fishes are exemplary adaptive innovations and models for investigating new gene evolution. Here, we report clear evidence and a detailed molecular mechanism for the de novo formation of the northern gadid (codfish) antifreeze glycoprotein (AFGP) gene from a minimal noncoding sequence. We constructed genomic DNA libraries for AFGP-bearing and AFGP-lacking species across the gadid phylogeny and performed fine-scale comparative analyses of the AFGP genomic loci and homologs. We identified the noncoding founder region and a nine-nucleotide (9-nt) element therein that supplied the codons for one Thr-Ala-Ala unit from which the extant repetitive AFGP-coding sequence (cds) arose through tandem duplications. The latent signal peptide (SP)-coding exons were fortuitous noncoding DNA sequence immediately upstream of the 9-nt element, which, when spliced, supplied a typical secretory signal. Through a 1-nt frameshift mutation, these two parts formed a single read-through open reading frame (ORF). It became functionalized when a putative translocation event conferred the essential cis promoter for transcriptional initiation. We experimentally proved that all genic components of the extant gadid AFGP originated from entirely nongenic DNA. The gadid AFGP evolutionary process also represents a rare example of the proto-ORF model of de novo gene birth where a fully formed ORF existed before the regulatory element to activate transcription was acquired.

Original languageEnglish (US)
Pages (from-to)4400-4405
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume116
Issue number10
DOIs
StatePublished - Jan 1 2019

Fingerprint

Antifreeze Proteins
Genes
Open Reading Frames
Parturition
Frameshift Mutation
Genomic Library
Phylogeny
Protein Sorting Signals
Gene Library
Codon
Exons
Fishes
Nucleotides

Keywords

  • Adaptive evolution
  • Codfish AFGP
  • De novo gene
  • Noncoding origin
  • Proto-ORF

ASJC Scopus subject areas

  • General

Cite this

Molecular mechanism and history of non-sense to sense evolution of antifreeze glycoprotein gene in northern gadids. / Zhuang, Xuan; Yang, Chun; Murphy, Katherine R.; Christina Cheng, C. H.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 116, No. 10, 01.01.2019, p. 4400-4405.

Research output: Contribution to journalArticle

@article{ea104d03050e4344b20cbb810762cda0,
title = "Molecular mechanism and history of non-sense to sense evolution of antifreeze glycoprotein gene in northern gadids",
abstract = "A fundamental question in evolutionary biology is how genetic novelty arises. De novo gene birth is a recently recognized mechanism, but the evolutionary process and function of putative de novo genes remain largely obscure. With a clear life-saving function, the diverse antifreeze proteins of polar fishes are exemplary adaptive innovations and models for investigating new gene evolution. Here, we report clear evidence and a detailed molecular mechanism for the de novo formation of the northern gadid (codfish) antifreeze glycoprotein (AFGP) gene from a minimal noncoding sequence. We constructed genomic DNA libraries for AFGP-bearing and AFGP-lacking species across the gadid phylogeny and performed fine-scale comparative analyses of the AFGP genomic loci and homologs. We identified the noncoding founder region and a nine-nucleotide (9-nt) element therein that supplied the codons for one Thr-Ala-Ala unit from which the extant repetitive AFGP-coding sequence (cds) arose through tandem duplications. The latent signal peptide (SP)-coding exons were fortuitous noncoding DNA sequence immediately upstream of the 9-nt element, which, when spliced, supplied a typical secretory signal. Through a 1-nt frameshift mutation, these two parts formed a single read-through open reading frame (ORF). It became functionalized when a putative translocation event conferred the essential cis promoter for transcriptional initiation. We experimentally proved that all genic components of the extant gadid AFGP originated from entirely nongenic DNA. The gadid AFGP evolutionary process also represents a rare example of the proto-ORF model of de novo gene birth where a fully formed ORF existed before the regulatory element to activate transcription was acquired.",
keywords = "Adaptive evolution, Codfish AFGP, De novo gene, Noncoding origin, Proto-ORF",
author = "Xuan Zhuang and Chun Yang and Murphy, {Katherine R.} and {Christina Cheng}, {C. H.}",
year = "2019",
month = "1",
day = "1",
doi = "10.1073/pnas.1817138116",
language = "English (US)",
volume = "116",
pages = "4400--4405",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
publisher = "National Academy of Sciences",
number = "10",

}

TY - JOUR

T1 - Molecular mechanism and history of non-sense to sense evolution of antifreeze glycoprotein gene in northern gadids

AU - Zhuang, Xuan

AU - Yang, Chun

AU - Murphy, Katherine R.

AU - Christina Cheng, C. H.

PY - 2019/1/1

Y1 - 2019/1/1

N2 - A fundamental question in evolutionary biology is how genetic novelty arises. De novo gene birth is a recently recognized mechanism, but the evolutionary process and function of putative de novo genes remain largely obscure. With a clear life-saving function, the diverse antifreeze proteins of polar fishes are exemplary adaptive innovations and models for investigating new gene evolution. Here, we report clear evidence and a detailed molecular mechanism for the de novo formation of the northern gadid (codfish) antifreeze glycoprotein (AFGP) gene from a minimal noncoding sequence. We constructed genomic DNA libraries for AFGP-bearing and AFGP-lacking species across the gadid phylogeny and performed fine-scale comparative analyses of the AFGP genomic loci and homologs. We identified the noncoding founder region and a nine-nucleotide (9-nt) element therein that supplied the codons for one Thr-Ala-Ala unit from which the extant repetitive AFGP-coding sequence (cds) arose through tandem duplications. The latent signal peptide (SP)-coding exons were fortuitous noncoding DNA sequence immediately upstream of the 9-nt element, which, when spliced, supplied a typical secretory signal. Through a 1-nt frameshift mutation, these two parts formed a single read-through open reading frame (ORF). It became functionalized when a putative translocation event conferred the essential cis promoter for transcriptional initiation. We experimentally proved that all genic components of the extant gadid AFGP originated from entirely nongenic DNA. The gadid AFGP evolutionary process also represents a rare example of the proto-ORF model of de novo gene birth where a fully formed ORF existed before the regulatory element to activate transcription was acquired.

AB - A fundamental question in evolutionary biology is how genetic novelty arises. De novo gene birth is a recently recognized mechanism, but the evolutionary process and function of putative de novo genes remain largely obscure. With a clear life-saving function, the diverse antifreeze proteins of polar fishes are exemplary adaptive innovations and models for investigating new gene evolution. Here, we report clear evidence and a detailed molecular mechanism for the de novo formation of the northern gadid (codfish) antifreeze glycoprotein (AFGP) gene from a minimal noncoding sequence. We constructed genomic DNA libraries for AFGP-bearing and AFGP-lacking species across the gadid phylogeny and performed fine-scale comparative analyses of the AFGP genomic loci and homologs. We identified the noncoding founder region and a nine-nucleotide (9-nt) element therein that supplied the codons for one Thr-Ala-Ala unit from which the extant repetitive AFGP-coding sequence (cds) arose through tandem duplications. The latent signal peptide (SP)-coding exons were fortuitous noncoding DNA sequence immediately upstream of the 9-nt element, which, when spliced, supplied a typical secretory signal. Through a 1-nt frameshift mutation, these two parts formed a single read-through open reading frame (ORF). It became functionalized when a putative translocation event conferred the essential cis promoter for transcriptional initiation. We experimentally proved that all genic components of the extant gadid AFGP originated from entirely nongenic DNA. The gadid AFGP evolutionary process also represents a rare example of the proto-ORF model of de novo gene birth where a fully formed ORF existed before the regulatory element to activate transcription was acquired.

KW - Adaptive evolution

KW - Codfish AFGP

KW - De novo gene

KW - Noncoding origin

KW - Proto-ORF

UR - http://www.scopus.com/inward/record.url?scp=85062676587&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062676587&partnerID=8YFLogxK

U2 - 10.1073/pnas.1817138116

DO - 10.1073/pnas.1817138116

M3 - Article

C2 - 30765531

AN - SCOPUS:85062676587

VL - 116

SP - 4400

EP - 4405

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 10

ER -