Unraveling complex local genomic rearrangements from long-read data

Zachary D. Stephens, Ravishankar K Iyer, Chen Wang, Jean Pierre A. Kocher

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
EditorsIllhoi Yoo, Jane Huiru Zheng, Yang Gong, Xiaohua Tony Hu, Chi-Ren Shyu, Yana Bromberg, Jean Gao, Dmitry Korkin
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages181-187
Number of pages7
ISBN (Electronic)9781509030491
DOIs
StatePublished - Dec 15 2017
Event2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017 - Kansas City, United States
Duration: Nov 13 2017Nov 16 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
Volume2017-January

Other

Other2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
CountryUnited States
CityKansas City
Period11/13/1711/16/17

Fingerprint

DNA
Genomic Structural Variation
Sequence Inversion
DNA Replication
Datasets

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics

Cite this

Stephens, Z. D., Iyer, R. K., Wang, C., & Kocher, J. P. A. (2017). Unraveling complex local genomic rearrangements from long-read data. In I. Yoo, J. H. Zheng, Y. Gong, X. T. Hu, C-R. Shyu, Y. Bromberg, J. Gao, ... D. Korkin (Eds.), Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017 (pp. 181-187). (Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017; Vol. 2017-January). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BIBM.2017.8217647

Unraveling complex local genomic rearrangements from long-read data. / Stephens, Zachary D.; Iyer, Ravishankar K; Wang, Chen; Kocher, Jean Pierre A.

Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. ed. / Illhoi Yoo; Jane Huiru Zheng; Yang Gong; Xiaohua Tony Hu; Chi-Ren Shyu; Yana Bromberg; Jean Gao; Dmitry Korkin. Institute of Electrical and Electronics Engineers Inc., 2017. p. 181-187 (Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017; Vol. 2017-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Stephens, ZD, Iyer, RK, Wang, C & Kocher, JPA 2017, Unraveling complex local genomic rearrangements from long-read data. in I Yoo, JH Zheng, Y Gong, XT Hu, C-R Shyu, Y Bromberg, J Gao & D Korkin (eds), Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017, vol. 2017-January, Institute of Electrical and Electronics Engineers Inc., pp. 181-187, 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017, Kansas City, United States, 11/13/17. https://doi.org/10.1109/BIBM.2017.8217647
Stephens ZD, Iyer RK, Wang C, Kocher JPA. Unraveling complex local genomic rearrangements from long-read data. In Yoo I, Zheng JH, Gong Y, Hu XT, Shyu C-R, Bromberg Y, Gao J, Korkin D, editors, Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 181-187. (Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017). https://doi.org/10.1109/BIBM.2017.8217647
Stephens, Zachary D. ; Iyer, Ravishankar K ; Wang, Chen ; Kocher, Jean Pierre A. / Unraveling complex local genomic rearrangements from long-read data. Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. editor / Illhoi Yoo ; Jane Huiru Zheng ; Yang Gong ; Xiaohua Tony Hu ; Chi-Ren Shyu ; Yana Bromberg ; Jean Gao ; Dmitry Korkin. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 181-187 (Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017).
@inproceedings{5cbc2008cd0547029f4ec4ea9f0a8b91,
title = "Unraveling complex local genomic rearrangements from long-read data",
abstract = "In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4{\%} between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.",
author = "Stephens, {Zachary D.} and Iyer, {Ravishankar K} and Chen Wang and Kocher, {Jean Pierre A.}",
year = "2017",
month = "12",
day = "15",
doi = "10.1109/BIBM.2017.8217647",
language = "English (US)",
series = "Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "181--187",
editor = "Illhoi Yoo and Zheng, {Jane Huiru} and Yang Gong and Hu, {Xiaohua Tony} and Chi-Ren Shyu and Yana Bromberg and Jean Gao and Dmitry Korkin",
booktitle = "Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017",
address = "United States",

}

TY - GEN

T1 - Unraveling complex local genomic rearrangements from long-read data

AU - Stephens, Zachary D.

AU - Iyer, Ravishankar K

AU - Wang, Chen

AU - Kocher, Jean Pierre A.

PY - 2017/12/15

Y1 - 2017/12/15

N2 - In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.

AB - In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.

UR - http://www.scopus.com/inward/record.url?scp=85046277710&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046277710&partnerID=8YFLogxK

U2 - 10.1109/BIBM.2017.8217647

DO - 10.1109/BIBM.2017.8217647

M3 - Conference contribution

AN - SCOPUS:85046277710

T3 - Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017

SP - 181

EP - 187

BT - Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017

A2 - Yoo, Illhoi

A2 - Zheng, Jane Huiru

A2 - Gong, Yang

A2 - Hu, Xiaohua Tony

A2 - Shyu, Chi-Ren

A2 - Bromberg, Yana

A2 - Gao, Jean

A2 - Korkin, Dmitry

PB - Institute of Electrical and Electronics Engineers Inc.

ER -