1. ClinGen Allele Registry links information about genetic variants
- Author
-
Sharon E. Plon, Lillian Ashmore, Aleksandar Milosavljevic, Peter B. McGarvey, Jimmy Zhen, Selina S. Dwight, Chris Bizon, Bradford C. Powell, Neethu Shah, Robert R. Freimuth, Clinical Genome (ClinGen) Resource, Piotr Pawliczek, Tristan Nelson, Matthew Wright, Ronak Y. Patel, Melissa J. Landrum, Sameer Paithankar, Andrew R. Jackson, Larry Babb, and Natasha T. Strande
- Subjects
Special Issue Articles ,0301 basic medicine ,Representational state transfer ,Informatics ,HGVS representation ,dbSNP ,Hypertext Transfer Protocol ,computer.internet_protocol ,variant centric resources ,Biology ,03 medical and health sciences ,Databases, Genetic ,Genetics ,Humans ,Registries ,variant identifiers ,Alleles ,Genetics (clinical) ,Information retrieval ,Application programming interface ,Genetic Variation ,linked data ,Linked data ,Object (computer science) ,Identifier ,030104 developmental biology ,pathogenicity of genetic variants ,Canonicalization ,computer ,Software - Abstract
Effective exchange of information about genetic variants is currently hampered by the lack of readily available globally unique variant identifiers that would enable aggregation of information from different sources. The ClinGen Allele Registry addresses this problem by providing (1) globally unique “canonical” variant identifiers (CAids) on demand, either individually or in large batches; (2) access to variant‐identifying information in a searchable Registry; (3) links to allele‐related records in many commonly used databases; and (4) services for adding links to information about registered variants in external sources. A core element of the Registry is a canonicalization service, implemented using in‐memory sequence alignment‐based index, which groups variant identifiers denoting the same nucleotide variant and assigns unique and dereferenceable CAids. More than 650 million distinct variants are currently registered, including those from gnomAD, ExAC, dbSNP, and ClinVar, including a small number of variants registered by Registry users. The Registry is accessible both via a web interface and programmatically via well‐documented Hypertext Transfer Protocol (HTTP) Representational State Transfer Application Programming Interface (REST‐APIs). For programmatic interoperability, the Registry content is accessible in the JavaScript Object Notation for Linked Data (JSON‐LD) format. We present several use cases and demonstrate how the linked information may provide raw material for reasoning about variant's pathogenicity.
- Published
- 2018
- Full Text
- View/download PDF