What is ClinVar?
ClinVar is a freely accessible, public archive of reports of human variations classified for diseases and drug responses, with supporting evidence. ClinVar thus facilitates access to and communication about the relationships asserted between human variation and observed conditions, and the history of those assertions. ClinVar processes submissions reporting variants found in patient samples, classifications for diseases and drug responses, information about the submitter, and other supporting data. The variants described in submissions are mapped to reference sequences, and reported according to the HGVS standard. ClinVar presents the data on the website for interactive users, and on the FTP site and by API for those wishing to use ClinVar programmatically in daily workflows and other local applications. ClinVar works in collaboration with interested organizations to meet the needs of the medical genetics community as efficiently and effectively as possible.
ClinVar supports submissions of differing levels of complexity. The submission may be as simple as a representation of a variant, its classification for a disease or drug response, and minimal evidence (sometimes termed a variant-level submission). Or it may detailed, providing structured data about individuals in whom the variant was observed (in aggregate or case-level) or functional data about the effect of the variant. A major goal is to support re-evaluation of variant classifications, and to enable the ongoing evolution of knowledge regarding variations and associated conditions. ClinVar is an active partner of the ClinGen project, providing data for evaluation and archiving variant classifications by recognized expert panels and providers of practice guidelines. ClinVar archives and versions submissions which means that when submitters update their records, the previous version is retained for review. Read more about submitting data to ClinVar.
The level of confidence in the accuracy of variation calls and classifications depends in large part on the supporting evidence, so this information, when available, is collected and visible to users. Because the availability of supporting evidence may vary, particularly in regard to retrospective data aggregated from published literature, the archive accepts submissions from multiple groups, and aggregates related information, to reflect transparently both consensus and conflicting assertions of clinical significance. A review status is also assigned to any record, to support communication about the trustworthiness of any classification. Domain experts are encouraged to apply for recognition as an expert panel.
An accession nunber, with the format SCV000000000.0, is assigned to each submitted record. If there are multiple submitted records about the same variation/condition pair, they are aggregated within ClinVar's data flow and reported as a reference accession with the format RCV000000000.0. Because of this model, one variant will be included in multiple RCV accessions whenever different conditions are reported for that variant. Submitted records for the same variation are also aggregated and reported as an accession with the format VCV000000000.0. This aggregation lets a user review all submitted data for a variant, regardless of the condition for which it was classified.
ClinVar archives submitted information and adds identifiers and other data that may be available about a variant or condition from other public resources. However ClinVar neither curates submitted content nor modifies classifications independent of an explicit submission. If you have data that differs from what is currently represented in ClinVar, we encourage you to submit your data and the evidence supporting your classification.
If you are submitting variants that were classified as part of work funded by the NIH, please consult your program officer about expectations for submissions to ClinVar.
References
More information about ClinVar is available in these sources:
-
Landrum, M. J., Chitipiralla, S., Brown, G. R., Chen, C., Gu, B., Hart, J., Hoffman, D., Jang, W., Kaur, K., Liu, C., Lyoshin, V., Maddipatla, Z., Maiti, R., Mitchell, J., O'Leary, N., Riley, G. R., Shi, W., Zhou, G., Schneider, V., Maglott, D., Holmes, J.B., Kattman, B. L. ClinVar: improvements to accessing data. Nucleic Acids Res. 2020;48(D1):D835-D844. doi: 10.1093/nar/gkz972. [PubMed PMID:31777943]
-
Landrum MJ, Kattman BL. ClinVar at five years: Delivering on the promise. Hum Mutat. 2018 Nov;39(11):1623-1630. doi: 10.1002/humu.23641. [PubMed PMID:30311387]
-
Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Jang W, Karapetyan K, Katz K, Liu C, Maddipatla Z, Malheiro A, McDaniel K, Ovetsky M, Riley G, Zhou G, Holmes JB, Kattman BL, Maglott DR. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018 Jan 4;46(D1):D1062-D1067. doi: 10.1093/nar/gkx1153. [PubMed PMID:29165669]
-
Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Hoover J, Jang W, Katz K, Ovetsky M, Riley G, Sethi A, Tully R, Villamarin-Salomon R, Rubinstein W, Maglott DR. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016 Jan 4;44(D1):D862-8. doi: 10.1093/nar/gkv122. [PubMed PMID:26582918]
-
Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, Maglott DR. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014 Jan 1;42(1):D980-5. doi: 10.1093/nar/gkt1113. [PubMed PMID: 24234437]
-
NCBI Handbook. Melissa Landrum, PhD, Jennifer Lee, PhD, George Riley, PhD, Wonhee Jang, PhD, Wendy Rubinstein, MD, PhD, Deanna Church, PhD, and Donna Maglott, PhD. ClinVar. [Bookshelf ID: NBK174587]
Scope
ClinVar accepts variants in any part of the genome, from single nucleotide variants and small insertions/deletions through large copy number variants. Variants must be able to be mapped to the genome. Variants may be classified for Mendelian diseases, cancer, or drug responses.
ClinVar currently includes classifications for variants identified through several methods of data collection, including clinical testing, research, and reports from the literature (literature only). See our documentation on submitting collection method for more details.
ClinVar currently does not include uncurated sets of data from GWAS studies, although variants that were identified through GWAS and have been individually curated to provide a classification for disease are in scope.
Represents medical phenotypes
ClinVar maps submitted conditions to MedGen, which aggregates the names of medical conditions with a genetic basis from such sources as UMLS, GeneReviews®, MeSH, Mondo, and OMIM®. MedGen also phenotypes, or clinical features, from Human Phenotype Ontology (HPO), OMIM®, and other sources.
Represents variations
Human variations are reported as sequence changes relative to an mRNA, genomic and protein reference sequence (if appropriate), according to the HGVS standard. The defaults are as ‘c.’ and any protein sequence change. Genomic sequences are represented in RefSeqGene/LRG coordinates, as well as locations on chromosomes (as versioned accessions and per assembly name, such as NCBI36/hg18 and GRCh37/hg19). Novel variations are accessioned in NCBI’s variation databases (dbSNP and dbVar).
Represents the relationships between variants and conditions
ClinVar is designed to support our evolving understanding of the relationship between variants and diseases or drug responses. By aggregating information about variant classifications, ClinVar supports establishment of the clinical validity of human variation.
A ClinVar record contains the following elements:
ClinVar Accession and version
- Submission accession number/version number separated by a decimal (SCV000000000.0) assigned to each submitted record.
- Reference accession number/version separated by a decimal (RCV000000000.0) assigned to sets of submitted records about the same variation/condition pair.
- Variation accession number/version separated by a decimal (VCV000000000.0) assigned to sets of submitted records about the same variation.
Identifiers for each variant or set of variants
- HGVS expressions
- Published allele names
- Database identifiers
Attributes of each condition
- Name
- Database identifiers
Classification of the variant for a disease or drug response
- Review status of the classification
- Submitter of the classification
- Classification - see full documentation on classification
-
Summary of the evidence for the classification
- Free text summary describing the rationale for the classification
- Number of individuals wiht the variant
- Clinical features and other data about individuals with the variant
- In vitro, in silico, or in vivo studies
- Description of the test or study
-
Mode of inheritance
- Citations, including URLs
Submission information
- Submitter name
- Dates first submitted and updated
- Data added by NCBI computation, e.g. SCV accession
Detailed descriptions of the data elements are available in the ClinVar Data Dictionary .
Represents evidence for variations and assertions
Where submitted, evidence supporting the classification of a variant is archived, to allow in-depth review of evidence by users and expert panels.
Representative use cases
Location search
Clinicians, researchers, and other users search a DNA or protein location to see how variants at that location were classified.
Review evidence about a variation
Clinicians and researchers review the evidence for/against a disease asserted to be associated with a variant, allowing determination or reassessment of a variant’s pathogenicity. Any conflict or uncertainty is reported explicitly. ClinVar does not compute conclusions, but only reports conclusions from external data submitters.
Curation of assertions for a variation
Experts review the evidence to assign appropriate levels of confidence to the assertions made in regard to a variant or sets of variants and submit expert-reviewed records.
Integration into testers’ workflow
Clinical laboratories integrate the information available from ClinVar into their workflow, both submitting their variant classifications and using the available information to classify variants identified in testing.
Data sharing
The information archived in ClinVar is freely available to users and organizations to ensure the broadest utility to the medical genetics community. To that end, we work with submitters and other archives to ensure that data structures are designed to facilitate data exchange so that data can be shared in both directions with willing organizations.
Attribution is important to identify the source of variant classifications, to facilitate communication and to give due credit to submitters. Each submitter is explicitly acknowledged, with pointers to more detailed submitter contact information to facilitate communication and collaboration within the genetics community. Data sources can be used for queries.
It is a goal of ClinVar to be a cooperative effort so that the archive can represent the broadest range of high quality variation/condition information. It is in the community’s best interest not to duplicate efforts unnecessarily, but rather to integrate publicly where possible.
History
A preliminary view of ClinVar was launched in 2012, with the first full public release in April 2013. The initial dataset included variations from OMIM®, GeneReviews®, some locus-specific databases (LSDB), contributing testing laboratories, and others. ClinVar is an active participant in the ClinGen project, leading to improved content and representation of that content. ClinVar continues to evolve in response to the needs of the clinical genetics community.
We invite your feedback.
ClinVar adapts to meet the needs of the genetics community; we invite comments and responses to help make this resource as effective a tool as possible for all our user communities. Email us at clinvar@ncbi.nlm.nih.gov.