ACMG Variant Pathogenicity Statement Example (with Evidence)

Description:

The data below builds on the simple ClinVar-GKS example described here, embellishing its base ClinVar record with additional evidence to demonstrate richer structures the Variant Pathogenicity Statement (ACMG 2015) profile can support.

Specifically, it stitches together several simpler Statement, Study Result, and Evidence Line data examples from the test fixtures directory, to reveal how these objects can be combined to build the rich evidence and provenance structure below.

../_images/variant-pathogenicity-statement-with-evidence.png

High Level Structure of the Data Example

Legend: A root Pathogenicity Statement is supported by Evidence Lines based on a Cohort Allele Frequency Study Result from gnomAD, and a Functional Impact Statement from MAVE DB, which itself is supported by a Functional Impact Study Result. Boxes represent objects comprising the central axis of the data, with italicized text indicating what each object reports to be true.

Such structures can represent the full details of how evidence is interpreted to build up support for higher order assertions of variant knowledge - e.g. here how functional data from a study result supports a study-specific conclusion about the functional impact of a variant, which is interpreted as ‘strong’ evidence ‘supporting’ for the variant’s possible pathogenicity, and assessed as one argument supporting an ACMG-based pathogenicity classification of the variant.

A few additional notes about this example:

  • Comments in the yaml are provided to help readers better understand the structure, semantics, and utility of the data in the example.

  • Some identifiers not present in the source test fixture data were created for purposes of identifying and cross-referencing objects in this aggregate example (these are all prefixed with the string ‘ex:’).

  • Note that the variant subject of each Statement and Study Result objects is reported as the same, generic variation for simplicity (ex:Variant001). In reality these objects may describe subtly different variants that all map to each other in some way (e.g. a protein-level variant in the Functional Impact objects, a genomic-level variant in the Allele Frequency objects, and a Categorical Variant that covers both of these contextual variants in the Pathogenicity Statement and its direct Evidence Lines). Nuances around how variant subjects of Statements and those described by supporting evidence is a separate and complex topic addressed here.

  • The example omits full representations of these VRS and CatVRS Variation objects - as these are large structures that are the remit of other GKS Specifications.

Data:

Note

We recommend opening this example side-by-side with the figure above, and tracking how the data reflects the diagrammed structure and semantics.

ex.Statement001:
  id: ex:Statement001
  type: Statement
  proposition:
    id: ex:Proposition001
    type: VariantPathogenicityProposition
    subjectVariant: ex:Variant001
    predicate: isCausalFor
    objectCondition:
      id: clinvar.trait/939
      conceptType: Disease
      name: Autosomal dominant nonsyndromic hearing loss 2A
      primaryCoding:
        code: C2677637
        system: https://www.ncbi.nlm.nih.gov/medgen/
        iris:
          - http://identifiers.org/medgen/C2677637
    penetranceQualifier:
      primaryCoding:
        code: high
        system: ga4gh-gks-term:pathogenicity-penetrance-qualifier
      name: high
  direction: supports
  strength:
    primaryCoding:
      code: definitive
      system: ACMG Guidelines, 2015
  classification:
    primaryCoding:
      code: pathogenic
      system: ACMG Guidelines, 2015
  contributions:
    - type: Contribution
      contributor:
        id: clinvar.submitter/500139
        type: Agent
        name: ClinVar Staff, National Center for Biotechnology Information (NCBI)
      activityType:
        name: evaluated
        mappings:
          - coding:
              code: cg000011
              system: https://dataexchange.clinicalgenome.org/codes/
            relation: exactMatch
      date: '2015-08-20'
    - type: Contribution
      contributor:
        id: clinvar.submitter/500139
        type: Agent
        name: ClinVar Staff, National Center for Biotechnology Information (NCBI)
      activityType:
        name: submitted
        mappings:
          - coding:
              code: cg000010
              system: https://dataexchange.clinicalgenome.org/codes/
            relation: exactMatch
      date: '2018-06-12'
  specifiedBy:
    type: Method
    name: ClinGen Hearing Loss Expert Panel Specifications to the ACMG/AMP Variant Interpretation Guidelines
    reportedIn:
      type: Document
      urls:
        - https://clinicalgenome.org/docs/clingen-hearing-loss-expert-panel-specifications-to-the-acmg-amp-variant-interpretation-guidelines/
  hasEvidenceLines:
  - id: ex:EvidenceLine001
    type: EvidenceLine
    targetProposition: ex:Proposition001
    hasEvidenceItems:
    - id: ex:StudyResult001
      type: CohortAlleleFrequencyStudyResult
      name: Overall Cohort Allele Frequency for 1-40819444_40819446-del
      focusAllele: ex:Variant001
      focusAlleleFrequency: 0
      focusAlleleCount: 0
      locusAlleleCount: 34086
      sourceDataSet:
        id: gnomad4.1.0
        type: DataSet
        name: gnomAD v4.1.0
        version: 4.1.0
      cohort:
        id: ALL
        name: Overall
        type: StudyGroup
      specifiedBy:
        type: Method
        name: gnomAD methods
        reportedIn:
          type: Document
          name: gnomAD help documentation
          urls:
            - "https://gnomad.broadinstitute.org/help"
    directionOfEvidenceProvided: supports
    strengthOfEvidenceProvided:
      primaryCoding:
        code: moderate
        system: ACMG Guidelines, 2015
    evidenceOutcome:
      primaryCoding:
        code: PM2_moderate
        system: ACMG Guidelines, 2015
      name: ACMG 2015 PM2 Moderate Criterion Met
    specifiedBy:
      type: Method
      methodType: PM2
      name: ClinGen Hearing Loss Expert Panel Specifications to the ACMG/AMP Variant Interpretation Guidelines
      reportedIn:
        type: Document
        urls:
          - https://clinicalgenome.org/docs/clingen-hearing-loss-expert-panel-specifications-to-the-acmg-amp-variant-interpretation-guidelines/
    contributions:
      - type: Contribution
        contributor:
          id: curator001
          type: Agent
        activityType:
          name: evidence evaluation
        date: '2018-03-11'
  - id: ex:EvidenceLine002
    type: EvidenceLine
    targetProposition: ex:Proposition001
    hasEvidenceItems:
      - id: ex:Statement002
        type: Statement
        proposition:
          type: ExperimentalVariantFunctionalImpactProposition
          subjectVariant: ex:Variant001
          predicate: impactsFunctionOf
          objectSequenceFeature:
            id: clinvar-gene:9132
            conceptType: Gene
            primaryCoding:
              code: ncbigene:9132
              system: https://identifiers.org/ncbigene
              iris:
                - https://identifiers.org/ncbigene:9132
            name: KCNQ4
          experimentalContextQualifier:
            title: KCNQ4 VAMP Seq Expt 001
            description: Multiplex assessment of KCNQ4 protein variant abundance by massively parallel sequencing
            phenotypicAssay: flow cytometry
            modelSystem: immortalized human cells
            variantLibrarySystem: oligo-directed mutagenic PCR
            profilingStrategy: barcode sequencing
            sequencingReadType: single-segment (short read)
        direction: supports
        classification:
          primaryCoding:
            code: abnormal
            system: ga4gh-gks-term:experimental-var-func-impact-classification
        specifiedBy:
          type: Method
          methodType:
            name: variant interpretation guideline
          reportedIn:
            type: Document
            pmid: 29785012
        hasEvidenceLines:
          id: EvidenceLine003
          type: EvidenceLine
          directionOfEvidenceProvided: supports
          specifiedBy:
            type: Method
            name: MAVE bayesian threshold probability method 001
            reportedIn:
              type: Document
              urls:
                - "https://mavedb.org/score-sets/urn:mavedb:00000013-a-1"
          hasEvidenceItems:
            - id: ex:StudyResult002
              type: ExperimentalVariantFunctionalImpactStudyResult
              focusVariant: ex:Variant001
              functionalImpactScore: 1.29395467005388
              specifiedBy:
                type: Method
                methodType:
                  name: Experimental protocol
                reportedIn:
                  type: Document
                  pmid: 29785012
              sourceDataSet:
                type: DataSet
                name: variant effect data set
                license:
                  primaryCoding:
                    code: CC0
                    system: https://spdx.org/licenses/
                    iris:
                      - https://spdx.org/licenses/CC0-1.0.html
                reportedIn:
                  type: Document
                  urls:
                    - "https://mavedb.org/score-sets/urn:mavedb:00000013-a-1"
    directionOfEvidenceProvided: supports
    strengthOfEvidenceProvided:
      primaryCoding:
        code: strong
        system: ACMG Guidelines, 2015
    evidenceOutcome:
      primaryCoding:
        code: PS3_strong
        system: ACMG Guidelines, 2015
      name: ACMG 2015 PS3 Supporting Criterion Met
    specifiedBy:
      type: Method
      methodType: PS3
      name: ClinGen Hearing Loss Expert Panel Specifications to the ACMG/AMP Variant Interpretation Guidelines
      reportedIn:
        type: Document
        urls:
          - https://clinicalgenome.org/docs/clingen-hearing-loss-expert-panel-specifications-to-the-acmg-amp-variant-interpretation-guidelines/
    contributions:
      - type: Contribution
        contributor:
          id: curator002
          type: Agent
        activityType:
          name: evidence evaluation
        date: '2018-04-03'
  extensions:
  - name: clinvarMethodCategory
    value: literature only
  - name: clinvarReviewStatus
    value: no assertion criteria provided
  - name: clinvarSubmittedClassification
    value: Pathogenic

Detailed Diagram:

The diagram shows a subset of data from the full json example. It provides a more detailed data structure overview that highlights encapsulation of Propositions in Statements and Evidence Lines and the use of the same set of Core Model classes (Method, Document, Contribution, Agent) to capture provenance information about all primary knowledge artifacts.

It also highlights the kind of schema that specifies each objects in the data - illustrating how Core Model Classes, Base Profiles, and Community Profiles that rely on different authoring mechanisms are used together in a structured data representation.

../_images/variant-pathogenicity-statement-with-evidence-2.png

Detailed Data Example

Legend: Diagrammatic representation of a subset of data in the json example above. Styling conventions indicate the type of model that specifies each object in the example (Core Class, Base Profile, Community Profile). To fit the data into this form and make it human readable, syntactic shortcuts were taken to simplify values normally wrapped in complex data structures like MappableConcepts and Codings.

A key thing to note in the example is that, because Base Profiles are defined as formal subclasses, these objects have a specific type that reflects this (e.g. CohortAlleleFrequencyStudyResult). But because Community Profiles are defined using schema composition, the formal type of these objects is that of the Core Model class on which they are built (e.g. Statement, EvidenceLine).