Statement
Note
This data class is at a trial use maturity level and may change in future releases. Maturity levels are described in the maturity-model.
Computational Definition
A claim of purported truth as made by a particular agent, on a particular occasion. Statements may be used to put forth a possible fact (i.e. a ‘Proposition’) as true or false, or to provide a more nuanced assessment of the level of confidence or evidence supporting a particular Proposition.
Information Model
Some Statement attributes are inherited from Information Entity.
Field |
Flags |
Type |
Limits |
Description |
|---|---|---|---|---|
id |
string |
0..1 |
The ‘logical’ identifier of the Entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system, but may or may not be globally unique outside the system. It is used within a system to reference an object from another. |
|
name |
string |
0..1 |
A primary name for the entity. |
|
description |
string |
0..1 |
A free-text description of the Entity. |
|
aliases |
⋮ | string |
0..m |
Alternative name(s) for the Entity. |
extensions |
⋮ | 0..m |
A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model. |
|
specifiedBy |
0..1 |
A specification that describes all or part of the process that led to creation of the Information Entity |
||
contributions |
↓ | 0..m |
Specific actions taken by an Agent toward the creation, modification, validation, or deprecation of an Information Entity. |
|
reportedIn |
⋮ | 0..m |
A document in which the the Information Entity is reported. |
|
type |
string |
1..1 |
MUST be “Statement”. |
|
proposition |
1..1 |
A possible fact, the validity of which is assessed and reported by the Statement. A Statement can put forth the proposition as being true, false, or uncertain, and may provide an assessment of the level of confidence/evidence supporting this claim. |
||
direction |
string |
1..1 |
A term indicating whether the Statement supports, disputes, or remains neutral w.r.t. the validity of the Proposition it evaluates. |
|
strength |
0..1 |
A term used to report the strength of a Proposition’s assessment in the direction indicated (i.e. how strongly supported or disputed the Proposition is believed to be). Implementers may choose to frame a strength assessment in terms of how confident an agent is that the Proposition is true or false, or in terms of the strength of all evidence they believe supports or disputes it. |
||
score |
D | number |
0..1 |
A quantitative score that indicates the strength of a Proposition’s assessment in the direction indicated (i.e. how strongly supported or disputed the Proposition is believed to be). Depending on its implementation, a score may reflect how confident that agent is that the Proposition is true or false, or the strength of evidence they believe supports or disputes it. Instructions for how to interpret the meaning of a given score may be gleaned from the method or document referenced in ‘specifiedBy’ attribute. |
classification |
0..1 |
A single term or phrase summarizing the outcome of direction and strength assessments of a Statement’s Proposition, in terms of a classification of its subject. |
||
hasEvidenceLines |
⋮ | 0..m |
An evidence-based argument that supports or disputes the validity of the proposition that a Statement assesses or puts forth as true. The strength and direction of this argument (whether it supports or disputes the proposition, and how strongly) is based on an interpretation of one or more pieces of information as evidence (i.e. ‘Evidence Items). |
DATA STRUCTURE
Statements represent assertions or assessments of general knowledge about a variant - e.g. an assertion that ‘HRAS:c.173C>T is pathogenic for Costello Syndrome, or an assessment that there is presently only moderate evidence supporting this possible fact.
In VA-Spec, the Statement class and its profiles can support the general data structure below.
Statement Data Structure
Legend A class-level view of the Statement-based structures supported in VA-Spec data. Italicized text in each class exemplify the kind of information each may capture, here in the case of a Variant Pathogenicity Statement supported by Population Allele Frequency evidence.
In this structure:
A Statement roots a central axis where it is linked to zero or more Evidence Lines representing discrete arguments for or against it.
Each Evidence Line may be linked to zero or more pieces of information (e.g. Study Results) that were used to build its evidence-based argument.
The Proposition contained in the Statement object encapsulates a structured representation of the possible fact that the Statement may assert or assess (e.g. that ‘HRAS:c.173C>T is causal for Costello Syndrome’). Unless otherwise stated, this is the same proposition against which evidence is assessed in supporting Evidence Lines.
Surrounding this central axis are classes that describe the provenance of the central artifacts, including Contributions made to them by Agents, Activities performed in doing so, Methods that specify their creation, and Documents that describe them.
A data example illustrating this structure for a Variant Pathogenicity Statement can be found here.
IMPLEMENTATION GUIDANCE
1. Statement and Proposition Semantics
Statements put forth a Proposition that expresses some possible fact about the world, and may provide an assessment of this proposition’s validity (e.g. a level of confidence that it is true, or indicator of the overall strength of evidence supporting it). The semantics of the Proposition are captured in subject, predicate, object, and optional qualifier attributes (SPOQ). An assessment of the Proposition’s validity can be captured using direction, strength, and/or score attributes (DS).
The
directionattribute is used to indicate whether the Statement’s Proposition is supported by the agent’s assessment (when evidence favors its validity), is disputed by the agent’s assessment (when evidence argues against its validity), or remains neutral (when conflicting or insufficient evidence exists to assert one direction or the other). Values come from an enumerated set of strings defined in the model {‘supports’, ‘disputes’, ‘neutral’}.The
strengthattribute is used to report the strength of this assessment in the direction indicated. Strength can be framed as a level of confidence that the Proposition is true or false, or as a strength of evidence that supports or disputes it - depending on what values of this attribute are used (e.g. ‘high confidence’, ‘low confidence’, etc. if confidence level is being assessed, or ‘strong evidence’, ‘weak evidence’, etc. if evidence strength is being assessed). ALternatively, data providers can choose values that don’t commit to one or the other if they don’t want to make the distinction (e.g. ‘high’ vs ‘medium’ vs ‘low’).The
scoreattribute serves the same purpose asstrength, but allows for a quantitative assessment based on a numerical score, and can be used in addition to or as an alternative to the ‘strength’ attribute.
This ‘SPOQ-DS’ Proposition pattern is used to explicitly represent the semantics of the central piece of knowledge reported in any Statement, which is supported by evidence and provenance information captured in other Statement attributes.
2. Statement ‘Modes of Use’
The model supports two “modes of use” for Statements, which differ in what they say about their Proposition, and can be distinguished by how direction and strength or score attributes are populated.
In “Assertion Mode”, a Statement simply reports its SPOQ Proposition to be true or false (e.g. that “BRCA2 c.8023A>G is pathogenic for Breast Cancer”). The``strength` and
scoreattributes are not populated, anddirectionis assumed true/supports if not otherwise indicated. This mode is used by projects reporting conclusive assertions about a domain of discourse, but not providing overall confidence or evidence level assessments.In “Proposition Assessment Mode”, a Statement describes the overall state of evidence and/or confidence surrounding the SPOQ Proposition which is not necessarily being asserted as true or false. The
strengthorscoreattributes are populated, which allows for Statements to report things like “there is weak evidence supporting the proposition that ‘BRCA2 c.8023A>G is causal for Breast Cancer’”, or “we have high confidence that the proposition ‘PAH:c.1285C>A is causal for Phenylketonuria is false”). This mode is used in projects to track the evolving state of support for propositions of interest, as curators actively collect evidence and work toward a conclusive assertion.
For a diagrammed example of each mode of use, see here.
Note
Many VA Standard Profiles, including the Variant Pathogenicity Statement Profile, contain the direction, strength, and score attributes, and thus could be use to support either Mode of Use. Implementations should choose the mode that best fits their data and use case when generating VA-compliant datasets - leveraging Proposition Assessment Mode if they wish to provide nuanced representations of the state of evidence or confidence surrounding a possible fact.
3. Use of the Proposition.qualifier Attribute:
This attribute allows representation of more complex, n-ary statements that may not be accommodated by a simple subject-predicate-object (SPO) triple. For example, if an SPO triple asserts that ‘Variant X’ - predicts sensitivity to - ‘Treatment Y’, a qualifier can be used to indicate that this applies in the context of a particular ‘Disease Z’.
Qualifiers can also add information that quantifies aspects of a Statement’s Proposition - e.g. an SPO triple reporting that a ‘Variant X’- causes - ‘Phenotype Y’, can be quantified with frequency/penetrance information that indicates the percentage of carriers in which the phenotype manifests. Proposition profiles may define more than one qualifier, as needed to capture different types of qualifying information.
The Core model specifies use of a key-value ‘Qualifier’ object to capture the meaning and value of each type of qualifying information relevant for a given type of Proposition. But in practice, profiles for specific Proposition types may choose to define one or more specializations of the generic ‘qualifier’ property as named attributes. This makes the data more succinct and parsable, and allows specific constraints to be applied and validated for different qualifiers.
For example, a VariantPathogenicityProposition profile may define a named
alleleOriginQualifierattribute that is required, and a namedgeneContextQualifierattribute that is optional - both of which conceptually specialize the Corequalifierproperty. Under this approach, the corequalifieracts as a placeholder to seed such specializations, but is not used directly in Proposition profiles.In practice, the core
qualifierattribute SHOULD be conceptually extended in Proposition profiles to indicate specific types of qualifying information that is being provided (e.g.``diseaseContextQualifier``, orpenetranceQualifier). Thequalifierattribute in the core model acts as a placeholder to seed such specializations, but it, or theQualifierclass, SHOULD NOT be used directly in a Proposition profile.