Study Result

Note

This data class is at a trial use maturity level and may change in future releases. Maturity levels are described in the maturity-model.

Computational Definition

A collection of data items from a single study that pertain to a particular subject or experimental unit in the study, along with optional provenance information describing how these data items were generated.

Information Model

Some StudyResult attributes are inherited from Information Entity.

Field

Flags

Type

Limits

Description

id

string

0..1

The ‘logical’ identifier of the Entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system, but may or may not be globally unique outside the system. It is used within a system to reference an object from another.

type

string

1..1

The name of the class that is instantiated by a data object representing the Entity.

name

string

0..1

A primary name for the entity.

description

string

0..1

A free-text description of the Entity.

aliases

string

0..m

Alternative name(s) for the Entity.

extensions

Extension

0..m

A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model.

specifiedBy

Method | iriReference

0..1

A specification that describes all or part of the process that led to creation of the Information Entity

contributions

Contribution

0..m

Specific actions taken by an Agent toward the creation, modification, validation, or deprecation of an Information Entity.

reportedIn

Document | iriReference

0..m

A document in which the the Information Entity is reported.

focus

Entity | Mappable Concept | iriReference

1..1

The specific participant, subject or experimental unit in a Study that data included in the StudyResult object is about - e.g. a particular variant in a population allele frequency dataset like ExAC or gnomAD.

sourceDataSet

Data Set

0..1

A larger DataSet from which the data included in the StudyResult was taken or derived.

ancillaryResults

D

object

0..1

An object in which implementers can define custom fields to capture additional results derived from analysis of primary data items captured in standard attributes in the main body of the Study Result. e.g. in a Cohort Allele Frequency Study Result, this maybe a grpMaxFAF95 calculation, or homozygote/heterozygote calls derived from analyzing raw allele count data.

qualityMeasures

D

object

0..1

An object in which implementers can define custom fields to capture metadata about the quality/provenance of the primary data items captured in standard attributes in the main body of the Study Result. e.g. a sequencing coverage metric in a Cohort Allele Frequency Study Result.


DATA STRUCTURE

In VA-Spec, the Study Result class and its profiles can support the general data structure below.

../../../_images/study-result-data-structure.png

Study Result Data Structure

Legend A class-level view of the Study Result-based structures supported in VA-Spec data. Italicized text in each class exemplify the kind of information each may capture - here in the case of a Cohort Allele Frequency Study Result reporting data from the gnomAD dataset about a particular variant.

In this structure:

  • A Study Result and the data items it holds can be linked to the larger Data Set from which they came, and a description of the Study Group from which the data was collected.

  • Note that no Proposition object is used here, because Study Results represent more foundational data, and do not assert or assess evidence for possible facts about the domain.

  • As with Statements and Evidence Lines, surrounding classes can be used to describe the provenance of the Study Result and its data items.

A data example illustrating this structure for a Study Result interpreted as evidence for a Variant Pathogenicity Statement can be found here.


IMPLEMENTATION GUIDANCE

1. Study Result Utility

  • StudyResults provide a useful way to capture a subset of items from a study dataset that are used as evidence in generating higher order knowledge assertions about the entity that is the focus of the study result.

  • For example, consider the comprehensive allele frequency dataset provided by gnomAD, which covers millions of variants. A curator looking to assess the pathogenicity of a particular variant might create a StudyResult object to capture a subset of data related to this focus allele, including its count, frequency, homozygous frequency, along with metadata concerning the shared provenance or quality of this data. The StudyResult could then be references as a piece of evidence used to inform the focus alleles final pathogenicity classification.

  • Study Results are typically used to define subsets of data from larger high throughput analyses or clinical study data sets. But a StudyResult might be used to organize the data from a simple, small scale bench experiment - e.g. a western blot analysis of protein expression, or an in vitro binding assay focused on a single protein. Even such small ‘studies’ can generate multiple data items and metadata, and a StudyResult object can be used to collect all or some of these data points pertinent to a particular focus into an organized structure.

2. Use of the StudyResult.dataItems attribute:

  • The model specifies use of a key-value based DataItem object to capture the meaning and value of each type of data item captured in a given StudyResult. But in practice, profiles for specific StudyResult types may choose to define one or more specializations of the generic dataItems attribute as named attributes. This makes the data more succinct and parsable, and allows specific constraints to be applied and validated for different data items.

  • For example, a CohortAlleleFrequencyStudyResult profile may define a named focusAlleleFrequency attribute that is required, and a named focusAlleleCount attribute that is optional - both of which conceptually specialize the gks-core dataItems' property. Under this approach, the core dataItems attribute acts as a placeholder to seed such specializations, but is not used directly in StudyResult profiles.