Genetic Descriptor
Contents
Genetic Descriptor#
Support genetic descriptors was developed as a BIDS Extension Proposal. Please see Citing BIDS on how to appropriately credit this extension when referring to it in the context of the academic literature.
Genetic data are typically stored in dedicated repositories, separate from imaging data. A genetic descriptor links a BIDS dataset to associated genetic data, potentially in a separate repository, with details of where to find the genetic data and the type of data available.
The following example dataset with genetics data have been formatted using this specification and can be used for practical guidance when curating a new dataset.
Dataset Description#
Genetic descriptors are encoded as an additional, OPTIONAL entry in the
dataset_description.json
file.
Datasets linked to a genetic database entry include the following REQUIRED or OPTIONAL
dataset_description.json keys (a dot in the key name denotes a key in a sub-object,
see the example further below):
Example:
{
  "Name": "Human Connectome Project",
  "BIDSVersion":  "1.3.0",
  "License": "CC0",
  "Authors": ["1st author", "2nd author"],
  "Funding": ["P41 EB015894/EB/NIBIB NIH HHS/United States"],
  "Genetics": {
     "Dataset": "https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001364.v1.p1",
     "Database": "https://www.ncbi.nlm.nih.gov/gap/",
     "Descriptors": ["doi:10.1016/j.neuroimage.2013.05.041"]
     }
}
Subject naming and Participants file#
If the same participants have different identifiers in the genetic and imaging datasets,
the column genetic_id SHOULD be added to the participants.tsv file to associate
the BIDS participant with a subject in the Genetics.Dataset referred to in the
dataset_description.json file.
Information about the presence/absence of specific genetic markers MAY be duplicated
in the participants.tsv file by adding optional columns (like idh_mutation in the
example below).
Note that optional columns MUST be further described in an accompanying
participants.json file as described in
Tabular files.
participants.tsv example:
participant_id	age	sex	group	genetic_id	idh_mutation
sub-control01	34	M	control	124587	yes
sub-control02	12	F	control	548936	yes
sub-patient01	33	F	patient	489634	no
Genetic Information#
Template:
genetic_info.json
The genetic_info.json file describes the genetic information available in the
participants.tsv file and/or the genetic database described in
dataset_description.json.
Datasets containing the Genetics field in dataset_description.json or the
genetic_id column in participants.tsv MUST include this file with the following
fields:
To ensure dataset description consistency, we recommend following Multi-omics approaches to disease by Hasin et al. 2017 to determine the GeneticLevel:
Genetic: data report on a single genetic location (typically directly in theparticipants.tsvfile)Genomic: data link to participants’ genome (multiple genetic locations)Epigenomic: data link to participants’ characterization of reversible modifications of DNATranscriptomic: data link to participants RNA levelsMetabolomic: data link to participants’ products of cellular metabolic functionsProteomic: data link to participants peptides and proteins quantification
genetic_info.json example:
{
  "GeneticLevel": "Genomic",
  "AnalyticalApproach": ["Whole Genome Sequencing", "SNP/CNV Genotypes"],
  "SampleOrigin": "brain",
  "TissueOrigin": "gray matter",
  "CellType":  "neuron",
  "BrainLocation": "[-30 -15 10]"
}