Definitions and Glossary

What is a Study?

A Study in GWAS Central is similar in scope to a journal article, comprising information relevant to a given research question or set of related questions. Data and analysis results from a study are grouped into one or more Experiments. The main fields in a Study entry are: Title, Abstract, Background, Objectives, KeyResults, Conclusions, StudyDesign, StudySizeReason, StudyPower, SourcesOfBias, Limitations, Acknowledgements, and SubmissionDate.

What is an Analysis Experiment?

Analysis Experiments in GWAS Central are packages of information that address one discrete research question, providing summaries of genetic association findings in Assayed Panels. An Analysis Experiment may include data for any number of Markers and any number of Assayed Panels, but will address no more than one Phenotype question. The main fields in an Experiment entry are: Objective, Outcome, and Comments.

What is a Sample Panel?

A Sample Panel in GWAS Central is a set of test subjects that are collected together and grouped into a named compilation to address some phenotype of interest. Typically, all the individuals in a Sample Panel are annotated in terms of one or more related Phenotypes, or share some commonality of another key metric (e.g., age, gender, ethnicity). Sample Panels may or may not be equivalent to the eventual groupings that are used as the basis for examining and reporting Experiment data, i.e., the Assayed Panels.

What is an Assayed Panel?

An Assayed Panel in GWAS Central is a set of test subjects that are grouped into a named compilation, and used as the basis for examining and reporting Experiment data. Each Assayed Panel is derived from one or more Sample Panels (by splitting them into subsets and/or merging across Sample Panels) on the basis of some explicit phenotype criterion (such as presence/absence of a Phenotype, or a Phenotype value beyond some inclusion threshold).

What is a Phenotype?

A Phenotype in GWAS Central is a reported characteristic or trait of interest, such as blood pressure. Phenotype information is organized into three sub-components: the ‘Phenotype Property’ which represent the concept of the trait under study, the ‘Phenotype Method’ which describes how the Phenotype Property was measured, and the ‘Phenotype Value’ which is a particular observation/result produced by measuring the Phenotype Property. Schemalet examples of this are available at the PaGE-OM website.
This system is very straightforward to use for the representation of ordinal or nominal Phenotype Values. To solve the problem of presenting quantitative Phenotype Values in a group of individuals (i.e., a Sample Panel or an Assayed Panel), GWAS Central stores various statistics that define the group’s distribution (e.g., mean, max, min, standard deviation). GWAS Central does not store Phenotype information for single individuals.

What is a Marker?

In GWAS Central we define a Marker as: “A DNA sequence for which identical or highly similar instances exist at one or more locations in a genome. Markers are typically used as the basis for designing an experimental assay for detection of those instances of that sequence”. The range of Markers available in GWAS Central is extensive, including the complete Marker content from other public depositories such as dbSNP, UniSTS, and DBGV.

What is a Genotype?

In GWAS Central we define a Genotype as: “A qualitative or quantitative combination of alleles of one or more Markers or DNA regions, implied (by the result of running a genotyping assay) to be resident at one or more positions in the genome of a tested DNA sample”. This definition thus focuses on the genotyping result and not absolute reality, i.e., detected genotypes may not always reflect the true status of the genome, since some assays are flawed in their design or application, and some DNA samples may be inaccurately genotyped. This definition also allows for haplotype genotypes, MarkerSet genotypes (composite Marker signals), and genotype classes that are something other than simple presence/absence detections. Specifically, we must also cater for copy-number variation and somatic variation, which implies quantitative and ratio genotypes will need to be supported. A new GWAS Central Nomenclature System for genotypes has been devised, to help manage these various complexities.

What is an Allele?

In GWAS Central we define an Allele as: “A specific version of a set of different sequence alternatives of a Marker or DNA region resident at one or more locations in a genome”. To minimise confusion when referring to Alleles, GWAS Central always presents Alleles in the context of their immediate flanking DNA sequences, and a new GWAS Central Nomenclature System for Alleles has been devised.

What are MeSH terms?

Medical Subject Headings (MeSH) is the National Library of Medicine’s controlled vocabulary thesaurus. It consists of sets of descriptors structured in a hierarchy that permits searching at various levels of specificity. In GWAS Central two levels of MeSH are implemented. MeSH ‘headings’ are displayed in the MeSH tree and represent concepts found in the biomedical literature, for example “Neoplasms”. MeSH ‘terms’ are used by the phenotype autocomplete search box and are the various synonyms used to represent those concepts, for example “Benign Neoplasms” and “Cancer”.

What are HPO terms?

The Human Phenotype Ontology (HPO) provides a standardized vocabulary of phenotypic abnormalities encountered in human disease. The vocabulary is structured as a hierarchy that permits searching at various levels of specificity. The HPO is being developed using information from Online Mendelian Inheritance in Man (OMIM) and the medical literature. The HPO complies with the OBO Foundry (Open Biological and Biomedical Ontologies) development principles. The HPO maintains synonyms for the terms which are categorised as either “exact” or “related” synonyms. In GWAS Central the terms are used to generate the browsable tree, while the terms along with the exact synonyms may be used in text/autocomplete searches. For example, searches for the string “Asthma” (term) and “Bronchial asthma” (exact synonym for Asthma) would yield identical results.

What are inferred phenotypes?

In cases where disease or syndrome names are used to describe the GWAS phenotypes then an equivalent HPO term will not be found.  In these cases HPO can be used to define the individual phenotypic abnormalities (signs and symptoms) associated with the disease or syndrome.  GWAS Central implements the HPO deconstructions of disease phenotypes described by the Online Mendelian Inheritance in Man (OMIM) within a framework of MeSH to OMIM term mappings to provide automatically inferred HPO annotations for the originally assigned MeSH annotations.  The inferred phenotypes represent phenotypic abnormlities which are commonly observed in occurrences of the disease or syndrome reported in the GWAS.

What is Author Communication?

Where possible we contact the “corresponding authors” of published studies with an invitation to review the data contained within GWAS Central for their studies and submit additional p-values to further scientific discovery.  A standard e-mail is sent to contacts, an example of which is given below.  In addition, we record if and when a response is received to our e-mail and the nature of that response.

Dear (name),

We are approaching you from the European Community funded GEN2PHEN project, working in collaboration with the NHGRI GWAS Catalog, Ensembl, and the Japanese GWAS Database.  We would like to make you aware of ‘GWAS Central’ (formerly HGVbaseG2P): a comprehensive GWAS database, with powerful browser support for multi-study viewing and comparison.

Our goal is to achieve the full, open, free, and integrated sharing of GWAS summary-level data, to further scientific progress in this domain.  For legal and ethical reasons our efforts are limited to summary-level p-values, to ensure individuals can not be identified.  We are contacting all corresponding authors of published GWAS research to ensure their findings are properly and optimally displayed in GWAS Central.  We are also requesting authors provide additional p-values for their studies, since typically only a very small number of ‘top’ p-values are made public via the scientific literature.

Currently you are represented as the corresponding author who is to be contacted in relation to the following (study/studies):

* (study_name) as described in “(paper_title)” (doi) can be viewed at http://www.gwascentral.org/study/(study_id)

We invite you to review the data GWAS Central is displaying for your (study/studies). Once you have reviewed the data we would be grateful if you would work with us to ensure your findings are maximally represented.  We are very willing to help and to do all the necessary ‘heavy-lifting’ to process large numbers of additional p-values for your (study/studies), in whichever file format you have available.

Please get in touch if you’d like to discuss any of the above in more detail.

Regards,
GWAS Central Team