SPARQL end-point

Posted by rcf8 in - (Comments Off on SPARQL end-point)

GWAS Central provides access to a triple-store containing all nanopub resources (including all associated marker,  phenotype and result resources) through a SPARQL end-point.

The end-point is available at http://fuseki.gwascentral.org. The triple-store does not include all association data in GWAS Central, only the subset which are provided within nanopubs (all associations with a p-value ≤ 10-5) for Release 11 of the Study Database.

Example SPARQL queries of varying complexity are described at the bottom of the page.

The nanopub approach (see here for details) uses named graphs to provide context, so one or more graphs must be provided when querying the data using SPARQL.

Each nanopub represents its data in the context of four types of named graph specific to each data set. These types are described below.

The nanopub named graph

This named graph is a container for triples which point to the assertion, conditions and provenance named graphs described below.

SPARQL to get all triples in a single ‘nanopub’ named graph representing data set HGVRS17:

select *
from <http://purl.org/gwas/np/HGVRS17>
where {?s ?p ?o}

View results of query

The assertion named graph

This named graph is a container for triples which relate to the association results. An ‘as’ suffix is added to the nanopub URI to distinguish this from the nanopub named graph.

SPARQL to get all triples in an ‘assertion’ named graph representing data set HGVRS17:

select *
from named <http://purl.org/gwas/np/HGVRS17#as>
where { graph ?g {?s ?p ?o} }

View results from query

The conditions named graph

This named graph is a container for triples which relate to the conditions of the assertion, including the panels used and the organism. A ‘co’ suffix is added to the nanopub URI to distinguish this from the nanopub named graph.

SPARQL to get all triples in a ‘conditions’ named graph representing data set HGVRS17:

select *
from named <http://purl.org/gwas/np/HGVRS17#co>
where { graph ?g {?s ?p ?o} }

View results from query

The provenance named graph

This named graph is a container for triples which relate to the provenance of the assertion, including the contributors to the nanopub, linkouts to related publications and databases. A ‘co’ suffix is added to the nanopub URI to distinguish this from the nanopub named graph.

SPARQL to get all triples in a ‘provenance’ named graph representing data set HGVRS17:

select *
from named <http://purl.org/gwas/np/HGVRS17#pr>
where { graph ?g {?s ?p ?o} }

View results from query

Example SPARQL Queries

Construct an RDF graph of genes, their associated markers and GWAS Central phenotype resources when p-values ≤ 10-7, from nanopublications related to coronary artery disease:

PREFIX mesh: <http://bio2rdf.org/mesh:>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX gc:<http://purl.org/gwas/schema#>
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>

CONSTRUCT {
    ?phenotype rdf:type <http://example.com/CAD> ;
    <http://example.com/gene> ?gene;
    <http://example.com/annotation> ?annotation;
    <http://example.com/meshTerm> ?meshterm;
    <http://example.com/hpoTerm> ?hpoterm;
    <http://example.com/gene> ?gene.
    ?gene <http://example.com/marker> ?marker.
    ?marker <http://example.com/pvalue> ?pvalue.
}
WHERE
{
    GRAPH ?g
    {
        ?phenotype gc:meshAnnotation ?meshterm.
        optional {?phenotype gc:hpoAnnotation ?hpoterm}
        ?marker gc:associated ?phenotype ;
        gc:pvalue ?pvalue;
       gc:locatedInGene ?gene.
    }
    FILTER (xsd:float(?pvalue) <= 10e-7 && ?meshterm = mesh:D003324)
}

View results of query

Get all phenotype annotation terms and associated information from nanopublications where p-values ≤ 10-10:

PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX gc:<http://purl.org/gwas/schema#>
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>
PREFIX obo:<http://www.obofoundry.org/ro/ro.owl#>
SELECT *
WHERE
{
    GRAPH ?g
    {
        ?phenotype gc:meshAnnotation ?meshterm .
        ?marker gc:associated ?phenotype ;
        gc:locatedInGene ?gene ;
        gc:pvalue ?pvalue;
       obo:hasSynonym ?ext_marker_id.
       optional {?phenotype gc:hpoAnnotation ?hpoterm }
    }
    FILTER (xsd:float(?pvalue) <= 1e-10)
}

View results from query

Get all triples in a specific nanopub:

prefix np150:<http://purl.org/gwas/np/HGVRS150#>
prefix npbase:<http://purl.org/gwas/np/>
select *
from named npbase:HGVRS150
from named np150:as from named np150:co
from named np150:pr where { graph ?g {?s ?p ?o} }

View results from query

Construct graph of all triples in a specific nanopub:

prefix np150:<http://purl.org/gwas/np/HGVRS150#>
prefix npbase:<http://purl.org/gwas/np/>
prefix marker:<http://purl.org/gwas/mkr/>
construct { ?s ?p ?o }
from named npbase:HGVRS150
from named np150:as
from named np150:co
from named np150:pr
where { graph ?g { ?s ?p ?o }  }

View results and graph from query (W3C validator tool)

Describe a particular marker and its associations

prefix owl:<http://www.w3.org/2002/07/owl#>
describe ?s
where {
graph ?g
 { ?s owl:sameAs ?o }
filter (?o = <http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs3803662>)
}

View results from query

Semantically enabling a genome-wide association study database

Posted by rkh7 in News | Uncategorized - (Comments Off on Semantically enabling a genome-wide association study database)

Semantically enabling a genome-wide association study database: a publication using data from GWAS Central describing a methodology for applying phenotype annotations to a comprehensive genome-wide association dataset and for ensuring compatibility with the Semantic Web was published December 2012.

The provision of GWAS nanopublications enables a new dimension for exploring GWAS data, by way of intrinsic links to related data resources within the Linked Data web. The value of such annotation and integration will grow as more biomedical resources adopt the standards of the Semantic Web.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3579732/

The GWAS Central phenotype annotations can be queried and viewed from the web interface at:

http://www.gwascentral.org/phenotypes

The GWAS Central SPARQL end-point can be accessed at:

http://fuseki.gwascentral.org

The human-mouse comparative phenotype pipeline described in this paper, named “get human and mouse phenotypes for a gene”, is available from myExperiment at:

http://www.myexperiment.org/workflows/2131.html

 

Semantic Web/Linked Data support added

Posted by rcf8 in News | Uncategorized - (Comments Off on Semantic Web/Linked Data support added)

GWAS Central now provides data in Semantic Web and Linked Data compatible formats through:

  • Individual markers, phenotypes and results which now output RDF and Turtle formats that connect to Linked Data resources.
  • Nanopublications: a  Semantic Web resource using named graphs, and contain key results for each dataset outputs N-Quads format. These follow the nanopublication proposals suggested here.
  • The construction of a triple-store with a SPARQL server available at http://fuseki.gwascentral.org.

We are in the process of updating the Web Services section of our help with more information.

Release 7

Posted by rcf8 in Uncategorized | WebRelease - (Comments Off on Release 7)

Semantic Web and Linked Data support added:

  • Individual markers, phenotypes and results now output RDF and Turtle formats.
  • Nanopublications:  S/W resource using named graphs, and contain key results for each dataset outputs N-Quads format. Follows the nanopublication proposals suggested here.
  • Construction of a TripleStore with a SPARQL server at http://fuseki.gwascentral.org