⬅️  Back Next ➡️

The first program disease area (PDA) that we were tasked to investigation was TNBC. Unfortunately, it turned out that the current public linked data was not particularly rich.

Contents:

  1. Genes associated with TNBC through gene variant annotations (negative result)
  2. Therapies associate with positive and negative predicators (partial result)
  3. Genes list linked through pathways on triple-negative breast cancer
  4. Concatenated list of TNBC genes
  5. Chemical compounds part of a pathway on triple-negative breast cancer
  6. Genes and the cellular components where encoded gene products are found in Pathways on TNBC
  7. Cell types related to breast cancer via markers



Genes associated with TNBC through gene variant annotations (negative result)

The first attempt was to load the list of genes associated with TNBC. Unsurprisingly, BRCA1 and BRCA2 are returned, but no other results are found.

#title: Genes associated with TNBC through gene variant annotations 
SELECT DISTINCT ?variant ?variantLabel  WHERE {
  VALUES ?predicator {p:P3354 p:P3355 p:P3356 p:P3357 p:P3358 p:P3359}
  VALUES ?predicatorValue {ps:P3354 ps:P3355 ps:P3356 ps:P3357 ps:P3358 ps:P3359}
    ?variant ?predictor ?node .
  ?node ?predicatorValue ?therapy .
  ?node pq:P2175 wd:Q7843332 .
  ?prop wikibase:claim ?predicator .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}


Contents ↑



Therapies associate with positive and negative predicators (partial result)

In the Wikidata model, there are also several properties to for predictors, both positive and negative as well as diagnostic, prognostic, or therapeutic. The list of therapies associated with TNBC (“Q7843332”) is richer.

#title: 
SELECT DISTINCT ?variant ?variantLabel ?propLabel ?therapyLabel WHERE {
  VALUES ?predicator {p:P3354 p:P3355 p:P3356 p:P3357 p:P3358 p:P3359}
  VALUES ?predicatorValue {ps:P3354 ps:P3355 ps:P3356 ps:P3357 ps:P3358 ps:P3359}
    ?variant ?predictor ?node .
  ?node ?predicatorValue ?therapy .
  ?node pq:P2175 wd:Q7843332 .
  ?prop wikibase:claim ?predicator .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}


Contents ↑



Genes list linked through pathways on triple-negative breast cancer

Rather than further try to reach genes from TNBC within the Wikidata linked data network, a permissively licensed pathway was encoded into WikiPathways, a collection of semantically rich pathway representations. Below you can see a graphical representation of https://www.wikipathways.org/index.php/Pathway:WP5211, “Glucose metabolism in triple-negative breast cancer cells (Homo sapiens)”:

Queries like this one can now be written that use the gene list provided by this pathway:

#title: Genes list linked through pathways on triple-negative breast cancer
PREFIX wd:<http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase:<http://wikiba.se/ontology#>
PREFIX bd:<http://www.bigdata.com/rdf#>

SELECT DISTINCT ?gene ?geneLabel ?scholia WHERE {
    VALUES ?disease {wd:Q7843332}   # with a main subject of triple negative breast cancer
    ?pathway wdt:P2410 ?wpid ;      # pathways with a Wikipathways ID
            wdt:P921 ?disease ;
            wdt:P527 ?gene .        # which has various parts
    ?gene wdt:P31 wd:Q7187 .        # part is of part gene
    BIND(uri(CONCAT("https://scholia.toolforge.org/gene/",REPLACE(str(?gene), "http://www.wikidata.org/entity/", ""))) as ?scholia)

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

(Note: you can find more potentially TNBC-related genes under [“Mutations and Slides for TNBC”](./data.html#{{ “Mutations and Slides for TNBC” | sluggify }}).)


Contents ↑



Concatenated list of TNBC genes

This simplified list can then be passed to other resources like https://string-db.org to explore the network interactions:

#title: Genes list linked through pathways on triple-negative breast cancer
SELECT (GROUP_CONCAT(?geneLabel; SEPARATOR=" ") as ?geneList) WHERE {
  VALUES ?disease {
    wd:Q7843332
  }
  ?pathway wdt:P2410 ?wpid;
    wdt:P921 ?disease;
    wdt:P527 ?gene.
  ?gene wdt:P31 wd:Q7187.
  ?gene rdfs:label ?geneLabel.
  FILTER(LANG(?geneLabel) = "en")
}

STRING Network Interactions

The list of genes from above can be entered here to visualize the network interactions from https://string-db.org:



Contents ↑



Chemical compounds part of a pathway on triple-negative breast cancer

Similarly the chemical compounds involved in a pathway can be queried:

#title: Chemical compounds part of a pathway on triple-negative breast cancer

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>

SELECT DISTINCT ?compound ?compoundLabel ?scholia WHERE {
    ?pathway wdt:P2410 ?wpid ;                    # pathways with a Wikipathways ID
            wdt:P921 wd:Q7843332 ;                # with a main subject of triple negative breast cancer
            wdt:P527 ?compound .                  # which has various parts
    ?compound wdt:P279*/wdt:P31 wd:Q11173 .       # part is subclass of of chemical compound
    BIND(uri(CONCAT("https://scholia.toolforge.org/chemical/",REPLACE(str(?gene), "http://www.wikidata.org/entity/", ""))) as ?scholia)

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}


Contents ↑



Genes and the cellular components where encoded gene products are found in Pathways on TNBC

or even cellular components via the related gene products. Here a different representation of the same knowledge graph is being displayed:

#title: Genes and the cellular components where encoded gene products are found in Pathways on TNBC
#defaultView:Graph
PREFIX wd:<http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase:<http://wikiba.se/ontology#>
PREFIX bd:<http://www.bigdata.com/rdf#>

SELECT DISTINCT ?molecular_function ?molecular_functionLabel ?gene ?geneLabel  WHERE {
    VALUES ?disease {wd:Q7843332} # with a main subject of triple negative breast cancer
    ?pathway wdt:P2410 ?wpid ;      # pathways with a Wikipathways ID
            wdt:P921 ?disease ;
            wdt:P527 ?gene .      # which has various parts
    ?gene wdt:P31 wd:Q7187 ;       # part is of part gene
          wdt:P688 ?protein .
    ?protein wdt:P680 ?molecular_function .
    BIND(uri(CONCAT("https://scholia.toolforge.org/gene/",REPLACE(str(?gene), "http://www.wikidata.org/entity/", ""))) as ?scholia)

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}


Contents ↑



Finally, even without the use of pathways, some cell type annotations can be discovered in WikiData via the “has marker” and “genetic association” properties.

#title: Cell types related to breast cancer via markers

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT ?cellTypeLabel ?scholia_cellType ?geneLabel ?scholia_gene ?diseaseLabel ?scholia_disease
WHERE {
    VALUES ?disease { wd:Q128581} # breast cancer
    ?disease wdt:P2293 ?diseaseGene ;  # Breast cancer --> genetic association --> gene
        rdfs:label ?diseaseLabel .
    ?cellType wdt:P8872 ?diseaseGene ; # Cell type --> has marker --> gene
        rdfs:label ?cellTypeLabel .

    ?diseaseGene rdfs:label ?geneLabel .
    BIND(uri(CONCAT("https://scholia.toolforge.org/disease/",REPLACE(str(?disease), "http://www.wikidata.org/entity/", ""))) as ?scholia_disease)
    BIND(uri(CONCAT("https://scholia.toolforge.org/topic/",REPLACE(str(?cellType), "http://www.wikidata.org/entity/", ""))) as ?scholia_cellType)
    BIND(uri(CONCAT("https://scholia.toolforge.org/gene/",REPLACE(str(?diseaseGene), "http://www.wikidata.org/entity/", ""))) as ?scholia_gene)

    FILTER (LANG(?cellTypeLabel) = "en")
    FILTER (LANG(?diseaseLabel) = "en")
    FILTER (LANG(?geneLabel) = "en")
}


Contents ↑


⬅️  Back Next ➡️