Skip to content

SPARQL Semantic Queries Tutorial

oguzhekim edited this page Mar 10, 2024 · 5 revisions

What is SPARQL?


SPARQL is basically a query language for RDF(Resource Description Framework), which is a directed, labeled graph data format for representing information in the Web. Regardless of the storage place of data is natively as RDF or viewed as RDF via middleware, SPARQL is able to express queries among different data sources. SPARQL also supports aggregation, subqueries, negation, creating values by expressions, extensible value testing, and constraining queries by source RDF graph. The results of SPARQL queries can be result sets or RDF graphs.


Why Use SPARQL?

SPARQL stands out among other query languages due to its utilization of RDF features.

  • Semantic querying: SPARQL queries are centered on what the user wants to know from the data, rather than how the data is structured.

  • Flexibility: SPARQL operates on RDF graphs, allowing users to query interconnected data represented as subject-predicate-object triples. This graph-based approach allows for flexible querying of diverse datasets. It's also easier to make live changes to an RDF store without any downtime or redesign.

  • Ease of use: SPARQL allows users to start working with different data sources right away, without having to map data sources and write new schemas.

  • Extensibility: SPARQL can be extended by creating functions to be used with the BIND operator. These functions can be implemented in SPARQL itself or in other programming languages such as JavaScript, Python, or Java.

  • Standardization: The World Wide Web Consortium (W3C) is responsible for maintaining the RDF and SPARQL specifications. The level of standardization of implementations using RDF and SPARQL is quite high. This ensures compatibility between different data stores because they all speak the same language.

  • Integration: RDF stores are typically queried over HTTP, making it very easy to integrate them into service architectures.

How to Make Simple Queries in SPARQL?


  • Generally, a SPARQL query contain a set of triple patterns called a basic graph pattern.

  • These triple patterns are like RDF but each of the subject, predicate and object may be a variable.

  • An example of a very simple SPARQL query would be the query below:

    SELECT ?title
    WHERE
    {
     <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title_ .
    }

    with the data:

    <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL Tutorial"

    would give the result:

    title
    "SPARQL Tutorial"
  • If there are more than one solution, the result of a SPARQL query will be a solution sequence. This sequence is not just an arbitrarily found solution, yet it corresponds to the ways that the graph pattern of the query matches the data. For instance, the query:

    PREFIX foaf:   <http://xmlns.com/foaf/0.1/>
    SELECT ?name ?mbox
    WHERE
    { 
      ?x foaf:name ?name .
      ?x foaf:mbox ?mbox 
    }

    with the data :

    @prefix foaf:  <http://xmlns.com/foaf/0.1/> .
    _:a  foaf:name   "Johnny Lee Outlaw" .
    _:a  foaf:mbox   <mailto:jlow@example.com> .
    _:b  foaf:name   "Peter Goodguy" .
    _:b  foaf:mbox   <mailto:peter@example.org> .
    _:c  foaf:mbox   <mailto:carol@example.org> .

    would give the result:

    name mbox
    "Johnny Lee Outlaw" mailto:jlow@example.com
    "Peter Goodguy" mailto:peter@example.org

A General View of a SPARQL Query


Keeping in mind that SPARQL is highly related to SQL, a SPARQL query is composed of one of four different forms: SELECT, DESCRIBE, CONSTRUCT or ASK; also, it uses a set of triple patterns: subject, predicate, and object, where any variable can be switched for a wild card to obtain a specific result. Consider the SPARQL query below:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name
FROM <http://example.com/dataset.rdf>
WHERE {
?x foaf:name ?name .
}
ORDER BY ?name
  • PREFIX: This keyword describes prefix declarations for abbreviating URIs. Without a prefix, one has to use the whole URI in a SPARQL query.
  • SELECT: This keyword has a similar function as in SQL, which is returning data matching some conditions
  • FROM: This keyword defines the RDF dataset which is being queried. Optionally, the clause FROM NAMED, which is used when one wants to query a named graph.
  • WHERE: This keyword specifies the query graph pattern to be matched. It is vital for a SPARQL query.
  • ORDER BY: This keyword is used to rearrange the query results (Other solution modifiers are LIMIT and OFFSET).
    Instead of SELECT, there are three more return clauses that is available: ASK, DESCRIBE, and CONSTRUCT, where:
  • ASK: It checks whether there is at least one result(either true or false) for a given query pattern.
  • DESCRIBE: It returns an RDF graph that describes a resource.
  • CONSTRUCT: It basically returns an RDF graph that is created from a template specified as part of the query itself.

Wikidata with SPARQL

Wikidata offers a wide range of general data about pretty much everything. It acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others. The data is stored in RDF format, making it accessible via SPARQL endpoints. Users can write SPARQL queries to retrieve specific information from Wikidata, such as retrieving all cities in a particular country, finding notable people born in a specific year, or querying relationships between different entities.

The Wikidata Query Service (WDQS) is Wikidata's own SPARQL endpoint. It returns the results of queries made in SPARQL. The service can be used both as an interactive web interface or programmatically by submitting GET or POST requests to https://query.wikidata.org/sparql. The interactive web interface includes many sample queries, making it suitable for beginners.

Additional Resources and References


Cmpe 352 Archive
Clone this wiki locally