Accessing Data from the SPARQL Endpoint

Anzo offers a standard HTTP(S) SPARQL endpoint for sending SPARQL requests between client applications and Anzo. The endpoint is enabled by default. This topic provides the base endpoint URL and describes the supported HTTP methods and parameters.

Authentication

The Anzo SPARQL endpoint supports Basic Authentication. The endpoint can be configured to enable other Anzo-supported authentication methods. However, implementing alternate authentication mechanisms can have unexpected results. For more information, contact Cambridge Semantics Support.

Ultimately the data that is available to users from SPARQL endpoints depends on the access control configuration of the graphmart or linked data set as configured in Anzo.

HTTP Methods and Options

The Anzo SPARQL endpoint accepts HTTP GET and POST methods. GET is used to retrieve data from the endpoint, and POST is used to send data to the endpoint. Update queries must use the POST method, and read queries can be submitted using GET or POST.

Endpoint Base URL

Use the following base URL to access data in Anzo via the SPARQL endpoint. The table below describes each base URL component:

<protocol>://<hostname>:<port>/sparql/<store_type>/<url-encoded_dataset_uri>
Option Description
protocol The protocol to use for the connection: http for HTTP protocol or https for SSL protocol.
hostname The DNS name or IP address of the Anzo server.
port The port for the endpoint. The port that you specify depends on the protocol that you choose. By default, the HTTP port is 80 and the HTTPS port is 443. To view the ports that are configured for your Anzo instance, see Server Settings in the Administration menu.
sparql Required keyword for the SPARQL endpoint.
store_type The type of RDF store for the data. Typically users specify graphmart to query data that is in a graphmart. It is also possible to query the metadata for a linked data set (LDS) in the Dataset catalog. To query an LDS that is stored in a local volume, specify lds as the store type.
url-encoded_dataset_uri The URI for the graphmart or the catalog entry for the LDS. The URI must be URL-encoded using upper case hexadecimal digits. Lower case hexadecimal digits are not supported at this time.

How do I find the URI for a Graphmart?

How do I find the catalog entry URI for a Dataset?

For example, the following base endpoint URL targets the data in a graphmart:

https://10.100.10.20:8443/sparql/graphmart/http%3A%2F%2Fcambridgesemantics.com%2FGraphmart%2F1ad0ee911b834097ad7f71ee0ae1c0ff

The example below shows a base endpoint URL that targets a Dataset catalog entry:

https://10.100.10.20:8443/sparql/lds/http%3A%2F%2Fopenanzo.org%2FcatEntry(%255Bhttp%253A%252F%252Fcsi.com%252FFileBasedLinkedDataSet%252F001e517db4f0eaea9f279427e4e2a828%255D%2540%255Bhttp%253A%252F%252Fopenanzo.org%252Fdatasource%252FsystemDatasource%255D)

HTTP Header Options

The HTTP header provides information related to the transfer of data between the requesting client and the SPARQL endpoint. The table below describes the supported HTTP header options. Both of the fields are optional.

Option Description
Content-Type The Content-Type specifies the type of request that is being sent by the client. Anzo supports the following Content-Type values:
  • application/x-www-form-urlencoded: Including this value specifies that the query string will be passed as the value of a "query" or "update" HTTP parameter. This is the default value. When Content-Type is not specified, the endpoint behaves as if Content-Type: application/x-www-form-urlencoded is specified.
  • application/sparql-query: Including this value specifies that the HTTP request body includes a SPARQL read (non-update) query.
  • application/sparql-update: Including this value specifies that the HTTP request body includes a SPARQL update query.
Accept The Accept field specifies the response formats that are acceptable for the server to send back to the client. You can use this field to specify the output serialization format for query results in place of the format HTTP parameter. For details about the supported formats, see Format Options below.

HTTP Body Parameters

The HTTP parameters in the body of the request provide the rest of the information about the request. Certain parameters are appropriate for read-only queries, SELECT and CONSTRUCT, and others are appropriate for updates, INSERT and DELETE. The tables below describe the supported parameters for query and update requests.

Query Parameters

Parameter Description
query Specifies the full read-only query string to run. If you do not specify a url-encoded_dataset_uri, default-graph-uri or named-graph-uri in the request, the query string should contain the appropriate FROM clauses.

To run an update query (INSERT or DELETE), use the update parameter.

default-graph-uri Specifies a default graph URI to query. You can include this parameter multiple times in a request. When the base URL specifies a graphmart URI, you can specify a data layer URI to narrow the scope of the query to a specific data layer in the graphmart.
named-graph-uri Specifies a named graph URI to query. You can include this parameter multiple times in a request. When the base URL specifies a graphmart URI, you can specify a data layer URI to narrow the scope of the query to a specific data layer in the graphmart.
format Specifies the serialization format to use for the results of the query. For details about the supported formats, see Format Options below.
includeMetadataGraphs A boolean value that specifies whether to query the metadata graphs. Only valid for queries that target a linked data set (LDS) that is stored in a local volume. The default value is includeMetadataGraphs=false.
delim Specifies a custom delimiter character to use in CSV output results. Valid only for SELECT queries where the output format is text/csv. This field accepts any character. When delim is not specified the default value is a , (comma).
dedup A boolean value that specifies whether to deduplicate CONSTRUCT results on the client side. When dedup is not specified, the default value is dedup=true.
serverDedup A boolean value that specifies whether to deduplicate CONSTRUCT results on the server side. When serverDedup is not specified, the default value is serverDedup=true.
skipCache A boolean value that specifies whether to skip the reuse of any query cache that exists from a previous run of the query. When skipCache is not specified, the default value is skipCache=false.
hasHeader A boolean value that specifies whether to include headers in CSV results. Valid only for SELECT queries where the output format is text/csv. When hasHeader is not specified, the default value is hasHeader=false.
attachResult A boolean value that specifies whether to provide the query response as a file "attachment," i.e. the HTTP response will include the Content-Disposition of attachment. When attachResult is not specified, the default value is attachResult=false. When returning results as an attachment, you can specify a file name in filename the parameter.
filename If attachResult is true, this parameter specifies the file name to use for the attachment, excluding the file extension. If attachResult is true and filename is not specified, the default file name is QueryResult.

Format Options

The table below describes the options for specifying the serialization format of the results that the server sends back to the client. These format options, i.e., MIME types or file extensions, can be specified in the format parameter in the body of the request or in the Accept header.

When the request does not include the format parameter or Accept header, the default result format for SELECT queries is SPARQL XML (application/sparql-results+xml). For CONSTRUCT queries, the default format depends on whether the query includes GRAPH clauses. If no GRAPH clause is present, the default format for CONSTRUCT results is RDF Turtle. If GRAPH clauses are present, the default format is RDF TriG.

Format Accepted Values Query Type Description
XML application/sparql-results+xml
application/xml
xml
xml2
srx
SELECT only Returns results in SPARQL Query Results XML Format.
application/rdf+xml
rdf
owl
rdfs
CONSTRUCT only Returns results in RDF 1.1 XML format.
JSON application/json
json
SELECT and CONSTRUCT For SELECT queries, results are returned in SPARQL Query Results JSON Format.

For CONSTRUCT queries, results are returned in Anzo's native JSON RDF serialization format. See Anzo JSON RDF Serialization for details.

application/sparql-results+json SELECT only Returns results in SPARQL Query Results JSON Format.
CSV text/csv
csv
SELECT only Returns results in SPARQL Query Results CSV Format.
TriG and
Gzipped TriG
application/x-trig
trig
application/x-trigz
trigz
gz
trig.gz
CONSTRUCT only CONSTRUCT queries with a GRAPH clause return RDF 1.1 TriG by default if no format is specified.
Turtle and
Gzipped Turtle
application/x-turtle
ttl
application/x-turtlez
ttlz
ttl.gz
CONSTRUCT only Returns RDF 1.1 Turtle.

CONSTRUCT queries without a GRAPH clause return Turtle by default if no format is specified.

N-Triples text/plain
nt
CONSTRUCT only Returns results in RDF 1.1 N-Triples format.
Notation3 and
Gzipped Notation3
text/rdf+n3
n3
text/rdf+n3z
n3z
n3z.gz
CONSTRUCT only Returns results in RDF Notation3 format.
N-Quads text/x-nquads
nq
nquad
nquads
CONSTRUCT only Returns results in RDF 1.1 N-Quads format.
TriX application/trix
trix
CONSTRUCT only Returns results in RDF Triples in XML format.

Update Parameters

Parameter Description
update Specifies the full update string to run. If you do not specify a url-encoded_dataset_uri, using-graph-uri or using-named-graph-uri in the request, the update query should contain the appropriate USING clauses.

To run a non-update query (SELECT or CONSTRUCT), use the query parameter.

using-graph-uri Specifies a default graph URI to update. You can include this parameter multiple times in a request. When the base URL specifies a graphmart URI, you can specify a data layer URI to narrow the scope of the update to a specific data layer in the graphmart.
using-named-graph-uri Specifies a named graph URI to update. You can include this parameter multiple times in a request. When the base URL specifies a graphmart URI, you can specify a data layer URI to narrow the scope of the update to a specific data layer in the graphmart.
includeMetadataGraphs A boolean value that specifies whether to query the metadata graphs. Only valid for queries that target a linked data set (LDS) that is stored in a local volume. The default value is false.

Examples

The following example uses cURL to send a request that runs a SELECT query against a graphmart. Since the request does not include an Accept header or format parameter, results will be returned in SPARQL XML format.

curl --user sysadmin:@nz0 -c cookiejar.txt -L -v -k
http://10.100.10.20/sparql/graphmart/http%3A%2F%2Fcambridgesemantics.com%2FGraphmart%2F2dc579b101654ae29eb91b0c7d046ca1 
--data-urlencode "query=SELECT * WHERE{ ?s ?p ?o . } LIMIT 100"

The following example sends a GET request that runs a SELECT query against a graphmart. The format parameter is included to format the results in text/csv serialization.

For reference, below is the URL-encoded version of the request string shown in the image above. When sending a request from a client that does not automatically encode requests, you must convert the string. Line breaks are added for readability:

http://10.100.10.20/sparql/graphmart/http%3A%2F%2Fcambridgesemantics.com%2F
Graphmart%2F646861d1bab54d67bc79dea94e02f3e6
?query=select%20*%20where%20%7B%3Fs%20%3Fp%20%3Fo%7D%20limit%20100

The example below sends a POST request that runs a SELECT query. In this example, the query is included in the body of the request and the response format is XML.

The example below sends a GET request that runs a CONSTRUCT query. The response format is set to JSON, and the results are formatted in Anzo JSON RDF Serialization.

The example below uses a Python script to send a request that runs a SPARQL query.

import requests
import urllib

server = 'https://company.anzo.com:'
port = 443
graphmart = 'http://cambridgesemantics.com/Graphmart/be4bd080c5654628b6fff90ca1b647d6'
url = server + str(port) + '/sparql/graphmart/' + urllib.quote_plus(graphmart) #urllib.parse.quote_plus(graphmart) in Python 3

queryText = 'SELECT * WHERE {?instance a ?type .} LIMIT 10'
payload = {'query':queryText, 'format':'text/csv'}

r = requests.post(url, data = payload, auth = ('sysadmin','<pw>'))
print r.text
Related Topics