Accessing an Endpoint Programmatically
This topic provides guidance on accessing Data on Demand endpoints programmatically by showing some example implementations using R and Python.
- Authentication and Data Access
- Accessing an Endpoint with R (Through RStudio)
- Accessing an Endpoint with Python (Through a Linux Terminal)
Authentication and Data Access
Connections to Data on Demand endpoints must be authenticated. Users can submit their Anzo username and password when accessing data. Ultimately the data that is available to users from OData endpoints is subject to the security and composition of the graphmart as configured in Anzo.
Accessing an Endpoint with R (Through RStudio)
The following example shows how to connect to an OData endpoint from RStudio. The example uses the R programming language to access a Data on Demand endpoint and pull in data via a standard dataframe. New or existing R scripts can then be used with the data.
The first step in accessing data from RStudio is to prepare the R script that will construct the target URL and retrieve the resulting information via HTTP. The example script below accesses a pre-configured "Sample Data" endpoint. The script has sections for filtering the results as well as expanding the selection to include information from multiple classes:
require("httr") require("jsonlite") require("rstudioapi") user <- rstudioapi::showPrompt("Username", "Enter Anzo username", "sysadmin") pw <- rstudioapi::askForPassword(paste("Enter password for",user,sep=" ")) ## Data on Demand endpoint odata <- "https://cambridgesemantics.com/dataondemand/Sample-Graphmart/Sample-Data" ## Start from Probe class startClass <- "Probe?" ## Filter results for Homo sapiens species filterKw <- "$filter=" filterVal <- "Species eq 'Hs'" urlify <- URLencode(filterVal) filterStr <- paste(filterKw,urlify,sep="") ## Select properties of interest (FeatureID) from base class selectKw <- "&$select=" selectVal <- "FeatureID" selectStr <- paste(selectKw,selectVal,sep="") ## Select properties of interest (symbol) from Gene class ## via corresponds_to property on base Probe class expandKw <- "&$expand=" expandClass <- "corresponds_to" expandProps <- "symbol" expSelStr <- "$select=" expandStr <- paste(expandKw,expandClass,"(",expSelStr,expandProps,")",sep="") ## Specify format format <- "&$format=json" ## Generate OData URL using fragments above url <- paste(odata,startClass,filterStr,selectStr,expandStr,format,sep="") ## Access OData endpoint resultRaw <- GET(url, (authenticate(user,pw, type = "basic"))) resultTxt <- content(resultRaw, "text") resultJson <- fromJSON(resultTxt, flatten = TRUE) print(url) ## Read results into dataframe resultDataFrame <- as.data.frame(resultJson) View(resultDataFrame)
Executing the above R script from RStudio results in a dataframe that represents columns from the Probe and Gene classes.
Accessing an Endpoint with Python (Through a Linux Terminal)
Many users have existing Python scripts to use with data in Anzo or a familiarity with Python that would make exploring, retrieving, and leveraging the data easier. The following example shows how to connect to an OData endpoint by executing a Python script from a Linux terminal.
The first step in accessing data using Python is to prepare the Python script that will construct the target URL and retrieve the resulting information via HTTP. The example script below accesses a pre-configured "Sample Data" endpoint. The script has sections for filtering the results as well as expanding the selection to include information from multiple classes (the same filter and class properties that were used in the R example above).
import requests import getpass from urllib.parse import urlparse un = getpass.getpass(prompt='Username: ') pw = getpass.getpass(prompt='Password: ') ## OData endpoint odata = 'https://cambridgesemantics.com/dataondemand/Sample-Graphmart/Sample-Data/' # data on demand url ## Start from Lease class startClass = "Probe?" ## Filter results filterKw = "$filter=" filterVal = "Species eq 'Hs'" urlify = urlparse(filterVal) filterStr = filterKw + urlify.geturl() ## Select properties of interest (start date, missed payments, lease status) from base class selectKw = "&$select=" selectVal = "FeatureID" selectStr = selectKw + selectVal ## Select properties of interest (name, social security number, credit score) from Individual class expandKw = "&$expand=" expandClass = "corresponds_to" expandProps = "symbol" expSelStr = "$select=" expandStr = expandKw + expandClass + "(" + expSelStr + expandProps + ")" ## Specify format format = "&$format=text/csv" ## Generate OData URL using fragments above url = odata + startClass + filterStr + selectStr + expandStr + format ## Access OData endpoint r = requests.get(url, auth=(un, pw), verify=False) print("URL") print(url) print("CONTENT") print(r.content.decode('unicode_escape')) print(type(r)) print(type(r.content))
In this example, the output is returned in CSV format (rather than JSON, as in the R example).