Skip to content

Datamart Dataset APIs

  • API Version: 1.0.0
  • Release date: Stable release
  • Uses Dataset Metadata version schema: 1.0.0
  • Uses Dataset version schema: 0.0.3
  • Authors: Pedro Szekely, Ke-Thia Yao and Daniel Garijo

Datamart exposes two main APIs: a Dataset metadata API, where developers may retrieve metadata about datasets and variables; and a Dataset content API, where developers may download datasets and their variable time series.

Info

The metadata API follows the Dataset schema in https://datamart-upload.readthedocs.io/en/latest/. The content API follows the schema in https://datamart-upload.readthedocs.io/en/latest/download/

An implementation of the API is available at: https://datamart:datamart-api-789@dsbox02.isi.edu/datamart-api. We illustrate how to use it in a Jupyter notebook.

Metadata API.

The metadata API supports the operations listed below:

Path Method Description Parameters
/metadata/datasets GET Returns all datasets (list of Dataset) We support filtering datasets according to the following parameters:
name: name of the dataset. Example: &name=fbiData2009
geo: Spatial location. Example: &geo=33.946799,-118.4307395,15z
intersects: Intersection if the dataset location with a bounding box in format [lonmin,lonmax,latmin,latmax]. Example: &intersects=84.7142,-76.7142,14.9457,22.945
keyword: A relevant keyword (or keyword list separated by ",") that points to relevant variables, subjects or location of the dataset Example: &keyword=maize,ethiopia
/metadata/datasets POST Creates a new Dataset record.
Returns: Status code 201 (created) if successful, along with the dataset id.
NOTE: If the POST methods have already been executed against the Datamart server, then server will respond with an error message.
/metadata/datasets/dataset_id PUT REPLACES the entry of the dataset identified by dataset_id with the JSON received in the request. Returns: Status code 200 if successful. None
/metadata/datasets/dataset_id GET Returns the metadata of the Dataset identified by dataset_id None
/metadata/datasets/dataset_id/variables GET Returns all Variables in a dataset identified by dataset_id (list of variable) None
/metadata/datasets/dataset_id/variables POST Creates a new Variable in the dataset identified by dataset_id. Returns 201 if successful None
/metadata/datasets/dataset_id/variables/variable_id GET Returns the Variable variable_id in the dataset identified by dataset_id None
/metadata/variables GET Returns all existing variable metadata We support filtering datasets according to the following parameters:
ids: Variable ids to be returned (could be more than one). Example: &ids=H123,H124
name: name of the variable. Example: &name=population
geo: Spatial location: Example: &geo=33.946799,-118.4307395,15z
intersects: Intersection if the variable location with a bounding box in format [lonmin,lonmax,latmin,latmax]. Example: &intersects=84.7142,-76.7142,14.9457,22.945
keyword: A relevant keyword (or keyword list separated by ",") that points to relevant aspects of the variable Example: &keyword=production,ethiopia

When a request includes a filter (e.g., by keyword), the response table will also have a rank column with a score indicating the best matches for the given request (highest scores indicate better matching).

Data Content API.

Path Method Description Parameters
/datasets/dataset_id GET Returns the raw dataset identified by dataset_id in its original format. Raw data could be in any format, such as CSV, TSV, PDF, images, zip, etc. None
/datasets/dataset_id/variables GET Returns a CSV with the variables included in the dataset identified by dataset_id. The results follow the canonical data format, and do not include qualifiers. limit: The API will return data for 20 variables only, by default. However that limit can be increased by setting the limit in the url. Example: ?limit=50
/datasets/dataset_id/variables?variable=variable_id GET Returns a CSV in canonical data format for the specified dataset (dataset_id) and variable (variable_id). include: Additional columns to download. Example: &include=country_id,admin1_id
exclude: Exclude columns from download. Example: &exclude=coordinate
country: Download rows where the main subject is one of the specified countries. Example: &country=Ethiopia,Sudan
country_id: Download rows where the main subject is one of the specified country identifiers.Example: &country_id=Q115,Q1049
admin1: Download rows where the main subject is one of the specified first-level administrative regions. Example: &admin1=Oromia+Region
admin1_id: Download rows where the main subject is one of the specified first-level administrative region identifiers.Example: &admin1_id=Q202107
admin2: Download rows where the main subject is one of the specified second-level administrative regions.- Example: &admin2=Arsi+Zone
admin2_id: Download rows where the main subject is one of the specified second-level administrative region identifiers. Example: &admin2_id=Q646859
admin3: Download rows where the main subject is one of the specified third-level administrative regions. Example: &admin3=Amigna,Digeluna+Tijo
admin3_id: Download rows where the main subject is one of the specified third-level administrative region identifiers. Example: &admin3_id=Q2843318,Q5275598
in_country: Download rows where the main subject is a first-level administrative regions of the specified countries. Example: &in_country=Ethiopia
in_country_id: Download rows where the main subject is a first-level administrative regions of the specified country identifiers. Example: &in_country_id=Q115
in_admin1: Download rows where the main subject is a second-level administrative regions of the specified first-level administrative regions. Example: &in_admin1=Oromia+Region
in_admin1_id: Download rows where the main subject is a second-level administrative regions of the specified first-level administrative region identifiers. Example: &in_admin1_id=Q202107
in_admin2: Download rows where the main subject is a third-level administrative regions of the specified second-level administrative regions. Example: &in_admin2=Arsi+Zone
in_admin2_id: Download rows where the main subject is a third-level administrative regions of the specified second-level administrative regions. Example: &in_admin2_id=Q646859
/datasets/dataset_id/variables/variable_id PUT Uploads data to a variable of a dataset. The variable must already exist in the dataset (i.e., it has to be created by POST to /metadata/datasets/{dataset_id}/variables). DEPRECATED
/datasets/dataset_id/variables/variable_id DELETE Deletes the variable from a target dataset. None
/datasets/dataset_id/annotated PUT, POST Uploads data to one or more variables in the target dataset

Additional considerations:

All the region parameters (i.e. country, country_id, admin1, etc) can be used at the same time. Datamart interprets multiple region parameters as or constraints.

The Datamart uses place names based on Wikidata place name labels in English. Also, a place can be identified using its Wikidata qnode id. The mapping between place name and its identifier, as well as its administrative hierarchy, can be found in this file.

-Example:

  • GET [API_URL]/[dataset_id]/variable/[variable_id]: Get a CSV table of crop productions
  • GET [API_URL]/[dataset_id]/variable/[variable_id]/area&include=admin1_id: Get a CSV table of land area used for crop productions, and include the admin1_id column in the table.

Aggregation of Data Content API

Path Method Description Parameters
/datasets/dataset_id/variable/variable_id GET Returns an aggregated dataset from dataset dataset_id and variable variable_id in canonical data format. group-by: specifies the column to use for aggregation
operator: specifies the function to use for aggregation

-Example: - GET [API_URL]/datasets/[dataset_id]/variables/[variable_id]?group-by=admin1_id&operator=sum: Get food production aggregated at theadmin1 region level.