Datamart Dataset APIs¶

API Version: 1.0.0
Release date: Stable release
Uses Dataset Metadata version schema: 1.0.0
Uses Dataset version schema: 0.0.3
Authors: Pedro Szekely, Ke-Thia Yao and Daniel Garijo

Datamart exposes two main APIs: a Dataset metadata API, where developers may retrieve metadata about datasets and variables; and a Dataset content API, where developers may download datasets and their variable time series.

Info

The metadata API follows the Dataset schema in https://datamart-upload.readthedocs.io/en/latest/. The content API follows the schema in https://datamart-upload.readthedocs.io/en/latest/download/

An implementation of the API is available at: https://datamart:datamart-api-789@dsbox02.isi.edu/datamart-api. We illustrate how to use it in a Jupyter notebook.

Metadata API.¶

The metadata API supports the operations listed below:

Path	Method	Description	Parameters
/metadata/datasets	GET	Returns all datasets (list of Dataset)	We support filtering datasets according to the following parameters: `name`: name of the dataset. Example: `&name=fbiData2009` `geo`: Spatial location. Example: `&geo=33.946799,-118.4307395,15z` `intersects`: Intersection if the dataset location with a bounding box in format [lonmin,lonmax,latmin,latmax]. Example: `&intersects=84.7142,-76.7142,14.9457,22.945` `keyword`: A relevant keyword (or keyword list separated by ",") that points to relevant variables, subjects or location of the dataset Example: `&keyword=maize,ethiopia`
/metadata/datasets	POST	Creates a new Dataset record. Returns: Status code 201 (created) if successful, along with the dataset id.	NOTE: If the POST methods have already been executed against the Datamart server, then server will respond with an error message.
/metadata/datasets/dataset_id	PUT	REPLACES the entry of the dataset identified by `dataset_id` with the JSON received in the request. Returns: Status code 200 if successful.	None
/metadata/datasets/dataset_id	GET	Returns the metadata of the Dataset identified by `dataset_id`	None
/metadata/datasets/dataset_id/variables	GET	Returns all Variables in a dataset identified by `dataset_id` (list of variable)	None
/metadata/datasets/dataset_id/variables	POST	Creates a new Variable in the dataset identified by `dataset_id`. Returns 201 if successful	None
/metadata/datasets/dataset_id/variables/variable_id	GET	Returns the `Variable` `variable_id` in the dataset identified by `dataset_id`	None
/metadata/variables	GET	Returns all existing variable metadata	We support filtering datasets according to the following parameters: `ids`: Variable ids to be returned (could be more than one). Example: `&ids=H123,H124` `name`: name of the variable. Example: `&name=population` `geo`: Spatial location: Example: `&geo=33.946799,-118.4307395,15z` `intersects`: Intersection if the variable location with a bounding box in format [lonmin,lonmax,latmin,latmax]. Example: `&intersects=84.7142,-76.7142,14.9457,22.945` `keyword`: A relevant keyword (or keyword list separated by ",") that points to relevant aspects of the variable Example: `&keyword=production,ethiopia`

When a request includes a filter (e.g., by keyword), the response table will also have a rank column with a score indicating the best matches for the given request (highest scores indicate better matching).

Data Content API.¶

Path	Method	Description	Parameters
/datasets/dataset_id	GET	Returns the raw dataset identified by `dataset_id` in its original format. Raw data could be in any format, such as CSV, TSV, PDF, images, zip, etc.	None
/datasets/dataset_id/variables	GET	Returns a CSV with the variables included in the dataset identified by `dataset_id`. The results follow the canonical data format, and do not include qualifiers.	`limit`: The API will return data for 20 variables only, by default. However that limit can be increased by setting the limit in the url. Example: `?limit=50`
/datasets/dataset_id/variables?variable=variable_id	GET	Returns a CSV in canonical data format for the specified dataset (`dataset_id`) and variable (`variable_id`).	`include`: Additional columns to download. Example: `&include=country_id,admin1_id` `exclude`: Exclude columns from download. Example: `&exclude=coordinate` `country`: Download rows where the main subject is one of the specified countries. Example: `&country=Ethiopia,Sudan` `country_id`: Download rows where the main subject is one of the specified country identifiers.Example: `&country_id=Q115,Q1049` `admin1`: Download rows where the main subject is one of the specified first-level administrative regions. Example: `&admin1=Oromia+Region` `admin1_id`: Download rows where the main subject is one of the specified first-level administrative region identifiers.Example: `&admin1_id=Q202107` `admin2`: Download rows where the main subject is one of the specified second-level administrative regions.- Example: `&admin2=Arsi+Zone` `admin2_id`: Download rows where the main subject is one of the specified second-level administrative region identifiers. Example: `&admin2_id=Q646859` `admin3`: Download rows where the main subject is one of the specified third-level administrative regions. Example: `&admin3=Amigna,Digeluna+Tijo` `admin3_id`: Download rows where the main subject is one of the specified third-level administrative region identifiers. Example: `&admin3_id=Q2843318,Q5275598` `in_country`: Download rows where the main subject is a first-level administrative regions of the specified countries. Example: `&in_country=Ethiopia` `in_country_id`: Download rows where the main subject is a first-level administrative regions of the specified country identifiers. Example: `&in_country_id=Q115` `in_admin1`: Download rows where the main subject is a second-level administrative regions of the specified first-level administrative regions. Example: `&in_admin1=Oromia+Region` `in_admin1_id`: Download rows where the main subject is a second-level administrative regions of the specified first-level administrative region identifiers. Example: `&in_admin1_id=Q202107` `in_admin2`: Download rows where the main subject is a third-level administrative regions of the specified second-level administrative regions. Example: `&in_admin2=Arsi+Zone` `in_admin2_id`: Download rows where the main subject is a third-level administrative regions of the specified second-level administrative regions. Example: `&in_admin2_id=Q646859`
/datasets/dataset_id/variables/variable_id	PUT	Uploads data to a variable of a dataset.	The variable must already exist in the dataset (i.e., it has to be created by POST to `/metadata/datasets/{dataset_id}/variables`). DEPRECATED
/datasets/dataset_id/variables/variable_id	DELETE	Deletes the variable from a target dataset.	None
/datasets/dataset_id/annotated	PUT, POST	Uploads data to one or more variables in the target dataset

Additional considerations:

All the region parameters (i.e. country, country_id, admin1, etc) can be used at the same time. Datamart interprets multiple region parameters as or constraints.

The Datamart uses place names based on Wikidata place name labels in English. Also, a place can be identified using its Wikidata qnode id. The mapping between place name and its identifier, as well as its administrative hierarchy, can be found in this file.

-Example:

GET [API_URL]/[dataset_id]/variable/[variable_id]: Get a CSV table of crop productions
GET [API_URL]/[dataset_id]/variable/[variable_id]/area&include=admin1_id: Get a CSV table of land area used for crop productions, and include the admin1_id column in the table.

Aggregation of Data Content API¶

Path	Method	Description	Parameters
/datasets/dataset_id/variable/variable_id	GET	Returns an aggregated dataset from dataset `dataset_id` and variable `variable_id` in canonical data format.	`group-by`: specifies the column to use for aggregation `operator`: specifies the function to use for aggregation

-Example: - GET [API_URL]/datasets/[dataset_id]/variables/[variable_id]?group-by=admin1_id&operator=sum: Get food production aggregated at theadmin1 region level.