Datamart Dataset APIs¶
- API Version: 1.0.0
- Release date: Stable release
- Uses Dataset Metadata version schema: 1.0.0
- Uses Dataset version schema: 0.0.3
- Authors: Pedro Szekely, Ke-Thia Yao and Daniel Garijo
Datamart exposes two main APIs: a Dataset metadata API, where developers may retrieve metadata about datasets and variables; and a Dataset content API, where developers may download datasets and their variable time series.
Info
The metadata API follows the Dataset schema in https://datamart-upload.readthedocs.io/en/latest/. The content API follows the schema in https://datamart-upload.readthedocs.io/en/latest/download/
An implementation of the API is available at: https://datamart:datamart-api-789@dsbox02.isi.edu/datamart-api. We illustrate how to use it in a Jupyter notebook.
Metadata API.¶
The metadata API supports the operations listed below:
Path | Method | Description | Parameters |
---|---|---|---|
/metadata/datasets | GET | Returns all datasets (list of Dataset) | We support filtering datasets according to the following parameters:name : name of the dataset. Example: &name=fbiData2009 geo : Spatial location. Example: &geo=33.946799,-118.4307395,15z intersects : Intersection if the dataset location with a bounding box in format [lonmin,lonmax,latmin,latmax]. Example: &intersects=84.7142,-76.7142,14.9457,22.945 keyword : A relevant keyword (or keyword list separated by ",") that points to relevant variables, subjects or location of the dataset Example: &keyword=maize,ethiopia |
/metadata/datasets | POST | Creates a new Dataset record. Returns: Status code 201 (created) if successful, along with the dataset id. |
NOTE: If the POST methods have already been executed against the Datamart server, then server will respond with an error message. |
/metadata/datasets/dataset_id | PUT | REPLACES the entry of the dataset identified by dataset_id with the JSON received in the request. Returns: Status code 200 if successful. |
None |
/metadata/datasets/dataset_id | GET | Returns the metadata of the Dataset identified by dataset_id |
None |
/metadata/datasets/dataset_id/variables | GET | Returns all Variables in a dataset identified by dataset_id (list of variable) |
None |
/metadata/datasets/dataset_id/variables | POST | Creates a new Variable in the dataset identified by dataset_id . Returns 201 if successful |
None |
/metadata/datasets/dataset_id/variables/variable_id | GET | Returns the Variable variable_id in the dataset identified by dataset_id |
None |
/metadata/variables | GET | Returns all existing variable metadata | We support filtering datasets according to the following parameters:ids : Variable ids to be returned (could be more than one). Example: &ids=H123,H124 name : name of the variable. Example: &name=population geo : Spatial location: Example: &geo=33.946799,-118.4307395,15z intersects : Intersection if the variable location with a bounding box in format [lonmin,lonmax,latmin,latmax]. Example: &intersects=84.7142,-76.7142,14.9457,22.945 keyword : A relevant keyword (or keyword list separated by ",") that points to relevant aspects of the variable Example: &keyword=production,ethiopia |
When a request includes a filter (e.g., by keyword), the response table will also have a rank
column with a score indicating the best matches for the given request (highest scores indicate better matching).
Data Content API.¶
Path | Method | Description | Parameters |
---|---|---|---|
/datasets/dataset_id | GET | Returns the raw dataset identified by dataset_id in its original format. Raw data could be in any format, such as CSV, TSV, PDF, images, zip, etc. |
None |
/datasets/dataset_id/variables | GET | Returns a CSV with the variables included in the dataset identified by dataset_id . The results follow the canonical data format, and do not include qualifiers. |
limit : The API will return data for 20 variables only, by default. However that limit can be increased by setting the limit in the url. Example: ?limit=50 |
/datasets/dataset_id/variables?variable=variable_id | GET | Returns a CSV in canonical data format for the specified dataset (dataset_id ) and variable (variable_id ). |
include : Additional columns to download. Example: &include=country_id,admin1_id exclude : Exclude columns from download. Example: &exclude=coordinate country : Download rows where the main subject is one of the specified countries. Example: &country=Ethiopia,Sudan country_id : Download rows where the main subject is one of the specified country identifiers.Example: &country_id=Q115,Q1049 admin1 : Download rows where the main subject is one of the specified first-level administrative regions. Example: &admin1=Oromia+Region admin1_id : Download rows where the main subject is one of the specified first-level administrative region identifiers.Example: &admin1_id=Q202107 admin2 : Download rows where the main subject is one of the specified second-level administrative regions.- Example: &admin2=Arsi+Zone admin2_id : Download rows where the main subject is one of the specified second-level administrative region identifiers. Example: &admin2_id=Q646859 admin3 : Download rows where the main subject is one of the specified third-level administrative regions. Example: &admin3=Amigna,Digeluna+Tijo admin3_id : Download rows where the main subject is one of the specified third-level administrative region identifiers. Example: &admin3_id=Q2843318,Q5275598 in_country : Download rows where the main subject is a first-level administrative regions of the specified countries. Example: &in_country=Ethiopia in_country_id : Download rows where the main subject is a first-level administrative regions of the specified country identifiers. Example: &in_country_id=Q115 in_admin1 : Download rows where the main subject is a second-level administrative regions of the specified first-level administrative regions. Example: &in_admin1=Oromia+Region in_admin1_id : Download rows where the main subject is a second-level administrative regions of the specified first-level administrative region identifiers. Example: &in_admin1_id=Q202107 in_admin2 : Download rows where the main subject is a third-level administrative regions of the specified second-level administrative regions. Example: &in_admin2=Arsi+Zone in_admin2_id : Download rows where the main subject is a third-level administrative regions of the specified second-level administrative regions. Example: &in_admin2_id=Q646859 |
/datasets/dataset_id/variables/variable_id | PUT | Uploads data to a variable of a dataset. | The variable must already exist in the dataset (i.e., it has to be created by POST to /metadata/datasets/{dataset_id}/variables ). DEPRECATED |
/datasets/dataset_id/variables/variable_id | DELETE | Deletes the variable from a target dataset. | None |
/datasets/dataset_id/annotated | PUT, POST | Uploads data to one or more variables in the target dataset |
Additional considerations:
All the region parameters (i.e. country
, country_id
, admin1
,
etc) can be used at the same time. Datamart interprets multiple region parameters as or constraints.
The Datamart uses place names based on Wikidata place name labels in English. Also, a place can be identified using its Wikidata qnode id. The mapping between place name and its identifier, as well as its administrative hierarchy, can be found in this file.
-Example:
GET [API_URL]/[dataset_id]/variable/[variable_id]
: Get a CSV table of crop productionsGET [API_URL]/[dataset_id]/variable/[variable_id]/area&include=admin1_id
: Get a CSV table of land area used for crop productions, and include theadmin1_id
column in the table.
Aggregation of Data Content API¶
Path | Method | Description | Parameters |
---|---|---|---|
/datasets/dataset_id/variable/variable_id | GET | Returns an aggregated dataset from dataset dataset_id and variable variable_id in canonical data format. |
group-by : specifies the column to use for aggregation operator : specifies the function to use for aggregation |
-Example:
- GET [API_URL]/datasets/[dataset_id]/variables/[variable_id]?group-by=admin1_id&operator=sum
: Get food production aggregated at theadmin1
region level.