NAV
shell python

Introduction

Welcome to the Tree Schema API!

The Tree Schema API gives you programatic access to just about every resource within Tree Schema. The Tree Schema API is designed to give you the ability to keep your data catalog up to date by integrating Tree Schema directly into your ETL jobs, model pipelines and analytical workflows.

We have language bindings depicted in Shell and Python (with more on the way!) but all of the interfaces are built around REST so you can interact with Tree Schema from any You can view code examples in the dark area to the right, and you can switch the programming language of the examples with the tabs in the top right.

All API requests are made to the following host:

https://api.treeschema.com

Make sure to properly authenticate!

API Overview

Authentication

Use base64 encoding to create the authentication string for your account. You will need to concatenate your email, a colon and your secret key before base64 encoding the full string. Add the Authorization key to your headers with the Basic prefix.

SECRET_KEY=your_secret_key
TREE_SCHEMA_EMAIL=your_email
ENCODED_SECRET=$(echo -n "$TREE_SCHEMA_EMAIL:$SECRET_KEY" | openssl base64)

curl -H "Authorization: Basic $ENCODED_SECRET" \
"https://api.treeschema.com/catalog/search?term=dev"

import base64
import requests as r

creds = (your_email + ':' + your_secret_key).encode('utf-8')
encoded_creds = base64.b64encode(creds).decode('utf-8')

headers = {
    'Authorization': 'Basic ' + encoded_creds
}

resp = r.get(..., headers=headers)

Authorization is done using a combination of the email used for your Tree Schema account and your user secret key. Your organization owner will first need to enable programatic access for your org and once that is done you can access your personal secret key from your user profile.

Tree Schema expects for your secret key to be included in all API requests to the server in a header Authorization that looks like the following:

Authorization: Basic your_encoded_secret

You can view detailed instructions on how to generate your API keys in our help and documentation.

Pagination

An example of a meta response object with a next page

{
  "meta": {
    "current_page": 2,
    "next_page": 3,
    "total_cnt": 123
  },
  ...
}

An example of a meta response object without a next page

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 5
  },
  ...
}

When retrieving a list of objects with a [GET] request, results are being paginated by Tree Schema.

All paginated responses return 1000 results per request.

Meta Response Object

Field Data Type Description
current_page integer The number for the current page
next_page integer The number for the next page, if there is a next page, this will be null if there is not a next page
total_cnt integer The total count of objects returned for the given API

Meta information is returned for all queries that contain pagination. The meta object will respond with the page number for the next page and the total

Additional Headers

HTTP headers:

{ "Content-Type": "application/json" }

Every POST, PUT and DELETE HTTP request sent to the Tree Schema Public API must specify the Content-Type entity header to application/json.

Data Stores

Data stores are containers for your data, they can be databases, file stores, dashboard tools and more. They are where your data physically (or virtually) resides. You can create and retrieve data stores.

Data Store Object

The data store object

{
  "data_store_id": 18,
  "name": "Kafka Prod Cluster",
  "type": "kafka",
  "other_type": null,
  "created_ts": "2020-09-23 18:16:16",
  "updated_ts": "2020-09-23 18:16:16",
  "description_markup": "<p>This is the Kafka cluster.</p>",
  "description_raw": "This is the Kafka cluster.",
  "steward": {
    "user_id": 1,
    "name": "Grant",
    "email": "grant@treeschema.com"
  },
  "tech_poc": {
    "user_id": 2,
    "name": "Asher",
    "email": "asher@treeschema.com"
  },
  "details": {
    "bootstrap_servers": "1.3.5.7:22"
  }
}

The Data Store object is returned when you GET a single or multiple data store(s). It is also returned when you create a data store. An example of the data store object can be seen to the right.

Data Store Object Fields

Field Data Type Description
data_store_id integer The ID used to uniquely represent the data store, the same ID can be found in the Tree Schema GUI, the URL for the data store will contain the data store ID
name string The name of the data store
type string The type of the data store
other_type string The more detailed type, if provided
created_ts timestamp The timestamp that the data store was created
updated_ts timestamp The timestamp that the data store was updated
description_markup string An HTML string that represents the full markup description
description_raw string The data store description that has had all markup removed
steward User Object] The data steward assigned to the data store
tech_poc User Object] The technical point of contact assigned to the data store
details object An object that can contain any arbitrary key/value pairs for the data store. Details will include information such as host and port, if the data store is connected to a data base, but users can also add arbitary key/value pairs of information and they will be returned as well.

Get All Data Stores

import requests as r
BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores'

headers = {'Authorization': 'Basic your_encoded_secret'}
resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'
curl -H "Authorization: Basic $ENCODED_SECRET" \
"$BASE_URL/data-stores"

Retrieve all data stores in your organization.

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 5
  },
  "data_stores": [
    {
      "data_store_id": 18,
      "name": "Kafka Prod Cluster",
      "type": "kafka",
      "other_type": null,
      "created_ts": "2020-09-23 18:16:16",
      "updated_ts": "2020-09-23 18:16:16",
      "description_markup": "<p>This is the Kafka cluster.</p>",
      "description_raw": "This is the Kafka cluster.",
      "steward": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      },
      "tech_poc": {
        "user_id": 2,
        "name": "Asher",
        "email": "asher@treeschema.com"
      },
      "details": {
        "bootstrap_servers": "1.3.5.7:22"
      }
    }
  ]
}

HTTPs Request

GET /data-stores

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores
name null The name of the data store

Path Parameters

There are no path parameters for this endpoint.

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
data_stores list[Data Store Object] A list of data store objects

Response Codes

Value Description
200 Retrieved all data stores

Get A Data Store

import requests as r
BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1'

headers = {'Authorization': 'Basic your_encoded_secret'}
resp = r.get(url, headers=headers)
resp.json()

BASE_URL='https://api.treeschema.com/catalog'
curl -H "Authorization: Basic $ENCODED_SECRET" \
"$BASE_URL/data-stores/1"

Retrieve a specific data stores from your organization.

Returns the object:

{
  "data_store": {
    "data_store_id": 1,
    "name": "Oracle DB",
    "type": "oracle",
    "other_type": "",
    "created_ts": "2020-08-15 17:15:24",
    "updated_ts": "2020-08-15 17:15:24",
    "description_markup": null,
    "description_raw": null,
    "steward": {
      "user_id": 2,
      "name": "Asher",
      "email": "asher@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "details": {
      "host": "oracle.host",
      "port": 1521,
      "servicename": "dbschema"
    }
  }
}

HTTPs Request

GET /data-stores/{data_store_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store to retrieve.

Body

There is no body for this endpoint.

Response

Field Data Type Description
data_store Data Store Object A data store object

Response Codes

Value Description
200 Successfully retrieved data store
404 The data store ID requested could not be found

Create A Data Store

To create the data store

import requests as r
BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores'

new_ds_data = {
    'name': "My API Data Store",
    'type': 'postgres',
    'tech_poc': 2,
    'description': 'This data store was created via an API'
}
resp = r.post(url, json=new_ds_data, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"name": "My API Data Store - From Shell", "type": "other", "other_type": "some other value", "tech_poc": 2, "description": "This data store was created via an API"}' \
$BASE_URL/data-stores

Create a new data store. If the name of the data store you are trying to create already exists then the existing data store will be returned.

Returns the object:

{
  "data_store": {
    "data_store_id": 20,
    "name": "My API Data Store",
    "type": "postgres",
    "other_type": null,
    "created_ts": "2020-09-29 14:51:14",
    "updated_ts": "2020-09-29 14:51:14",
    "description_markup": "<p>This data store was created via an API</p>",
    "description_raw": "This data store was created via an API",
    "steward": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "tech_poc": {
      "user_id": 2,
      "name": "Asher",
      "email": "asher@treeschema.com"
    },
    "details": {}
  }
}

HTTPs Request

POST /data-stores

Query Parameters

There are no query parameters for this endpoint

Path Parameters

There are no path parameters for this endpoint.

Body

Field Required Description
name Yes The name of the data store
type Yes The type of data store, must be one of: dynamodb, kafka, mongodb, mysql, oracle, other, postgres, redis, redshift or s3
other_type No A more descriptive type of data store that can augment the field type if the value other is chosen
description No The description to give the data store
tech_poc No The ID for the user to assign as the technical point of contact for this data store, if no value is provided the user executing The API will be used
steward No The ID for the user to assign as the steward for this data store, if no value is provided the user executing the API will be used

Response

Field Data Type Description
data_store Data Store Object A data store object

Response Codes

Value Description
200 A data store with the same name already exists
201 Data Store Created
400 A malformed request was made, descriptions of the error will be provided in the body

Tag A Data Store

To tag a data store

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/20/tags'

tags = {'tags': ['api tag', 'schema tag', 'pii', 'mktg']}

resp = r.post(url, json=tags, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"tags": ["api tag", "schema tag", "pii", "mktg"]}' \
"$BASE_URL/data-stores/19/tags"

Add a tag to a data store.

Returns the object:

{
  "tags": [
    "api tag",
    "schema tag",
    "pii",
    "mktg"
  ],
  "tag_statuses": [
    "added",
    "added",
    "added",
    "added"
  ]
}

HTTPs Request

POST /data-stores/{data_store_id}/tags

Query Parameters

There are no query parameters for this endpoint

Path Parameters

Parameter Description
data_store_id The ID for the data store to add the tag(s) to

Body

Field Required Description
tags List[string] A list of string values to add as tags, each tag can be up to 32 characters

Response

Field Data Type Description
tags List[string] The list of tags that were processed
tag_statuses List[string] The status for each tag processed, statuses match the same index position as their corresponding tag. Values include added and exists.

Response Codes

Value Description
200 All of the tags requested already existed for the data store
201 At least one of the tags requested was added
400 A malformed request was made, descriptions of the error will be provided in the body

Data Schemas

Schemas are the heart and soul of a Data Catalog. They describe the shape, structure and format of the data. You may typically have data schemas represented as a table, a JSON or Parquet file, or a CSV but a Data Schema is really just a reference to a structured set of fields.

All schemas reside within a data store, therefore, in order to interact with a data schema you must know the data store that it belongs to.

Data Schema Object

The data schema object

{
  "data_schema_id": 16,
  "name": "My API Schema",
  "type": "table",
  "schema_loc": null,
  "created_ts": "2020-09-23 14:56:02",
  "updated_ts": "2020-09-23 14:56:02",
  "description_markup": null,
  "description_raw": null,
  "steward": {
    "user_id": 1,
    "name": "Grant",
    "email": "grant@treeschema.com"
  },
  "tech_poc": {
    "user_id": 1,
    "name": "Grant",
    "email": "grant@treeschema.com"
  }
}

The Data Schema object is returned when you GET a single or multiple data schema(s) from a data store. It is also returned when you create a new data schema. An example of the data schema object can be seen to the right.

Data Schema Object Fields

Field Data Type Description
data_schema_id integer The ID used to uniquely represent the data schema, the same ID can be found in the Tree Schema GUI, the URL for the data schema will contain the data schema ID
name string The name of the data schema
type string The type of the data schema
schema_loc string The location where the schema resides, this is used primarily for object data stores, such as s3. The schema location would represent the path to the directory where the schmema exists. For most schemas, the schema_loc will be the same as the name.
created_ts timestamp The timestamp that the data store was created
updated_ts timestamp The timestamp that the data store was updated
description_markup string An HTML string that represents the full markup description
description_raw string The data store description that has had all markup removed
steward User Object] The data steward assigned to the data store
tech_poc User Object] The technical point of contact assigned to the data store

Get All Schemas from Data Store

To get all schemas for a data store

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
"$BASE_URL/data-stores/1/schemas"

List all schemas for a data store.

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 4
  },
  "data_schemas": [
    {
      "data_schema_id": 16,
      "name": "public.session_info",
      "type": "table",
      "schema_loc": "public.session_info",
      "created_ts": "2020-09-23 14:56:02",
      "updated_ts": "2020-09-23 14:56:02",
      "description_markup": null,
      "description_raw": null,
      "steward": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      },
      "tech_poc": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      }
    },
    {
      "data_schema_id": 7,
      "name": "public.device_info",
      "type": "table",
      "schema_loc": "public.device_info",
      "created_ts": "2020-08-15 22:10:17",
      "updated_ts": "2020-08-15 22:10:17",
      "description_markup": null,
      "description_raw": null,
      "steward": {
        "user_id": 2,
        "name": "Asher",
        "email": "asher@treeschema.com"
      },
      "tech_poc": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      }
    }
  ]
}

HTTPs Request

GET /data-stores/{data_store_id}/schemas

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores
name null The name of the data schema

Path Parameters

Parameter Description
data_store_id The ID for the data store that you are listing schemas for

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
data_schemas list[Data Schema Object] A list of data schema objects

Response Codes

Response Codes

Value Description
200 Retrieved all data schemas for the data store
404 The data store ID requested could not be found

Get a Schema

To get a single schemas from a data store

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
"$BASE_URL/data-stores/1/schemas/1"

Get a single schema from a data store.

Returns the object:

{
  "data_schema": {
    "data_schema_id": 1,
    "name": "public.session_info",
    "type": "table",
    "schema_loc": "public.session_info",
    "created_ts": "2020-08-15 17:16:10",
    "updated_ts": "2020-08-15 17:16:10",
    "description_markup": null,
    "description_raw": null,
    "steward": {
      "user_id": 1,
      "name": "Asher",
      "email": "asher@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  }
}

HTTPs Request

GET /data-stores/{data_store_id}/schemas/{data_schema_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that you are listing schemas for
data_schema_id The ID for the data schema that exists within the data store

Body

There is no body for this endpoint.

Response

Field Data Type Description
data_schema Data Schema Object A data store object

Response Codes

Value Description
200 Retrieved the data schema from the data store
404 The data store ID requested could not be found or the schema requested does not exist within the data store

Create a Schema

To create a schema in a data store

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas'
new_schema = {
    'name': "My API Schema",
    'type': 'table',
    'description': 'This schema is created via API'
}

resp = r.post(url, json=new_schema, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"name": "My API Schema - Shell", "type": "table", "description": "This schema is created via API"}' \
$BASE_URL/data-stores/1/schemas

Create a data schema. Since a schema must reside within a data store the data store that you want to contain the schema must be specified in the URL path. If a schema with the same name (case insensitive) already exists within the data store then the existing schema is returned and no updates are made.

Returns the object:

{
  "data_schema": {
    "data_schema_id": 501,
    "name": "My New API Schema",
    "type": "table",
    "schema_loc": "My New API Schema",
    "created_ts": "2020-09-29 16:07:16",
    "updated_ts": "2020-09-29 16:07:16",
    "description_markup": "<p>This schema is created via API</p>",
    "description_raw": "This schema is created via API",
    "steward": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  }
}

HTTPs Request

POST /data-stores/{data_store_id}/schemas

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that will contain the schema being created

Body

Field Required Description
name Yes The name of the data schema
type Yes The type of data schema, must be one of: avro, csv, csv_other, json, parquet, other, table, or tsv
description No The description to give the schema
schema_loc No The location where the schema resides, this is used primarily for object data stores, such as s3. The schema location would represent the path to the directory where the schmema exists. For most schemas, the schema_loc will be the same as the name. If a schema_loc is not provided then the value will be set as value provided for the name
tech_poc No The ID for the user to assign as the technical point of contact for this data schema, if no value is provided the user executing The API will be used
steward No The ID for the user to assign as the steward for this data schema, if no value is provided the user executing the API will be used

Response

Field Data Type Description
data_schema Data Schema Object A data store object

Response Codes

Value Description
409 A data schema with the same name already exists and was returned instead of creating a new object
201 Data Schema Created
400 A malformed request was made, descriptions of the error will be provided in the body
404 The data store ID requested could not be found

Delete Schemas

To delete multiple schemas from a data store

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas'
delete_schemas = {'schema_ids': [501, 502]}

resp = r.delete(url, json=delete_schemas, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X DELETE -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"schema_ids": [8, 9]}' \
$BASE_URL/data-stores/1/schemas

There is no response in the body for this request

Deprecates data schemas that exist within a data store. In order to delete the schemas, the schema IDs must exist within the data store specified in the path parameters. If multiple schema IDs are provided and some exist within the data store but others do not exist within the data store then only those that exist within the data store will be deleted.

HTTPs Request

DELETE /data-stores/{data_store_id}/schemas

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that will contain the schema being deleted

Body

Field Required Description
schema_ids list[integer] A list of IDs that corresponds to the schemas to be deleted.

Response

There is no response body for this endpoint.

Response Codes

Value Description
200 The schemas provided were deleted from the data store
400 A malformed request was made, descriptions of the error will be provided in the body
404 The data store ID requested could not be found

Tag A Schema

To tag a data schema

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/tags'
tags = {'tags': ['api tag', 'schema tag', 'pii', 'mktg2']}

resp = r.post(url, json=tags, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"tags": ["api tag", "schema tag", "pii", "mktg"]}' \
$BASE_URL/data-stores/1/schemas/1/tags

Add one or more tags to a data schema.

Returns the object:

{
  "tags": [
    "api tag",
    "schema tag",
    "pii",
    "mktg"
  ],
  "tag_statuses": [
    "added",
    "added",
    "added",
    "added"
  ]
}

HTTPs Request

POST /data-stores/{data_store_id}/schemas/{data_schema_id}/tags

Query Parameters

There are no query parameters for this endpoint

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema to add tags to
data_schema_id The ID for the data schema to have the tags added to

Body

Field Required Description
tags List[string] A list of string values to add as tags, each tag can be up to 32 characters

Response

Field Data Type Description
tags List[string] The list of tags that were processed
tag_statuses List[string] The status for each tag processed, statuses match the same index position as their corresponding tag

Response Codes

Value Description
200 All of the tags requested already existed for the data store
201 At least one of the tags requested was added
400 A malformed request was made, descriptions of the error will be provided in the body

Data Fields

Data Fields are the most granular part of your catalog that describes the format and data type of our underlying data. Whether your Fields are represented as columns in a table, keys in JSON file, or Structs in a distributed Parquet data set you can capture their meaning and definition with Tree Schema Fields.

All fields reside within a data schema, therefore, in order to interact with a data fields you must know the data schema and data store that it belongs to.

Data Fields Object

The data fields object

{
  "field_id": 1,
  "name": "FIRST_NAME",
  "parent_path": null,
  "full_path_name": "FIRST_NAME",
  "type": "scalar",
  "data_type": "string",
  "data_format": "VARCHAR2",
  "nullable": false,
  "created_ts": "2020-08-15 17:16:11",
  "updated_ts": "2020-08-15 17:16:11",
  "description_markup": null,
  "description_raw": null,
  "steward": {
    "user_id": 1,
    "name": "Grant",
    "email": "grant@treeschema.com"
  },
  "tech_poc": {
    "user_id": 2,
    "name": "Asher",
    "email": "asher@treeschema.com"
  }
}

The Data Fields object is returned when you GET a single or multiple data field(s) from a data schema. It is also returned when you create a new data field. An example of the data field object can be seen to the right.

Data Field Object Fields

Field Data Type Description
field_id integer The ID used to uniquely represent the data schema, the same ID can be found in the Tree Schema GUI, the URL for the data schema will contain the data schema ID
name string The name of the field, for example, this would be the column name if the field is from a table or CSV or it could be a struct name if the field is from a Parquet file
parent_path string The dot-notation path for the parent to this field, this is only provided for fields that are contained within other fields, e.g. {"parent_field": {"child_field": 1}} would be parent_field.child_field
full_path_name string This is a concatenation of the parent path and the name. If the parent path is null then this value is the same as the name
type string Valid values include scalar, object and list
data_type string A JSON compatible data type, values include array, boolean, bytes, null, number, object and string
data_format string A free-form field that describes the format of the data, this could be varchar(32), YYYY-MM-DD, float(16), etc.
nullable boolean Whether or not the field can be null
created_ts timestamp The timestamp that the field was created in Tree Schema
updated_ts timestamp The timestamp that the field was updated in Tree Schema
description_markup string An HTML string that represents the full markup description
description_raw string The field description that has had all markup removed
steward User Object] The data steward assigned to the field
tech_poc User Object] The technical point of contact assigned to the field

Get All Fields from Schema

To get all fields for a data schema

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/data-stores/1/schemas/1/fields

List all fields for a data schema.

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 3
  },
  "data_fields": [
    {
      "field_id": 1,
      "name": "FIRST_NAME",
      "parent_path": null,
      "full_path_name": "FIRST_NAME",
      "type": "scalar",
      "data_type": "string",
      "data_format": "VARCHAR2",
      "nullable": false,
      "created_ts": "2020-08-15 17:16:11",
      "updated_ts": "2020-08-15 17:16:11",
      "description_markup": null,
      "description_raw": null,
      "steward": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      },
      "tech_poc": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      }
    },
    {
      "field_id": 2,
      "name": "LAST_NAME",
      "parent_path": null,
      "full_path_name": "LAST_NAME",
      "type": "scalar",
      "data_type": "string",
      "data_format": "VARCHAR2",
      "nullable": false,
      "created_ts": "2020-08-15 17:16:11",
      "updated_ts": "2020-08-15 17:16:11",
      "description_markup": null,
      "description_raw": null,
      "steward": {
        "user_id": 2,
        "name": "Asher",
        "email": "asher@treeschema.com"
      },
      "tech_poc": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      }
    }
  ]
}

HTTPs Request

GET /data-stores/{data_store_id}/schemas/{data_schema_id}/fields

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores
name name The name of the field

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
data_fields list[Data Field Object] A list of data field objects

Response Codes

Value Description
200 Retrieved all data fields for the data schema
404 The data store ID requested could not be found or the data schema does not exist within the data store

Get A Field

To get a single field from a data schema

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields/1'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/data-stores/1/schemas/1/fields/1

Get a single field from a schema.

Returns the object:

{ 
  "data_field": {
    "field_id": 1,
    "name": "FIRST_NAME",
    "parent_path": null,
    "full_path_name": "FIRST_NAME",
    "type": "scalar",
    "data_type": "string",
    "data_format": "VARCHAR2",
    "nullable": false,
    "created_ts": "2020-08-15 17:16:11",
    "updated_ts": "2020-08-15 17:16:11",
    "description_markup": null,
    "description_raw": null,
    "steward": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  }
}

HTTPs Request

GET /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested
data_field_id The ID for the data field being requested

Body

There is no body for this endpoint.

Response

Field Data Type Description
data_field Data Field Object A data field object

Response Codes

Value Description
200 Retrieved the data field for the schema
404 The data store ID requested could not be found or the data schema does not exist within the data store or the data field does not exist within the schema

Create A Field

To create a single field in a data schema

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields'
new_field = {
    'name': "my_field.sub_field",
    'type': 'scalar',
    'data_type': 'number',
    'data_format': 'integer(16)'
}

resp = r.put(url, json=new_field, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X PUT -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"name": "my_field.sub_field.from_shell", "type": "scalar", "data_type": "number", "data_format": "integer(16)"}' \
$BASE_URL/data-stores/1/schemas/1/fields

Create a single field in a schema. If a field with the same name (case insensitive) already exists within the schema then the existing field is returned and no updates are made.

Returns the object:

{
  "data_field": {
    "field_id": 5453,
    "name": "sub_field",
    "parent_path": "my_field",
    "full_path_name": "my_field.sub_field",
    "type": "scalar",
    "data_type": "number",
    "data_format": "integer(16)",
    "nullable": true,
    "created_ts": "2020-09-29 18:07:09",
    "updated_ts": "2020-09-29 18:07:09",
    "description_markup": null,
    "description_raw": null,
    "steward": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  }
}

HTTPs Request

PUT /data-stores/{data_store_id}/schemas/{data_schema_id}/fields

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested

Body

Field Required Description
name Yes The name of the field
type Yes The type of the field, valid values are scalar, list and object
data_type Yes The data type for the field, this is a representation of the field as a JSON compatible data type, must be one of array, boolean, bytes, null, number, object or string
data_format Yes A free-form field that describes the format of the data, this could be varchar(32), YYYY-MM-DD, float(16), etc.
nullable No Whether or not the field can be null, defaults to True
tech_poc No The ID for the user to assign as the technical point of contact for this data field, if no value is provided the user executing The API will be used
steward No The ID for the user to assign as the steward for this data field, if no value is provided the user executing the API will be used

Response

Field Data Type Description
data_field Data Field Object A data field object

Response Codes

Value Description
200 A data field with the same name already exists in the schema and was returned
201 The data field was created
400 A malformed request was made, descriptions of the error will be provided in the body
404 The data store ID requested could not be found or the data schema does not exist within the data store

Update A Field

To update a single field in a data schema

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields/1'
updates = {
    'description': "A new description",
    'type': 'list',
    'data_type': 'array',
    'data_format': 'YYYY-MM-DD',
    'nullable': False,
    'tech_poc': '1',
    'steward': 2
}

resp = r.post(url, json=updates, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"description": "A new description", "type": "list", "data_type": "array", "data_format": "YYYY-MM-DD", "nullable": false, "tech_poc": "1", "steward": 2}' \
$BASE_URL/data-stores/1/schemas/1/fields/1

Update a single field in a schema. You can update any value for a field except for the name.

Returns the updated object:

{
  "data_field": {
    "field_id": 5453,
    "name": "sub_field",
    "parent_path": "my_field",
    "full_path_name": "my_field.sub_field",
    "type": "scalar",
    "data_type": "number",
    "data_format": "integer(16)",
    "nullable": true,
    "created_ts": "2020-09-29 18:07:09",
    "updated_ts": "2020-09-29 18:07:09",
    "description_markup": null,
    "description_raw": null,
    "steward": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  }
}

HTTPs Request

POST /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested
data_field_id The ID for the data field being requested

Body

Field Required Description
type No The type of the field, valid values are scalar, list and object
data_type No The data type for the field, this is a representation of the field as a JSON compatible data type, must be one of array, boolean, bytes, null, number, object or string
data_format No A free-form field that describes the format of the data, this could be varchar(32), YYYY-MM-DD, float(16), etc.
description No The new description for the field, this will override any existing description
nullable No Whether or not the field can be null, defaults to True
tech_poc No The ID for the user to assign as the technical point of contact for this data field, if no value is provided the user executing The API will be used
steward No The ID for the user to assign as the steward for this data field, if no value is provided the user executing the API will be used

Response

Field Data Type Description
data_field Data Field Object A data field object

Response Codes

Value Description
200 The data field was updated successfully
400 A malformed request was made, descriptions of the error will be provided in the body
404 The data store ID requested could not be found or the data schema does not exist within the data store or the data field does not exist within the schema

Delete Fields

To delete fields from a schema

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields'
delete_fields = {'field_ids': [5452, 5454]}

resp = r.delete(url, json=delete_fields, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X DELETE -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"field_ids": [5453]}' \
$BASE_URL/data-stores/1/schemas/504/fields

There is no response in the body for this request

Deprecates data fields that exist within a data schema. In order to deprecate the fields, the field IDs must exist within the data schema specified in the path parameters. If multiple field IDs are provided and some exist within the data schema but others do not exist within the data schema then only those that exist within the data store will be deleted.

HTTPs Request

DELETE /data-stores/{data_store_id}/schemas/{data_schema_id}/fields

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that will contain the field(s) being deleted
data_schema_id The ID for the data schema that will contain the field(s) being deleted

Body

Field Required Description
field_ids list[integer] A list of IDs that corresponds to the fields to be deleted.

Response

There is no response body for this endpoint.

Response Codes

Value Description
200 The fields provided were deleted from the data store
400 A malformed request was made, descriptions of the error will be provided in the body
404 The data store ID requested could not be found or the data schema ID does not exist within the data store ID provided

Delete A Field

To delete a field from a schema

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields/1'

resp = r.delete(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X DELETE -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/data-stores/1/schemas/1/fields/1

There is no response in the body for this request

Deprecates data field in the path provided.

HTTPs Request

DELETE /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that will contain the field being deleted
data_schema_id The ID for the data schema that will contain the field being deleted
data_field_id The ID for the data field to be deleted

Body

There is no body for this endpoint.

Response

There is no response body for this endpoint.

Response Codes

Value Description
200 The field was deleted from the data store
404 The data store ID requested could not be found or the data schema ID does not exist within the data store ID provided or the data field does not exist within the schema

Tag A Field

To tag a field

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/2/fields/4/tags'
tags = {'tags': ['api tag', 'schema tag', 'pii', 'mktg2']}

resp = r.post(url, json=tags, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"tags": ["api tag", "schema tag", "pii", "mktg"]}' \
$BASE_URL/data-stores/1/schemas/2/fields/5/tags

Add a tag to a data field.

Returns the object:

{
  "tags": [
    "api tag",
    "schema tag",
    "pii",
    "mktg"
  ],
  "tag_statuses": [
    "added",
    "added",
    "added",
    "added"
  ]
}

HTTPs Request

POST /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}/tags

Query Parameters

There are no query parameters for this endpoint

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema to add tags to
data_schema_id The ID for the data schema to have the tags added to
data_field_id The ID for the field to add the tags to

Body

Field Required Description
tags List[string] A list of string values to add as tags, each tag can be up to 32 characters

Response

Field Data Type Description
tags List[string] The list of tags that were processed
tag_statuses List[string] The status for each tag processed, statuses match the same index position as their corresponding tag

Response Codes

Value Description
200 All of the tags requested already existed for the field
201 At least one of the tags requested was added
400 A malformed request was made, descriptions of the error will be provided in the body

Field Values

Field values are just that - values for a field. For example, if your field is status_code you may have the values 01, 02, 03, etc. and each of these values has a specific meaning. Field values allow you to capture both the value and the meaning of the value.

All field values reside within a data field, therefore, in order to interact with a field value you must know the data field, data schema, and data store that it belongs to.

Field Value Object

The field value object

{
  "field_value_id": 396,
  "field_value": "01",
  "description_markup": "<p>New customer</p>",
  "description_raw": "New customer",
  "created_ts": "2020-08-15 22:10:18",
  "updated_ts": "2020-08-15 22:10:18"
}

The Field Value object is returned when you GET a single or multiple field value(s) from a data field. It is also returned when you create a new field value. An example of the field value object can be seen to the right.

Field Value Object Fields

Field Data Type Description
field_value_id integer The ID used to uniquely represent the field value
field_value string The value
description_markup string An HTML string that represents the full markup description, this can be null if no description has been provided
description_raw string The field description that has had all markup removed, this can be null if no description has been provided
created_ts timestamp The timestamp that the field was

created in Tree Schema updated_ts | timestamp | The timestamp that the field was updated in Tree Schema

Get All Values for Field

To get all values for a data field

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields/1/values'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/data-stores/1/schemas/1/fields/1/values

List all values for a data field.

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 4
  },
  "field_values": [
    {
      "field_value_id": 1,
      "field_value": "01",
      "description_markup": "<p>New customer</p>",
      "description_raw": "New customer",
      "created_ts": "2020-08-15 22:10:18",
      "updated_ts": "2020-08-15 22:10:18"
    },
    {
      "field_value_id": 2,
      "field_value": "02",
      "created_ts": "2020-08-15 22:10:18",
      "updated_ts": "2020-08-15 22:10:18",
      "description_markup": null,
      "description_raw": null
    }
  ]
}

HTTPs Request

GET /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}/values

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores
value null The value of a sample value to retrieve

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested
data_field_id The ID for the data field that the values belong to

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
field_values list[Field Value Object] A list of field value objects

Response Codes

Value Description
200 Retrieved all field values for the field
404 The data store ID requested could not be found or the data schema does not exist within the data store or the field does not exist within the schema

Get A Sample Value

To get a single valuee for a data field

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/1/fields/1/values/1'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/data-stores/1/schemas/1/fields/1/values/1

List all values for a data field.

Returns the object:

{
  "field_value": {
    "field_value_id": 1,
    "field_value": "01",
    "description_markup": "<p>New customer</p>",
    "description_raw": "New customer",
    "created_ts": "2020-08-15 22:10:18",
    "updated_ts": "2020-08-15 22:10:18"
  }
}

HTTPs Request

GET /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}/values/{field_value_id}

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested
data_field_id The ID for the data field that the values belong to
field_value_id The ID for the specific field value to retrieve

Body

There is no body for this endpoint.

Response

Field Data Type Description
field_value Field Value Object A field value object

Response Codes

Value Description
200 Retrieved all field values for the field
404 The data store ID requested could not be found or the data schema does not exist within the data store or the field does not exist within the schema

Create A Field Value

To create a value for a data field

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/7/fields/78/values'
new_field_value = {
  'field_value': 'a new value here', 
  'description': 'and a new description'
}

resp = r.put(url, json=new_field_value, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X PUT -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"field_value": "a second value here", "description": "and a new description"}' \
$BASE_URL/data-stores/1/schemas/7/fields/78/values

Create a new value for a field.

Returns the object:

{
  "field_value": {
    "field_value_id": 16323,
    "field_value": "a new value here",
    "created_ts": "2020-09-29 20:51:18",
    "updated_ts": "2020-09-29 20:51:18",
    "description_markup": "<p>and a new description</p>",
    "description_raw": "and a new description"
  }
}

HTTPs Request

PUT /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}/values

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested
data_field_id The ID for the data field that the values belong to

Body

Field Required Description
field_value Yes The sample value for the field
description No The description for the sample value, an omitted description will be created as null

Response

Field Data Type Description
field_value Field Value Object A field value object

Response Codes

Value Description
201 Created the field value
400 A malformed request was made, descriptions of the error will be provided in the body
404 The data store ID requested could not be found or the data schema does not exist within the data store or the field does not exist within the schema
409 The field value already exists for the field provided

Update a Field Value

To update a value for a data field

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/data-stores/1/schemas/7/fields/78/values/16324'
new_desc = {'description': 'new description goes here'}

resp = r.post(url, json=new_desc, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"field_value": "a second value here", "description": "and a new description"}' \
$BASE_URL/data-stores/1/schemas/7/fields/78/values/16324

Update a value for a field.

Returns the object:

{
  "field_value": {
    "field_value_id": 16323,
    "field_value": "a new value here",
    "created_ts": "2020-09-29 20:51:18",
    "updated_ts": "2020-09-29 20:51:18",
    "description_markup": "<p>and a new description</p>",
    "description_raw": "and a new description"
  }
}

HTTPs Request

POST /data-stores/{data_store_id}/schemas/{data_schema_id}/fields/{data_field_id}/values/{field_value_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
data_store_id The ID for the data store that contains the schema in the path
data_schema_id The ID for the data schema that contains the fields being requested
data_field_id The ID for the data field that the values belong to
field_value_id The ID of the field value to update

Body

Field Required Description
field_value No The sample value for the field, if omitted the existing field value will remain in place
description No The description for the sample value, if omitted the existing description will remain in place

Response

Field Data Type Description
field_value Field Value Object A field value object

Response Codes

Value Description
200 Retrieved all field values for the field
400 A malformed request was made, descriptions of the error will be provided in the body
404 The data store ID requested could not be found or the data schema does not exist within the data store or the field does not exist within the schema

Transformations

Creating Transformations in Tree Schema is a critical part of unlocking the true value in your data as it allows you to see how data moves from system to system, identify dependencies in your data flow and to create your data lineage. Transformations describe data movement from field to field between schemas.

Transformation Object

The transformation object

{
  "transformation_id": 25,
  "name": "my api transform #2",
  "type": "some",
  "created_ts": "2020-09-22 17:20:38",
  "updated_ts": "2020-09-22 17:25:34",
  "description_markup": "<p>desc</p>",
  "description_raw": "desc",
  "steward": {
    "user_id": 1,
    "name": "Grant",
    "email": "grant@treeschema.com"
  },
  "tech_poc": {
    "user_id": 1,
    "name": "Grant",
    "email": "grant@treeschema.com"
  }
}

The transformation object by itself is a shell, it is only used to hold transformations links. Once a transformation object is created add transformation links to it in order to build your data lineage!

Transformation Object Fields

Field Data Type Description
transformation_id integer The ID used to uniquely represent the transformation, the same ID can be found in the Tree Schema GUI, the URL for the transformation will contain the transformation ID
name string The name of the transformation
type string The type of the transformation, valid values are batch_process_scheduled, batch_process_triggered, other, pub_sub_event and sql_trigger
created_ts timestamp The timestamp that the transformation was created
updated_ts timestamp The timestamp that the transformation was updated
description_markup string An HTML string that represents the full markup description
description_raw string The transformation description that has had all markup removed
steward User Object] The data steward assigned to the transformation
tech_poc User Object] The technical point of contact assigned to the transformation

Get All Transformations

To get all transformations

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/transformations

List all transformations.

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 2
  },
  "transformations": [
    {
      "transformation_id": 25,
      "name": "My Tansform",
      "type": "batch_process_triggered",
      "created_ts": "2020-09-22 17:20:38",
      "updated_ts": "2020-09-22 17:25:34",
      "description_markup": "<p>desc</p>",
      "description_raw": "desc",
      "steward": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      },
      "tech_poc": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      }
    },
    {
      "transformation_id": 28,
      "name": "My Second Transformation",
      "type": "other",
      "created_ts": "2020-09-22 18:06:17",
      "updated_ts": "2020-09-22 18:06:56",
      "description_markup": null,
      "description_raw": null,
      "steward": {
        "user_id": 2,
        "name": "Asher",
        "email": "asher@treeschema.com"
      },
      "tech_poc": {
        "user_id": 1,
        "name": "Grant",
        "email": "grant@treeschema.com"
      }
    }
  ]
}

HTTPs Request

GET /transformations

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores
name null The name of the transformation to return

Path Parameters

There are no path parameters for this endpoint.

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
transformations list[Transformation Object] A list of transformation objects

Response Codes

Value Description
200 Retrieved all transformations

Get A Transformation

Get a single transformation

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations/25'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/transformations/25

Get a single transformation

Returns the object:

{
  "transformation": {
    "transformation_id": 25,
    "name": "My Tansform",
    "type": "batch_process_triggered",
    "created_ts": "2020-09-22 17:20:38",
    "updated_ts": "2020-09-22 17:25:34",
    "description_markup": "<p>desc</p>",
    "description_raw": "desc",
    "steward": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  }
}

HTTPs Request

GET /transformations/{transformation_id}

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores

Path Parameters

Parameter Description
transformation_id The ID for the transformation being retrieved

Body

There is no body for this endpoint.

Response

Field Data Type Description
transformation Transformation Object A transformation object

Response Codes

Value Description
200 Retrieved the transformations
404 The transformaiton requested does not exist

Create A Transformation

Create a new transformation

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations'
new_transform = {
    'name': 'My API Transformation!',
    'type': 'other'
}

resp = r.put(url, json=new_transform, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X PUT -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"name": "My API Transformation2", "type": "other"}' \
$BASE_URL/transformations

Create a transformation

Returns the object:

{
  "transformation": {
    "transformation_id": 25,
    "name": "My Tansform",
    "type": "batch_process_triggered",
    "created_ts": "2020-09-22 17:20:38",
    "updated_ts": "2020-09-22 17:25:34",
    "description_markup": "<p>desc</p>",
    "description_raw": "desc",
    "steward": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    },
    "tech_poc": {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  }
}

HTTPs Request

PUT /transformations

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

There are no path parameters for this endpoint.

Body

Field Required Description
name Yes The name of the transformation
type Yes The type of transformation, alid values are batch_process_scheduled, batch_process_triggered, other, pub_sub_event and sql_trigger
description No The description to give the transformation
tech_poc No The ID for the user to assign as the technical point of contact for this transformation, if no value is provided the user executing The API will be used
steward No The ID for the user to assign as the steward for this transformation, if no value is provided the user executing the API will be used

Response

Field Data Type Description
transformation Transformation Object A transformation object

Response Codes

Value Description
200 Existing transformation retrieved
201 Transformation Created
404 The transformaiton requested does not exist

Delete A Transformation

Delete a transformation

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations/36'

resp = r.delete(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X DELETE -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/transformations/31

Delete a transformation

HTTPs Request

DELETE /transformations/{transformation_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
transformation_id The ID for the transformation to be deleted

Body

There is no body for this endpoint.

Response

There is no response body for this endpoint.

Response Codes

Value Description
200 Transformation deleted
404 The transformaiton requested does not exist

Tag A Transformation

To tag a transformation

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations/30/tags'

tags = {'tags': ['api tag', 'transform tag', 'pii', 'mktg']}

resp = r.post(url, json=tags, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST -H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"tags": ["api tag", "transform tag", "pii", "mktg"]}' \
"$BASE_URL/transformations/30/tags"

Add a tag to a transformation.

Returns the object:

{
  "tags": [
    "api tag",
    "schema tag",
    "pii",
    "mktg"
  ],
  "tag_statuses": [
    "added",
    "added",
    "added",
    "added"
  ]
}

HTTPs Request

POST /transformations/{transformation_id}/tags

Query Parameters

There are no query parameters for this endpoint

Path Parameters

Parameter Description
transformation_id The ID for the transformation to add the tag(s) to

Body

Field Required Description
tags List[string] A list of string values to add as tags, each tag can be up to 32 characters

Response

Field Data Type Description
tags List[string] The list of tags that were processed
tag_statuses List[string] The status for each tag processed, statuses match the same index position as their corresponding tag. Values include added and exists.

Response Codes

Value Description
200 All of the tags requested already existed for the transformation
201 At least one of the tags requested was added
400 A malformed request was made, descriptions of the error will be provided in the body

Transformation Links

Transformation links capture how data moves from field to field between your schemas. A single transformation link represents a single field to field movement. A single transformation (which may represent a data pipeline, or ETL / ELT job) will likely contain many transformation links.

The transformation link object

{
  "transformation_link_id": 1,
  "created_ts": "2020-09-22 23:54:26",
  "updated_ts": "2020-09-22 23:54:26",
  "source_data_store_id": 3,
  "source_data_store_name": "Kafka Prod",
  "source_schema_id": 17,
  "source_schema_name": "users-topic.v1",
  "source_field_id": 200,
  "source_field_name": "user_id",
  "target_data_store_id": 4,
  "target_data_store_name": "Redshift",
  "target_schema_id": 469,
  "target_schema_name": "usr.user_info",
  "target_field_id": 5399,
  "target_field_name": "user_id"
}

The transformation link object contains references to all of the data stores, schemas and fields that are associated when data moves from one schema to another, these associations are referred to as the source and target.

Field Data Type Description
transformation_link_id integer The ID used to uniquely represent the transformation link
source_data_store_id integer The unique ID for the data store for the source of the transformation.
source_schema_id integer The unique ID for the schema for the source of the transformation.
source_field_id integer The unique ID for the field for the source of the transformation.
target_data_store_id integer The unique ID for the data store for the target of the transformation.
target_schema_id integer The unique ID for the schema for the target of the transformation.
target_field_id integer The unique ID for the field for the target of the transformation.
created_ts timestamp The timestamp that the transformation link was created
updated_ts timestamp The timestamp that the transformation link was updated

To get all links for a transformation

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations/1/links'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/transformations/1/links

List all transformation links for a given transformation.

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 4
  },
  "transformation_links": [
    {
      "transformation_link_id": 1,
      "created_ts": "2020-09-22 23:54:26",
      "updated_ts": "2020-09-22 23:54:26",
      "source_data_store_id": 3,
      "source_data_store_name": "Kafka Prod",
      "source_schema_id": 17,
      "source_schema_name": "users-topic.v1",
      "source_field_id": 200,
      "source_field_name": "user_id",
      "target_data_store_id": 4,
      "target_data_store_name": "Redshift",
      "target_schema_id": 469,
      "target_schema_name": "usr.user_info",
      "target_field_id": 5399,
      "target_field_name": "user_id"
    },
    {
      "transformation_link_id": 1,
      "created_ts": "2020-09-22 23:54:26",
      "updated_ts": "2020-09-22 23:54:26",
      "source_data_store_id": 3,
      "source_data_store_name": "Kafka Prod",
      "source_schema_id": 17,
      "source_schema_name": "users-topic.v1",
      "source_field_id": 201,
      "source_field_name": "email",
      "target_data_store_id": 4,
      "target_data_store_name": "Redshift",
      "target_schema_id": 469,
      "target_schema_name": "usr.user_info",
      "target_field_id": 5400,
      "target_field_name": "email"
    }
  ]
}

HTTPs Request

GET /transformations/{transformation_id}/links

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores

Path Parameters

Parameter Description
transformation_id The ID for the transformation to retrieve the links

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
transformation_links list[Transformation Link Object] A list of transformation objects

Response Codes

Value Description
200 Retrieved all transformation links

To a transformation link

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations/1/links/1'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/transformations/1/links/1

Get a single transformation link for a given transformation.

Returns the object:

{
  "transformation_link": {
      "transformation_link_id": 1,
      "created_ts": "2020-09-22 23:54:26",
      "updated_ts": "2020-09-22 23:54:26",
      "source_data_store_id": 3,
      "source_data_store_name": "Kafka Prod",
      "source_schema_id": 17,
      "source_schema_name": "users-topic.v1",
      "source_field_id": 200,
      "source_field_name": "user_id",
      "target_data_store_id": 4,
      "target_data_store_name": "Redshift",
      "target_schema_id": 469,
      "target_schema_name": "usr.user_info",
      "target_field_id": 5399,
      "target_field_name": "user_id"
    }
}

HTTPs Request

GET /transformations/{transformation_id}/links/{transformation_link_id}

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through data stores

Path Parameters

Parameter Description
transformation_id The ID for the transformation to retrieve the links
transformation_link_id The ID for the transformation link

Body

There is no body for this endpoint.

Response

Field Data Type Description
transformation_link Transformation Link Object A transformation object

Response Codes

Value Description
200 Retrieved the transformation link

Create links for a transformation

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations/1/links'

new_links = {
    'links': [
        {
            'source_field_id': 89,
            'target_field_id': 5399
        },
        {
            'source_field_id': 200,
            'target_field_id': 5399
        }
    ]
}

resp = r.post(url, json=new_links, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X POST \
-H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"links": [{"source_field_id": 89, "target_field_id": 5399}, {"source_field_id": 200, "target_field_id": 5399}]}' \
$BASE_URL/transformations/1/links

Create links for a transformation.

When creating links you only need to link to fields together - a source field and a target field. Tree Schema will infer the schema and data store directly from the field IDs.

Returns the object:

{
  "links": [
    {
      "source_field_id": 89,
      "target_field_id": 5399
    },
    {
      "source_field_id": 200,
      "target_field_id": 5399
    }
  ],
  "link_statuses": [
    "exists",
    "exists"
  ],
  "updated_links": [
    {
      "transformation_link_id": 205,
      "source_field_id": 89,
      "source_field_name": "account_type",
      "source_schema_id": 8,
      "source_schema_name": "public.accounts",
      "source_data_store_id": 3,
      "source_data_store_name": "Postgres Prod",
      "target_field_id": 5399,
      "target_field_name": "acct_type",
      "target_schema_id": 469,
      "target_schema_name": "acct.dvc.raw.01",
      "target_data_store_id": 4,
      "target_data_store_name": "Kafka"
    },
    {
      "transformation_link_id": 206,
      "source_field_id": 200,
      "source_field_name": "user_id",
      "source_schema_id": 17,
      "source_schema_name": "public.users",
      "source_data_store_id": 3,
      "source_data_store_name": "Postgres Prod",
      "target_field_id": 5399,
      "target_field_name": "acct_type",
      "target_schema_id": 469,
      "target_schema_name": "acct.dvc.raw.01",
      "target_data_store_id": 4,
      "target_data_store_name": "Kafka"
    }
  ]

}

HTTPs Request

POST /transformations/{transformation_id}/links

Query Parameters

Query Parameters

Parameter Default Description
set_state False If True, the state of the transformation will be set to the links provieded,

any exsisting links in the transformation that are not part of the input will be deprecated and any links that are provided but do not exist in the transformation will be created

Path Parameters

Parameter Description
transformation_id The ID for the transformation to add the links

Body

Transformation links are created by providing a list of source to target fields.

Field Required Description
links Yes List[Transformation source to target mapping] that represents the source and target for each transformation link

Transformation source to target mapping

Field Required Description
source_field_id Yes The field_id for the source field where data moves from
target_field_id Yes The field_id for the target field where data moves to

Response

Field Data Type Description
links list[Transformation source to target mapping] The same source to target mapping inputs provided as the input
link_statuses list[string]] The status for each link processed, statuses match the same index position as their corresponding link. Values include created, exists and could_not_create.
updated_links list[Transformation Link Object] A list of transformation link objects for each transformation link requested that was created or already exists

Response Codes

Value Description
200 All transformation links processed
201 At least one transformation link was created
400 A malformed request was made, descriptions of the error will be provided in the body

Delete a single transformation link

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/transformations/1/links'

delete_links = {
    'transform_link_ids': [
        206,205
    ]
}

resp = r.delete(url, json=delete_links, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -X DELETE \
-H "Authorization: Basic $ENCODED_SECRET" \
-H "Content-Type: application/json" \
-d '{"transform_link_ids": [142, 144]}' \
$BASE_URL/transformations/1/links

Delete links for a transformation.

Returns the object:

{
  "links": [
    205,
    206
  ],
  "link_statuses": [
    "deleted",
    "deleted"
  ]
}

HTTPs Request

DELETE /transformations/{transformation_id}/links

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

Parameter Description
transformation_id The ID for the transformation to delete the links

Body

Transformation links are created by providing a list of source to target fields.

Field Required Description
transformation_link_ids Yes List[integer] The list of transformation link IDs to delete

Response

Field Data Type Description
transformation_link_ids List[integer] The list of transformation link IDs submitted to delete
link_statuses list[string]] The status for each link processed, statuses match the same index position as their corresponding link. Values include deleted and could_not_delete.

Response Codes

Value Description
200 All transformation links processed
400 A malformed request was made, descriptions of the error will be provided in the body

======

Users

Access your teammates and assign them as tech pocs and data stewards.

User object

The user object

{
  "user_id": 2,
  "name": "Asher",
  "email": "asher@treeschema.com"
}

User Result Object Fields

Field Data Type Description
user_id integer The ID used to uniquely represent the user
name string The name of the user
email string The user's email

Get All Users

Get all users in your organization

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/users'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/users

Get all users in your organization

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 2
  },
  "users": [
    {
      "user_id": 2,
      "name": "Asher",
      "email": "asher@treeschema.com"
    },
    {
      "user_id": 1,
      "name": "Grant",
      "email": "grant@treeschema.com"
    }
  ]
}

HTTPs Request

GET /users

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through search results
email null A user's email address

Path Parameters

There are no path parameters for this endpoint.

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
users list[User Object] A list of users

Response Codes

Value Description
200 Retrieved the users

Get a User

Get a user in your organization

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/users/1'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/users/1

Get all users in your organization

Returns the object:

{
"user": {
    "user_id": 1,
    "name": "Grant",
    "email": "grant@treeschema.com"
  }
}

HTTPs Request

GET /users/{user_id}

Query Parameters

There are no query parameters for this endpoint.

Path Parameters

There are no path parameters for this endpoint.

Body

There is no body for this endpoint.

Response

Field Data Type Description
user User Object] A users

Response Codes

Value Description
200 Retrieved the users
404 The user requested was not found

Full Catalog Search

Search your entire data catalog from a single place. Can't remember what data store your user_analytics schema sits in? Need a refresher on where that pesky usr_start_dt field is? Search the catalog!

Catalog Search Object

The search result object

{
  "entity_id": 5405,
  "schema_id": 469,
  "data_store_id": 4,
  "name": "device_id",
  "entity_type": "field"
}

The search API is intended to search the following:

The search results are intended to enable a simple and easy way to quickly find the key IDs needed to use your Tree Schema catalog.

Search Result Object Fields

Field Data Type Description
entity_id integer The ID used to uniquely represent the entity, this goes with the entity_type to find a specific item in the catalog
entity_type string The type of entity that the entity_id relates to, possible values include data_store, data_schema, field, and transformation
name string the name of the object
data_store_id integer If the entity resides within a data store, for example a data_schema or field then this field will be populated, otherwise it will be null
schema_id integer If the entity resides within a data schema, for example a field then this field will be populated, otherwise it will be null

Search the Catalog

Search the catalog

import requests as r

BASE_URL = 'https://api.treeschema.com/catalog'
url = BASE_URL + '/search?term=usr'

resp = r.get(url, headers=headers)
resp.json()
BASE_URL='https://api.treeschema.com/catalog'

curl -H "Authorization: Basic $ENCODED_SECRET" \
$BASE_URL/search?term=usr

Search the entire catalog

Returns the object:

{
  "meta": {
    "current_page": 1,
    "next_page": null,
    "total_cnt": 7
  },
  "results": [
    {
      "entity_id": 5405,
      "schema_id": 469,
      "data_store_id": 4,
      "name": "device_id",
      "entity_type": "field"
    },
    {
      "entity_id": 79,
      "schema_id": 7,
      "data_store_id": 1,
      "name": "DEVICE_ID",
      "entity_type": "field"
    }
  ]
}

HTTPs Request

GET /search

Query Parameters

Parameter Default Description
page 1 The page to retrieve when paginating through search results
term None The search term to look for in the data catalog

Path Parameters

There are no path parameters for this endpoint.

Body

There is no body for this endpoint.

Response

Field Data Type Description
meta Meta object A meta object for pagination
results list[Search Results Object] A list of search results

Response Codes

Value Description
200 Retrieved the search results

Errors

These are common errors that apan across the Tree Schema API Requests.

Error Code Meaning
400 Bad Request -- Your request is invalid.
401 Unauthorized -- Your API key is wrong.
403 Forbidden -- The resource requested is hidden for administrators only.
404 Not Found -- The specified resource could not be found.
406 Not Acceptable -- You requested a format that isn't json.
429 Too Many Requests -- You've made too many requests!
500 Internal Server Error -- We had a problem with our server. Try again later.
503 Service Unavailable -- We're temporarily offline for maintenance. Please try again later.