Resources

name path methods description
Audit Events
  • /v2/auditevent/list
  • /v2/auditevent/search
  • /v2/auditevent/{key}
  • GET
  • POST
  • GET
Endpoints under /auditevent provide access to audit events—Waterline Data system and user activities. You can use audit event characteristics such as user, date, or type of activity to filter the events returned.
Audit events are available only to users with an Administrator role.
Authentication
  • /v2/login
  • /v2/logout
  • POST
  • GET
The Authentication API accepts the same user credentials as the Waterline Data browser. These API calls control the creation and destruction of a session cookie to validate other API calls.
Browse: Field-level
  • /v2/browse/field/facet
  • POST
Endpoints under /browse/field manage the facets available when users view a list of fields for a given resource.
Field-level calls return results based on the user credentials for the current authentication session and the parent data resource.
Browse: Resource-level
  • /v2/browse/facet
  • /v2/browse/new
  • /v2/browse/page
  • /v2/browse/pattern
  • POST
  • POST
  • POST
  • POST
Endpoints under /browse allow you to list and manage metadata for data resources based commonalities of location. Compare with the /search functionality, that lists data resources which share some metadata characteristics. Browse functionality depends on how data is stored and its location on a file system or database.
Browse facets are data resource attributes, such as file format, associated tags, size, and so on. Facets provide a list of items or ranges that include the values represented in the browse context. For example, the Content Type facet includes all file format types found in the catalog, but only the values that apply to the current browse context are considered.
The browse call maintains state. The sequence of calls is important, as each call stores information that successive calls use. A BrowseResult object contains path information, an array of facet names and categories, information about the DataSource, and DataResource objects in JSON format.
The results of a browse operation can be huge, so results are paginated. The default page size is 25. A browse request can specify a different page size, and after the first request, can specify the 0-based index of a specific page of results to retrieve, or a number of pages to retrieve.
Browse calls return results based on the user credentials for the current authentication session.
Configuration
  • /v2/config
  • /v2/config/{name}
  • GET
  • POST
Endpoints under /config provide access to the configuration options that control job and web server behavior.
Individual configuration options are represented by the ConfigProperty object.
Custom Header
  • /v2/customheader
  • GET
Data Objects
  • /v2/dataobject
  • /v2/dataobject/name
  • /v2/dataobject/recommendations
  • /v2/dataobject/searchresource
  • /v2/dataobject/{key}
  • /v2/dataobject/{key}/profile
  • /v2/dataobject/{key}/tag
  • /v2/dataobject/{key}/join/add
  • /v2/dataobject/{key}/join/refresh
  • /v2/dataobject/{key}/join/remove
  • /v2/dataobject/{key}/resource/add
  • /v2/dataobject/{key}/resource/remove
  • GET POST
  • GET
  • GET
  • GET
  • DELETE GET PUT
  • POST
  • POST
  • POST
  • POST
  • POST
  • POST
  • POST
Data Preview
  • /v2/content/{resourceKey}
  • GET
Endpoints under /datapreview provide access to previews of data resource content based on the authorization of the current user.
Data Resources
  • /v2/dataresource
  • /v2/dataresource/hivecolumns
  • /v2/dataresource/metadata
  • /v2/dataresource/savepatterns
  • /v2/dataresource/{key}
  • /v2/dataresource/browse/{resourceKey}
  • /v2/dataresource/schema/{resourceKey}
  • /v2/dataresource/source/{sourceName}
  • /v2/dataresource/table/{resourceKey}
  • /v2/dataresource/view/{resourceKey}
  • /v2/dataresource/virtualfolder/{virtualFolderName}
  • /v2/dataresource/{resourceKey}/partitions
  • PUT
  • GET
  • GET
  • POST
  • GET PUT
  • GET
  • PUT
  • GET
  • POST
  • POST
  • GET
  • GET
Endpoints under /dataresource provide access to DataResource objects in the Waterline Data repository. Waterline Data defines a data resource as an HDFS file or folder, a set of files that are organized in folders (known as a Waterline Data collection), as a Hive database or table, or as a relational database or table. DataResource objects are uniquely identified by a "key" value, assigned on creation.
Requests to to a {key} endpoint under /dataresource address a specific DataResource object. If a directory, file, database, or table has not yet been encountered in a format or schema discovery operation, it does not have a corresponding data resource entry in the Waterline Data catalog.
Data resource calls return results based on the user credentials for the current authentication session.
Data Sets
  • /v2/dataset
  • /v2/dataset/name
  • /v2/dataset/{key}
  • /v2/dataset/members/add
  • /v2/dataset/members/remove
  • /v2/dataset/schema/{key}
  • /v2/dataset/schema/version/max
  • GET POST
  • GET
  • DELETE GET PUT
  • POST
  • POST
  • PUT
  • GET
Endpoints under /dataset provide access to Virtual Dataset objects. A Waterline Virtual Data Set allow users to create groups of resources having the same schema and spanning different folders in your data lake, into a single virtual unit for easier management. Virtual Data Sets are considered as user defined virtual collections that have the matching schema but may have different path specifications/hierarchy with respect to a data source.
Data Sources
  • /v2/datasource
  • /v2/datasource/list-configs
  • /v2/datasource/validate
  • /v2/datasource/{key}
  • /v2/datasource/dependencies/{key}
  • /v2/datasource/gethint/{datasourcetype}
  • /v2/datasource/name/{name}
  • /v2/datasource/sync/{key}
  • /v2/datasource/credentials/users/{user}
  • /v2/datasource/{key}/credentials/users/{user}
  • GET POST
  • GET
  • POST
  • DELETE GET PUT
  • GET
  • GET
  • GET
  • DELETE
  • GET
  • DELETE POST PUT
Endpoints under /datasource provide access to DataSource objects. A Waterline Data data source is the location of an HDFS directory, a Hive instance or specific database, a cloud storage location, or a relational database instance or specific database. A data source hosts data resources, in the form of folders, files, databases, and tables.
DataSource objects are uniquely identified by a "key" value, assigned on creation. In addition, data sources can be referenced by the "name" property, specified by the creator.
Data source calls return results based on the user credentials for the current authentication session.
Entity Specifications
  • /v2/entityspec/all
  • /v2/entityspec/{name}
  • /v2/entityspec/{name}/propertyspec
  • GET
  • GET
  • POST PUT
An entity specification defines properties for Waterline Data objects such as tags, users, roles, data resources, and so on. Properties of entities are represented by PropertySpec objects. An EntitySpec object associates a set of properties with a specific entity type. Endpoints under /entityspec allow you to create, query, update, and delete properties associated with entities, including setting the access level for properties.
Entities are identified by their "key", which in this context is a name. The built-in entities are audit_event, data_resource, data_source, operation_execution, operation, resource_field, role, tag_association, tag_domain, tag, user_review, and user. Operations and operation_executions are not exposed at this time. To access entities, the authenticated user must have an Administrator role.
External Sources
  • /v2/externalsource
  • /v2/externalsource/{key}
  • /v2/externalsource/name/{name}
  • GET POST
  • DELETE GET
  • DELETE GET
External Sources are sources of metadata to be imported into the Waterline Data catalog. Endpoints under /externalsource provide a way to identify and connect to these outside applications. For example, the extensions created to import metadata from applications such as Apache Atlas or Cloudera Navigator use external sources to persist connection information for the application instance.
External source calls are available only to users with administrator roles.
Favorites
  • /v2/favorites
  • /v2/favorites/{entityKey}
  • /v2/favorites/{user}
  • /v2/favorites/{entityKey}/{userName}
  • GET
  • DELETE POST
  • GET
  • POST
Endpoints under /favorites provide a way to manage user bookmarks on tags and resources.
JobManagement
  • /v2/job/execution
  • /v2/job/execution/terminate/{jobInstanceId}
  • /v2/job/execution/{key}/resources
  • POST
  • PUT
  • GET
JobManagement
  • /v2/job/instances
  • /v2/job/instances/currentuser
  • /v2/job/instances/{key}
  • /v2/job/instances/{key}/executions
  • GET
  • GET
  • GET
  • GET
JobManagement
  • /v2/job/sequences
  • /v2/job/sequences/{key}
  • GET
  • GET
JobManagement
  • /v2/job/templates
  • /v2/job/templates/bulkdelete
  • /v2/job/templates/{key}
  • /v2/job/templates/{key}/instances
  • GET POST
  • POST
  • DELETE GET PUT
  • GET
Lineage
  • /v2/lineage/searchresource
  • /v2/lineage/{key}
  • /v2/lineage/addparent/{key}
  • GET
  • GET
  • POST
Waterline Data defines a LineageRelations object that records and presents the lineage relationships between data resources. Endpoints under /lineage provide access to these objects.
Overlap tables
  • /v2/overlap/table
  • /v2/overlap/table/csv
  • GET POST
  • POST
Endpoints under /overlap/table allow you to pass tables information and receive a preliminary overlap report computed based on the table schema
The results of comparision are collected in TableComparisionTO model and are returned back
Reviews
  • /v2/reviews
  • /v2/reviews/{key}
  • /v2/reviews/entity/{entityKey}
  • /v2/reviews/user/{user}
  • /v2/reviews/users/{user}
  • POST PUT
  • DELETE
  • GET
  • GET
  • POST
Endpoints under /reviews allow access to user reviews including deleting specific reviews.
Roles
  • /v2/role
  • /v2/role/byname
  • /v2/role/{key}
  • GET POST
  • GET
  • DELETE GET PUT
Waterline Data uses roles to indicate what data resources from the catalog are available to each user. User roles determine what actions each user can perform for metadata management inside Waterline Data. Endpoints under /role allow creation of new roles and modification of existing roles. To associate roles to users, use a PUT request to /user/{key}.
A user with the admin role can create or modify roles with requests to the role endpoint. User roles are defined by the Role object, which has these components:
  • A list of data sources that role can access.
  • A list of of tag domains that role can access.
  • An access level.
  • A unique identifying key value, assigned on creation.
By default, Waterline Data provides these access level definitions:
  • administrator
  • steward
  • analyst
  • guest
Rules
  • /v2/rule
  • /v2/rule/name
  • /v2/rule/{key}
  • /v2/rule/name/{name}
  • GET POST PUT
  • PUT
  • DELETE GET
  • DELETE GET
Endpoints under /rule allow you to list and manage the rules.
Search
  • /v2/search/facets
  • /v2/search/new
  • /v2/search/page
  • /v2/search/runFacet
  • GET
  • POST
  • POST
  • POST
Endpoints under /search allow you to collect data resources that share some metadata characteristics. Compare the browse functionality, that returns a set of data resources based on the resource's location. Searchable characteristics include facets and keywords.
The results of a search are collected in a SearchResult object, which is assigned a unique key value on creation. Access control restrictions apply to the search API. Search results are restricted to data resources included in the data sources included in the active users' role.
SessionController
  • /session
  • GET
Tag Associations
  • /v2/tagassociation
  • /v2/tagassociation/byfield
  • /v2/tagassociation/byresource
  • /v2/tagassociation/list
  • /v2/tagassociation/{key}
  • /v2/tagassociation/list/byresource
  • /v2/tagassociation/resource/{key}
  • /v2/tagassociation/tag/{key}
  • /v2/tagassociation/counts/domainlevel/all
  • /v2/tagassociation/tag/container/{key}
  • /v2/tagassociation/counts/taglevel/domain/{key}
  • /v2/tagassociation/counts/taglevel/tag/{key}
  • POST
  • GET
  • GET
  • POST PUT
  • DELETE GET PUT
  • POST
  • GET
  • GET
  • GET
  • GET
  • GET
  • GET
Endpoints under /tagassociation provide access to TagAssociation objects in the Waterline Data repository. A tag association object defines the relationship of a tag to a specific data resource (folders, files, collections, and tables) or specific field in a data resource.
Tag associations can be made manually through API calls or the UI, or automatically through a Waterline Data tag discovery operation. Tag discovery can be made using the patterns from specific field values ("valueEnabled":"true" for the tag) or through matching field values against a regular expression ("regexEnabled":"true" for the tag). Tag associations have a "tagState" whose value is "ACCEPTED", "REJECTED", or "SUGGESTED".
  • When a tag association is made manually, the state is "accepted".
  • The first manual tag association created in the UI is automatically marked with the "semantic" = "sample". This setting must be explicitly marked when adding a tag association through the API.
  • Tag discovery marks tag associations as "suggested". Users can convert a "suggested" tag association to "accepted" in the UI or the API.
  • A "suggested" tag association can also be "rejected" by users in the UI or through the API. A rejected tag association contributes to the tag discovery learning algorithm; the tag association object remains, but is marked "rejected".
  • An "accepted" or "rejected" tag association can be removed by users in the UI or through the API.
Tag Domains
  • /v2/tagdomain
  • /v2/tagdomain/all
  • /v2/tagdomain/alldomains
  • /v2/tagdomain/list
  • /v2/tagdomain/{key}
  • /v2/tagdomain/name/{name}
  • /v2/tagdomain/{domainKey}/alltags
  • POST
  • GET
  • GET
  • POST PUT
  • DELETE GET PUT
  • DELETE GET
  • GET
Waterline Data tags are maintained in a tag glossary that is organized into tag domains. Endpoints under /tagdomain provide access to TagDomain objects.
All tags must belong to a domain. One tag domain, "Built-in Tags", is dedicated to system tags. Users with admin privileges can create additional tag domains.
Tags
  • /v2/tag
  • /v2/tag/byname
  • /v2/tag/hierarchy
  • /v2/tag/list
  • /v2/tag/{key}
  • /v2/tag/children/{key}
  • /v2/tag/regex/validate
  • POST
  • GET
  • POST
  • POST PUT
  • DELETE GET PUT
  • GET
  • POST
Endpoints under /tag provide access to Tag objects in the Waterline Data repository. Tags are labels associated with data resources (folders, files, collections, or tables), or with specific fields in data resources.
  • A tag has a unique identifier ("key"), a label ("name"), and a description.
  • Tag names can use dot notation to identify a hierarchy of related tags.
  • Tags are organized into domains, represented by TagDomain objects.
  • The relationship between a tag and a resource or field is represented by a TagAssociation object.
User Profiles
  • /v2/user
  • /v2/user/_me
  • /v2/user/{key}
  • /v2/user/assign/role
  • /v2/user/name/{username}
  • /v2/user/permit/resource
  • /v2/user/permit/source
  • /v2/user/remove/role
  • GET POST
  • GET PUT
  • DELETE GET PUT
  • POST
  • GET
  • GET
  • GET
  • POST
Endpoints under /user allow administrators to manage user profiles, represented by the User object. The caller must have a roles that includes an Administrator access level to create, update, or delete user profiles.
User objects are uniquely identified by a key, assigned on creation. Requests to a {key} endpoint under /user address a specific User object.
Virtual Folders
  • /v2/virtualfolder
  • /v2/virtualfolder/databases
  • /v2/virtualfolder/{key}
  • /v2/virtualfolder/datasource/{datasourceKey}
  • /v2/virtualfolder/dependencies/{key}
  • /v2/virtualfolder/list/hive
  • /v2/virtualfolder/name/{name}
  • GET POST
  • GET
  • DELETE GET PUT
  • GET
  • GET
  • GET
  • GET
Endpoints under /virtualfolder provide access to VirtualFolder objects. Virtual ​folders ​allow ​users ​to ​create ​groups ​of ​resources ​belonging ​to ​a ​data ​source ​into ​smaller ​units ​for easier ​management. ​In ​addition, ​Virtual ​folders ​allow ​data resources ​to ​be ​part ​of ​multiple ​folders ​thus letting ​customers ​create ​folders ​ with ​overlapping ​sets ​of ​data ​resources.
Waterline Data Logs
  • /v2/logs
  • /v2/logs/download
  • /v2/logs/view
  • GET
  • GET
  • GET