Waterline Data provides a REST API to access metadata held in the catalog. The same API allows applications to insert metadata such as property values, tags, tag associations, and lineage relationships. The API provides access to the same operations available from the Waterline Data browser application.
The API uses JSON objects as request and response payloads. The HTTP call returns a general pass/fail status message; calls that fail at the Waterline Data server return an failure response message with an error code and more detailed message.
The API accepts the same user credentials as the Waterline Data browser application. Before sending API calls, an application sends an authentication request. The server responds to a successful request with a session cookie, which the application then uses in the header of API calls. The token is valid for the length of the session.
name | path | methods | description |
---|---|---|---|
Audit Events |
|
|
Endpoints under /auditevent provide access to audit events—Waterline Data system and user activities.
You can use audit event characteristics such as user, date, or type of activity to filter the events returned.
|
Authentication |
|
|
The Authentication API accepts the same user credentials as the Waterline Data browser. These API calls control the creation and destruction of a session cookie to validate other API calls. |
Browse: Field-level |
|
|
Endpoints under /browse/field manage the facets available when users view a list of fields for a given resource.
|
Browse: Resource-level |
|
|
Endpoints under /browse allow you to list and manage metadata for data resources based commonalities of location.
Compare with the /search functionality, that lists data resources which share some metadata characteristics.
Browse functionality depends on how data is stored and its location on a file system or database.
|
Configuration |
|
|
Endpoints under /config provide access to the configuration options that control job
and web server behavior.
|
Custom Header |
|
|
|
Data Objects |
|
|
|
Data Preview |
|
|
Endpoints under /datapreview provide access to previews of data resource content based on the authorization of the
current user.
|
Data Resources |
|
|
Endpoints under /dataresource provide access to DataResource objects in the Waterline Data repository.
Waterline Data defines a data resource as an HDFS file or folder, a set of files that are organized in
folders (known as a Waterline Data collection), as a Hive database or table, or as a relational database
or table. DataResource objects are uniquely identified by a "key" value, assigned on creation.
|
Data Sets |
|
|
Endpoints under /dataset provide access to Virtual Dataset objects. A Waterline Virtual Data Set
allow users to create groups of resources having the same schema and spanning different folders in your data lake,
into a single virtual unit for easier management.
Virtual Data Sets are considered as user defined virtual collections that have the matching schema but may have
different path specifications/hierarchy with respect to a data source.
|
Data Sources |
|
|
Endpoints under /datasource provide access to DataSource objects. A Waterline Data data source is the
location of an HDFS directory, a Hive instance or specific database, a cloud storage location, or a relational
database instance or specific database. A data source hosts data resources, in the form of folders, files,
databases, and tables.
|
Entity Specifications |
|
|
An entity specification defines properties for Waterline Data objects such as tags, users, roles, data resources,
and so on. Properties of entities are represented by PropertySpec objects. An EntitySpec object associates
a set of properties with a specific entity type. Endpoints under /entityspec allow you to create,
query, update, and delete properties associated with entities, including setting the access level for properties.
|
External Sources |
|
|
External Sources are sources of metadata to be imported into the Waterline Data catalog. Endpoints under
/externalsource provide a way to identify and connect to these outside applications. For example,
the extensions created to import metadata from applications such as Apache Atlas or Cloudera Navigator use
external sources to persist connection information for the application instance.
|
Favorites |
|
|
Endpoints under /favorites provide a way to manage user bookmarks on tags and resources.
|
JobManagement |
|
|
|
JobManagement |
|
|
|
JobManagement |
|
|
|
JobManagement |
|
|
|
Lineage |
|
|
Waterline Data defines a LineageRelations object that records and presents the lineage relationships between data
resources. Endpoints under /lineage provide access to these objects.
|
Overlap tables |
|
|
Endpoints under /overlap/table allow you to pass tables information and receive a preliminary overlap
report computed based on the table schema
|
Reviews |
|
|
Endpoints under /reviews allow access to user reviews including deleting specific reviews.
|
Roles |
|
|
Waterline Data uses roles to indicate what data resources from the catalog are available to each user.
User roles determine what actions each user can
perform for metadata management inside Waterline Data. Endpoints under /role allow creation
of new roles and modification of existing roles. To associate roles to users, use a PUT request to
/user/{key} .
|
Rules |
|
|
Endpoints under /rule allow you to list and manage the rules.
|
Search |
|
|
Endpoints under /search allow you to collect data resources that share some metadata characteristics.
Compare the browse functionality, that returns a set of data resources based on the resource's location.
Searchable characteristics include facets and keywords.
|
SessionController |
|
|
|
Tag Associations |
|
|
Endpoints under /tagassociation provide access to TagAssociation objects in the Waterline Data repository.
A tag association object defines the relationship of a tag to a specific data resource (folders, files, collections,
and tables) or specific field in a data resource.
|
Tag Domains |
|
|
Waterline Data tags are maintained in a tag glossary that is organized into tag domains. Endpoints under
/tagdomain provide access to TagDomain objects.
|
Tags |
|
|
Endpoints under /tag provide access to Tag objects in the Waterline Data repository. Tags are
labels associated with data resources (folders, files, collections, or tables), or with specific fields in data resources.
|
User Profiles |
|
|
Endpoints under /user allow administrators to manage user profiles, represented by the User object.
The caller must have a roles that includes an Administrator access level to create, update, or delete user profiles.
|
Virtual Folders |
|
|
Endpoints under /virtualfolder provide access to VirtualFolder objects. Virtual folders allow users to
create groups of resources belonging to a data source into smaller units for easier management. In addition,
Virtual folders allow data resources to be part of multiple folders thus letting customers create folders
with overlapping sets of data resources.
|
Waterline Data Logs |
|
|
type | description |
---|---|
AbstractField | The complete metadata for each field (column) in a data file or table. |
AccessLevel | Access levels are a component of user roles that describe sets of Waterline Data behaviors available on the data sources and tag domains included in a role. The access levels are hierarchical: all behaviors available to the Guest level are available to the Analyst; all Analyst behaviors are available to the Steward level; all Steward behaviors are available to the Admin level. |
AssociationCount | Type that describes tag association counts by state for a given tag. |
AtomicField | The complete metadata for each field (column) in a data file or table. |
AuditEvent | Type that describes the Waterline Data activities that users (including the system user) perform. They include the action, the user who performed the action, and the Waterline Data objects that are affected by the action. |
AuditEventListResponse | Type that wraps a list of AuditEvent objects.
It includes the pagination start index and page size for returning a fixed number of audit event descriptions.
|
AuditFilterCriteria | |
BrowseRequest | The type that describes the data resource payload returned when viewing a list of resources based on their context in a file system or database. It includes the parent data resource (typically a folder or database) and pagination start and size. It can also include selections to filter the child resources based on facets. |
BrowseResult | Type that defines the fields to display a Waterline Data catalog entry ("data resource"), whether it is a container such as a folder or an individual file or table. Note that Waterline Data collections are described both as a container and as a single data set. |
Category | Enumeration that describes groupings for configuration properties. These categories are for organization and are not functional. |
ConfigProperty | Type that describes the configuration settings used to control Waterline Data behaviors. Properties include controls such as whether to profile files that failed previously. The information provided for each property includes the property name, the label used in the UI, a description of what the property controls, and attributes for who and how the property can be viewed or changed. |
Credential | |
Credentials | Type that describes the username and password for the current user's authenticated session. Credentials details are passed to the system, but cannot be returned. |
CustomHeaderProperties | |
CustomProperty | Type describing a user-defined property defined for a Waterline Data entity. |
DataObjectRequest | Base type for Waterline Data metadata objects. |
DataPartition | Type representing DataPartition |
DataPreview | |
DataResource | Type includes all the properties that describe a database, folder, table, schema, file, or Waterline Data collection. You can add properties to a data resource using a PUT call to /entityspec/\{key\/propertygroup}. |
DataResourceFlexTO | DataResourceFlexTO a Transfer object used to return data for resource/metadata api |
DataResourceLite | Type includes all the properties that describes a lite resource Used in children array of browse and search api response. |
DataSource | Type including the fields that describe a data source, including source type and connection information. |
DataSourceDependencies | |
DataType | Enumeration of data types for Waterline Data property values. These types are Java primitive types plus string, decimal, date, timestamp, array, object, and map. |
DatasetMember | |
DatasetRequest | Base type for Waterline Data metadata objects. |
Entity | Base type for Waterline Data metadata objects. |
EntitySpecification | Type that describes a Waterline Data object type. |
ExternalSource | Type that describes applications with which Waterline Data can exchange metadata. For example, connections to Cloudera Navigator or Apache Atlas are defined as ExternalSource objects. |
FacetResult | Type that describes a search facet, including facet category counts for the number of data resources or fields included in search results. |
FacetSelection | Type to indicate a filter on a search result. |
Favorite | Type that describes favorites, including the marked item and the time it was marked. |
GroupIdentity | |
IterableOfExternalSource | |
IterableOfTableComparisionTO | |
IteratorOfString | |
JobDetailsTO | |
JobExecutionResultTO | |
JobExecutionTO | |
JobSequenceTO | |
JoinConditionTO | |
LineageRelationPath | |
LineageRelations | Type that describes the connection between two data resources in a lineage graph, including the operation that identifies the connection. |
LineageSearchPathTO | |
LogFileContent | |
LogFileElement | |
LogFolderElement | |
LoginResponse | Type that provides a response to a login attempt. Successful attempts return a SUCCESS flag and message and echo the username. Failed attempts return a FAILURE flag and a message that describes the reason for failing. |
LoginResult | Enumeration of the possible outcomes of a login attempt, including "SUCCESS" and "FAILURE". |
LogoutResponse | |
Member | Type that describes payload for dataset add/remove member apis |
Node | Type that describes the operation and most recent operation execution that connects two resources in a lineage graph. |
ObjectField | Type that describes field-level metadata for data resources. |
OpDetails | Type that describes the operation executions that correspond to an operation. |
OperationExecution | |
PaginatedResponse2OfDataResourceFlexTO | Response Payload used in dataresource/metadta api |
PaginatedResponseOfDataResource | |
PaginatedResponseOfJobDetailsTO | |
PaginatedResponseOfVirtualFolder | |
PagingCriteria | Type describing the start and size of data resource or field information in search or browse results. |
PathElement | Type that maps a resource identifier (key) to the unique name of the resource, including its filesystem path or parent database. |
PathInfo | Type that describes the location of a resource based on the data source (indicated by its name and unique identifying key) and the fully qualified path of the resource. |
PathSpecification | Type describing PathSpecification of a virtual-folder |
PatternTO | |
PatternValue | |
Principal | |
Properties | |
Property | Type that describes metadata associated with an entity. You can create additional properties to enhance the metadata stored for data resources. |
PropertySpecification | Type describing the attributes of properties on Waterline Data entities. |
PropertyType | Enumeration of property value data types. |
RegexTester | Type that includes the regular expression used to identify field data to associate with a tag. This type includes a sample of field data to validate the regular expression against. |
ReportedField | Type representing ReportedField of data resource |
ResourceField | Type that describes a field in a data resource, including the field name, its data type, and the unique identifying key of the data resource the field is associated with. |
Role | Type describing the Waterline Data access model for users. Roles include an access level, one or more data sources, and one or more tag domains. |
Rule | Base type for Waterline Data metadata objects. |
RuleAction | |
RuleScope | |
SearchCriteria | Type that describes the criteria for searching on data resources or fields. Criteria can include a keyword and/or facets. |
SearchResult | Type that describes the data resources or fields returned in a search. The search criteria are also included. |
SearchType | |
Semantic | Enumeration of values that describe how the tag association contributes to tag association discovery. |
Semantic | Enumeration of values that describe how the domain was created. Should not be set from the REST API. |
SessionData | |
Source | Type that describes a parent in a lineage relationship, including the fully-qualified data resource name (path) and whether or not the data resource is a Waterline Data collection. |
State | Enumeration of values that describe the state of the tag association. |
State | |
StatusDetail | Type that describes the execution status of a job. |
TableComparisionTO | |
Tag | Type that describes a tag, including the tag name and properties that describe how the tag is used in tag association discovery. |
TagAssociation | Type that describes a tag association, including a data resource or resource and field, a tag, the state of the association, and whether or not the field data from this association is used as a reference for tag discovery (Semantic). |
TagAssociationContainer | Type that describes a list of tag associations. |
TagAssociationRequest | |
TagAssociationView | List of tag associations for this data resource. The content includes values designed to be human readable: it includes the names of the relevant items in addition to their unique keys. |
TagDetail | |
TagDetails | |
TagDomain | Type that describes a tag domain, including the name and description for the domain. |
TagDomainContainer | Type that describes a tag domain and all of the included tags. |
TagRequest | Type that describes the minimum properties needed to create a tag, including the tag domain (by key), the tag name, and the tag description. |
TagState | Enumeration of the types of tag propagation available in Waterline Data. |
UpdatedUserReview | Type that describes the new statistics associated with a rating on a data resource a user review for that resource has been updated. It includes the components of the updated review as well. |
User | Type that describes a user profile, including the user name and the roles associated with this user. |
UserIdentity | |
UserInfo | Type that provides information about the current user. |
UserReview | Type that includes the components of a review, including a title, comment, and rating.
The UserReview also includes the username of the person who created the review.
|
VirtualFolder | Base type for Waterline Data metadata objects. |
VirtualFolderDependencies |