name | path | methods | description |
---|---|---|---|
Agent |
|
|
Endpoints under /agent provide access to Agent objects. Agents are the processing engine components
that can run on multiple remote clusters.
|
Audit Events |
|
|
Endpoints under /auditevent provide access to audit events—Waterline Data system and user activities.
You can use audit event characteristics such as user, date, or type of activity to filter the events returned.
Audit events are available only to users with an Administrator role. |
Authentication |
|
|
The Authentication API accepts the same user credentials as the Waterline Data browser. These API calls control the creation and destruction of a session cookie to validate other API calls. |
Browse: Field-level |
|
|
Endpoints under /browse/field manage the facets available when users view a list of fields for a given resource.
Field-level calls return results based on the user credentials for the current authentication session and the parent data resource. |
Browse: Resource-level |
|
|
Endpoints under /browse allow you to list and manage metadata for data resources based commonalities of location.
Compare with the /search functionality, that lists data resources which share some metadata characteristics.
Browse functionality depends on how data is stored and its location on a file system or database.
Browse facets are data resource attributes, such as file format, associated tags, size, and so on. Facets provide a list of items or ranges that include the values represented in the browse context. For example, the Content Type facet includes all file format types found in the catalog, but only the values that apply to the current browse context are considered. The browse call maintains state. The sequence of calls is important, as each call stores information that successive calls use. A BrowseResult object contains path information, an array of facet names and categories, information about the DataSource, and DataResource objects in JSON format. The results of a browse operation can be huge, so results are paginated. The default page size is 25. A browse request can specify a different page size, and after the first request, can specify the 0-based index of a specific page of results to retrieve, or a number of pages to retrieve. Browse calls return results based on the user credentials for the current authentication session. |
ComponentNetworkStatus |
|
|
|
Configuration |
|
|
Endpoints under /config provide access to the configuration options that control job
and web server behavior.
Individual configuration options are represented by the ConfigProperty object. |
Custom Header |
|
|
The Custom Header API returns the customised header and footer details. |
Data Objects |
|
|
|
Data Preview |
|
|
Endpoints under /datapreview provide access to previews of data resource content based on the authorization of the
current user.
|
Data Resources |
|
|
Endpoints under /dataresource provide access to DataResource objects in the Waterline Data repository.
Waterline Data defines a data resource as an HDFS file or folder, a set of files that are organized in
folders (known as a Waterline Data collection), as a Hive database or table, or as a relational database
or table. DataResource objects are uniquely identified by a "key" value, assigned on creation.
Requests to to a {key} endpoint under /dataresource address a specific DataResource object. If a directory,
file, database, or table has not yet been encountered in a format or schema discovery operation,
it does not have a corresponding data resource entry in the Waterline Data catalog.
Data resource calls return results based on the user credentials for the current authentication session. |
Data Sets |
|
|
Endpoints under /dataset provide access to Virtual Dataset objects. A Waterline Virtual Data Set
allow users to create groups of resources having the same schema and spanning different folders in your data lake,
into a single virtual unit for easier management.
Virtual Data Sets are considered as user defined virtual collections that have the matching schema but may have
different path specifications/hierarchy with respect to a data source.
|
Data Sources |
|
|
Endpoints under /datasource provide access to DataSource objects. A Waterline Data data source is the
location of an HDFS directory, a Hive instance or specific database, a cloud storage location, or a relational
database instance or specific database. A data source hosts data resources, in the form of folders, files,
databases, and tables.
DataSource objects are uniquely identified by a "key" value, assigned on creation. In addition, data sources can be referenced by the "name" property, specified by the creator. Data source calls return results based on the user credentials for the current authentication session. |
DataOps |
|
|
Endpoints under /dataops provides following details
Number of top level resources, including collections and data sets,
Total number of fields in the data lake,
All tag associations, at all levels.
All significant objects hosted by WDC
Count up all items curated or authored by the community
All users that we have seen on WDC
Total number of searches performed on WDC
including catalogmetrics,metadataobjectmetrics,usagemetrics and communitymetrics
|
Discussions |
|
|
|
Entity Specifications |
|
|
An entity specification defines properties for Waterline Data objects such as tags, users, roles, data resources,
and so on. Properties of entities are represented by PropertySpec objects. An EntitySpec object associates
a set of properties with a specific entity type. Endpoints under /entityspec allow you to create,
query, update, and delete properties associated with entities, including setting the access level for properties.
Entities are identified by their "key", which in this context is a name. The built-in entities are audit_event, data_resource, data_source, operation_execution, operation, resource_field, role, tag_association, tag_domain, tag, user_review, and user. Operations and operation_executions are not exposed at this time. To access entities, the authenticated user must have an Administrator role. |
External Sources |
|
|
External Sources are sources of metadata to be imported into the Waterline Data catalog. Endpoints under
/externalsource provide a way to identify and connect to these outside applications. For example,
the extensions created to import metadata from applications such as Apache Atlas or Cloudera Navigator use
external sources to persist connection information for the application instance.
External source calls are available only to users with administrator roles. |
Favorites |
|
|
Endpoints under /favorites provide a way to manage user bookmarks on tags and resources.
|
Homepage |
|
|
|
JobManagement |
|
|
|
JobManagement |
|
|
|
JobManagement |
|
|
|
JobManagement |
|
|
|
Lineage |
|
|
Waterline Data defines a LineageRelations object that records and presents the lineage relationships between data
resources. Endpoints under /lineage provide access to these objects.
|
Metadata Rest Server Config |
|
|
Endpoints under /metadata-server provide access to synchronise the metadata rest server
configurations to the app-server.
|
Metadata Server |
|
|
Endpoints under /metadata-server provide access to Metadata Rest Server details objects. Metadata Rest Server provides the endpoints
for agents to perform CRUD operations on the repository.
|
Notification |
|
|
|
Overlap tables |
|
|
Endpoints under /overlap/table allow you to pass tables information and receive a preliminary overlap
report computed based on the table schema
The results of comparision are collected in TableComparisionTO model and are returned back |
QueryController |
|
|
|
Reviews |
|
|
|
Roles |
|
|
Waterline Data uses roles to indicate what data resources from the catalog are available to each user.
User roles determine what actions each user can
perform for metadata management inside Waterline Data. Endpoints under /role allow creation
of new roles and modification of existing roles. To associate roles to users, use a PUT request to
/user/{key} .
A user with the admin role can create or modify roles with requests to the role endpoint. User roles are defined by the Role object, which has these components:
|
Rules |
|
|
Endpoints under /rule allow you to list and manage the rules.
|
SSLController |
|
|
|
Saved Search |
|
|
Endpoints under /savedsearch allow you to list and manage the Saved Searches.
|
Search |
|
|
Endpoints under /search allow you to collect data resources that share some metadata characteristics.
Compare the browse functionality, that returns a set of data resources based on the resource's location.
Searchable characteristics include facets and keywords.
The results of a search are collected in a SearchResult object, which is assigned a unique key value on creation. Access control restrictions apply to the search API. Search results are restricted to data resources included in the data sources included in the active users' role. |
SessionController |
|
|
|
Tag Associations |
|
|
Endpoints under /tagassociation provide access to TagAssociation objects in the Waterline Data repository.
A tag association object defines the relationship of a tag to a specific data resource (folders, files, collections,
and tables) or specific field in a data resource.
Tag associations can be made manually through API calls or the UI, or automatically through a Waterline Data tag discovery operation. Tag discovery can be made using the patterns from specific field values ("valueEnabled":"true" for the tag) or through matching field values against a regular expression ("regexEnabled":"true" for the tag). Tag associations have a "tagState" whose value is "ACCEPTED", "REJECTED", or "SUGGESTED".
|
Tag Associations Features |
|
|
|
Tag Domains |
|
|
Waterline Data tags are maintained in a tag glossary that is organized into tag domains. Endpoints under
/tagdomain provide access to TagDomain objects.
All tags must belong to a domain. One tag domain, "Built-in Tags", is dedicated to system tags. Users with admin privileges can create additional tag domains. |
Tags |
|
|
Endpoints under /tag provide access to Tag objects in the Waterline Data repository. Tags are
labels associated with data resources (folders, files, collections, or tables), or with specific fields in data resources.
|
Tokens |
|
|
Endpoints under /token provide access to Token objects. Tokens are used to authenticate inter component communication
for the distributed architecture.
|
User Profiles |
|
|
Endpoints under /user allow administrators to manage user profiles, represented by the User object.
The caller must have a roles that includes an Administrator access level to create, update, or delete user profiles.
User objects are uniquely identified by a key, assigned on creation. Requests to a {key} endpoint under /user address a specific User object.
|
Virtual Folders |
|
|
Endpoints under /virtualfolder provide access to VirtualFolder objects. Virtual folders allow users to
create groups of resources belonging to a data source into smaller units for easier management. In addition,
Virtual folders allow data resources to be part of multiple folders thus letting customers create folders
with overlapping sets of data resources.
|