VAST Catalog is a database that indexes metadata attributes of all data on the cluster, enabling high performance querying of files, objects and directories based on classification of data according to attributes used by multiple access protocols. Querying is available to storage administrators through the VAST Web UI as well as through the VMS API where you can leverage the queries. Querying is available to all users (including end users on the cluster's client network) through the VAST Catalog CLI or connected third party query engines.
VAST Catalog indexes the metadata attributes of the cluster's data from periodic snapshots of the cluster's data. The database is stored on a dedicated S3 bucket on the cluster.
The usage possibilities are endless. VAST Catalog can be used to:
-
Find files, such as:
-
Finding all files older than 90 days larger than 10GB that reside in the /projects directory.
-
Finding all files created since last week by a specific user.
-
Finding all S3 objects with the tag processed where the value of the tag is false.
-
-
Report capacity and usage, such as:
-
-
Ranking users consuming the most capacity in specific folder/projects
-
Ranking capacity usage by file extension
-
The following tools are available for querying VAST Catalog:
-
The VAST Catalog page in the VAST Web UI. This page provides a graphic user interface for easy building and execution of queries against VAST Catalog. The tool returns results in seconds and displays them in a customizable table of columns. You can choose to display any selection of VAST Catalog's indexed columns for the query results. You can also export the results to CSV format for further processing.
-
The VAST Catalog Command Line Interface (CLI). This interface brings the same fast and powerful query capabilities to clients on the enterprise network. It also runs in a standard Unix shell and therefore you can pipeline it with other Unix toolsets.
-
You can expose VAST Catalog to the Trino open source query engine, using a Trino-deployed storage connector.
VAST Catalog indexes the following attributes:
Attribute |
Type |
Description |
---|---|---|
|
timestamp |
The date and time of element creation |
|
integer |
Owner's POSIX UID |
|
varchar |
Owner's SMB SID |
|
varchar |
Owner user name |
|
integer |
Owner group's POSIX GID |
|
integer |
Owner group's SMB SID |
|
varchar |
Name of owner group |
|
timestamp |
Time of last file access. |
|
timestamp |
Time of last modification |
|
timestamp |
Time of last file system metadata change |
|
integer |
Number of associated hard links |
|
integer |
Type of element, such as FILE (file), DIR (directory) |
|
integer |
The size of the element |
|
integer |
Number of bytes used on disk |
|
varchar |
Element name |
|
varchar |
File extension |
|
varchar |
Parent path of file |
|
varchar |
The path of the link, if the element is a symbolic link |
|
integer |
Major device number |
|
integer |
Minor device number |
|
integer |
POSIX permissions mode bits |
|
boolean |
Indicates whether or not extended ACEs (NTFS/NFS4/Posix) exist on this element |
|
boolean |
Indicates whether there is an S3 object locking legal hold on this object |
|
integer |
The number of S3 tags an element has |
|
map() |
S3 tags associated with the element |
|
map() |
S3 metadata items on object (x-amz-meta) |
|
varchar |
The login name of the element owner |
|
varchar |
Path where the element can be searched |
It is also possible to add user-defined S3 tags and S3 metadata as additional columns.
Comments
0 comments
Article is closed for comments.