With VAST Database, also known as Natural Database or NDB, you can store, access and manage tabular data on a VAST cluster.
A REST over HTTP(S) protocol can be used to interface the VAST Database tables. Using the REST API, you can create tables, filter data, perform modifications and run ACID transactions.
VAST provides database connectors (plugins) that enable third-party query engines to run queries against data stored in a VAST database. A query engine can use its pushdown capabilities and offload work to the VAST cluster.
On the VAST cluster, data can be accessed with less effort and with less data transfer between the query engine and the storage. In addition, you can leverage VAST analytics, scalability, performance, data reduction and snapshots on tabular data.
Check the list of supported query engines and follow the steps to set up an environment for running queries against a VAST database.
For instructions on managing VAST databases via VAST Web UI or VAST CLI, see Managing VAST Databases.
For an overview of VAST analytics on tabular data, see Monitoring VAST Databases.
VAST provides database connectors for the following query engines:
-
Trino 3.7.5
-
Apache Spark 3.3.1
VAST Database supports predicate and dereference pushdowns.
If a database query includes requests that are not supported by VAST Database, the query engine generates a query execution plan to perform those operations after receiving filtered data back from VAST.
In the following example, the WHERE clauses would be processed as pushdowns to VAST Database and then the returned data would be averaged by the query engine:
SELECT AVG(col1) FROM vast.schema.mytable WHERE col1 > 100 AND col3 < 50;
VAST Database supports the following Apache Arrow, Trino, and Apache Spark data types:
VAST Database data type |
Arrow data type |
Trino data type |
Spark data type |
---|---|---|---|
UINT8, UINT16, UINT32, UINT64 |
UINT8, UINT16, UINT32, UINT64 |
n/a |
n/a |
INT8, INT16, INT32, INT64 |
INT8, INT16, INT32, INT64 |
TINYINT, SMALLINT, INTEGER, BIGINT |
TINYINT, SMALLINT, INTEGER, BIGINT |
BOOL |
BOOL |
BOOLEAN |
BOOLEAN |
FLOAT32, FLOAT64 |
FLOAT, DOUBLE |
REAL, DOUBLE |
REAL, DOUBLE |
STRING |
STRING |
VARCHAR |
STRING |
BINARY |
BINARY |
VARBINARY |
BINARY |
DECIMAL128 |
DECIMAL128 |
DECIMAL |
DECIMAL |
DATE32 |
DATE32 |
DATE |
DATE |
TIMESTAMP |
TIMESTAMP |
TIMESTAMP |
TIMESTAMP |
TIME32 |
TIME32 |
TIME |
n/a |
TIME64 |
TIME64 |
TIME |
n/a |
Tip
Pay attention to precision settings when dealing with data types that have a configurable precision. For example, a data type of timestamp
specified in a Parquet file can be seen as timestamp(3)
in Trino (as a result of a Trino default setting) while processed as timestamp(6)
in VAST Database.
VAST Database offers query performance gains through offloading a subset of operations to the VAST cluster where tabular data reside.
VAST Database is not currently intended to provide capabilities such as advanced query parsing or planning, JOINs, or multifaceted SQL queries directly on VAST-stored data.
Comments
0 comments
Article is closed for comments.