Backup to S3 is a low cost method of backing up the data on your VAST Cluster. Backup to S3 enables you to replicate a VAST Cluster's data to an AWS S3 bucket or to any custom destination that supports S3 access. You can restore the replicated data from a VAST Cluster to the same VAST Cluster or to a different VAST Cluster.
Backup to S3 is a new feature in VAST Cluster 3.2.0.
Please note some specific limitations in this version (described in more detail below):
Data can be replicated from a VAST Cluster to only one S3 bucket (target) at any given time. The replication is done by a replication stream, of which no more than one can be configured at one time.
The VAST Cluster's entire file system is replicated. There is no way to limit the replication to a specific data set.
If you delete a stream and then create another stream later to resume replicating to the same target, the new stream will perform a new initial copy of all data to the target, rather than copying only new data that was not already copied to the target. (You can, however, pause and resume the stream.)
To set up S3 backup, first create a replication target. A replication target defines a destination to which data can be replicated. Then create a replication policy. A replication policy specifies a target, a starting time and a frequency for performing replication. Once you have a target and a policy, create a replication stream. The replication stream takes snapshots of the data at the configured points in time and performs replication to the target following each snapshot.
Creating a replication target on a VAST Cluster enables you to:
Configure replication of the VAST Cluster's data to the target.
Access data that was already replicated to the target from a client mount of the VAST Cluster's file system.
You can have multiple replication targets configured in parallel on a VAST Cluster. This enables you to access data that was replicated to all of those targets. However, you cannot replicate data to more than one target at any given time. See also Replication Streams.
A replication policy is a reusable configuration that defines the date and time to start replicating and a frequency for performing replications. So, for example, you could set replication to be done on July 1st, 2020 at midnight and then once every day. Replication would be done every day at midnight beginning July 1st.
Replication streams copy data from a VAST Cluster to a replication target. You can have one replication stream configured on a VAST Cluster at any given time.
Immediately after you create a replication stream, the stream starts copying all data under the root directory to the replication target and continues copying until an initial state of sync between the VAST Cluster and the target is reached. Following initial sync, the stream performs replication at the times configured in the replication policy. Due to the immediate initial sync, the start time in the policy is in fact the second time replication is done.
The stream performs replication by taking a snapshot of the VAST Cluster's data at a point in time and then copying data to the target until replication is complete. Only the changes between the data at the time of the snapshot and the data previously copied to the target are copied. For each point in time, the stream creates a restore point from which the data can be accessed.
If you create a stream to replicate data to a target that you already replicated data to earlier by another stream that was deleted earlier, the initial sync is performed again. In other words, the re-creation of a stream triggers a new upload of the entire data set to the target.
Restore points are time stamped directories on the target that let you access data as it was at the point in time when the data was replicated. Each active replication stream creates a restore point every time it performs replication. The name of each restore point is the time of the creation of a snapshot for replication. Restore points remain on a target independent of whether the stream that created them still exists.