The VAST Cluster S3 service supports client connection using any S3 toolkit that supports HTTP or HTTPS to call the REST API.
VAST Data-specific guidelines are available for using the S3cmd command line tool or the Boto 3 SDK for Python.
In general, to get started with any client interface, you will need to configure the following:
-
The endpoint for the VAST Cluster S3 service. Specify any of the cluster's Virtual IPs.
-
The connection type. VAST Cluster can listen to S3 service requests on HTTP or on HTTPS. When configuring the client to connect over HTTPS, read about installing an SSL certificate and configuring certificate verification on the client.
Note
It is possible to disable S3 HTTP and/or HTTPS connection. See S3 Connection Settings.
-
Credentials. Specify an access key pair generated for you via VMS.
-
Signature type for authenticated requests. VAST Cluster supports AWS signature version 2 and AWS signature version 4.
VAST Cluster supports a subset of S3 API actions. They are listed below with references to guidelines for how to call them using some specific client interfaces.
Task |
Amazon S3 Action Name |
s3cmd Command |
Boto 3 S3 Resource Usage |
Boto 3 S3 Client Method |
---|---|---|---|---|
Bucket Operations |
||||
List the buckets on the S3 server |
ListBuckets |
|||
Create a bucket |
CreateBucket |
|||
List the objects in bucket |
ListObjectsV2 (revised version) ListObjects (prior version supported by Amazon for backward compatibility) |
|||
Verify if bucket exists and if you have permission to access it |
HeadBucket |
N/A |
||
Delete a bucket |
DeleteBucket |
|||
Return the region in which the bucket resides |
GetBucketLocation |
|||
Bucket Access Control List (ACL) OperationsNoteFor information about ACLs, see Managing S3 Access Control Rules (ACLs). |
||||
Set the ACL of a bucket |
PutBucketAcl |
|||
Return the ACL of a bucket |
GetBucketAcl |
|||
Object Operations |
||||
Create an object |
PutObject |
|||
Create a copy of an object |
CopyObject NoteObject rename is also supported with a VAST rename header. See Renaming Objects. |
|||
Retrieve an object |
GetObject |
|||
Retrieve metadata of an object without returning object itself |
HeadObject |
|||
Delete an object |
DeleteObject |
|||
Delete multiple objects in a bucket |
DeleteObjects |
N/A |
||
Object ACL OperationsNoteFor information about ACLs, see Managing S3 Access Control Rules (ACLs). |
||||
Set ACL permissions on an existing object |
PutObjectAcl |
|||
Return the ACL of an object |
GetObjectAcl |
|||
Upload Operations |
||||
Initiate upload of a file in multiple parts (multipart upload) |
CreateMultipartUpload |
|||
Abort a multipart upload |
AbortMultipartUpload |
|||
Complete a multipart upload |
CompleteMultipartUpload |
Automatically executed with s3cmd put for multipart uploads. |
||
Upload a part in a multipart upload. |
UploadPart |
Automatically executed with s3cmd put for multipart uploads. |
||
Upload a part by copying data from an existing object as data source. |
UploadPartCopy |
N/A |
Upload a part by copying data from an existing object as data source |
|
List the parts that have been uploaded for a specific multipart upload. |
ListParts |
|||
List multipart uploads |
ListMultipartUploads |
|||
Object Versioning OperationsNoteFor information about object versioning, see Object Versioning. |
||||
Set versioning state on a bucket NoteOnce enabled, versioning cannot be disabled or suspended. |
PutBucketVersioning |
N/A |
||
Get versioning state of a bucket |
GetBucketVersioning |
N/A |
||
Return metadata about all versions of the objects in a bucket |
ListObjectVersions |
N/A |
|
|
Object Locking OperationsNoteFor information about object locking, see S3 Object Locking Overview. |
||||
Place an object lock configuration on a bucket. |
PutObjectLockConfiguration |
N/A |
||
Get the object lock configuration for a bucket. |
GetObjectLockConfiguration |
N/A |
||
Place an object retention configuration on an object. |
PutObjectRetention |
N/A |
||
Get an object's retention settings. |
GetObjectRetention |
N/A |
||
Apply a legal hold configuration to an object. |
PutObjectLegalHold |
N/A |
||
Get an object's current legal hold status. |
GetObjectLegalHold |
N/A |
The following VAST custom headers are supported with the following request types:
Custom Header |
Request |
Feature |
Header description |
---|---|---|---|
x-amz-delete-contents: true |
Delete |
Deletes objects asynchronously by moving them into VAST trash folder. |
|
x-amz-copy-disposition: replace |
CopyObject |
Renames an object |
Object versioning is a feature enabled on the bucket level that automatically preserves all previous versions of every object.
Any change to an S3 object is an overwrite that replaces the object with a new one. With versioning enabled, the replaced object is not deleted; it is stored with a unique version number. You can always access any previous version of any object.
If the bucket's view is accessible via other protocols, those protocols have read-only access to the versioned objects.
S3 versioned objects are stored as numbered files under a version directory per object.
-
When versioning is enabled on a bucket, it cannot be disabled.
-
Multiprotocol access to versioned objects is not supported. NFS and SMB clients may attempt to access the S3 versioned objects in read-only mode.
-
MFA Delete is not supported. MFA Delete is an optional configuration that requires multi-factor authentication of the bucket owner for any request to delete a version or change the versioning state of the bucket.
-
Copy objects from before VAST 4.0 cannot be used as copy sources and cannot be overwritten. The only way to modify a pre-v4.0 copy object is to copy the object onto itself.
-
Versioning cannot be enabled if the trash folder is enabled on the cluster.
Caution
VAST Cluster does not support disabling (suspending) versioning on a bucket. The versioning of a bucket can only be enabled and remain enabled.
You can enable versioning via VMS or via S3 RPC.
Versioning can be enabled on a bucket when you create the bucket via VMS by creating It is possible to enable object versioning via VMS when creating a bucket using the S3 Bucket protocol option in the view creation flow. Otherwise, versioning can be enabled via RPC.
For an existing view with an S3 bucket:
-
On the Views page, open the Actions menu for the view and select Edit. In the Update View dialog, on the S3 tab, enable the S3 Versioning setting and then click Update to save your change.
When creating a new view:
-
When you create the view, in the Create View dialog, enable the S3 Versioning setting on the S3 tab. The S3 Bucket protocol must be selected already in the Protocols dropdown before you can select the S3 tab.
For an existing view with an S3 bucket:
-
Run the
view modify
command with the--s3-versioning
parameter.
For example, this command enables versioning on a bucket whose view has ID 32:
$ view modify --id 32 --s3-versioning
Invoke the PutBucketVersioning S3 command to set the versioning state to Enabled.
Note
The Suspended state is not supported.
The following commands are supported:
-
GetBucketVersioning. Returns information on the state of the bucket's versioning.
-
ListObjectsVersions. Retrieves metadata about the versions of all the objects in a bucket.
When you upload an object to a version-enabled bucket, a unique version ID is automatically added to the object .
Each version of the object is stored as '<object-name>/<version-number>' where <object-name> is the object key and <version-number> is the version number. When listing objects, the S3 service returns objects in ascending order. Newer versions of objects have lower numbers than older versions, so the latest version is listed first.
To list all versions of a given object, send a GET Bucket
request to list the object's version directory. Specify the directory by setting the prefix
parameter to the object key.
A simple GET Object request retrieves the current version of an object.
To retrieve a specific version, you need to specify its version ID. Set the versionId to the ID of the version of the object you want to retrieve and send a GET Object versionId request.
A simple HEAD Object
request retrieves metadata of the current version of an object.
To retrieve the metadata of a specific version of an object, send a HEAD Object versionId
request, specifying the version ID of the object you want to retrieve.
A DeleteObject
request for a versioned object does not delete it. It leaves previous versions of the object in the object version directory and creates a delete marker. A delete marker is an object that indicates that the object has been deleted. It has a key name (or key) and version ID like any other object but it is otherwise not a regular object: It does not have data associated with it nor does it have an access control list (ACL). It is not retrieved in a GET Object request or returned in a GET Bucket request.
You can delete the delete marker by sending a DeleteObject versionId
request with the delete marker's version ID. This has the effect of undoing the object deletion.
To delete multiple versions of a versioned object, send a DeleteObjects
request with an array of object identifiers (key, versionid)
The following operations restore a previous version of an object:
-
Copy the version that you want to restore into the same bucket. The copied object becomes the current version of that object and all object versions are preserved.
-
Permanently delete the current version of the object. The previous version automatically becomes the current version.
S3 object locking is a feature that helps prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.
S3 object locking can be enabled on an S3 bucket provided that the bucket's view is not simultaneously enabled for access via other protocols (SMB, NFS, NFSv4.1). Once object locking is enabled on a view, it cannot be disabled. Object versioning is automatically enabled with object locking.
There are two levels of protection with S3 object locking, called retention modes:
-
Governance mode, in which users can't overwrite or delete an object version or alter its lock settings unless they have special permissions. With governance mode, you protect objects against being deleted by most users, but you can still grant some users permission to alter the retention settings or delete the object if necessary. You can also use governance mode to test retention-period settings before creating a compliance-mode retention period.
To override or remove governance-mode retention settings, a user must have the
s3:BypassGovernanceRetention
permission and must explicitly includex-amz-bypass-governance-retention:true
as a request header with any request that requires overriding governance mode. -
Compliance mode, in which a protected object version can't be overwritten or deleted by any user. When an object is locked in compliance mode, its retention mode can't be changed, and its retention period can't be shortened. Compliance mode helps ensure that an object version can't be overwritten or deleted for the duration of the retention period.
There are two ways to manage object retention with object locking:
-
Retention period, which specifies a fixed period of time during which an object remains locked. During this period, your object is WORM-protected. This means that when an object is deleted or replaced, the version that was deleted or replaced is protected from being removed from the bucket, although it does cease to be the latest version and can only be accessed by its version ID.
-
Legal hold, which provides the same protection as a retention period, but it has no expiration date. Instead, a legal hold remains in place until you explicitly remove it. Legal holds are independent from retention periods.
When object locking is enabled on a bucket, each object in the bucket can have no lock, a retention lock or a legal hold. If you configure a default retention period, object versions that are placed in the bucket are automatically protected with a retention lock.
For detailed information about how object locking works, see the AWS S3 documentation page, How S3 Object Lock works.
For information about supported ways to manage object locking on buckets and objects, see the following sections:
The following object locking features for buckets are available in VMS.
Caution
Once you enable object locking on a bucket, you cannot disable it or suspend versioning for that bucket.
You can send requests by S3 API to do the tasks described below for configuring and viewing object locking configurations on buckets and objects.
You can manage object lock configuration on buckets using the following API requests and headers. The operations require user permissions which must be granted through identity policies.
Task |
S3 API Operation |
S3 Permission Required |
---|---|---|
Enable object locking on a new bucket. |
Include the CautionOnce you enable object locking on a bucket, you cannot disable it or suspend versioning for that bucket. |
s3:PutBucketObjectLockConfiguration |
Enable object locking on an existing bucket and set a default retention period |
s3:PutBucketObjectLockConfiguration |
|
Get the Object Lock configuration of a bucket |
s3:GetObjectLockConfiguration |
Task |
S3 API Operations |
Notes |
S3 Permission Required |
---|---|---|---|
Retention Period Tasks |
|||
Set a retention configuration on an object. |
PutObjectRetention |
This includes setting the retention mode and setting an explicit retention period on the object. The explicit retention period overrides a default retention period set on the bucket. |
s3:PutObjectRetention |
Extend a retention period after setting a retention configuration on an object version. |
To do this, submit a new lock request for the object version with a |
s3:PutObjectRetention |
|
Get the retention settings of an object. |
This includes the date and time and the retention mode. |
s3:GetObjectRetention |
|
Get the date and time when an object's lock is due to expire, along with other object information. |
GetObject, HeadObject |
The response includes the |
s3:GetObjectRetention |
Get an object's retention mode, along with other object information. |
GetObject, HeadObject |
The response includes the Compliance mode is not supported. Therefore, object lock mode is always |
s3:GetObjectRetention |
Legal Hold Tasks |
|||
Apply a legal hold configuration to an object. |
PutObjectLegalHold |
Placing a legal hold on an object version doesn't affect the retention mode or retention period for that object version. |
s3:PutObjectLegalHold |
Get an object's current legal hold status. |
GetObjectLegalHold, GetObject |
The With |
s3:GetObjectLegalHold |
Operations that Require Bypassing Governance Mode |
|||
Overwrite or delete an object version or alter its lock settings, including shortening the retention period, and removing an object lock by placing a new lock with empty parameters. |
You must explicitly include |
s3:BypassGovernanceRetention |
S3 Object Ownership lets you set ownership of objects uploaded to a given bucket and to determine whether ACLs are used to control access to objects within this bucket.
A bucket can be configured with one of the following object ownership rules:
-
Bucket Owner Enforced. The bucket owner has full control over any object in the bucket. Access to objects is controlled based on policies configured for the bucket. ACLs are not used.
-
Bucket Owner Preferred. The bucket owner has full control over new objects uploaded to the bucket by other users. ACLs can be used to control access to the objects.
-
Object Writer.The user that uploads an object has full control over this object. ACLs can be used to let other users access the object.
S3 Object Ownership is only supported with S3 Native security flavor set for the view policy.
To set an object ownership rule for a bucket, specify the --s3-object-ownership-rule
option on the view create
or view modify
command.
In S3, renaming an object requires copying the object and then deleting the source object. VAST Cluster supports renaming an object using a single copy request which specifies to replace the object with its copy.
The rename operation moves the object with its ACL and metadata.
Source names and destination names may include double slashes and leading slashes, with the following implications in terms of directories and regular objects:
Source Name |
Destination Name |
Rename Operation |
---|---|---|
mybucket/a |
mybucket/b |
|
mybucket/a/ |
mybucket/b |
|
mybucket/a/ |
mybucket/b/ |
|
mybucket/a// |
mybucket/b// |
|
To rename an object, include the following header in a CopyObject request:
x-amz-copy-disposition: replace
If this header is included in the request, then:
-
The optional x-amz-metadata-directive header may not be used to specify that you want to replace object metadata with metadata provided in the request. Only copying object metadata from the source object is allowed.
-
The following headers are not allowed:
-
x-amz-aclx-amz-grant-full-control
-
x-amz-grant-read
-
x-amz-grant-read-acp
-
x-amz-grant-write-acp
-
The trash folder is a hidden folder into which client users can move folders and files that they want to delete. The files and folders are automatically deleted for the user from the trash folder asynchronously.
The trash folder feature is disabled by default and can be enabled on the cluster. It is supported for use via NFSv3 as well as via S3.
Note
Trash folder cannot be enabled simultaneously with S3 object versioning.
For S3 users, trash folder deletion is done by adding the custom header 'x-amz-delete-contents: true' to a DELETE request. The custom header can be used to move any of the following to the trash folder:
-
A non-versioned bucket, even if not empty, along with any objects contained in the bucket. That is, a bucket delete request that would otherwise fail if the bucket is not empty does not fail when using the trash folder header.
-
An object. If the header is used with a DeleteObject request and an object key is specified which denotes an explicit object, that object is deleted.
-
All objects which share a specified object key prefix. Effectively, this is like deleting all files under a given hierarchical directory path. The prefix should be specified without a trailing slash. All contents under the prefix are moved to trash.
When using the trash folder via S3, the user requires the same permissions as are otherwise required to delete the data if the trash folder is not used. So, for example, a user deleting a bucket by moving it to the trash folder needs the explicitly granted delete bucket permission that is granted to users via VMS.
The trash folder feature must be enabled globally in the VAST Web UI. It is disabled by default and needs to be enabled in the Cluster settings.
Note
Unlike for NFS hosts, where permission is granted on a host basis through the host-based access rules in the view policy, S3 usage of the trash folder is not affected by the view policy host-based access rules. The same delete permissions are required with the trash folder as without the trash folder.
In order to use the trash folder for an object or bucket deletion, include the following custom header in a DELETE request:
x-amz-delete-contents: true
Note
There is no specific feature for undoing a move to the trash folder to restore data that was accidentally deleted.
Suppose we upload the object myfile to the bucket mybucket using the S3cmd client:
[centos@host~]$ s3cmd put myfile s3://mybucket/ upload: 'myfile' > 's3://mybucket/myfile' [1 of 1] 10485760 of 10485760 100% in 0s 113.09 MB/s done
Normally, a request to delete the bucket would fail because the bucket is not empty. However, making a DeleteBucket request and specifying the x-amz-delete-contents
header for the trash folder, the bucket and its content are deleted.
The following client session of the Boto3 AWS SDK for Python connects to the cluster's S3 service over HTTP at one of the cluster's CNode virtual IPs (198.51.100.255), authenticates with a VMS generated access key pair and deletes the bucket via the trash folder:
def set_delete_contents_header(request, **kwargs): request.headers.add_header('x-amz-delete-contents', 'true') sess = boto3.session.Session() s3 = sess.client(service_name='s3', region_name='vast-1', use_ssl=False, endpoint_url='http://198.51.100.255' aws_access_key_id='YAUY93A7Q91SO2M07BWZ', aws_secret_access_key='NWzyNIqIscgVZlaHqBzBWaejMDHSHbZuXDxbE2Yp' config=boto3.session.Config( signature_version='s3v4' s3={'addressing_style': 'path'} ) ) s3.meta.events.register('before-sign.*.*', set_delete_contents_header) s3.delete_bucket(Bucket='mybucket')
Following the deletion, no buckets are found:
[centos@host ~]$ s3cmd ls [centos@host ~]$
Suppose two objects a/b/c/object1 and a/b/d/object2 are created in an existing bucket mybucket using the s3cmd client:
[centos@host ~]$ s3cmd put 1KB_random_file s3://mybucket/a/b/c/object1 upload: '1KB_random_file' -> 's3://mybucket/a/b/c/object1' [1 of 1] 1024 of 1024 100% in 0s 15.16 KB/s done [centos@host ~]$ s3cmd put 1KB_random_file s3://mybucket/a/b/d/object2 upload: '1KB_random_file' -> 's3://mybucketheaders/a/b/d/object2' [1 of 1] 1024 of 1024 100% in 0s 17.13 KB/s done
These objects can be moved to trash by deleting the common object key prefix a/b, making a DeleteObject request specifying the x-amz-delete-contents
header.
Using a client instance of the Boto3 AWS SDK for Python, contacting the S3 service over HTTP at one of the CNode virtual IPs (198.51.100.255), authenticating with a VMS generated access key pair, the request might look like this:
def set_delete_contents_header(request, **kwargs): request.headers.add_header('x-amz-delete-contents', 'true') sess = boto3.session.Session() s3 = sess.client(service_name='s3', region_name='vast-1', use_ssl=False, endpoint_url='http://198.51.100.255' aws_access_key_id='YAUY93A7Q91SO2M07BWZ', aws_secret_access_key='NWzyNIqIscgVZlaHqBzBWaejMDHSHbZuXDxbE2Yp' config=boto3.session.Config( signature_version='s3v4' s3={'addressing_style': 'path'} ) ) s3.meta.events.register('before-sign.*.*', set_delete_contents_header) s3.delete_object(Bucket='mybucket', Key='a/b')
Following the deletion, the objects can no longer be listed:
[centos@host~]$ s3cmd ls s3://mybucket/a/b [centos@host ~]$
In an S3 bucket, the objects are stored in a flat hierarchy where every object has a unique key. The key is the name of the object and can contain slashes, unlike in other access protocols where a slash denotes a directory delineator in a file system path. In S3, directories do not exist but slashes can be used to organize the namespace and simulate the illusion of directories. Some S3 applications use objects with trailing slashes in their keys to simulate directory nodes. For example, an application could use an object called 'dev/' as a directory by creating new objects with keys that have 'dev/' as a common prefix. This use of objects as directory nodes may be referred to as 'S3 directories'.
The VAST Cluster S3 implementation supports 'S3 directories' with limitations in terms of multiprotocol access. S3 directories are supported as follows:
-
Object keys specified in S3 requests can contain slashes:
-
Object keys can start or end with a slash, such as '/a' or ‘a/’.
-
Object keys can contain multiple slashes in series, such as '//a', 'a//', 'a//b', or 'a///'.
-
-
An object created via S3 RPC cannot be accessed via NFS or SMB, neither as a file nor as a directory, if the object key starts with a slash, or ends with a slash, or includes multiple slashes.
-
An object with a key that has a trailing slash is a valid S3 object and is usable as an S3 directory by virtue of the trailing slash. It is otherwise the same as any other S3 object, supporting S3 operations and able to contain data as well as metadata, ACL, and versions.
-
In non-versioned buckets, two objects can have keys that are identical except that one of them has a single trailing slash. This is effectively like having a file and a directory with the same name. For example, you can have an object with key 'a/b' and a 'directory' object with key 'a/b/'. These are valid as two distinct objects. However, there is a limitation that you cannot have a file with the same name as a directory if you also have a file 'inside' that directory. For example: although you can have "key1" and 'key1/" as object keys, you cannot have "key1" and "key1/key2".
-
In versioned buckets, object keys must be distinguished by more than a single trailing slash, since a trailing slash is reserved for the object versioning directory. So, for example, the following are valid in a versioned buckets as keys for two distinct objects: 'a/b' and a/b//'.
Directories created inside views via other protocols (NFS, SMB), are accessible by S3 clients. To derive the key for accessing a file under a nested path as an S3 object, prepend the file path relative to the view path to the file name.
For example, supposing:
-
A view is created on path 'dev/' mounted by an NFS client.
-
An NFS client creates a directory "src" under the mounted view and saves the file "component" under the directory.
The key for accessing the file "component" as an S3 object is 'src/component".
Comments
0 comments
Article is closed for comments.