The VAST Cluster S3 service supports client connection using any S3 toolkit that supports HTTP or HTTPS to call the REST API.
VAST Data-specific guidelines are available for using the S3cmd command line tool or the Boto 3 SDK for Python.
In general, to get started with any client interface, you will need to configure the following:
-
The endpoint for the VAST Cluster S3 service. Specify any of the cluster's Virtual IPs.
-
The connection type. VAST Cluster can listen to S3 service requests on HTTP or on HTTPS. When configuring the client to connect over HTTPS, read about installing an SSL certificate and configuring certificate verification on the client.
Note
It is possible to disable S3 HTTP and/or HTTPS connection. See S3 Connection Settings.
-
Credentials. Specify an access key pair generated for you via VMS.
-
Signature type for authenticated requests. VAST Cluster supports AWS signature version 2 and AWS signature version 4.
VAST Cluster supports a subset of S3 API operations. They are listed below with references to guidelines for how to call them using some specific client interfaces.
Task |
AWS API Operation |
s3cmd Command |
Boto 3 S3 Resource Usage |
Boto 3 S3 Client Method |
---|---|---|---|---|
Bucket Operations |
|
|
||
List the buckets on the S3 server |
ListBuckets |
|||
Create a bucket |
CreateBucket |
|||
List the objects in bucket |
ListObjects |
|||
Verify if bucket exists and if you have permission to access it |
HeadBucket |
N/A |
||
Delete a bucket |
DeleteBucket |
|||
Return the region in which the bucket resides |
GetBucketLocation |
|
|
|
Bucket Access Control List (ACL) OperationsNoteFor information about ACLs, see S3 Access Control Rules (ACLs). |
|
|
||
Set the ACL of a bucket |
PutBucketAcl |
|||
Return the ACL of a bucket |
GetBucketAcl |
|||
Object Operations |
|
|
||
Create an object |
PutObject |
|||
Create a copy of an object |
CopyObject NoteObject rename is also supported with a VAST rename header. See Renaming Objects. |
|||
Retrieve an object |
GetObject |
|||
Retrieve metadata of an object without returning object itself |
HeadObject |
|||
Delete an object |
DeleteObject |
|||
Delete multiple objects in a bucket |
DeleteObjects |
N/A |
||
Object ACL OperationsNoteFor information about ACLs, see S3 Access Control Rules (ACLs). |
|
|
||
Set ACL permissions on an existing object |
PutObjectAcl |
|||
Return the ACL of an object |
GetObjectAcl |
|||
Upload Operations |
|
|
||
Initiate upload of a file in multiple parts (multipart upload) |
CreateMultipartUpload |
|||
Abort a multipart upload |
AbortMultipartUpload |
|||
Complete a multipart upload |
CompleteMultipartUpload |
Automatically executed with s3cmd put for multipart uploads. |
||
Upload a part in a multipart upload. |
UploadPart |
Automatically executed with s3cmd put for multipart uploads. |
||
Upload a part by copying data from an existing object as data source. |
UploadPartCopy |
N/A |
Upload a part by copying data from an existing object as data source |
|
List the parts that have been uploaded for a specific multipart upload. |
ListParts |
|||
List multipart uploads |
ListMultipartUploads |
|||
Object Versioning OperationsNoteFor information about object versioning, see Object Versioning. |
|
|
||
Set versioning state on a bucket NoteOnce enabled, versioning cannot be disabled or suspended. |
PutBucketVersioning |
N/A |
|
|
Get versioning state of a bucket |
GetBucketVersioning |
N/A |
|
|
Returns metadata about all versions of the objects in a bucket |
ListObjectVersions |
N/A |
|
The following VAST custom headers are supported with the following request types:
Feature |
Request |
Custom Header |
---|---|---|
Delete |
x-amz-delete-contents: true |
|
CopyObject |
x-amz-copy-disposition: replace |
Note
Setting S3 ACLs on buckets and objects via S3 RPCs is supported with S3 Native security flavor.
Retrieving S3 ACLs is supported with NFS security flavor as well.
To read about how permissions are controlled with each security flavor, see Controlling File and Directory Permissions Across Protocols.
Each bucket and object has an ACL attached to it as a sub resource. The ACL defines which grantees are granted access and what permission type each grantee has. A grantee can be a user or a predefined group. When a request is received against a resource, VAST Cluster checks the corresponding ACL to verify that the requester has the necessary access permissions
When you create a bucket or an object, VAST Cluster creates a default ACL that grants the owner full control over the bucket or object (assuming no ACL is specified in the call that creates the bucket.)
This default ACL is also used when files and directories are created via NFSv3 or NFSv4.1 clients in a view that is using S3 Native security flavor.
ACLs can grant permissions to individual VAST Cluster users and to predefined groups.
The following groups are predefined:
-
AUTHENTICATED_USERS. This group represents all users who can authenticate to the S3 service. ACLs granted to this group allow access to any user that has network access to the cluster provided the request is signed (authenticated).
-
ALL_USERS. This group represents all users in the world with network connectivity to the cluster. ACLs granted to this group allow access to any user that has network access to the cluster. The requests can be signed (authenticated) or unsigned (anonymous). Unsigned requests omit the Authentication header in the request.
Note
Anonymous requests are blocked unless the Anonymous access setting is enabled for the relevant view.
Caution
If anonymous access is enabled and an ACL grants permission to the ALL_USERS group, any client that sends an unsigned request (also known as sending a request in anonymous mode) to access the object with the requested permission type will be granted the requested access. Therefore, it is good practice to exercise caution with granting permissions to the All Users group. For example, if you assign WRITE permission to this group for accessing a bucket, any requester could store objects in your bucket or delete objects you might want to keep.
To grant ACLs to a user, you can specify the user as one of the following:
-
A principal name in the format user@domain, where user is the user name and domain is configured for an external auth provider on the cluster (LDAP, NIS).
Note
Users on the local provider cannot be specified this way.
-
A VID, which is a VAST ID used in the cluster's internal user database. A user VID is retrievable by running the
user query
VAST CLI command and specify udb as the context of the query. The output includes the user's VID.
You can grant any of the following permissions to any valid grantee in an ACL.
Permission Type |
When granted on a bucket |
When granted on an object |
---|---|---|
READ |
Allows grantee to list the objects in the bucket |
Allows grantee to read the object data and its metadata |
WRITE |
Allows grantee to create, overwrite, and delete any object in the bucket |
Not applicable |
READ_ACP |
Allows grantee to read the bucket ACL |
Allows grantee to read the object ACL |
WRITE_ACP |
Allows grantee to write the ACL for the applicable bucket |
Allows grantee to write the ACL for the applicable object |
FULL_CONTROL |
Allows grantee the READ, WRITE, READ_ACP, and WRITE_ACP permissions on the bucket |
Allows grantee the READ, READ_ACP, and WRITE_ACP permissions on the object |
It is possible to grant a user 'S3 Superuser' permission. This overrides all ACLs and gives the user full permissions on all objects and buckets. This is useful for applications such as backup applications which need to restore object attributes.
To set or retrieve S3 ACLs, you can use the operations listed under Bucket Access Control List (ACL) Operations and Object ACL Operations.
Object versioning is a feature enabled on the bucket level that automatically preserves all previous versions of every object.
Any change to an S3 object is an overwrite that replaces the object with a new one. With versioning enabled, the replaced object is not deleted; it is stored with a unique version number. You can always access any previous version of any object.
If the bucket's view is accessible via other protocols, those protocols have read-only access to the versioned objects.
S3 versioned objects are stored as numbered files under a version directory per object.
-
When versioning is enabled on a bucket, it cannot be disabled.
-
Multiprotocol access to versioned objects is not supported. NFS and SMB clients may attempt to access the S3 versioned objects in read-only mode.
-
MFA Delete is not supported. MFA Delete is an optional configuration that requires multi-factor authentication of the bucket owner for any request to delete a version or change the versioning state of the bucket.
-
Copy objects from before VAST 4.0 cannot be used as copy sources and cannot be overwritten. The only way to modify a pre-v4.0 copy object is to copy the object onto itself.
Caution
VAST Cluster does not support disabling (suspending) versioning on a bucket. The versioning of a bucket can only be enabled and remain enabled.
You can enable versioning via VMS or via S3 RPC.
Versioning can be enabled on a bucket when you create the bucket via VMS by creating It is possible to enable object versioning via VMS when creating a bucket using the S3 Bucket protocol option in the view creation flow. Otherwise, versioning can be enabled via RPC.
For an existing view with an S3 bucket:
-
On the Views page, open the Actions menu for the view and select Edit. In the Update View dialog, on the S3 tab, enable the S3 Versioning setting and then click Update to save your change.
When creating a new view:
-
When you create the view, in the Create View dialog, enable the S3 Versioning setting on the S3 tab. The S3 Bucket protocol must be selected already in the Protocols dropdown before you can select the S3 tab.
For an existing view with an S3 bucket:
-
Run the
view modify
command with the--s3-versioning
parameter.
For example, this command enables versioning on a bucket whose view has ID 32:
$ view modify --id 32 --s3-versioning
Invoke the PutBucketVersioning S3 command to set the versioning state to Enabled.
Note
The Suspended state is not supported.
The following commands are supported:
-
GetBucketVersioning. Returns information on the state of the bucket's versioning.
-
ListObjectsVersions. Retrieves metadata about the versions of all the objects in a bucket.
When you upload an object to a version-enabled bucket, a unique version ID is automatically added to the object .
Each version of the object is stored as '<object-name>/<version-number>' where <object-name> is the object key and <version-number> is the version number. When listing objects, the S3 service returns objects in ascending order. Newer versions of objects have lower numbers than older versions, so the latest version is listed first.
To list all versions of a given object, send a GET Bucket
request to list the object's version directory. Specify the directory by setting the prefix
parameter to the object key.
A simple GET Object request retrieves the current version of an object.
To retrieve a specific version, you need to specify its version ID. Set the versionId to the ID of the version of the object you want to retrieve and send a GET Object versionId request.
A simple HEAD Object
request retrieves metadata of the current version of an object.
To retrieve the metadata of a specific version of an object, send a HEAD Object versionId
request, specifying the version ID of the object you want to retrieve.
A DeleteObject
request for a versioned object does not delete it. It leaves previous versions of the object in the object version directory and creates a delete marker. A delete marker is an object that indicates that the object has been deleted. It has a key name (or key) and version ID like any other object but it is otherwise not a regular object: It does not have data associated with it nor does it have an access control list (ACL). It is not retrieved in a GET Object request or returned in a GET Bucket request.
You can delete the delete marker by sending a DeleteObject versionId
request with the delete marker's version ID. This has the effect of undoing the object deletion.
To delete multiple versions of a versioned object, send a DeleteObjects
request with an array of object identifiers (key, versionid)
The following operations restore a previous version of an object:
-
Copy the version that you want to restore into the same bucket. The copied object becomes the current version of that object and all object versions are preserved.
-
Permanently delete the current version of the object. The previous version automatically becomes the current version.
In S3, renaming an object requires copying the object and then deleting the source object. VAST Cluster supports renaming an object using a single copy request which specifies to replace the object with its copy.
The rename operation moves the object with its ACL and metadata.
Source names and destination names may include double slashes and leading slashes, with the following implications in terms of directories and regular objects:
Source Name |
Destination Name |
Rename Operation |
---|---|---|
mybucket/a |
mybucket/b |
|
mybucket/a/ |
mybucket/b |
|
mybucket/a/ |
mybucket/b/ |
|
mybucket/a// |
mybucket/b// |
|
To rename an object, include the following header in a CopyObject request:
x-amz-copy-disposition: replace
If this header is included in the request, then:
-
The optional x-amz-metadata-directive header may not be used to specify that you want to replace object metadata with metadata provided in the request. Only copying object metadata from the source object is allowed.
-
The following headers are not allowed:
-
x-amz-aclx-amz-grant-full-control
-
x-amz-grant-read
-
x-amz-grant-read-acp
-
x-amz-grant-write-acp
-
The trash folder is a hidden folder into which client users can move folders and files that they want to delete. The files and folders are automatically deleted for the user from the trash folder asynchronously.
The trash folder feature is disabled by default and can be enabled on the cluster. It is supported for use via NFSv3 as well as via S3.
For S3 users, trash folder deletion is done by adding the custom header 'x-amz-delete-contents: true' to a DELETE request. The custom header can be used to move any of the following to the trash folder:
-
A non-versioned bucket, even if not empty, along with any objects contained in the bucket. That is, a bucket delete request that would otherwise fail if the bucket is not empty does not fail when using the trash folder header.
Note
Versioned buckets cannot be deleted via the trash folder.
-
An object. If the header is used with a DeleteObject request and an object key is specified which denotes an explicit object, that object is deleted.
-
All objects which share a specified object key prefix. Effectively, this is like deleting all files under a given hierarchical directory path. The prefix should be specified without a trailing slash. All contents under the prefix are moved to trash.
When using the trash folder via S3, the user requires the same permissions as are otherwise required to delete the data if the trash folder is not used. So, for example, a user deleting a bucket by moving it to the trash folder needs the explicitly granted delete bucket permission that is granted to users via VMS.
The trash folder feature must be enabled globally in the VAST Web UI. It is disabled by default and needs to be enabled in the Cluster settings.
Note
Unlike for NFS hosts, where permission is granted on a host basis through the host based access rules in the view policy, S3 usage of the trash folder is not affected by the view policy host based access rules. The same delete permissions are required with the trash folder as without the trash folder.
In order to use the trash folder for an object or bucket deletion, include the following custom header in a DELETE request:
x-amz-delete-contents: true
Note
There is no specific feature for undoing a move to the trash folder to restore data that was accidentally deleted.
Suppose we upload the object myfile to the bucket mybucket using the S3cmd client:
[centos@host~]$ s3cmd put myfile s3://mybucket/ upload: 'myfile' > 's3://mybucket/myfile' [1 of 1] 10485760 of 10485760 100% in 0s 113.09 MB/s done
Normally, a request to delete the bucket would fail because the bucket is not empty. However, making a DeleteBucket request and specifying the x-amz-delete-contents
header for the trash folder, the bucket and its content are deleted.
The following client session of the Boto3 AWS SDK for Python connects to the cluster's S3 service over HTTP at one of the cluster's CNode virtual IPs (198.51.100.255), authenticates with a VMS generated access key pair and deletes the bucket via the trash folder:
def set_delete_contents_header(request, **kwargs): request.headers.add_header('x-amz-delete-contents', 'true') sess = boto3.session.Session() s3 = sess.client(service_name='s3', region_name='vast-1', use_ssl=False, endpoint_url='http://198.51.100.255' aws_access_key_id='YAUY93A7Q91SO2M07BWZ', aws_secret_access_key='NWzyNIqIscgVZlaHqBzBWaejMDHSHbZuXDxbE2Yp' config=boto3.session.Config( signature_version='s3v4' s3={'addressing_style': 'path'} ) ) s3.meta.events.register('before-sign.*.*', set_delete_contents_header) s3.delete_bucket(Bucket='mybucket')
Following the deletion, no buckets are found:
[centos@host ~]$ s3cmd ls [centos@host ~]$
Suppose two objects a/b/c/object1 and a/b/d/object2 are created in an existing bucket mybucket using the s3cmd client:
[centos@host ~]$ s3cmd put 1KB_random_file s3://mybucket/a/b/c/object1 upload: '1KB_random_file' -> 's3://mybucket/a/b/c/object1' [1 of 1] 1024 of 1024 100% in 0s 15.16 KB/s done [centos@host ~]$ s3cmd put 1KB_random_file s3://mybucket/a/b/d/object2 upload: '1KB_random_file' -> 's3://mybucketheaders/a/b/d/object2' [1 of 1] 1024 of 1024 100% in 0s 17.13 KB/s done
These objects can be moved to trash by deleting the common object key prefix a/b, making a DeleteObject request specifying the x-amz-delete-contents
header.
Using a client instance of the Boto3 AWS SDK for Python, contacting the S3 service over HTTP at one of the CNode virtual IPs (198.51.100.255), authenticating with a VMS generated access key pair, the request might look like this:
def set_delete_contents_header(request, **kwargs): request.headers.add_header('x-amz-delete-contents', 'true') sess = boto3.session.Session() s3 = sess.client(service_name='s3', region_name='vast-1', use_ssl=False, endpoint_url='http://198.51.100.255' aws_access_key_id='YAUY93A7Q91SO2M07BWZ', aws_secret_access_key='NWzyNIqIscgVZlaHqBzBWaejMDHSHbZuXDxbE2Yp' config=boto3.session.Config( signature_version='s3v4' s3={'addressing_style': 'path'} ) ) s3.meta.events.register('before-sign.*.*', set_delete_contents_header) s3.delete_object(Bucket='mybucket', Key='a/b')
Following the deletion, the objects can no longer be listed:
[centos@host~]$ s3cmd ls s3://mybucket/a/b [centos@host ~]$
In an S3 bucket, the objects are stored in a flat hierarchy where every object has a unique key. The key is the name of the object and can contain slashes, unlike in other access protocols where a slash denotes a directory delineator in a file system path. In S3, directories do not exist but slashes can be used to organize the namespace and simulate the illusion of directories. Some S3 applications use objects with trailing slashes in their keys to simulate directory nodes. For example, an application could use an object called 'dev/' as a directory by creating new objects with keys that have 'dev/' as a common prefix. This use of objects as directory nodes may be referred to as 'S3 directories'.
The VAST Cluster S3 implementation supports 'S3 directories' with limitations in terms of multiprotocol access. S3 directories are supported as follows:
-
Object keys specified in S3 requests can contain slashes:
-
Object keys can start or end with a slash, such as '/a' or ‘a/’.
-
Object keys can contain multiple slashes in series, such as '//a', 'a//', 'a//b', or 'a///'.
-
-
An object created via S3 RPC cannot be accessed via NFS or SMB, neither as a file nor as a directory, if the object key starts with a slash, or ends with a slash, or includes multiple slashes.
-
An object with a key that has a trailing slash is a valid S3 object and is usable as an S3 directory by virtue of the trailing slash. It is otherwise the same as any other S3 object, supporting S3 operations and able to contain data as well as metadata, ACL, and versions.
-
In non-versioned buckets, two objects can have keys that are identical except that one of them has a single trailing slash. This is effectively like having a file and a directory with the same name. For example, you can have an object with key 'a/b' and a 'directory' object with key 'a/b/'. These are valid as two distinct objects. However, there is a limitation that you cannot have a file with the same name as a directory if you also have a file 'inside' that directory. For example: although you can have "key1" and 'key1/" as object keys, you cannot have "key1" and "key1/key2".
-
In versioned buckets, object keys must be distinguished by more than a single trailing slash, since a trailing slash is reserved for the object versioning directory. So, for example, the following are valid in a versioned buckets as keys for two distinct objects: 'a/b' and a/b//'.
Directories created inside views via other protocols (NFS, SMB), are accessible by S3 clients. To derive the key for accessing a file under a nested path as an S3 object, prepend the file path relative to the view path to the file name.
For example, supposing:
-
A view is created on path 'dev/' mounted by an NFS client.
-
An NFS client creates a directory "src" under the mounted view and saves the file "component" under the directory.
The key for accessing the file "component" as an S3 object is 'src/component".
Comments
0 comments
Article is closed for comments.