The Data Reduction Probe software is provided at zero cost with zero warranty to VAST Data’s current and prospective customers in order to accurately estimate Data Reduction Rates of specific data not yet on VAST Data appliances. The Data Reduction Probe software is run on physical or virtualized customer-maintained hardware and analyzes data that the customer allows access to through traditional filesystem based access. The results of the probe are used to determine a Data Reduction Rate which will often be used to project an aggregate financial savings for VAST Data’s current and prospective customers.
Where does the Data Reduction Probe originate?
The Data Reduction Probe is a Docker container of scripts and libraries maintained and assembled solely by VAST Data engineering which is updated frequently, usually quarterly. The links to download the probe are posted on the VAST Data support website referencing specific signed Azure Blob storage URLs for different regions across the globe.
Where does the Data Reduction Probe run?
The Data Reduction Probe is designed to be run within a customer environment on physical or virtualized customer-maintained equipment. The provided container requires a base Linux operating system which is expected to be installed and updated by the customer before the Data Reduction Probe is launched.
What information does the Data Reduction Probe collect?
The Data Reduction Probe generates a series of logs for each iteration of data scanning. These logs are by default saved on the same physical or virtualized customer-maintained equipment that the Data Reduction Probe runs. These logs contain references to paths which have been provided as inputs, and can refer to any path within that directory structure when making declarative statements about data reduction results. The analysis log file that is generated upon completion of the Data Reduction Probe prints each full path with figures about data reduction rate for that path. In addition, a secondary section of same analysis log file prints aggregate information about specific file extensions with figures about data reduction rate for that file extension.
What information does the Data Reduction Probe send back to VAST Data?
Data Reduction Probe has built-in call home telemetry which is on by default when executed assuming the probe has access to specific AWS S3 buckets via the internet. While the probe is running, telemetry logs will be sent approximately every 5 minutes. These telemetry logs, by default, omit references to full paths with the exception of the of the root input path and simply upload a percentage-based status of the probe as well as any error messages. The final telemetry log is similar to the local analysis log file but, by default, removes full paths with the exception of the of the root input path. The final telemetry log will send the aggregated data reduction rates based on file extensions as illustrated below:
file extension statistics:
file type .xlsx, original_size=143.7GB, global_compression_reduced_size=126.6GB, global_compression_factor=1.14, dedup_percentage=10.34%, similarity_match_percentage=15.12%, similarity_gain=310.9MB, local_compression_only_size=126.9GB
file type .tsv, original_size=291.5GB, global_compression_reduced_size=30.8GB, global_compression_factor=9.47, dedup_percentage=1.95%, similarity_match_percentage=84.83%, similarity_gain=9.6GB, local_compression_only_size=40.4GB
Who can access the logs sent to VAST Data?
Anyone at VAST Data engineering or sales has access to AWS S3 bucket that is used as the telemetry destination for the Data Reduction Probe.
What actions are performed with the logs sent to VAST Data?
The telemetry logs are primarily used by sales to determine a Data Reduction Rate which will often be used to project an aggregate financial savings for VAST Data’s current and prospective customers. Alternatively, any telemetry logs can be used to determine an expected Data Reduction Rate for a given industry or use case which may be similar to a sales team’s customer which has not run the Data Reduction Probe. VAST Data engineering also uses the telemetry data for bug fixes and over all improvements to the software and user experience.
How do I control what the Data Reduction Probe sends back to VAST Data?
This call home telemetry feature can be disabled at runtime with the added flag:
If you wish to send file names with the default telemetry logs, add the following flag: