DBox replacement is a VMS-enabled procedure for replacing a faulty DBox while the cluster continues to operate.
This DBox replacement procedure is suitable for the following situations:
-
When a DBox has failed in a cluster that has DBox HA capability. The cluster is still running.
-
When a DBox is faulty but running. For example, the DBox has a failed slot. Even if DBox HA is not enabled, the cluster is still running, since the DBox has not failed.
A replacement DBox is shipped with new DNodes and empty of SSDs and NVRAMs. During the procedure, the SSDs and NVRAMs are migrated from the faulty DBox to the new DBox.
-
The procedure requires you to connect the new DBox to the switches before disconnecting the old DBox. Therefore, the cluster's network switches must have enough spare unused ports to accommodate an extra DBox.
Please consult your VAST Data sales engineer for help designating switch ports and ensuring that they are configured with the correct port designations for DNodes as required.
-
Similarly, you'll need rack space and PSUs in order to install the new DBox before physically removing the faulty DBox.
-
Replacement DBox with rail mount kit and four C13/C14 power cables. All SSD slots on the DBox must be empty.
-
4 x 100Gb/s QSFP28 cables for connecting the new DBox to the cluster's switches.
-
Without removing the faulty DBox, rack mount the new cluster and add the new DBox to the cluster. Follow the instructions in this cluster expansion procedure to add the DBox to the cluster. Make sure to select Empty box in the General Settings screen.
-
On the DBoxes tab, open the Actions menu for the faulty DBox that you want to replace and select Replace.
-
Click Yes to confirm your action.
-
On the Clusters tab of the Infrastructure page, check that the cluster's Raid State is healthy.
-
Prepare to move SSDs from the old DBox into the new DBox. Plan to insert each SSD into the slot in the new DBox that has the same slot number as in the old DBox.
-
Migrate each SSD, one at a time, as follows:
-
Remove the SSD from the faulty DBox.
The SSD's state changes to Failed and the cluster's RAID state changes to Rebuild.
-
Insert the removed SSD into the target slot in the new DBox.
The SSD is activated automatically.
-
Verify that the cluster's RAID state has returned to healthy before proceeding with the next SSD.
-
-
Prepare to move NVRAMs from the old DBox into the new DBox. Plan to insert each NVRAM into the slot in the new DBox that has the same slot number as in the old DBox.
-
On the Clusters tab of the Infrastructure page, check that the cluster's NVRAM State is healthy.
-
For each NVRAM in turn:
-
In the NVRAMs tab, open the Actions menu for the NVRAM and select Deactivate.
-
When the NVRAM is deactivated, remove the NVRAM from the faulty DBox.
-
Insert it into the target slot in the new DBox. In case of a faulty NVRAM, insert the replacement NVRAM into the planned slot.
-
Verify that the slot is active and the device is healthy.
-
Open the Actions menu for the moved NVRAM and select Activate.
-
Verify that the cluster's NVRAM State is healthy before proceeding with the next NVRAM.
-
-
Verify that the faulty DBox is empty of devices.
-
Open the Actions menu again for the faulty DBox and select Conclude Replacement.
-
Click Yes to confirm the replacement.
The process of removing the DBox and DNodes takes some time. You can monitor the progress by watching the replace_dbox task in the Activitiies page.
-
Wait until the task is complete and then physically remove the faulty DBox. Ship it back to VAST Data.
Comments
0 comments
Article is closed for comments.