Openzfs scrub all pools

11/12/2022

The sequential scrub algorithm has appeared in Open-E JovianDSS starting from the Up29 Version. It significantly improves this process and shortens it to several days (of course the effects depend on the type of data and the way they are distributed on the disk). Now they can be checked against their checksums, to identify the corrupted data. The blocks are then sorted by size and offset to arrange them in sequential order of their physical drive location. the imperfection of the algorithm itself, which was based on the random IOs issues.įortunately, this problem was solved by the implementation of a sequential data scrub algorithm, that scans metadata instead of data directly to create the in-memory list of data blocks.the system is very busy because of the nature of the stored data (scrub is a low-priority task, so it has to wait its turn in such a case).The older scrub algorithm was indeed inefficient and resource-hungry, which meant that the scrubbing of large volumes of data could take a very long time – sometimes even longer than a month. Until quite recently, storage admins may have felt a certain reluctance to perform this process due to its duration. It is therefore a drive hygiene task and it’s necessary to ensure data integrity. To prevent that, during the scrub process, the system reads all the data in the pool, checks it against their checksums, and if any corrupted data is spotted, it marks a drive block as unusable and copies data recovered from parity or mirror to a new place on the drive. What is the purpose of such a procedure? As previously mentioned, some of the data in our pool are rarely used, which carries the risk of data corruption. As the name suggests, it scrubs all the data on the pool. For ZFS based systems such as Open-E JovianDSS, we have the Scrub tool. Of course, there is also a solution for that. This leads to the dangerous situation when most frequently used data is safe (because it is subject to a frequent procedure of checking and repairing), but the ones used rarely (which does not mean that they are not important), not only are more likely to be damaged because of data rotting, but also go through the self-healing procedure less frequently. This means that the less frequently we read data from the device, the chance it is corrupted increases. So what can go wrong? The big part of your data is not read on a daily, or even weekly basis (but it doesn’t mean it is not important), and – as we already know – the probability of data corruption increases over time. The ability of self-healing means that all the data we get from the storage is verified on an ongoing basis and, if necessary, repaired. If it happens, the system goes to parity data, restores it, and returns the correct data to the application. Each reading of this data involves checking against the checksum – if it does not match, it means the data has been corrupted. This means that along with the actual data, its checksum, and parity data are written in the storage pool as well. How does it work? All data saved in ZFS is saved in a redundant manner (as long as you do not disable this function, which we do not recommend). ZFS data self-healingĪdvanced file systems such as ZFS deal with this problem thanks to their self-healing ability. So do not underestimate the risk of this phenomenon on your data storage. This is called data rot or data decay and, depending on the type of drive, can be caused by the loss of the magnetic state of bytes in HDD or by loss of the electrical charge in the SSD cells. Just as ink on paper fades when exposed to the sun, your data on storage media such as HDDs and SSDs could also get corrupted. The same happens to your data, once recorded and left with no guard, slowly decays – regardless of the medium it is stored. Keep Your Storage Secure and Scrub Data RegularlyĪccording to the principle of entropy, everything that is ordered goes into disorder.

0 Comments

Openzfs scrub all pools

Leave a Reply.

Author

Archives

Categories