I recently needed to do some maintenance on a RocksDB key-value store. The task was simple enough, just delete some keys as the db served as a cache and did not contain any permanent data.
I used the RocksDB cli administration tool ldb to erase the keys. After running a key scan with it, I got this error
Failed: Corruption: Snappy not supported or corrupted Snappy compressed block contents
So a damaged database. Fortunately, there's a tool to fix it, and after running it, I had access to the db via the admin tool.
All the data was lost though. Adding and removing keys worked fine but all the old keys were gone. It turned out that the corrupted data was all the data there was.
The recovery tool made a backup folder, and I recovered the data by taking the files from the backup folder and manually changing the CURRENT file to point to the old MANIFEST file which is apparently how RocksDB knows which sst (table) files to use.
I could not access the data with the admin tool, but the native C API worked fine, so I just exported all the keys as raw key-value text strings with the native API and typed a script which used the admin tool to recreate the database.
Everything was working fine now, I could add and remove keys both via the native API which the actual application uses and with the admin tool.
I returned to the DB a few days later, and as I feared, the DB has corrupted again. Or so the admin tool claimed, but the actual application was working fine, so I thought it has probably to do something with the data, or there is a bug in the admin tool. Investigation on that is still ongoing, next step is to upgrade the DB to a later version, check if there is some wrong configuration parameter and try out adding some of the newer keys manually to a fresh db to see if I can "break" it.
I used the RocksDB cli administration tool ldb to erase the keys. After running a key scan with it, I got this error
Failed: Corruption: Snappy not supported or corrupted Snappy compressed block contents
So a damaged database. Fortunately, there's a tool to fix it, and after running it, I had access to the db via the admin tool.
All the data was lost though. Adding and removing keys worked fine but all the old keys were gone. It turned out that the corrupted data was all the data there was.
The recovery tool made a backup folder, and I recovered the data by taking the files from the backup folder and manually changing the CURRENT file to point to the old MANIFEST file which is apparently how RocksDB knows which sst (table) files to use.
I could not access the data with the admin tool, but the native C API worked fine, so I just exported all the keys as raw key-value text strings with the native API and typed a script which used the admin tool to recreate the database.
Everything was working fine now, I could add and remove keys both via the native API which the actual application uses and with the admin tool.
I returned to the DB a few days later, and as I feared, the DB has corrupted again. Or so the admin tool claimed, but the actual application was working fine, so I thought it has probably to do something with the data, or there is a bug in the admin tool. Investigation on that is still ongoing, next step is to upgrade the DB to a later version, check if there is some wrong configuration parameter and try out adding some of the newer keys manually to a fresh db to see if I can "break" it.
Comments
Post a Comment