5

It never happened to me before, but I'm unable to do a simple task such as compressing a 18.5 GB file on Ubuntu Linux 18.04 with any of the popular compressing tools such as gzip, bzip2 and 7z. All of them report a similar warning (not error) message claiming that the file size has changed during compression, when actually no other process is accessing to the file. For example, when trying to "tar-gz", the tool reports: File shrank by <nnnnnnnn> bytes; padding with zeros, exiting with the error code 1, which tar's manpage says it's due to a file change during compression:

exit code 1: Some files differ. If tar was invoked with the --compare (--diff, -d) command line option, this means that some files in the archive differ from their disk counterparts. If tar was given one of the --create, --append or --update options, this exit code means that some files were changed while being archived and so the resulting archive does not contain the exact copy of the file set.

The file is a VMDK, and of course the associated VM is completely shut down when I compress it. On the other hand, I've noticed that all the compressing tools fail when the compressed file reaches a size around 280 MB.

I've already checked other similar questions on ServerFault but still I don't get any hint to figure out what's happening. The most voted answer to the linked question says that this is not an error and that the compressing tool is just "simplifying" a bunch of zero bytes, but if I attempt to run the VM after decompressing the VMDK file, it fails claiming the disk is corrupted.

I'm completely stuck on this. Any ideas of what can be happening?

UPDATE

While trying to copy the file to another directory using the cp command, it dumped an I/O error while reading the file. On the other hand dmesg reported I/O errors while reading a specific block of the file. Everything points to be a disk error (although e2fsck says everything is OK and there are 0 bad blocks). Since I already have a backup of the VM, I will try to change the host computer's disk and reinstall a fresh copy of Ubuntu and see what happens. I keep this question posted until I get some results.

9
  • To make really sure no process is accessing the file you can use the lsof command. Mar 22, 2021 at 8:33
  • @JoãoAlves Hello! I already did that! I also even restarted the host machine and attempted to compress the file without starting the VM.
    – Claudi
    Mar 22, 2021 at 8:35
  • "actually no other process is accessing to the file ....". How can you be sure? To be really really sure, can do sudo chattr +i my-file.vmdk then try compress. Also please paste example of failing compress command to help us understand exactly.
    – spinkus
    Mar 22, 2021 at 9:45
  • Please execute stat <filename> and triple check mtime/ctime timestamp to be sure nothing is changing the file.
    – shodanshok
    Mar 22, 2021 at 10:06
  • 1
    Сheck SMART of a disk. Run e2fsck -c -f, to check for bad blocks. Just simple e2fsck won't do full disk surface scan for bad blocks. Or better, immediately stop using this disk and make a dump of it if it has any valuable information (with ddrescue). Mar 22, 2021 at 11:28

1 Answer 1

3

OK, I'm answering my own question which may serve someone else for realizing if he/she is actually having hardware problems.

After trying multiple times to compress the problematic file (even with different compressors), I just tried to copy the file to another directory using cp, which dumped an I/O error while reading the file:

cp: reading `filename': Input/output error

A quick glance at dmesg's output confirmed the hardware error, reporting an I/O error reading a specific block on disk.

I booted the OS in emergency mode and ran e2fsck -vf /dev/sda1, yet it didn't report any bad block. From the comments to my question, user Nikita Kipriyanov suggested running e2fsck -c -f, which I had no chance to run because I already changed the disk. The -c flag deals specificly with bad blocks, according to the manpage:

causes e2fsck to use badblocks(8) program to do a read-only scan of the device in order to find any bad blocks. If any bad blocks are found, they are added to the bad block inode to prevent them from being allocated to a file or directory. If this option is specified twice, then the bad block scan will be done using a non-destructive read-write test.

Maybe the reader can run this command as Nikita suggests as a workaround, but when a disk starts giving hardware errors the best option is to try to save as much information as possible and move the system to a new fresh one.

Good luck!

1
  • 1
    You can always check the disk after it was changed. Just connect it to another computer :) We always do such post-mortem analysis, just to be sure. And, drives always die after some use, so you must have an essential drive health monitoring with alerts! It saves data... Apr 11, 2021 at 17:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .