Wednesday, August 16, 2023

Bus Error (Core Dumped)!

I was training a machine learning model written in PyTorch on a Linux system. During the training, I encountered "Bus error (core dumped)." This error produces no stack trace. Eventually, I figured it out that this was resulted in the exhaustion of shared memory whose symptom is that  "/dev/shm" is full. 

To resolve this issue, I simply double the size of "/dev/shm", following the instruction given in this Stack Overflow post,

How to resize /dev/shm?

Basically, it is to edit the /etc/fstab file. If the file already has an entry for /dev/shm, we simply increase its size. If not, we add a line to the file, such as

none /dev/shm tmpfs defaults,size=32G 0 0

To bring it to effect, we remount the file system, as in,

sudo mount /dev/shm