Post-Mortem Analysis for Multiprocessing Program in C

Last updated on December 24, 2023 pm

Post-Mortem Analysis for Multiprocessing Program in C

When a program crashes, sometimes there would be a core dumped message.

1	`make: *** [Makefile:33: run] Segmentation fault (core dumped)`

This core dump file is extremely useful for post-mortem analysis.

Enable Core Dump

Here is how to enable core dump on CentOS.

Set `ulimit`

If the value of ulimit -c is 0, then core dump is disabled.

1	`ulimit -c unlimited`

This operation only affect current shell session, so better to add it to users’ .bashrc.

1	`echo 'ulimit -c unlimited' >> ~/.bashrc`

Config dumped files location

On CentOS, the default core dump destination is defined in /proc/sys/kernel/core_pattern, which will require sudo permission to modify.

1	`echo 'dumps/core.%e.%t.%p' \| sudo tee /proc/sys/kernel/core_pattern`

Here we set the dumped file located in dumps directory of current working directory, with filename as core.%e.%t.%p. The placeholders’ meaning:

%e: name of the executable
%t: timestamp of dumping, in seconds since the UNIX Epoch
%p: process ID of the task

Inside a Docker container, the /proc/sys/kernel/core_pattern is Read-Only. This is because Docker on Windows uses WSL2 as backend. So simply change it in WSL2 would work. However, this might be flushed after reboot.

Analyze Core Dump

Now that core dump is enabled, we can use gdb to analyze a crashed program’s exiting state.

1	`gdb <executable> <core dump file>`

Here are some useful commands:

bt: backtrace, show the stack trace
info locals: show the local variables of current stack frame
frame <frame id> or f <frame id>: switch to a specific stack frame which is shown in backtrace
list: show the source code of current stack frame

POSIX Threads

info threads: show all threads’ information, current thread is marked with *
thread <thread id>: switch to a specific thread

(gdb) info threads
  Id   Target Id         Frame
  4    Thread 0x7f9f675be700 (LWP 9983) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185  
* 3    Thread 0x7f9f685c0700 (LWP 9981) __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  2    Thread 0x7f9f68dc4740 (LWP 9980) 0x00007f9f68998017 in pthread_join (threadid=140322627389184, thread_return=0x0) at pthread_join.c:90
  1    ...

Mutex

If a thread is stuck at __lll_lock_wait() function, then it is waiting for a mutex.

1
2
3

(gdb) bt
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 ...

Use p (pthread_mutex_t) <mutex> to print a mutex’s value.

1
2
3

(gdb) p (pthread_mutex_t) my_mutex
$0 = {__data = {__lock = 2, __count = 0, __owner = 9982, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
      __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\376&\000\000\001", '\000' <repeats 26 times>, __align = 2}

__owner: the ID of the thread who locks the mutex at the moment
__nusers: the number of threads who are waiting for the mutex
__kind: the type of the mutex, 0 stands for PTHREAD_MUTEX_NORMAL

Condition Variable

If a thread is stuck at pthread_cond_wait() function, then it is waiting for a condition variable.

1
2
3

(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  ...

Similarly, use p (pthread_cond_t) <cond> to print a condition variable’s value.

(gdb) p (pthread_cond_t) my_cond1
$20 = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, 
    __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align = 0}
(gdb) p (pthread_cond_t) my_cond2
$21 = {__data = {__lock = 0, __futex = 1, __total_seq = 1, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x603160 <my_mutex2>, 
    __nwaiters = 2, __broadcast_seq = 0},
  __size = "\000\000\000\000\001\000\000\000\001", '\000' <repeats 23 times>, "`1`\000\000\000\000\000\002\000\000\000\000\000\000",     
  __align = 4294967296}

Analyze Deadlocked Program

Even though this post is mainly about analyzing core dump, here is a short tip when a running program is deadlocked.

Replace the <pid> with the process ID of the program, which can be found by top command.

Use gdb --pid <pid> to attach to the program and debug as usual.
Save the current state of the program as a core dump file.
1. Use gcore or gcore <name> to save the core when inside gdb.
2. Use gcore <pid> or gcore -o <name> <pid> to dump the core when in shell.

References

Programming Language

#C #CentOS #Core Dump #GDB #Debug #Deadlock #Mutex #Condition Variable #Pthread #POSIX #WSL2 #Docker

Post-Mortem Analysis for Multiprocessing Program in C

https://lingkang.dev/2023/12/02/core-dump/

Author

Lingkang

Posted on

December 2, 2023

Licensed under

Implement RPC demo from scratch Previous

OSC Lab Task 5 Reasoning Next

Post-Mortem Analysis for Multiprocessing Program in C

Post-Mortem Analysis for Multiprocessing Program in C

Enable Core Dump

Set ulimit

Config dumped files location

Analyze Core Dump

POSIX Threads

Mutex

Condition Variable

Analyze Deadlocked Program

References

Set `ulimit`