In this blog, we learn how to set up PyStack, debug running processes, analyze core dumps, and integrate it with Pytest.
Debugging can be a formidable challenge, especially when dealing with stubborn issues like deadlocks, segmentation faults, crashing applications, or hanging processes. But there's a new player in town - PyStack, a powerful debugger that promises to work its magic and help you navigate these complex problems. In this blog post, we're going to explore how PyStack can be your troubleshooting sidekick for these perplexing scenarios.
You might be wondering, with a multitude of debugging tools, including interactive IDE debuggers at your disposal, why do you need PyStack? The answer lies in the nature of certain elusive bugs and issues that are incredibly challenging to resolve. Here's why PyStack comes to the rescue:
1. Deadlocks and Hanging Processes: When you encounter a hanging process, it's often hard to discern whether it's actively working or stuck in a deadlock. PyStack can provide insights into the state of these processes.
2. Hybrid Applications: Applications that blend Python with C/C++ components, like Python extension modules, or popular libraries such as NumPy or TensorFlow, can be tricky to debug. PyStack can help you tackle issues like NumPy crashes with segfaults.
3. Unique Circumstances: Some issues are peculiar, occurring under specific conditions like heavy load or after an application has been running for a certain duration. PyStack can help you investigate these niche problems.
While other tools like GDB exist, PyStack offers several advantages. It doesn't modify your code, it can inspect core dump files, and automatically fetches debugging information for your specific distribution, making it a valuable addition to your debugging toolkit.
Before diving into debugging, you'll need to prepare your environment:
Notably, you'll need to enable core dump generation by installing `systemd-coredump` and temporarily enable `ptrace` syscalls with the `echo` command. Lastly, you'll install PyStack itself along with the Pytest plugin.
PyStack provides two ways to debug a program: attaching to a running process or analyzing a core dump of a crashed process. Let's start with the former. Consider this code snippet:
You can run this code in the background and use PyStack to inspect it:
PyStack will provide a traceback, highlighting where your program hangs and displaying local variables, offering crucial context.
Unfortunately, PyStack is limited to Linux, but it works seamlessly in Docker. For instance, to debug a deadlock in a Docker container:
Sometimes, you won't be able to debug a live process, and that's where analyzing core dump files becomes vital. Core dumps are snapshots of a process when it crashes, often indicated by messages like "Segmentation fault (core dumped)." You can inspect core dumps with PyStack:
This allows you to investigate the state of the program when it crashed, displaying local variables crucial for understanding the issue.
PyStack proves invaluable when debugging issues in libraries like NumPy or PyTorch that have C/C++ components. Consider this NumPy example that triggers a segmentation fault:
Running this code will lead to a segfault, but analyzing the core dump with PyStack provides extensive information, making it easier to pinpoint the problem.
If you use Pytest for your test suite, PyStack can be integrated as a plugin to automatically run when a test exceeds a specified timeout:
This allows you to inspect the process running your tests if they run for too long, providing valuable insights into test failures.
PyStack has a fairly specific set of uses and isn't necessary for debugging typical problems,although it can be incredibly helpful when problems like deadlocks or segfaults do occur. There aren't any other tools that can do it so well, and being able to determine what a programme is doing while it is running or what it was doing when it crashed is tremendously useful. Additionally, you should try PyStack if you already use a profiler (such as py-spy or Austin) or other tools that investigate stack information because they work well together.