Introducing PyStack: Your Ultimate Python Debugger

In this blog, we learn how to set up PyStack, debug running processes, analyze core dumps, and integrate it with Pytest.

GraphQL has a role beyond API Query Language- being the backbone of application Integration
background Coditation

Introducing PyStack: Your Ultimate Python Debugger

Debugging can be a formidable challenge, especially when dealing with stubborn issues like deadlocks, segmentation faults, crashing applications, or hanging processes. But there's a new player in town - PyStack, a powerful debugger that promises to work its magic and help you navigate these complex problems. In this blog post, we're going to explore how PyStack can be your troubleshooting sidekick for these perplexing scenarios.

Why Do We Need PyStack?

You might be wondering, with a multitude of debugging tools, including interactive IDE debuggers at your disposal, why do you need PyStack? The answer lies in the nature of certain elusive bugs and issues that are incredibly challenging to resolve. Here's why PyStack comes to the rescue:
1. Deadlocks and Hanging Processes: When you encounter a hanging process, it's often hard to discern whether it's actively working or stuck in a deadlock. PyStack can provide insights into the state of these processes.
2. Hybrid Applications: Applications that blend Python with C/C++ components, like Python extension modules, or popular libraries such as NumPy or TensorFlow, can be tricky to debug. PyStack can help you tackle issues like NumPy crashes with segfaults.
3. Unique Circumstances: Some issues are peculiar, occurring under specific conditions like heavy load or after an application has been running for a certain duration. PyStack can help you investigate these niche problems.
While other tools like GDB exist, PyStack offers several advantages. It doesn't modify your code, it can inspect core dump files, and automatically fetches debugging information for your specific distribution, making it a valuable addition to your debugging toolkit.

Setting Up PyStack

Before diving into debugging, you'll need to prepare your environment:


sudo apt update
sudo apt install systemd-coredump python3-pip python3.10-venv
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
python3 -m venv ./venv
source ./venv/bin/activate
pip install pystack pytest-pystack

Notably, you'll need to enable core dump generation by installing `systemd-coredump` and temporarily enable `ptrace` syscalls with the `echo` command. Lastly, you'll install PyStack itself along with the Pytest plugin.

Debugging with PyStack

PyStack provides two ways to debug a program: attaching to a running process or analyzing a core dump of a crashed process. Let's start with the former. Consider this code snippet:


# wait.py 
from time import sleep 
def wait(): 
some_var = "data" 
other_var = [1, 2, 3] 
while True: sleep(5) 
print("Sleeping...") 
wait()

You can run this code in the background and use PyStack to inspect it:


nohup python wait.py &
# [1] 44000
pystack remote 44000 --locals --no-block
# Traceback for thread 44000 (python) [] (most recent call last):
#     (Python) File "/home/.../wait.py", line 18, in <module>
#         wait()
#     (Python) File "/home/.../wait.py", line 15, in wait
#         sleep(5)
#       Locals:
#         other_var: [1, 2, 3]
#         some_var: "data"

PyStack will provide a traceback, highlighting where your program hangs and displaying local variables, offering crucial context.
Unfortunately, PyStack is limited to Linux, but it works seamlessly in Docker. For instance, to debug a deadlock in a Docker container:


# Build the Docker image
docker build -t python-pystack -f Dockerfile .

# Run the container with necessary privileges
docker run --cap-add=SYS_PTRACE --name python-pystack --rm python-pystack

# Enter the container
docker exec -it python-pystack /bin/bash

Analyzing Core Files

Sometimes, you won't be able to debug a live process, and that's where analyzing core dump files becomes vital. Core dumps are snapshots of a process when it crashes, often indicated by messages like "Segmentation fault (core dumped)." You can inspect core dumps with PyStack:


# Force a crash in your code
# ...
pystack core ./core --locals

This allows you to investigate the state of the program when it crashed, displaying local variables crucial for understanding the issue.

Dealing with Segmentation Faults in Libraries like NumPy

PyStack proves invaluable when debugging issues in libraries like NumPy or PyTorch that have C/C++ components. Consider this NumPy example that triggers a segmentation fault:


# pip install numpy
from multiprocessing import shared_memory
import numpy as np
#

Running this code will lead to a segfault, but analyzing the core dump with PyStack provides extensive information, making it easier to pinpoint the problem.

PyStack and Pytest

If you use Pytest for your test suite, PyStack can be integrated as a plugin to automatically run when a test exceeds a specified timeout:


pytest -s --pystack-threshold=2 
--pystack-args='--locals' 
--pystack-output-file='./pystack.log'

This allows you to inspect the process running your tests if they run for too long, providing valuable insights into test failures.

PyStack has a fairly specific set of uses and isn't necessary for debugging typical problems,although it can be incredibly helpful when problems like deadlocks or segfaults do occur. There aren't any other tools that can do it so well, and being able to determine what a programme is doing while it is running or what it was doing when it crashed is tremendously useful. Additionally, you should try PyStack if you already use a profiler (such as py-spy or Austin) or other tools that investigate stack information because they work well together.

Want to receive update about our upcoming podcast?

Thanks for joining our newsletter.
Oops! Something went wrong.

Latest Articles

Implementing Custom Instrumentation for Application Performance Monitoring (APM) Using OpenTelemetry

Application Performance Monitoring (APM) has become crucial for businesses to ensure optimal software performance and user experience. As applications grow more complex and distributed, the need for comprehensive monitoring solutions has never been greater. OpenTelemetry has emerged as a powerful, vendor-neutral framework for instrumenting, generating, collecting, and exporting telemetry data. This article explores how to implement custom instrumentation using OpenTelemetry for effective APM.

Mobile Engineering
time
5
 min read

Implementing Custom Evaluation Metrics in LangChain for Measuring AI Agent Performance

As AI and language models continue to advance at breakneck speed, the need to accurately gauge AI agent performance has never been more critical. LangChain, a go-to framework for building language model applications, comes equipped with its own set of evaluation tools. However, these off-the-shelf solutions often fall short when dealing with the intricacies of specialized AI applications. This article dives into the world of custom evaluation metrics in LangChain, showing you how to craft bespoke measures that truly capture the essence of your AI agent's performance.

AI/ML
time
5
 min read

Enhancing Quality Control with AI: Smarter Defect Detection in Manufacturing

In today's competitive manufacturing landscape, quality control is paramount. Traditional methods often struggle to maintain optimal standards. However, the integration of Artificial Intelligence (AI) is revolutionizing this domain. This article delves into the transformative impact of AI on quality control in manufacturing, highlighting specific use cases and their underlying architectures.

AI/ML
time
5
 min read