Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debugpy listen silently crashing #1749

Open
koenlek opened this issue Nov 28, 2024 · 1 comment
Open

debugpy listen silently crashing #1749

koenlek opened this issue Nov 28, 2024 · 1 comment
Assignees
Labels
needs repro Issue has not been reproduced yet

Comments

@koenlek
Copy link

koenlek commented Nov 28, 2024

Environment data

  • debugpy version: 1.8.8
  • OS and version: A k8s pod running an Ubuntu 20.04.6 based container
  • Python version (& distribution if applicable, e.g. Anaconda): 3.9
  • Using VS Code or Visual Studio: VS Code

Actual behavior

I'm using the Ray Distributed Debugger (their code here) with Ray on K8S. It runs debugpy.listen , but when I check the port on which it listens, nothing is bound to that port (sudo lsof -i :$LISTEN_PORT). I enabled DEBUGPY_LOG_DIR to get more detailed logs, and I noticed that debugpy.pydevd.NNNN.log contains this near the end, indicating that it indeed crashed:

Traceback (most recent call last):
  File "/my_app/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py", line 422, in _on_run
    cmd.send(self.sock)
  File "/my_app/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_net_command.py", line 109, in send
    sock.sendall(as_bytes)
BrokenPipeError: [Errno 32] Broken pipe

I looked in the issue trackers of debugpy, pydevd, and ray, and did some googling, and couldn't find much unfortunately. The only thing I found is that this may point to the connection between the local services (there is a client, server, and "debug server" and some incoming client (?) involved in running debugpy on the application side, it seems) breaking. I found this snippet in debugpy.adapter.NNNN.log:

I+00000.071: Listening for incoming Client connections on 10.40.0.130:51507...

I+00000.071: Listening for incoming Server connections on 127.0.0.1:39415...

I+00000.071: Sending endpoints info to debug server at localhost:60997:
             {
                 "client": {
                     "host": "10.40.0.130",
                     "port": 51507
                 },
                 "server": {
                     "host": "127.0.0.1",
                     "port": 39415
                 }
             }

I+00000.076: Accepted incoming Server connection from 127.0.0.1:43864.

Lastly, I noticed this in debugpy.{adapter,server}.NNNN.log but that seems to be ok, as I also saw this in healthy local runs:

I+00000.049: Error while enumerating installed packages.
             
Traceback (most recent call last):
  File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 362, in get_environment_description
    report("    {0}=={1}\n", pkg.name, pkg.version)
AttributeError: 'PathDistribution' object has no attribute 'name'

Stack where logged:
  File "/my_app/python3_x86_64/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/my_app/python3_x86_64/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/my_app/debugpy/adapter/__main__.py", line 227, in <module>
    main(_parse_argv(sys.argv))
  File "/my_app/debugpy/adapter/__main__.py", line 50, in main
    log.describe_environment("debugpy.adapter startup environment:")
  File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 372, in describe_environment
    info("{0}", get_environment_description(header))
  File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 364, in get_environment_description
    swallow_exception(
  File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 215, in swallow_exception
    _exception(format_string, *args, **kwargs)

All of this crashes already before I try connecting to the debugger.

I was also able to reproduce this without using Ray Distributed Debugger. I just connect to the k8s pod, create a small python script:

import debugpy
debugpy.listen(5678)
print("before wait_for_client")
debugpy.wait_for_client()
print("after wait_for_client")
print("before breakpoint")
debugpy.breakpoint()
print("after breakpoint")

Run it and check the log files and see the same crash happening (BrokenPipeError: [Errno 32] Broken pipe) in the pydevd logs.

When I run all of this locally, everything works fine. When running on ray on k8s, I run into this issue...

These are the full, lightly redacted, logs:

Questions:

  • Is there a way to detect a crashed listen from code? If so, how?
  • Any ideas on what makes this crash?

Expected behavior

Accepting oncoming connections on the debugpy.listen endpoint.

Steps to reproduce:

I'm afraid it will be hard to reproduce this in an environment other than our "ray on k8s" setup. But details are in the "Actual behavior" section.

@rchiodo
Copy link
Contributor

rchiodo commented Dec 13, 2024

Not sure what a Ray cluster is, but the broken pipe sometimes happens in our test suite. I believe it's usually from one of two reasons:

  • Debugger processes are shutting down during terminate but not waiting for the debuggee to finish
  • Debugger is trying to make a connection to another process and it times out before that process starts.

This line in your adapter log makes me think it's the latter:

0.00s - PyDB.dispose_and_kill_all_pydevd_threads (called from: File "/my_app/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py", line 324, in _terminate_on_socket_close)

The connection the debuggee has to the adapter process is being killed.

What's bazel_python? That looks to be the python being used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs repro Issue has not been reproduced yet
Projects
None yet
Development

No branches or pull requests

3 participants