Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: termios.error: (22, 'Invalid argument') during become on Solaris/Illumos/SmartOS #1089

Merged
merged 4 commits into from
Sep 20, 2024

Conversation

gaige
Copy link
Contributor

@gaige gaige commented Jul 26, 2024

This fixes compatibility with Solaris/Illumos/SmartOS, addressing an issue that shows up most frequently with become. The issue was mostly due to differences in how the TTY driver is handled and the pty driver not supporting echo on both sides of the pipe (as designed, from a Solaris point of view).

Fixes #950

@oetiker
Copy link

oetiker commented Jul 26, 2024

hmmm I tried the patch and it did not suffice to make things work ... what I did find was that probably the

ERROR! [mux  1590247] 11:29:26.373979 E mitogen.[ssh.micd-omnios-adm]: ExternalContext.main() crashed
Traceback (most recent call last):
  File "<stdin>", line 4196, in main
  File "<stdin>", line 3933, in run
  File "<stdin>", line 679, in _profile_hook
  File "<stdin>", line 3916, in _dispatch_calls
  File "<stdin>", line 1237, in __iter__
  File "<stdin>", line 1223, in get
  File "<stdin>", line 2843, in get
  File "<stdin>", line 2807, in _make_cookie
struct.error: int too large to convert

I also tried a patch to make cfmakeraw happier:

def cfmakeraw(tflags):
    """
    Given a list returned by :py:func:`termios.tcgetattr`, return a list
    modified in a manner similar to the `cfmakeraw()` C library function, but
    additionally disabling local echo.
    """
    # BSD: github.com/freebsd/freebsd/blob/master/lib/libc/gen/termios.c#L162
    # Linux: github.com/lattera/glibc/blob/master/termios/cfmakeraw.c#L20
    iflag, oflag, cflag, lflag, ispeed, ospeed, cc = tflags
    if IS_SOLARIS:
        iflag &= ~flags('IGNBRK BRKINT PARMRK ISTRIP INLCR IGNCR ICRNL IXON')
        oflag &= ~flags('OPOST')
        lflag &= ~flags('ECHO ECHONL ICANON ISIG IEXTEN')
        cflag &= ~flags('CSIZE PARENB')
        cflag |= flags('CS8')
    else:
        iflag &= ~flags('IMAXBEL IXOFF INPCK BRKINT PARMRK '
                        'ISTRIP INLCR ICRNL IXON IGNPAR')
        iflag &= ~flags('IGNBRK BRKINT PARMRK')
        oflag &= ~flags('OPOST')
        lflag &= ~flags('ECHO ECHOE ECHOK ECHONL ICANON ISIG '
                    'IEXTEN NOFLSH TOSTOP PENDIN')
        cflag &= ~flags('CSIZE PARENB')
        cflag |= flags('CS8 CREAD')
    return [iflag, oflag, cflag, lflag, ispeed, ospeed, cc]

but because of the cookie problem I did not see whether this helped any

@gaige
Copy link
Contributor Author

gaige commented Jul 26, 2024

@oetiker Interesting failure mode. What's the context of the test you were doing where make_cookie failed? I was running Ansible 2.14.5 on my test machine on very recent SmartOS going against a series of similar-version SmartOS hosts.

@gaige
Copy link
Contributor Author

gaige commented Jul 26, 2024

I reran my tests on Ansible 2.15.12 and was seeing the same success. Although maybe you are seeing a completely different problem that I was here.

@oetiker
Copy link

oetiker commented Jul 26, 2024

yes I am pretty sure my problem is a different one ... maybe related to a different range for process ids on solaris or some such ?

@gaige
Copy link
Contributor Author

gaige commented Jul 27, 2024

yes I am pretty sure my problem is a different one ... maybe related to a different range for process ids on solaris or some such ?

What Solaris or derivative are you running? We're running SmartOS and I'm not seeing crashes in _make_cookie, however since it's basically a pack call, differing sizes would definitely be suspicious.

@moreati
Copy link
Member

moreati commented Aug 12, 2024

Notes to myself,

  • OpenSolaris: initial Open Source release of Solaris by Sun, discontinued by Oracle
  • Solaris Express, Oracle Solaris: proprietary releases
  • Illumos: Foundation/umbrella continuing OpenIndiana (fka OpenSolaris), basis of multiple "Illumos distributions"
  • SmartOS: Illumos distribution that incorporates hypervisors & cloud platform
  • Helios: Illumos distribution by Oxide Computers (WIP)

Other IS_* constants in Mitogen

➜  mitogen git:(master) ✗ ag -s '(\b|\.)IS_[A-Z0-9_]+ +='
mitogen/parent.py
149:IS_LINUX = os.uname()[0] == 'Linux'

mitogen/core.py
170:IS_DEAD = 999
203:    IS_WSL = 'Microsoft' in fp.read()
206:    IS_WSL = False

Questions to ponder (answers are not required to merge this)

  1. Could an Illumos distro be added as a test target in CI (e.g. VM, container voodoo)? Externally hosted?
  2. Is it worth creating another top level IS_* constant? Is it worth keeping the existing ones?
  3. Can disable_echo() be unit tested? Should it?

mitogen/parent.py Outdated Show resolved Hide resolved
@moreati moreati changed the title fix: solaris fix Fix: termios.error: (22, 'Invalid argument') during become on Solaris/Illumos/SmartOS Sep 6, 2024
@gaige
Copy link
Contributor Author

gaige commented Sep 12, 2024

@moreati : not sure what is causing these CI failures, but they don't seem directly related to (or unique to) this patch.

@moreati
Copy link
Member

moreati commented Sep 12, 2024

not sure what is causing these CI failures, but they don't seem directly related to (or unique to) this patch.

They're due to #1058, almost certainly unrelated to your change. To re-run tests for a PR add a comment containing

/AzurePipelines run

Edit: I can also rerun failed jobs within CI run. I'll do so now.

@gaige
Copy link
Contributor Author

gaige commented Sep 14, 2024

@moreati Any other requests for this one now that it's passing checks?

@moreati
Copy link
Member

moreati commented Sep 14, 2024

@moreati Any other requests for this one now that it's passing checks?

Yes. Two suggested changes are still open (see previous comments), and I want to answer "what are the consequences of leaving echo enabled?" before merging.

@gaige
Copy link
Contributor Author

gaige commented Sep 14, 2024

@moreati Any other requests for this one now that it's passing checks?

Yes. Two suggested changes are still open (see previous comments), and I want to answer "what are the consequences of leaving echo enabled?" before merging.

The reason I asked is that I looked through the comments here and believe that I have answered the questions or have addressed them in code. If you can give me a reference to the outstanding suggested changes, I'll gladly address them. Sorry if I'm being dense here.

The question of "what are the consequences of leaving echo enabled?" isn't quite the question for Solaris. In the case of mitogen, you're disabling echo on both sides of the pty, which works in linux. For Solaris, the pty is streams-based and because of the way it works, the termio ioctls are supported by the subsidiary (slave) device on Solaris. Setting termios attributes, such as ECHO on Solaris (and other Unix98 systems of that vintage) is not allowed and will result in the invalid parameter error.

@moreati
Copy link
Member

moreati commented Sep 14, 2024

The reason I asked is that I looked through the comments here and believe that I have answered the questions or have addressed them in code. If you can give me a reference to the outstanding suggested changes

You should be able to see the suggested changes on this page, and accept them using the buttons below each. This is how they look to me

@gaige
Copy link
Contributor Author

gaige commented Sep 14, 2024

You should be able to see the suggested changes on this page, and accept them using the buttons below each. This is how they look to me

Oddly enough, I don't see that on my screen, and searching for those words on the page doesn't find them either. With that said, let me apply the patches.

Copy link
Member

@moreati moreati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still can't find primary documentation that clearly states Solaris-likes only accept the slave fd for tcgetattr() and tcsetattr(), but the exerimental evidence has me satisfied.

mitogen/parent.py Outdated Show resolved Hide resolved
mitogen/parent.py Show resolved Hide resolved
mitogen/parent.py Outdated Show resolved Hide resolved
mitogen/parent.py Outdated Show resolved Hide resolved
@moreati moreati merged commit 2ba1b2b into mitogen-hq:master Sep 20, 2024
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

become fails on solaris
3 participants