Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up Python client auth #836

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

danielballan
Copy link
Member

@danielballan danielballan commented Jan 3, 2025

The goal of this PR is to improve the user experience and simplify the implementation for authentication in the Python client. There are no changes to the server in this PR.The new docs page in this PR, docs/source/how-to/authentication.md, is a good place to start. I recommend reading that before reading the summary below.

Regressions fixed

The PR addresses a critical regressions, both released in v0.1.0b12

A trailing-comma-as-tuple bug in #823 which manifests as

>>> from_uri("https://tiled.nsls2.bnl.gov")
...
KeyError: 0

and a deeper bug in #816 that prevents login:

>>> from_uri("https://tiled.nsls2.bnl.gov")
ClientError: 401: Incorrect username or password https://tiled.nsls2.bnl.gov/api/v1/auth/provider/pam/token

Design changes

  • The high-level convenience wrappers no longer collect an optional username. The username kwarg in from_uri(..., username="...") and from_profile(..., username="...") is now completely ignored (with a warning to the user). It will later be removed.
    • The username kwarg never made sense for browser-based (device code grant) auth anyway. (In that case, the username goes to the browser, not Tiled.) It always was a bit of wart.
    • The username kwarg has been the source of a lot of confusion over the years... with unintuitive behavior related to username=None and lately username=UNSET. I take this as a sign that it should not have been merged into the from_uri and from_profile convenience functions.
    • A username prompt like Username [dallan]: can be a "password trap". Tiled is trying to collect your username, but you might reflexively type your password onto the screen in clear text. If we don't have a valid session ready to reuse, we should just ask the user to type their username with plain old Username:.
  • This PR enables users to opt out of stashing tokens with a new keyword argument: from_uri("https://...", remember_me=False). The tokens will be stored only in memory. This is equivalent of logging into GMail in an Incognito Browser.
  • For a given token directory---typically $XDG_CACHE_DIR/tiled/tokens/---Tiled only stashes tokens for one user at a time (per Tiled server, as defined by its netloc).
    • It seems like a mistake to have ever supported stashing more than one. Storing two distinct principals' tokens with the same file permissions invites leakage.
    • The use case I originally had in mind was two processes running in a shared account (e.g. an NSLS2 Operator Account) authenticated as differnet users. But this rather niche use case can be addressed in many other, simpler ways:
      • remember_me=False "incognito mode"
      • setting custom token cache locations using TILED_CACHE_DIR
      • using an API key
    • This removes the notion of "default identity" and simplifies the state on disk. Instead of deeply nested directories like ~/.cache/tiled/{netloc}/{idP}/{username} we now just have one directory per tiled server: ~/.cache/tiled/{netloc}.
    • There is no need for logout(clear_default=True). Now logout() is all you need.

Importantly, it is still straightforward to authenticate to multiple servers in one session because the tokens are still scoped by the server's netloc:

aps_client  = from_uri("https://tiled.aps.anl.gov")  # an aspirational URL
nsls2_client = from_uri("https://tield.nsls2.bnl.gov")

Finally, as mentioned in the docs, this provides a path for applications without access to an interactive terminal. The approach outlined there (which could be fleshed out in the future with a fuller example) will work for both password grant and device code grant modes, as well as for Tiled servers that offer multiple authentication providers. Authentication code has been factored out in to public functions password_grant and device_code_grant, suitable for reuse.

Closes #820
Closes #831
Closes #833

Checklist

  • Add a Changelog entry
  • Add the ticket number which this PR closes to the comment section

Copy link
Contributor

@padraic-shafer padraic-shafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. I've added some questions/suggestions in the comments.

tiled/client/auth.py Outdated Show resolved Hide resolved
tiled/client/auth.py Outdated Show resolved Hide resolved
tiled/client/constructors.py Outdated Show resolved Hide resolved
tiled/client/constructors.py Outdated Show resolved Hide resolved
docs/source/how-to/authentication.md Outdated Show resolved Hide resolved
tiled/client/context.py Outdated Show resolved Hide resolved
tiled/client/context.py Show resolved Hide resolved
tiled/client/context.py Outdated Show resolved Hide resolved
provider = spec["provider"]
mode = spec["mode"]
if mode == "password":
# Prompt for username, password at terminal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should external clients also prompt for the username every time, rather than store the name of the last user as a convenience?

In other words, since Tiled python client has made this change, is it a recommendation for all Tiled clients?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you have in mind when you say "external clients"?

Copy link
Contributor

@padraic-shafer padraic-shafer Jan 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of general GUI clients (Qt, JS, etc. which might or might not be using the Tiled python client) and also had not seen the nice summary in "Design changes" when I wrote that question...so I was guessing at the reasons for removing username from the function args.

I see now that the reasons were partly about confusion in the function call and partly about UX (whoops, I just typed my plaintext password into the username field).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I was asking: Is there a security concern for caching the username as a convenience for the user?

It looks the answer is that it may be a behavioral concern (did the user realize that it's not the right place to enter their password) but not generally a "leakage" concern of Who was the last user before me?

My takeaway is that an app could store the username but that would be up to the app designer, and if so then it should be clear where the password gets entered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Fully on the same page, just reiterating:

The change in behavior in Tiled here is: we used to cache the username ("default identity") and now we only cache tokens. So, you either have a valid session of you don't, and if you don't Tiled itself will not retain any state related to some previously-used identity or session. That would be up to applications to manage as they wish, as you said.

tiled/client/context.py Outdated Show resolved Hide resolved
danielballan and others added 13 commits January 4, 2025 10:40
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
@danielballan
Copy link
Member Author

By the way, regarding the CI failures here, I haven't even bothered running pytest locally yet. I'm sure there is a lot to fix related to API changes. Just looking for feedback on the user interface and the approach. Then I'll tighten it up.

Copy link
Contributor

@maffettone maffettone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually this makes a lot of sense and clears some clutter that could become problematic.

  • I really like the parse_time_string changes. That could (for expediency and clarity) be its own atomic PR, I think.
  • The caching changes make a lot of sense for locking this down. The use case of "two processes running in a shared account" should be accomplished by on-behalf-of flow.
  • This will probably break some of the testing and applications designed around here (https://github.com/bluesky/tiled/pull/816/files), but rightfully so. And that is a recent enough change that it won't be a strong dependency of many. When we are at a confident state, I can update https://github.com/NSLS2/tiled-qt-tools to serve as an on site example of the interaction.

Custom applications should avoid using the convenience functions
`tiled.client.constructors.from_uri` and
`tiled.client.constructors.from_profile`. Instead, follow this pattern:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good place to link:

def prompt_for_credentials(http_client, providers):
"""
Prompt for credentials or third-party login at an interactive terminal.
"""
spec = _choose_identity_provider(providers)
auth_endpoint = spec["links"]["auth_endpoint"]
provider = spec["provider"]
mode = spec["mode"]
if mode == "password":
# Prompt for username, password at terminal.
username = input("Username: ")
PASSWORD_ATTEMPTS = 3
for _attempt in range(PASSWORD_ATTEMPTS):
password = getpass.getpass()
try:
tokens = password_grant(
http_client, auth_endpoint, provider, username, password
)
except httpx.HTTPStatusError as err:
if err.response.status_code == httpx.codes.UNAUTHORIZED:
print(
"Username or password not recognized. Retry, or press Enter to cancel."
)
continue
else:
# Sucess! We have tokens.
break
else:
# All attempts failed.
raise RuntimeError("Password attempts failed.")
elif mode == "external":
# Display link and access code, and try to open web browser.
# Block while polling the server awaiting confirmation of authorization.
tokens = device_code_grant(http_client, auth_endpoint)
else:
raise ValueError(f"Server has unknown authentication mechanism {mode!r}")
confirmation_message = spec.get("confirmation_message")
if confirmation_message:
username = tokens["identity"]["id"]
print(confirmation_message.format(id=username))
return tokens

as a sensible default.

@jwlodek
Copy link
Member

jwlodek commented Jan 6, 2025

Looks like the correct approach to me as well. I think caching username is a nice convenience, but probably not really needed, and can be done at the application level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants