Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a header with manifest tag to pull and push #358

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

yinonavraham
Copy link

Signed-off-by: Yinon Avraham yinona@jfrog.com

Add an optional request header Docker-Manifest-Tag to all requests in the push and pull operations.
This request header, when added, shall include the tag value of the image being pushed / pulled.

For example, when pushing or pulling image foo:1.2, the following header will be added to all requests:

Docker-Manifest-Tag: 1.2

Registries can then use this header, if it exists, for various use cases. For example:

  • In the /v2/ (ping) request - to identify the specific requested tag and by that limit the authorization request (auth challenge), making the authorization grant specific to that tag
  • In the .../blobs/... requests - optimize locating blobs while they are being pushed or pulled. This is useful in registry implementations which keep a link between tags and their blobs

Signed-off-by: Yinon Avraham <yinona@jfrog.com>
@yinonavraham
Copy link
Author

Also opened a PR with the implementation of adding this header: moby/moby#44359

@sajayantony
Copy link
Member

In the /v2/ (ping) request - to identify the specific requested tag and by that limit the authorization request (auth challenge), making the authorization grant specific to that tag

If the goal is to achieve tag-based authorization, I'm not following how adding header would enable this. The actual token needs to have it in the scope https://www.rfc-editor.org/rfc/rfc6749#section-4.2.2

Currently the distribution spec also doesn't define authentication in any form and kind of leave it to implementations to define this -e.g - https://docs.docker.com/registry/spec/auth/oauth/#example-refreshing-an-access-token

@yinonavraham
Copy link
Author

Because the distribution spec leaves authentication and authorization out of scope (which IMHO is a good decision), and because this information is helpful in several use cases, the suggestion here is to add it as a new request header.

For tag based authorization, having the header with the relevant tag when pushing or pulling a manifest or blob by digest can be very helpful. It gives the registry a hint on the context which points it on which tag to check, which leads the registry to the actual manifest to see if the requested blob is referenced. The auth token only serves as a mean for carrying the granted permissions. The header is used to easily locate the tag's manifest. Without it, the registry needs to get all tagged manifests of the named image, filter only those referencing the blob, and check each one to find the first that the auth token grants permissions on.

@sudo-bmitch
Copy link
Contributor

Some stream of consciousness feedback:

  • the docker headers were kept for legacy reasons, any new headers we add should be OCI headers
  • auth specifications in OCI is being planned, I'll be submitting a new working group proposal for that, hopefully this week
  • on the /v2 API, you don't know which repository is being accessed if you only have a tag
  • registries don't require the ping of the /v2 API, docker does this since it builds the auth headers manually, but you can skip that and get the auth header on your request directly to the repo, possibly skipping a round trip if anonymous access is permitted
  • some images are pushed/pulled by digest
  • multi platform images require recursion to search all child manifests
  • multiple tags can be pushed for the same digest in the same build command
  • on push, the blobs are pushed before the manifest, so there's nothing to verify
  • the user can lie about which tag is being pushed for a blob, then use it in a differently tagged manifest

@yinonavraham
Copy link
Author

@sudo-bmitch thanks for the feedback.

  • the docker headers were kept for legacy reasons, any new headers we add should be OCI headers

Understood. The goal with the suggested header name was to be aligned with similar existing headers like Docker-Content-Digest. But any better suggestion is of course welcome.

  • on the /v2 API, you don't know which repository is being accessed if you only have a tag

Right, mentioning the /v2/ endpoint was slightly wrong and biased by the actual implementation. I will rephrase -
The added section in the spec doesn't mention a specific endpoint, but refers to all requests in the push / pull. In the existing implementation this includes the GET /token request issued by the client following the auth challenge in the /v2/ response. This followup request does include the repository in the scope. So the goal was to make it accompany that request as well. But nevertheless, this is only one part of the goal of adding this header.
Since there is going to be a work group on adding auth to the spec, I think it is good to consider including the tag in addition to the repository and actions.

  • registries don't require the ping of the /v2 API ...

This is one of the reasons the header is added to each request, not relying on the auth. Anonymous access can be viewed at the same as any other principal. A repository owner might want to open specific tags for anonymous access, but require auth on other tags. Having the header on each request, regardless of the auth, supports it.

  • some images are pushed/pulled by digest

True for pull, not sure about push. But anyway - pulling (or pushing) images by digest is a different story, not covered by the suggestion in this PR. I did start to think about it, but decided to not mix the two for now and keep this PR focused on tags.

  • multi platform images require recursion to search all child manifests

True, but knowing which manifest list to start with by its tag is a very big hint which can greatly reduce the search area. The common recursion level is 2 (manifest list -> manifests) and the number of leaf manifests is usually small (e.g. number of supported platforms).

  • multiple tags can be pushed for the same digest in the same build command

Good point. As much as I saw, the reference implementation (moby) takes the tags one-by-one and relies on the HEAD request for checking whether the blob already exists. But if there are other implementations that do it otherwise, I think it is OK to define that the header can have multiple values. But I'm not sure it is really required for now. If needed - it can probably be added later on.

  • on push, the blobs are pushed before the manifest, so there's nothing to verify

You can verify that the authenticated principal is authorized to push the specific tag and fail the upload of the blobs earlier in the process, instead of only when the manifest is uploaded.

  • the user can lie about which tag is being pushed for a blob, then use it in a differently tagged manifest

Registries must take such "lies" under consideration. The suggestion says this is not a hard requirement - clients SHOULD send it, registries MAY use it. This header acts as a hint, enabling the registry to improve some actions based on it.

@sudo-bmitch
Copy link
Contributor

@sudo-bmitch thanks for the feedback.

  • the docker headers were kept for legacy reasons, any new headers we add should be OCI headers

Understood. The goal with the suggested header name was to be aligned with similar existing headers like Docker-Content-Digest. But any better suggestion is of course welcome.

Here's some context from when the spec was being written: #26

Looking back at the suggested use cases:

  • In the /v2/ (ping) request - to identify the specific requested tag and by that limit the authorization request (auth challenge), making the authorization grant specific to that tag

Scoping access to tags instead of repos is certainly something that I've seen interest in. But I don't think depending on user hints is a good way to securely implement it. Maybe I'm missing something.

  • In the .../blobs/... requests - optimize locating blobs while they are being pushed or pulled. This is useful in registry implementations which keep a link between tags and their blobs

Because of how blobs are reused between manifests, and manifests can have multiple tags, this feels like the wrong place to create that relationship, especially since images sharing layers may be pushed long after the blobs they are sharing were first pushed. But I spend most of my time on the client side, so I want leave space for registry operators to weigh in.

For myself, I don't see a clear issue being solved by this that wouldn't be better solved another way. So it may help to work through what problems this solves, and why this is the best way to solve those problems.

@yinonavraham
Copy link
Author

@sudo-bmitch thanks for the link regarding the header naming. As I understand, the suggestion is to name this field: OCI-Manifest-Tag, right? Please correct me if I misunderstood. I saw PR #206 was merged, but the spec in main branch does not include it, so it is not very clear...

Let's put the discussion on /v2/ aside, since it is not the main topic here. It is not mentioned explicitly in the suggested paragraph.

The suggestion is not to rely on the manifest tag in the header for scoping the request, as the header by itself cannot be trusted. The goal of the header is a hint for the registry on the context of the request, i.e. which manifest to check and verify this blob belongs to it.

Today, when a registry receives a request for blob, and the request has authorization which is scoped to specific image repositories and tags (because it is explicit in the token, or implied by other means - specific to the auth provider), in order for the registry to check the request can access the requested blob, it needs to:

  1. Optionally try to extract the permitted tags from the request's authorization, either by parsing the token (which is not something I would expect the registry to do), or by querying the authorization service - which probably requires a dedicated API. Anyway, the result of this extraction can be a list of tags.
  2. Must iterate over all (or just the list from step 1) the repository image tags, read their manifests' config and find the first one that both - (a) the request is authorized on, and (b) references the requested blob.

Having the tag added to the context of the request using a header gives a hint to the registry where to look. If the header is missing - nothing changed, this is the same as before (backward compatible). But if the header exists, the registry can check only that specific tag, avoiding the need to scan all other tags.

If the client lies and hints with a wrong tag there are several options:

  1. the request is not authorized on the hinted tag - the registry rejects the request
  2. the request is authorized on the hinted tag, but the tag doesn't reference this blob - the registry rejects the request
  3. the request is authorized on the hinted tag, and the tag references this blob - the registry serves the request. And this is ok.

Except for tag based authorization, there are other viable use cases for adding this header. Another example is registries which collect statistics on downloads and uploads can use this quite simply in order to give more fine-grained reports such as how many bytes downloaded/uploaded per tag.

@jonjohnsonjr
Copy link
Contributor

X-ref: #328

@yinonavraham
Copy link
Author

@sajayantony , @sudo-bmitch ,
I saw this PR is listed in the agenda items of the OCI weekly discussion on Oct. 27th 2022, but I didn't find any notes.
How can we proceed and push this proposal forward?

Copy link
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments assume we pass a list of references rather than a correlation ID. But which makes the most sense is best worked out in the linked issue.

spec.md Outdated Show resolved Hide resolved
spec.md Outdated Show resolved Hide resolved
spec.md Outdated Show resolved Hide resolved
spec.md Outdated Show resolved Hide resolved
spec.md Outdated Show resolved Hide resolved
@sudo-bmitch
Copy link
Contributor

I saw this PR is listed in the agenda items of the OCI weekly discussion on Oct. 27th 2022, but I didn't find any notes. How can we proceed and push this proposal forward?

It was a quick discussion to raise some visibility, which is when we linked to the existing issue. We should probably work through design discussions in the issue first, before making changes in the PR.

@yinonavraham
Copy link
Author

@sudo-bmitch I started to fix the PR based on the comments, but I have the following dilemma -
The suggestion is to use a header named OCI-Ref.
BUT- In the spec, the term reference usually refers to the tag or digest, while in #328 the suggestion is to put the full named image reference in the header value, i.e. : <registry>/<name><reference> (where <reference> is either :<tag> or @<digest>).

To resolve that and make the header name more explicit, I suggest to use: OCI-Named-Ref.
Although it still leaves it unclear whether the registry is part of the name or not, I think it is good enough and OCI-Full-Named-Ref seems an overkill...

- Rename header to OCI-Named-Ref
- Remove "hint"
- Merge to a single sentence, one for pull,
  one for push
- Change SHOULD to MAY in the push process

Signed-off-by: Yinon Avraham <yinona@jfrog.com>
@sudo-bmitch
Copy link
Contributor

@yinonavraham That hits on #216. We've been rather inconsistent and need to clean that up. I would lean towards reference being the full name (e.g. registry/repo:tag) and coming up with a new name for a tag or descriptor.

Signed-off-by: Yinon Avraham <yinona@jfrog.com>
@yinonavraham
Copy link
Author

@sudo-bmitch how can we push this forward?

@sudo-bmitch
Copy link
Contributor

We need to get some reviews from other mods, but I think most of the focus is on getting a 1.1 release out right now.

@sudo-bmitch sudo-bmitch added this to the v1.2.0 milestone May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants