Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple cleanup context from obtaining context #313

Open
aplr opened this issue Oct 17, 2024 · 2 comments
Open

Decouple cleanup context from obtaining context #313

aplr opened this issue Oct 17, 2024 · 2 comments

Comments

@aplr
Copy link
Contributor

aplr commented Oct 17, 2024

What version of the package are you using?

v0.21.4

What are you trying to do?

Cleaning up after a failed obtain attempt

What steps did you take?

Configured Certmagic to issue certificates on-demand

What did you expect to happen, and what actually happened instead?

Obtaining an on-demand certificate runs into the configured timeout of 180s:

certmagic/handshake.go

Lines 524 to 526 in c783cbd

var cancel context.CancelFunc
ctx, cancel = context.WithTimeout(ctx, 180*time.Second)
defer cancel()

If that happens, I expect the cleanup routines to succeed.

https://github.com/mholt/acmez/blob/527e47cae3f84fa3a92d1d9b9c21c5eb0b44359a/client.go#L316-L323

https://github.com/mholt/acmez/blob/527e47cae3f84fa3a92d1d9b9c21c5eb0b44359a/client.go#L337

However, the cleanup routines are bound to the cancelled context, and won't run as the context was cancelled.

Show error logs
{
  "insertId": "19ucu8ye3zmtq",
  "jsonPayload": {
    "msg": "cleaning up solver",
    "challenge_type": "http-01",
    "level": "error",
    "ts": 1729163825.755899,
    "identifier": "redacted.com",
    "caller": "v2@v2.0.3/client.go:544",
    "time": "2024-10-17T11:17:05.75616761Z",
    "stacktrace": "github.com/mholt/acmez/v2.(*Client).pollAuthorization\n\t/go/pkg/mod/github.com/mholt/acmez/v2@v2.0.3/client.go:544\ngithub.com/mholt/acmez/v2.(*Client).solveChallenges\n\t/go/pkg/mod/github.com/mholt/acmez/v2@v2.0.3/client.go:378\ngithub.com/mholt/acmez/v2.(*Client).ObtainCertificate\n\t/go/pkg/mod/github.com/mholt/acmez/v2@v2.0.3/client.go:136\ngithub.com/caddyserver/certmagic.(*ACMEIssuer).doIssue\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/acmeissuer.go:477\ngithub.com/caddyserver/certmagic.(*ACMEIssuer).Issue\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/acmeissuer.go:371\ngithub.com/aplr/redacted/tlsx.(*conditionalIssuer).Issue\n\t/app/tlsx/issuer.go:111\ngithub.com/caddyserver/certmagic.(*Config).obtainCert.func2\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/config.go:626\ngithub.com/caddyserver/certmagic.doWithRetry\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/async.go:104\ngithub.com/caddyserver/certmagic.(*Config).obtainCert\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/config.go:700\ngithub.com/caddyserver/certmagic.(*Config).ObtainCertAsync\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/config.go:505\ngithub.com/caddyserver/certmagic.(*Config).obtainOnDemandCertificate\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:531\ngithub.com/caddyserver/certmagic.(*Config).getCertDuringHandshake\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:369\ngithub.com/caddyserver/certmagic.(*Config).GetCertificateWithContext\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:92\ngithub.com/caddyserver/certmagic.(*Config).GetCertificate\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:49\ncrypto/tls.(*Config).getCertificate\n\t/usr/local/go/src/crypto/tls/common.go:1196\ncrypto/tls.(*serverHandshakeStateTLS13).pickCertificate\n\t/usr/local/go/src/crypto/tls/handshake_server_tls13.go:475\ncrypto/tls.(*serverHandshakeStateTLS13).handshake\n\t/usr/local/go/src/crypto/tls/handshake_server_tls13.go:61\ncrypto/tls.(*Conn).serverHandshake\n\t/usr/local/go/src/crypto/tls/handshake_server.go:54\ncrypto/tls.(*Conn).handshakeContext\n\t/usr/local/go/src/crypto/tls/conn.go:1568\ncrypto/tls.(*Conn).HandshakeContext\n\t/usr/local/go/src/crypto/tls/conn.go:1508\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1971",
    "env": "production",
    "cos.googleapis.com/stream": "stderr",
    "error": "deleting object acme/acme.zerossl.com-v2-dv90/challenge_tokens/redacted.com.json: context deadline exceeded",
    "logger": "app.https.tls.certmagic.acme_issuer.acme_client",
  }
}
{
  "insertId": "19ucu8ye3zmts",
  "jsonPayload": {
    "env": "production",
    "stacktrace": "github.com/mholt/acmez/v2.(*Client).solveChallenges.func1\n\t/go/pkg/mod/github.com/mholt/acmez/v2@v2.0.3/client.go:340\ngithub.com/mholt/acmez/v2.(*Client).solveChallenges\n\t/go/pkg/mod/github.com/mholt/acmez/v2@v2.0.3/client.go:380\ngithub.com/mholt/acmez/v2.(*Client).ObtainCertificate\n\t/go/pkg/mod/github.com/mholt/acmez/v2@v2.0.3/client.go:136\ngithub.com/caddyserver/certmagic.(*ACMEIssuer).doIssue\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/acmeissuer.go:477\ngithub.com/caddyserver/certmagic.(*ACMEIssuer).Issue\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/acmeissuer.go:371\ngithub.com/aplr/redacted/tlsx.(*conditionalIssuer).Issue\n\t/app/tlsx/issuer.go:111\ngithub.com/caddyserver/certmagic.(*Config).obtainCert.func2\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/config.go:626\ngithub.com/caddyserver/certmagic.doWithRetry\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/async.go:104\ngithub.com/caddyserver/certmagic.(*Config).obtainCert\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/config.go:700\ngithub.com/caddyserver/certmagic.(*Config).ObtainCertAsync\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/config.go:505\ngithub.com/caddyserver/certmagic.(*Config).obtainOnDemandCertificate\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:531\ngithub.com/caddyserver/certmagic.(*Config).getCertDuringHandshake\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:369\ngithub.com/caddyserver/certmagic.(*Config).GetCertificateWithContext\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:92\ngithub.com/caddyserver/certmagic.(*Config).GetCertificate\n\t/go/pkg/mod/github.com/caddyserver/certmagic@v0.21.4/handshake.go:49\ncrypto/tls.(*Config).getCertificate\n\t/usr/local/go/src/crypto/tls/common.go:1196\ncrypto/tls.(*serverHandshakeStateTLS13).pickCertificate\n\t/usr/local/go/src/crypto/tls/handshake_server_tls13.go:475\ncrypto/tls.(*serverHandshakeStateTLS13).handshake\n\t/usr/local/go/src/crypto/tls/handshake_server_tls13.go:61\ncrypto/tls.(*Conn).serverHandshake\n\t/usr/local/go/src/crypto/tls/handshake_server.go:54\ncrypto/tls.(*Conn).handshakeContext\n\t/usr/local/go/src/crypto/tls/conn.go:1568\ncrypto/tls.(*Conn).HandshakeContext\n\t/usr/local/go/src/crypto/tls/conn.go:1508\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1971",
    "ts": 1729163825.757464,
    "identifier": "redacted.com",
    "caller": "v2@v2.0.3/client.go:340",
    "level": "error",
    "time": "2024-10-17T11:17:05.757641208Z",
    "cos.googleapis.com/stream": "stderr",
    "authz": "https://acme.zerossl.com/v2/DV90/authz/redacted",
    "logger": "app.https.tls.certmagic.acme_issuer.acme_client",
    "msg": "deactivating authorization",
    "error": "attempt 1: https://acme.zerossl.com/v2/DV90/authz/redacted: context deadline exceeded",
  },
}

How do you think this should be fixed?

Allow to provide a long-running context for cleanup routines, which is passed to acmez, if this makes sense.

Bonus: What do you use CertMagic for, and do you find it useful?

Creating on-demand certificates for parked domains

@mholt
Copy link
Member

mholt commented Oct 18, 2024

Ah, yeah, that's tricky. Cleanup probably needs to be given a separate, uncanceled context, but we have to really trust that it won't take long to run. Maybe we could pass in a context with a new cancel (like a new timeout, but long enough).

@aplr
Copy link
Contributor Author

aplr commented Oct 24, 2024

My idea was to allow passing a custom context to acmez.Client, like a context for background tasks:

https://github.com/mholt/acmez/blob/527e47cae3f84fa3a92d1d9b9c21c5eb0b44359a/client.go#L52-L57

If no custom context is provided, it will just fall back to the current behaviour.

This would be pulled through to certmagic as well, where a custom context can be passed to the ACMEIssuer:

certmagic/acmeissuer.go

Lines 43 to 157 in 65cf36c

type ACMEIssuer struct {
// The endpoint of the directory for the ACME
// CA we are to use
CA string
// TestCA is the endpoint of the directory for
// an ACME CA to use to test domain validation,
// but any certs obtained from this CA are
// discarded
TestCA string
// The email address to use when creating or
// selecting an existing ACME server account
Email string
// The PEM-encoded private key of the ACME
// account to use; only needed if the account
// is already created on the server and
// can be looked up with the ACME protocol
AccountKeyPEM string
// Set to true if agreed to the CA's
// subscriber agreement
Agreed bool
// An optional external account to associate
// with this ACME account
ExternalAccount *acme.EAB
// Optionally specify the validity period of
// the certificate(s) here as offsets from the
// approximate time of certificate issuance,
// but note that not all CAs support this
// (EXPERIMENTAL: Subject to change)
NotBefore, NotAfter time.Duration
// Disable all HTTP challenges
DisableHTTPChallenge bool
// Disable all TLS-ALPN challenges
DisableTLSALPNChallenge bool
// The host (ONLY the host, not port) to listen
// on if necessary to start a listener to solve
// an ACME challenge
ListenHost string
// The alternate port to use for the ACME HTTP
// challenge; if non-empty, this port will be
// used instead of HTTPChallengePort to spin up
// a listener for the HTTP challenge
AltHTTPPort int
// The alternate port to use for the ACME
// TLS-ALPN challenge; the system must forward
// TLSALPNChallengePort to this port for
// challenge to succeed
AltTLSALPNPort int
// The solver for the dns-01 challenge;
// usually this is a DNS01Solver value
// from this package
DNS01Solver acmez.Solver
// TrustedRoots specifies a pool of root CA
// certificates to trust when communicating
// over a network to a peer.
TrustedRoots *x509.CertPool
// The maximum amount of time to allow for
// obtaining a certificate. If empty, the
// default from the underlying ACME lib is
// used. If set, it must not be too low so
// as to cancel challenges too early.
CertObtainTimeout time.Duration
// Address of custom DNS resolver to be used
// when communicating with ACME server
Resolver string
// Callback function that is called before a
// new ACME account is registered with the CA;
// it allows for last-second config changes
// of the ACMEIssuer and the Account.
// (TODO: this feature is still EXPERIMENTAL and subject to change)
NewAccountFunc func(context.Context, *ACMEIssuer, acme.Account) (acme.Account, error)
// Preferences for selecting alternate
// certificate chains
PreferredChains ChainPreference
// Set a logger to configure logging; a default
// logger must always be set; if no logging is
// desired, set this to zap.NewNop().
Logger *zap.Logger
// Set a http proxy to use when issuing a certificate.
// Default is http.ProxyFromEnvironment
HTTPProxy func(*http.Request) (*url.URL, error)
config *Config
httpClient *http.Client
// Some fields are changed on-the-fly during
// certificate management. For example, the
// email might be implicitly discovered if not
// explicitly configured, and agreement might
// happen during the flow. Changing the exported
// fields field is racey (issue #195) so we
// control unexported fields that we can
// synchronize properly.
email string
agreed bool
mu *sync.Mutex // protects the above grouped fields, as well as entire struct during NewAccountFunc calls
}

Then, when creating the acmez.Client, the context can just be set:

certmagic/acmeclient.go

Lines 258 to 282 in 65cf36c

func (iss *ACMEIssuer) newBasicACMEClient() (*acmez.Client, error) {
caURL := iss.CA
if caURL == "" {
caURL = DefaultACME.CA
}
// ensure endpoint is secure (assume HTTPS if scheme is missing)
if !strings.Contains(caURL, "://") {
caURL = "https://" + caURL
}
u, err := url.Parse(caURL)
if err != nil {
return nil, err
}
if u.Scheme != "https" && !SubjectIsInternal(u.Host) {
return nil, fmt.Errorf("%s: insecure CA URL (HTTPS required for non-internal CA)", caURL)
}
return &acmez.Client{
Client: &acme.Client{
Directory: caURL,
UserAgent: buildUAString(),
HTTPClient: iss.httpClient,
Logger: iss.Logger.Named("acme_client"),
},
}, nil
}

I think then it is just about documenting this feature so users understand the consequence, and that one has to cancel this context on e.g. application shutdown.

If you don't want to have the API changed, however, I think a new context with a sane (possibly configurable) timeout is good as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants