Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17541: LBSolrClient implementations should agree on 'getClient()' semantics #2899

Merged
merged 35 commits into from
Dec 23, 2024

Conversation

jdyer1
Copy link
Contributor

@jdyer1 jdyer1 commented Dec 9, 2024

With this PR makes LBHttp2SolrClient maintains an instance of HttpSolrClientBuilderBase per "Base Url". This makes the semantics of LBHttp2SolrClient#getClient consistent with that of the older LBHttpSolrClient.

Behavior changes:

  • LBHttp2SolrClient generates a Http Solr Client per base url
    • (1) at constructon
    • (2) whenever a previously-unseen base url is encountered
    • NOTE: The Map holding the clients may grow unbounded; there is no cleanup short of callng close().
  • LBHttp2SolrClient mutates the baseSolrUrl variable of the caller-supplied instance of HttpSolrClientBuilderBase whenever it creates a new Http Solr Client.
  • Both LBHttp2SolrClient and CloudHttp2SolrClient always own the internal/delegate clients. They are now always closed by us on close().

The following class definitions are changed in the SolrJ public API:

  • LBHttp2SolrClient
    • from: LBHttp2SolrClient<C extends HttpSolrClientBase> extends LBSolrClient
    • to: LBHttp2SolrClient<B extends HttpSolrClientBuilderBase<?, ?>> extends LBSolrClient
  • LBHttp2SolrClient.Builder
    • from: Builder<C extends HttpSolrClientBase>
    • to: Builder<B extends HttpSolrClientBuilderBase<?, ?>>
  • All methods on LBHttp2SolrClient.Builder return a Builder<B> instead of a Builder<C>

The following are removed from the SolrJ public API:

  • method CloudHttp2SolrClient.Builder#withHttpClient
  • constructor LBHttp2SolrClient.Builder(C solrClient, Endpoint... endpoints)
    • replaced with: public Builder(B solrClientBuilder, Endpoint... endpoints)
  • method LBHttp2SolrClient.Builder#withListenerFactory
    • renamed to LBHttp2SolrClient.Builder#withListenerFactories (replace any existing)
    • add LBHttp2SolrClient.Builder#addListenerFactory (add to existing)
  • method HttpJdkSolrClient#requestWithBaseUrl (main only, never released)

The following are marked @lucene.experimental

  • method Http2SolrClient#requestWithBaseUrl (main only, never released)
  • interface SolrClientFunction (main only, never released)

Comment on lines 128 to 136
var client = urlToClient.get(endpoint.toString());
if (client == null) {
String tmpBaseSolrUrl = solrClientBuilder.baseSolrUrl;
solrClientBuilder.baseSolrUrl = endpoint.getBaseUrl();
client = solrClientBuilder.build();
urlToClient.put(endpoint.getBaseUrl(), client);
solrClientBuilder.baseSolrUrl = tmpBaseSolrUrl;
}
return client;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer calling ConcurrentHashMap.computeIfAbsent or similar to get-then-put because of it's nice atomicity properties, and avoids synchronization needs

}
}

private synchronized HttpSolrClientBase buildClient(Endpoint endpoint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synchronized is guarding what here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synchronized keyword was to prevent multiple threads from adding clients for the same Base Url. However, in taking your advice to use computeIfAbsent this removes the need.

if (builder.solrClientBuilder.urlParamNames == null) {
this.urlParamNames = Collections.emptySet();
} else {
this.urlParamNames = Collections.unmodifiableSet(builder.solrClientBuilder.urlParamNames);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably fine but I'd prefer Set.copyOf here

Comment on lines 165 to 167
for (HttpSolrClientBase client : urlToClient.values()) {
IOUtils.closeQuietly(client);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be a one-liner with urlToClient.values().forEach(IOUtils::closeQuiety)

@jdyer1
Copy link
Contributor Author

jdyer1 commented Dec 10, 2024

As a blocker for this PR, I am getting many seemingly-unrelated solr-core test failures. For instance, PostToolTest#testBasicRun.

Maybe this is similar to the problem I describe in #2242, in that by sharing resources between client and server in a unit test, we get automatic PKI authentication, on which the tests depend. If this speculation is true, this would be merely a test bug.

Otherwise, perhaps solr-core needs to be able to share instances of Http2SolrClient across its internals? The major change here being callers can only pass a Builder to LbHttp2SolrClient and CloudHttp2Solrclient and not a pre-built delegate client.

In either case I do not know how to continue to make progress so any advice is appreciated.

@dsmiley
Copy link
Contributor

dsmiley commented Dec 12, 2024

The problem is that org.apache.solr.handler.component.HttpShardHandlerFactory#setSecurityBuilder is being called but it only affects the built client, not clients built separately (in LB). My suspicion is that your conversion of passing the Builder as a template instead of an existing client should switch back to the client as a template and then remember to call org.apache.solr.client.solrj.impl.Http2SolrClient.Builder#withHttpClient from the template.

@jdyer1
Copy link
Contributor Author

jdyer1 commented Dec 12, 2024

your conversion of passing the Builder as a template instead of an existing client should switch back to the client as a template

I am not sure this ticket/PR has much value if we continue having the caller pass in a pre-built delegate client instead of a Builder. LBHttp2solrClient needs to create a client per base url, but the caller only supplies one. Yet LBHttp2SolrClientmight not know how to correctly clone that client, keeping in mind LBHttp2solrClient is now generic. It seems to me that passing in the pre-built client is more of a relic of the days before we used Bulders in solrJ, and not a really great API decision. I was hoping to improve that here.

@dsmiley
Copy link
Contributor

dsmiley commented Dec 12, 2024

Okay with me to stay with the builder. Probably add a new method to org.apache.solr.security.HttpClientBuilderPlugin for the builder, overloading setup.

As an aside, I'm totally confused about our authentication plugins, especially PKIAuthenticationPlugin as it apparently has a pivotal role, even when there's know "PKI" going on! CC @janhoy I suspect you might know

@gerlowskija
Copy link
Contributor

This LGTM, pending figuring out the test-failure issue and adding a CHANGES.txt entry that highlights the deprecations/removals.

The following were not removed, but perhaps should be considered to be marked "internal" or "Experimental"

+1 - Both seems like good candidates for a "lucene.experimental" annotation. (Presumably there should be a "solr.experimental", but I don't think that exists...)

@jdyer1
Copy link
Contributor Author

jdyer1 commented Dec 16, 2024

I am curious to know opinions as to how to handle deprecations here. This PR introduces several API changes (see description) but most of the replacement APIs will never exist in any 9.x version. We can add a @deprecated annotation to CloudHttp2SolrClient.Builder#withHttpClient, with a suggestion to use #withInternalClientBuilder instead. Beyond that, is there anything else we can/should do in a theoretical Solr 9.9 to warn users of the upcoming changes?

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would be Solr 10?

@@ -872,15 +868,14 @@ public SolrCloudManager getSolrCloudManager() {
if (cloudManager != null) {
return cloudManager;
}
http2SolrClient =
var httpClientBuilder =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, should have a name like "solrClientBuilder" (maybe adding "http") to clarify this is building a SolrClient not an HttpClient.

private final InstrumentedHttpListenerFactory trackHttpSolrMetrics;

HttpSolrClientProvider(UpdateShardHandlerConfig cfg, SolrMetricsContext parentContext) {
trackHttpSolrMetrics = new InstrumentedHttpListenerFactory(getNameStrategy(cfg));
initializeMetrics(parentContext);

Http2SolrClient.Builder httpClientBuilder =
new Http2SolrClient.Builder().withListenerFactory(List.of(trackHttpSolrMetrics));
this.httpClientBuilder = new Http2SolrClient.Builder().addListenerFactory(trackHttpSolrMetrics);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name httpSolrClientBuilder or similar

@@ -34,4 +34,8 @@ public interface HttpClientBuilderPlugin {
public SolrHttpClientBuilder getHttpClientBuilder(SolrHttpClientBuilder builder);

public default void setup(Http2SolrClient client) {}

public default void setup(Http2SolrClient.Builder httpClientBuilder, Http2SolrClient client) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these should be split up; setup a client and builder separately (2 methods)? Feels weird to take both in setup().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree separate methods would be nicer. But the implementing classes do things on setup that should only be done once. I settled on this ugly API with hopes this is the least-trappy thing to leave for future developers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me at such so I can appreciate the problem better?

@@ -54,8 +53,8 @@ public class CloudHttp2SolrClient extends CloudSolrClient {
*/
protected CloudHttp2SolrClient(Builder builder) {
super(builder.shardLeadersOnly, builder.parallelUpdates, builder.directUpdatesToLeadersOnly);
this.clientIsInternal = builder.httpClient == null;
this.myClient = createOrGetHttpClientFromBuilder(builder);
var httpClientBuilder = createOrGetHttpClientBuilder(builder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

httpSolrClientBuilder

* @param httpClient http client
* @return this
*/
public Builder withHttpClient(Http2SolrClient httpClient) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we want to take this away from users

Copy link
Contributor Author

@jdyer1 jdyer1 Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most controversial part of this PR, and I do not want to merge a potentially disruptive API change like this without buy-in. I do not think this is much of a burden though, because the Builder has withInternalClientBuilder, which users should be able to easily migrate to use.

The purpose of forcing this is that CloudHttp2SolrClient in turn passes this Client(Builder) to LBHttp2SolrClient. But as noted before, LBHttp2SolrClient doesn't really use this Client either; it clones it using Http2SolrClient.builder#withHttpClient. Yet we've made LBHttp2SolrClient generic so this means that any subclass of HttpSolrClientBase needs to also have a way to be cloned. Personally I think clone methods like this are brittle and a maintenance burden, and I don't want to need to write one for HttpJdkSolrClient. It also seems like a better API: by telling the user we want a Builder, we signal that we will create Clients as needed. Passing in an already-built Client signals that we will use that particular one, which is not always the case here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it clones it using Http2SolrClient.builder#withHttpClient

which still exists, so if a user somehow has a client but not a builder but needs one due to the API change you introduce, they have this option.

by telling the user we want a Builder, we signal that we will create Clients as needed

Well written James!

+1

try (MockHttpSolrClient client = new MockHttpSolrClient("http://base.url", b);
LBHttp2SolrClient<MockHttpSolrClient> testClient =
new LBHttp2SolrClient.Builder<>(client, ep1, ep2).build()) {
var httpClientBuilder =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

httpSolrClientBuilder

try (MockHttpSolrClient client = new MockHttpSolrClient("http://base.url", b);
LBHttp2SolrClient<MockHttpSolrClient> testClient =
new LBHttp2SolrClient.Builder<>(client, ep1, ep2).build()) {
var httpClientBuilder =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

httpSolrClientBuilder

try (MockHttpSolrClient client = new MockHttpSolrClient("http://base.url", b);
LBHttp2SolrClient<MockHttpSolrClient> testClient =
new LBHttp2SolrClient.Builder<>(client, ep1, ep2).build()) {
var httpClientBuilder =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

httpSolrClientBuilder

public Http2SolrClient.Builder withListenerFactory(List<HttpListenerFactory> listenerFactory) {
this.listenerFactory = listenerFactory;
/**
* specify a listener factory, which will be appened to any existing values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* specify a listener factory, which will be appened to any existing values.
* Specify a listener factory, which will be appended to any existing values.

@@ -94,35 +97,63 @@
*
* @since solr 8.0
*/
public class LBHttp2SolrClient<C extends HttpSolrClientBase> extends LBSolrClient {
public class LBHttp2SolrClient<B extends HttpSolrClientBuilderBase<?, ?>> extends LBSolrClient {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an aside -- I wonder if the use of generics here is more annoying than helpful. We aren't compelled to use generics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry too I have gone down the wrong road with this. On the other hand, I do not want consider this with this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in another PR then.

@dsmiley
Copy link
Contributor

dsmiley commented Dec 17, 2024

RE deprecations: A good call out but I don't think we should worry much about deprecations if we can't provide the alternative. At least not for LBSolrClient which is rather internal, I think.

@jdyer1
Copy link
Contributor Author

jdyer1 commented Dec 17, 2024

So this would be Solr 10?

Yes. When I read the description on this ticket, it seemed to me a good opportunity to improve our API with the upcoming major release. It seems wrong we clone the passed-in client. This is especially important if we want to eventually have merely one LB Client, one Cloud Client and one Cluster State Provider that can take any subclass of HttpSolrClientBase.

setup(client, null);
}

private void setup(Http2SolrClient client, Http2SolrClient.Builder builder) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why have this separate and with different parameter order? In other words, lets make this one the @Override with builder then client. Remove the 4 lines earlier on 377.

@@ -34,4 +34,8 @@ public interface HttpClientBuilderPlugin {
public SolrHttpClientBuilder getHttpClientBuilder(SolrHttpClientBuilder builder);

public default void setup(Http2SolrClient client) {}

public default void setup(Http2SolrClient.Builder httpClientBuilder, Http2SolrClient client) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me at such so I can appreciate the problem better?

@jdyer1
Copy link
Contributor Author

jdyer1 commented Dec 18, 2024

Can you point me at such so I can appreciate the problem better?

My apologies, I think I misstated what is going on earlier. PKIAuthenticationPlugin on setup injects a RequestResponseListener into the passed-in Client. I wanted to ensure the same Listener gets injected both in the Builder and the Client, in case it matters.

@dsmiley
Copy link
Contributor

dsmiley commented Dec 18, 2024

Since the direction is SolrClient immutability, I think this auth/PKI setup method should only the builder (mutable). Is this realistic?

@jdyer1
Copy link
Contributor Author

jdyer1 commented Dec 18, 2024

Since the direction is SolrClient immutability, I think this auth/PKI setup method should only the builder (mutable). Is this realistic?

I fully agree this is the right direction to take. Unfortunately this probably requires a minor redesign to both UpdateShardHandler and HttpSolrClientProvider, which is out-of-scope here.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay then let's leave a TODO comment to communicate this direction that's out of scope. Maybe even create a JIRA issue and refer to it in the TODO. it'd be good if someone addressed that for Solr 10.

Just some nitpicks on comments / naming here but I'm +1 to merge otherwise.

}

private void setup(Http2SolrClient client, Http2SolrClient.Builder builder) {
@Override
public void setup(Http2SolrClient.Builder httpClientBuilder, Http2SolrClient client) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest naming this param "builder" as it was, or "httpSolrClientBuilder". Note "client" is simply "client".

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHANGES.txt will be interesting here; a number of things happened.

@jdyer1
Copy link
Contributor Author

jdyer1 commented Dec 19, 2024

I added a CHANGES.txt entry to this PR and plan to merge this early next week. I also will separately add a deprecation on CloudHttp2SolrClient#withHttpClient in branch_9x, to be released with Solr 9.9.

@jdyer1 jdyer1 merged commit d3e57aa into apache:main Dec 23, 2024
4 checks passed
@jdyer1 jdyer1 deleted the feature/SOLR-17541 branch December 23, 2024 15:04
asfgit pushed a commit that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants