Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: add embed endpoint #151

Closed
wants to merge 3 commits into from
Closed

Draft: add embed endpoint #151

wants to merge 3 commits into from

Conversation

rohanshah18
Copy link
Contributor

Problem

Describe the purpose of this change. What problem is being solved and why?

Solution

Describe the approach you took. Link to any relevant bugs, issues, docs, or other resources.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Infrastructure change (CI configs, etc)
  • Non-code change (docs, etc)
  • None of the above: (explain here)

Test Plan

Describe specific steps for validating this change.

Comment on lines 20 to 28
public EmbeddingsList embed(String model, String truncate, String inputType, List<String> inputs) throws ApiException {
EmbedRequestParameters embedRequestParameters = new EmbedRequestParameters().inputType(inputType);
if(truncate != null && !truncate.isEmpty())
embedRequestParameters.truncate(truncate);
List<EmbedRequestInputsInner> EmbedRequestInputsInnerList = convertInputStringToEmbedRequestInputsInner(inputs);
EmbedRequest embedRequest = new EmbedRequest()
.model(model)
.parameters(embedRequestParameters)
.inputs(EmbedRequestInputsInnerList);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other models might support different so would be better as a map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate on this more? Are you suggesting List<String> inputs should be a map instead? If so, how does the key value pair be assigned to the inputs since inputs is of type List<EmbedRequestInputsInner> where EmbedRequestInputsInner has only one member which is text of type String.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean for the params like truncate and input_type -- those apply to the multilingual-e5-large model, but they might not apply to the next embedding model we may add.

You can use the Python impl as a reference, where it takes params as basically an optional map (I know, Java optionals are not a thing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thank you!

Comment on lines 874 to 876
public Inference getInference() {
return new Inference();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How heavy is it to create this object, or should we cache Inference in Pinecone?

e.g. if someone effectively did this every time:

pc.getInference().embed(...);
pc.getInference().embed(...);
pc.getInference().embed(...);
pc.getInference().embed(...);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see your point! Do you feel like java users are traditionally used to getting the Inference object and reusing it rather than instantiating every single time to call the embed()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a java convention for this, so just thinking about the ergonomics of how we deliver the API to them, and if there's something here that's heavy to construct/connect/etc. then we should try to lazy construct that and reuse across requests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe calling it getInferenceClient() would resolve the ambiguity and encourage use like:

client = pc.getInferenceClient();
client.embed(...);
client.embed(...);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like this approach. If we hear about performance issues in future of having a lot of unused objects (which should be usually cleaned by the garbage collector), then we can think of caching the object and reusing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not worried about the extra objects if they're lightweight to construct. So I'm asking how expensive are they to construct?

How heavy is it to create this object, or should we cache Inference in Pinecone?

@@ -871,6 +871,10 @@ public AsyncIndex getAsyncIndexConnection(String indexName) throws PineconeValid
return new AsyncIndex(connection, indexName);
}

public Inference getInference() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little odd to me as a getter. Wondering if just pc.inference().embed(...) is better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think [pc.inference().embed(...) definitely feels more like python usage and it aligns with rest of our SDKs but given its java, do you feel like users will directly call pc.inference().embed()? And pc.inference() will still return the Inference object regardless, so the naming does make sense if not compared with other SDKs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for dataplane opeartions, we have getIndexConnection() and getAsyncIndexConnection().

Comment on lines 33 to 39
private List<EmbedRequestInputsInner> convertInputStringToEmbedRequestInputsInner(List<String> inputs) {
List<EmbedRequestInputsInner> embedRequestInputsInnerList = new ArrayList<EmbedRequestInputsInner>();
for(String input:inputs) {
embedRequestInputsInnerList.add(new EmbedRequestInputsInner().text(input));
}
return embedRequestInputsInnerList;
}
Copy link
Contributor

@ssmith-pc ssmith-pc Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better using stream API:

inputs.stream()
        .map(input -> new EmbedRequestInputsInner().text(input))
        .collect(Collectors.toList());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think using stream is better here? Wouldn't it create a new object for intermediate step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to look more closely at the underlying impl, but I think using the stream impl gives java flexibility to optimize how/when the underlying list actually gets created, and it may never have to allocate a new list at all depending on the usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a couple of points we are looking at:

  1. .streams will first and foremost create a stream and then the processing happens using the map function and finally it creates and assigns to the new list while the for-loop approach creates the list and adds the elements to it directly, so there is no intermediate step of creating and allocating extra memory.

I think using the stream impl gives java flexibility to optimize how/when the underlying list actually gets created, and it may never have to allocate a new list at all depending on the usage.

The only relevant optimization I can think of is lazy evaluation but in this case with collect(), the list will still be created as a last step right, since the inputs() accepts List as an argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is your call, and don't think it'll make much difference in how fast things run on such small datasets. I just find the stream api more pleasant to use, and gives java an opportunity to optimize what's happening under the hood more so than if you explicitly instantiate a list. But this is not a showstopper by any means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the dataset is going to be small so creating intermediate objects is not going to affect the performance and yes, streams are easier to read. Just wanted to make sure I didn't skip any optimization step but I can change this to streams for readability purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants