-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: add embed endpoint #151
Conversation
public EmbeddingsList embed(String model, String truncate, String inputType, List<String> inputs) throws ApiException { | ||
EmbedRequestParameters embedRequestParameters = new EmbedRequestParameters().inputType(inputType); | ||
if(truncate != null && !truncate.isEmpty()) | ||
embedRequestParameters.truncate(truncate); | ||
List<EmbedRequestInputsInner> EmbedRequestInputsInnerList = convertInputStringToEmbedRequestInputsInner(inputs); | ||
EmbedRequest embedRequest = new EmbedRequest() | ||
.model(model) | ||
.parameters(embedRequestParameters) | ||
.inputs(EmbedRequestInputsInnerList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other models might support different so would be better as a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate on this more? Are you suggesting List<String> inputs
should be a map
instead? If so, how does the key value pair be assigned to the inputs
since inputs
is of type List<EmbedRequestInputsInner>
where EmbedRequestInputsInner
has only one member which is text
of type String.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean for the params like truncate and input_type -- those apply to the multilingual-e5-large model, but they might not apply to the next embedding model we may add.
You can use the Python impl as a reference, where it takes params as basically an optional map (I know, Java optionals are not a thing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, thank you!
public Inference getInference() { | ||
return new Inference(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How heavy is it to create this object, or should we cache Inference in Pinecone?
e.g. if someone effectively did this every time:
pc.getInference().embed(...);
pc.getInference().embed(...);
pc.getInference().embed(...);
pc.getInference().embed(...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I see your point! Do you feel like java users are traditionally used to getting the Inference object and reusing it rather than instantiating every single time to call the embed()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a java convention for this, so just thinking about the ergonomics of how we deliver the API to them, and if there's something here that's heavy to construct/connect/etc. then we should try to lazy construct that and reuse across requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe calling it getInferenceClient()
would resolve the ambiguity and encourage use like:
client = pc.getInferenceClient();
client.embed(...);
client.embed(...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I like this approach. If we hear about performance issues in future of having a lot of unused objects (which should be usually cleaned by the garbage collector), then we can think of caching the object and reusing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not worried about the extra objects if they're lightweight to construct. So I'm asking how expensive are they to construct?
How heavy is it to create this object, or should we cache Inference in Pinecone?
@@ -871,6 +871,10 @@ public AsyncIndex getAsyncIndexConnection(String indexName) throws PineconeValid | |||
return new AsyncIndex(connection, indexName); | |||
} | |||
|
|||
public Inference getInference() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a little odd to me as a getter. Wondering if just pc.inference().embed(...)
is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think [pc.inference().embed(...) definitely feels more like python usage and it aligns with rest of our SDKs but given its java, do you feel like users will directly call pc.inference().embed()? And pc.inference() will still return the Inference object regardless, so the naming does make sense if not compared with other SDKs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for dataplane opeartions, we have getIndexConnection() and getAsyncIndexConnection().
private List<EmbedRequestInputsInner> convertInputStringToEmbedRequestInputsInner(List<String> inputs) { | ||
List<EmbedRequestInputsInner> embedRequestInputsInnerList = new ArrayList<EmbedRequestInputsInner>(); | ||
for(String input:inputs) { | ||
embedRequestInputsInnerList.add(new EmbedRequestInputsInner().text(input)); | ||
} | ||
return embedRequestInputsInnerList; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better using stream API:
inputs.stream()
.map(input -> new EmbedRequestInputsInner().text(input))
.collect(Collectors.toList());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think using stream is better here? Wouldn't it create a new object for intermediate step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to look more closely at the underlying impl, but I think using the stream impl gives java flexibility to optimize how/when the underlying list actually gets created, and it may never have to allocate a new list at all depending on the usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There a couple of points we are looking at:
- .streams will first and foremost create a stream and then the processing happens using the map function and finally it creates and assigns to the new list while the for-loop approach creates the list and adds the elements to it directly, so there is no intermediate step of creating and allocating extra memory.
I think using the stream impl gives java flexibility to optimize how/when the underlying list actually gets created, and it may never have to allocate a new list at all depending on the usage.
The only relevant optimization I can think of is lazy evaluation but in this case with collect(), the list will still be created as a last step right, since the inputs() accepts List as an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is your call, and don't think it'll make much difference in how fast things run on such small datasets. I just find the stream api more pleasant to use, and gives java an opportunity to optimize what's happening under the hood more so than if you explicitly instantiate a list. But this is not a showstopper by any means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the dataset is going to be small so creating intermediate objects is not going to affect the performance and yes, streams are easier to read. Just wanted to make sure I didn't skip any optimization step but I can change this to streams for readability purpose.
Problem
Describe the purpose of this change. What problem is being solved and why?
Solution
Describe the approach you took. Link to any relevant bugs, issues, docs, or other resources.
Type of Change
Test Plan
Describe specific steps for validating this change.