Limit the maximum length of the document identifier #583
loiclec
started this conversation in
Feedback & Feature Proposal
Replies: 3 comments 5 replies
-
Hello, a few thoughts about this proposal, from a newcomer's perspective:
Keep in mind I could be missing some aspects of the trade-off, though 🙏 |
Beta Was this translation helpful? Give feedback.
2 replies
-
Trying to consider the pros/cons of limiting to 500 bytes first (because no real gains were given to limit it to something more restrictive, so I’m not even considering it right now): Pros:
Cons:
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Do we know if our competitors have this kind of limitation? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
At the moment, we have not defined the maximum allowed length for a document's identifier.
For example, it is possible to add the following document, with a document id weighing 632 bytes:
Since Meilisearch v1 will be released soon, with a promise of no breaking changes, we need to decide now whether it is sustainable for us to keep accepting very large document ids.
Technically, I could see this lack of limitation causing some problems. For example, at the moment it is not possible to create any LMDB database where the key is the document id, since LMDB keys are limited to 511 bytes (I think). Therefore, to manage the correspondence between
user document identifier (String)
andinternal docid (32-bit unsigned integer)
, we resort to using two FSTs (finite state transducers), which are difficult to work with and slow to update incrementally (we have at least one issue related to it).Therefore, I'd like to propose a limit to the length of the document id to something close to 500 bytes, which would be easy to remember and would fit into an LMDB key with a few bytes of extra space for potential metadata. This would give us a lot of freedom to work with document ids and allow us to offer better performance guarantees as well.
Beta Was this translation helpful? Give feedback.
All reactions