--max-index-size
and --max-task-db-size
are not practical
#567
Replies: 9 comments 25 replies
-
Sharing @nicolasvienot's input Currently, we set both to the size of the disk we use:
Of course, this does not really make sense, but we mostly wanted to start using those variables in order to someday be able to really limit the size of the Meilisearch DBs to the size of our disk. We can’t really use those variables as it would mean:
When implementing that, yes we were hoping for a way to actually set a disk limit for Meilisearch. So a variable that could let us just set the max size for the whole Meilisearch would be a lot better for our use case. Although there might be other use cases where a user wants to limit the size of a specific index, a way to set a limitation for all indexes would be also nice. Since there is no way to separate the location of the tasks and the indexes data, I don’t think we really need the To recapitulate, for our use case, a |
Beta Was this translation helpful? Give feedback.
-
I've drawn some graphs to measure the usage of these configurations, we're talking about a usage rate below 1% among all the instances sharing the telemetry infos. |
Beta Was this translation helpful? Give feedback.
-
After a small meeting with @dureuill, @irevoire, and @gmourier, we defined that this is not a simple task to do. The current system internally uses the LMDB max size functions to specify the size of the indexes, one by one. We would have to design a smart algorithm to fairly distribute the disk usage by index. Note that changing the max map size of the env requires closing it and blocking read and write operations. I would also like to note that the current implementation of the index-scheduler doesn't store the update files e.g. JSON documents, before being processed, in the LMDB env, and therefore the We will continue the discussion about exposing a much simpler-to-use flag in the future. |
Beta Was this translation helpful? Give feedback.
-
So, we started preliminary work on this topic, and I would like to raise various points: Do we want a
|
Beta Was this translation helpful? Give feedback.
-
What to do of
|
Beta Was this translation helpful? Give feedback.
-
📡 Update 👋 Cross-posting @dureuill comment about the limitations that are brought by this change (removing So, after discussing this with the team, we decided to move forward with removing the flags and setting each index size to 500GiB for now. This change has 2 consequences (besides the fact it is breaking due to removing CLI options): 1. The number of indexes that can exist simultaneously in a Meilisearch DB becomes around We expect at least (1) to affect some users (it is unclear if (2) will affect users in practice, however). This change is currently implemented in meilisearch/meilisearch#3278. This change is taken now so as to avoid stabilizing We plan to lift the described limitations (1) and (2) in Knowing this and looking at our analytics, we estimate that a few users will be affected by the limit on the number of possible indexes and the index size. 📚 If you use many indexes to implement a multi-tenant search, you may have missed the tenant tokens feature; It should be helpful to avoid having one index per tenant. It is possible that the analytics at our disposal do not give us a complete representation of the pool of users that will be impacted by this change; that's why we are already working to solve these limitations. Some tracks are discussed here meilisearch/meilisearch#3280 If you encounter a problem induced by these limitations, please mention it here! 🙏 |
Beta Was this translation helpful? Give feedback.
-
My For example the website on the primary search page, it is not slow with this size: |
Beta Was this translation helpful? Give feedback.
-
Hello, everyone following this discussion 👋 We have just released the first RC (release candidate) of Meilisearch, containing the removal of limits on index size and number! You can test it by using docker run -it --rm -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:v1.1.0-rc.0 You are welcome to communicate any feedback about this new implementation in this discussion. If you encounter any bugs, please report them here. 🎉 Official and stable release containing this change will be available on 3rd April 2023 |
Beta Was this translation helpful? Give feedback.
-
Hey folks ✨ The limit introduced in v1.0 when removing the Meilisearch no longer limits the size nor the amount of indexes a single instance may contain. You can now create an unlimited number of indexes, whose maximum size will be only dictated by the memory address space devoted to a single process by the OS. Under Linux, the limit is at about 80TiB. Note that the task db is limited to 10gb, we are working on solving a case where reaching that limit block write operation on Meilisearch. In the meantime, we recommend regularly deleting finished tasks to avoid that if you don't need to keep the finished task historic. |
Beta Was this translation helpful? Give feedback.
-
As a user, to control the space Meilisearch will take on my disk, I would love to have only one configuration to set this instead of two, like a
--max-db-size
, including the task database AND the indexes database.Also, if we don't have the choice and I have to set 2 DB sizes, I would rather set the max size of the database for ALL my indexes, and not per index. For me,
--max-index-size
is not practical.I try to think as a user, I don't say this is easy technically, but I want to reconsider these both commands we provide to the users, which are not really practical and intuitive from my POV.
Beta Was this translation helpful? Give feedback.
All reactions