-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ken started Quickjs backed indexer processes freeze #5342
Comments
Thanks for reaching out, Antoine.
|
Hi nicka, No crash in the logs at all Below, the request information from remsh
|
Thanks @aduchate. Hmm, so far everything looks good.
|
pid:
indexer_pid:
A stuck indexer never moves anymore nothing changes in the values once it is stuck The stuck indexers are completely random. After killing the stuck indexer, the index ends up building correctly. QuickJS scanner show no problem. The indexes used to build correctly with spidermonkey but in this current build, spidermonkey is broken so, I cannot check if the problem also arises with spidermonkey. I'll try building with otp 25 and let you know. Thanks for the support |
I have some update in this one. Using OTP 25 instead of OTP 27 seems to fix the issue. |
Thank you for the update, Antoine. That is an interesting find! We have seen sometimes new Erlang version showing issues discovered by CouchDB code only, so that's certainly possibility. I guess it could be either OTP 26 or OTP 27. Let's keep this issue open to serve as a record/reminder in case we narrow down the cause. If you get a chance it might be interesting at least to see if it's 26 or 27. In 26 I believe there were some OTP changes related to terminal IO processing which would be related to talking to the indexer couchjs processes (either Spidermonkey or QuickJS, it's all stdio basically). |
Description
On a cluster (3.4.2) of 6 nodes that has a fairly large amount of databases (~1500 of size between 1GB and 150GB each), we have recently added and modified about 20 design docs per database (total 30000 dds). We have setup Ken with a concurrency of 5 to let the indexation happen.
About every 10 minutes, we see one indexer process not being updated anymore. It basically stays stuck forever (we let a few linger for 24 hours). Killing all couchjs_mainjs has no impact on the stuck indexer. The only way to get rid of it is to issue, in remsh, an exit(Pid, kill). . Pid here is the pid field of /_active_task, not indexer_pid.
Steps to Reproduce
Create a lot of databases with a lot of data, create a few design documents per database, start ken.
Expected Behaviour
The indexers shouldn't get stuck
Your Environment
Additional Context
We can give you access to the infrastructure that causes the problem to happen if needed.
The text was updated successfully, but these errors were encountered: