Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize time-consuming parts when there are many local branches #130

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

achimnol
Copy link

@achimnol achimnol commented Jan 4, 2025

I have more than 30 local branches working with.
This is partly because we are using Graphite.dev to make stacked, fine-grained, small-sized PRs, increasing the overall number of PRs.

After inserting simple Println() calls around the code, I found that applyCommits() and querying the PRs by commit hashes from the GitHub were the two major bottlenecks with a large number of local branches.

This PR simply parallelizes them using goroutines and wait groups.

A simple benchmark:

Before:

gh poi  2.03s user 0.72s system 16% cpu 16.858 total

After:

gh poi  1.53s user 0.65s system 42% cpu 5.136 total

You could observe that the most of time were wasted by waiting for I/O.

@seachicken
Copy link
Owner

It's good that it will be faster. The reason I put off parallelization is that the GitHub API has a rate limit, and if there are too many requests, an error may occur. I'll check the specifications around that.

@achimnol
Copy link
Author

achimnol commented Jan 8, 2025

It's good that it will be faster. The reason I put off parallelization is that the GitHub API has a rate limit, and if there are too many requests, an error may occur. I'll check the specifications around that.

Yes, I had the same concern, too. Nonetheless, as the GitHub API queries for getting corresponding PRs for a set of commit hashes are already chunked by 6 hashes due to the variable length limit (256 characters) in the GraphQL schema. So, even with ~30 local branches, the number of actual GitHub API requests would be less than 10 for a single execution of gh poi.

I think practically this shouldn't be the issue as long as the user is logged in, unless the user repeats gh poi quickly in a short period of time. I've been using my fork for 4 days and experienced no issue yet.

Copy link
Owner

@seachicken seachicken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallelization looks good!
The API rate limit seems most affected by the number of requests per minute, but I don't think it will cause an error unless there are many local branches.
https://docs.github.com/en/graphql/overview/rate-limits-and-node-limits-for-the-graphql-api#secondary-rate-limits

}
oids, err := connection.GetLog(ctx, branch.Name)
if err != nil {
return
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the git process returns an error, the user should be informed of the error.

}

queryHashes := shared.GetQueryHashes(branches)
results := make(chan pullRequestResult, len(queryHashes))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to unify the naming with the goroutine below, so how about calling it something like prChan?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants