Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(benchmark infra) Add beforeEachBatchAsync callback #23391

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

markfields
Copy link
Member

@markfields markfields commented Dec 20, 2024

Description

Fixes AB#19814

This adds an async version of beforeEachBatch, to allow awaiting async code before running a batch of benchmark iterations.

Reviewer Guidance

The merits of this design choice are up for debate, given that you can achieve this with the custom benchmark options.

@markfields markfields requested review from Copilot, CraigMacomber and alexvy86 and removed request for Copilot December 20, 2024 20:28
@github-actions github-actions bot added the base: main PRs targeted against main branch label Dec 20, 2024
@markfields markfields requested a review from Abe27342 December 20, 2024 20:29
Copy link
Contributor

@alexvy86 alexvy86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there's a pattern somewhere in the benchmark package on how to use the sync VS async functions. I'm not sure I agree with just replacing beforeEachBatch with beforeEachBatchAsync withing runBenchmarkAsync is the right way to go. If the beforeEachBatch function is not async but the benchmarkFn is, I think it's still valid to use sync beforeEachBatch inside runBenchmarkAsync, unless we rewrite the contracts and the docs a bit, so BenchmarkRunningOptionsAsync can only take the async version of everything.

Depending on the final way we go, we might need to document this as a breaking change for the package (bump version to 0.52 and add an entry to CHANGELOG.md). Related: I think somewhere I might have a branch with the updated CHANGELOG that includes the changes in 0.51, I'll see if I can find it.

@markfields
Copy link
Member Author

If the beforeEachBatch function is not async but the benchmarkFn is, I think it's still valid to use sync beforeEachBatch inside runBenchmarkAsync,

You're so right! I thought about this briefly then promptly forgot. Should I run both beforeEachBatch's in the async case...?

May have the same problem where existing calls
with sync beforeEachBatch + benchmarkFnAsync won't work.
Or maybe it does, but types don't block the converse.
@github-actions github-actions bot added the public api change Changes to a public API label Jan 1, 2025
@markfields markfields requested a review from alexvy86 January 1, 2025 23:54
Copy link
Contributor

github-actions bot commented Jan 2, 2025

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  170486 links
    1603 destination URLs
    1838 URLs ignored
       0 warnings
       0 errors


batches++;
},
benchmarkFnAsync: async (): Promise<void> => {
assert(batches > 0, "beforeEachBatchAsync should be called before test body");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert(batches > 0, "beforeEachBatchAsync should be called before test body");
assert(batches > 0, "beforeEachBatch should be called before test body");

iterations++;
},
after: () => {
// Restore to actual batch count for 'after' hook logic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Restore to actual batch count for 'after' hook logic
// Restore to actual batch count for 'afterEach' hook logic which cares about the number of batches that ran

batches++;
},
benchmarkFn: (): void => {
assert(batches > 0, "beforeEachBatchAsync should be called before test body");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert(batches > 0, "beforeEachBatchAsync should be called before test body");
assert(batches > 0, "beforeEachBatch should be called before test body");

*
* @public
*/
export interface OnBatchAsync {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OnBatch and OnBatchAsync leave me a bit confused. Is it strictly necessary to have beforeEachBatchAsync?: never; in the former? And should beforeEachBatch?: () => void; really be deprecated on the latter? Seems like we allow both to be defined (since we call both in the "synthetic" beforeAfterBatch we craft inside validateBenchmarkArguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key feature is to block synchronous benchmark with asynchronous beforeEachBatch. I tried a number of strategies, this was the best I came up with.

If you can see a simpler way to keep the tests passing, I'm all for it. Or maybe that requirement I held to isn't worth it? Seems like it to me but open to discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deprecated the one in the Async type because it's unnecessary, as you pointed out. Later someone could migrate usages to the new Async function and remove that one.

Comment on lines +115 to +117
export type BenchmarkRunningOptionsSync = BenchmarkSyncArguments;

export type BenchmarkRunningOptionsAsync = BenchmarkAsyncArguments &
BenchmarkTimingOptions &
OnBatch;
export type BenchmarkRunningOptionsAsync = BenchmarkAsyncArguments;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: these two feel redundant now, maybe we should remove them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me. @CraigMacomber ?

@CraigMacomber
Copy link
Contributor

While I'm not generally opposed to this, I'd like to point out a couple things:

  1. This feature should not be considered required to implement anything, since its perfectly possible to do everything it allows and more using benchmarkFnCustom. Note that even that is not our most flexible option: benchmarkCustom is even more flexible (can measure things other than runtime of batches)
  2. A lot of benchmarks needing access to running code in-between batches that I have seen is very sketchy and was written with incorrect assumptions on how batches work (ex: if the benchmark function accumulates state that needs cleaning up, doing it between batches is a flawed approach since it leads to the amount of accumulated state being proportional to batch size which can impact performance (ex: GC speed) and the tests assume that the function being tested has performance independent of batch size. I have yet to see a good use of it that would not be more clearly and robustly done with benchmarkFnCustom. I don't particularly want to add more/better support and focus on this very commonly confusing and misused API.

Comment on lines +175 to +182
beforeEachBatchAsync?: never;
}

// @public
export interface OnBatchAsync {
// @deprecated
beforeEachBatch?: () => void;
beforeEachBatchAsync?: () => Promise<void>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems unnecessary complex. While we have to make the actual function being timed clearly sync or async to avoid the over head of checking which each iteration, I don't see an issue with making beforeEachBatch?: () => void; just typed as beforeEachBatch?: () => void | promise<unknown>; then just awaiting what ever it returns unconditionally.

That should keep the API identical, except that async beforeEachBatch will work. This is similar to how mocha deals with hooks: same API for sync and async.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how I started. I was bothered by the lack of compiler error when passing args with synchronous benchmarkFn by async beforeEachBatch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation, I guess that would be an issue with how a few things are currently factored, like runBenchmarkSync, so some of this refactoring might be needed to support async beforeEachBatch.

@markfields markfields marked this pull request as draft January 8, 2025 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
base: main PRs targeted against main branch public api change Changes to a public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants