-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ngram_bench #49
fix: ngram_bench #49
Conversation
WalkthroughRecent updates focus on performance and dependency management. Key changes include fixing a counter function in the Changes
Tip New Features and ImprovementsReview SettingsIntroduced new personality profiles for code reviews. Users can now select between "Chill" and "Assertive" review tones to tailor feedback styles according to their preferences. The "Assertive" profile posts more comments and nitpicks the code more aggressively, while the "Chill" profile is more relaxed and posts fewer comments. AST-based InstructionsCodeRabbit offers customizing reviews based on the Abstract Syntax Tree (AST) pattern matching. Read more about AST-based instructions in the documentation. Community-driven AST-based RulesWe are kicking off a community-driven initiative to create and share AST-based rules. Users can now contribute their AST-based rules to detect security vulnerabilities, code smells, and anti-patterns. Please see the ast-grep-essentials repository for more information. New Static Analysis ToolsWe are continually expanding our support for static analysis tools. We have added support for Tone SettingsUsers can now customize CodeRabbit to review code in the style of their favorite characters or personalities. Here are some of our favorite examples:
Revamped Settings PageWe have redesigned the settings page for a more intuitive layout, enabling users to find and adjust settings quickly. This change was long overdue; it not only improves the user experience but also allows our development team to add more settings in the future with ease. Going forward, the changes to Miscellaneous
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
Bencher
Click to view all benchmark results
Bencher - Continuous Benchmarking View Public Perf Page Docs | Repo | Chat | Help |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #49 +/- ##
==========================================
- Coverage 85.05% 83.96% -1.09%
==========================================
Files 4 4
Lines 368 343 -25
==========================================
- Hits 313 288 -25
Misses 55 55 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- CHANGELOG.md (1 hunks)
- Cargo.toml (1 hunks)
- src/ngram.rs (3 hunks)
Files skipped from review due to trivial changes (1)
- Cargo.toml
Additional Context Used
LanguageTool (1)
CHANGELOG.md (1)
Near line 11: There seems to be a noun/verb agreement error. Did you mean “counters” or “countered”?
Context: ...eleased] ### Fixed -ngram
bench use counter lib's function rather than the truly us...
Rule ID: SINGULAR_NOUN_VERB_AGREEMENT
Markdownlint (15)
CHANGELOG.md (15)
18: Expected: 1; Actual: 2
Multiple consecutive blank lines
25: Expected: 1; Actual: 2
Multiple consecutive blank lines
10: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
13: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
19: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
26: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
27: Expected: 1; Actual: 0; Above
Headings should be surrounded by blank lines
27: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
31: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
11: null
Lists should be surrounded by blank lines
14: null
Lists should be surrounded by blank lines
20: null
Lists should be surrounded by blank lines
28: null
Lists should be surrounded by blank lines
32: null
Lists should be surrounded by blank lines
32: null
Files should end with a single newline character
Additional comments not posted (2)
src/ngram.rs (2)
21-21
: Ensure test imports are correctly scoped.This import is correctly scoped within the test module, which is good practice in Rust to avoid including test dependencies in the main binary.
60-60
: Ensure benchmark imports are correctly scoped.This import is correctly scoped within the benchmark module, which is good practice in Rust to avoid including benchmark dependencies in the main binary.
### Fixed | ||
- `ngram` bench use counter lib's function rather than the truly used function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure grammatical correctness in the changelog entry.
- `ngram` bench use counter lib's function rather than the truly used function.
+ `ngram` bench uses counter lib's function rather than the truly used function.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
### Fixed | |
- `ngram` bench use counter lib's function rather than the truly used function. | |
### Fixed | |
- `ngram` bench uses counter lib's function rather than the truly used function. |
Address formatting issues to improve readability.
+
### Fixed
- `ngram` bench uses counter lib's function rather than the truly used function.
+
### Changed
- Use AHash in ngram module
+
## [0.1.2] - 2024-04-29
+
### Changed
- Implement ngram counts with a better way (#33)
- Rust `compute_score` function signature changed to use references (#34)
- Use rayon to run every prediction-references statistics calculation parallely (#38)
- Use ahash for better performance (#39)
+
## [0.1.1] - 2024-04-26
+
### Changed
- Upgrade `cached` version to `0.50.0`
Change crate-type to `["cdylib", "rlib"]`
+
## [0.1.0] - 2024-04-23
+
- BLEU score calculation with `Tokenizer13a` as default tokenizer.
Also applies to: 13-14, 18-20, 25-27
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
### Fixed | |
- `ngram` bench use counter lib's function rather than the truly used function. | |
### Fixed | |
- `ngram` bench uses counter lib's function rather than the truly used function. | |
### Changed | |
- Use AHash in ngram module | |
## [0.1.2] - 2024-04-29 | |
### Changed | |
- Implement ngram counts with a better way (#33) | |
- Rust `compute_score` function signature changed to use references (#34) | |
- Use rayon to run every prediction-references statistics calculation parallely (#38) | |
- Use ahash for better performance (#39) | |
## [0.1.1] - 2024-04-26 | |
### Changed | |
- Upgrade `cached` version to `0.50.0` | |
Change crate-type to `["cdylib", "rlib"]` | |
## [0.1.0] - 2024-04-23 | |
- BLEU score calculation with `Tokenizer13a` as default tokenizer. |
let tokens: Vec<String> = vec![ | ||
"a".to_string(), | ||
"a".to_string(), | ||
"b".to_string(), | ||
"c".to_string(), | ||
]; | ||
let max_order = 4; | ||
|
||
let iter_num: usize = 100; | ||
b.iter(|| { | ||
std::hint::black_box(for _ in 1..=iter_num { | ||
get_ngram_counter(line, max_order); | ||
get_token_ngram_counter(&tokens, max_order); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize the benchmark setup.
Consider pre-allocating the tokens
vector outside the benchmark loop to avoid repeated allocations during the benchmarking process.
let tokens: Vec<String> = vec![
"a".to_string(),
"a".to_string(),
"b".to_string(),
"c".to_string(),
];
let max_order = 4;
let iter_num: usize = 100;
b.iter(|| {
std::hint::black_box(for _ in 1..=iter_num {
get_token_ngram_counter(&tokens, max_order);
});
});
Summary by CodeRabbit
Bug Fixes
ngram
benchmark.New Features
ngram
module for improved performance.Dependency Updates
cached
dependency to version0.51.3
.regex
,lazy_static
,rayon
, andahash
.counter
dependency.Refactor
ngram
module for better efficiency and clarity.