Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: relax the constraints of floor_sum #99

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions src/internal_math.rs
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,41 @@ pub(crate) fn primitive_root(m: i32) -> i32 {
// omitted
// template <int m> constexpr int primitive_root = primitive_root_constexpr(m);

/// # Arguments
/// * `n` `n < 2^32`
/// * `m` `1 <= m < 2^32`
///
/// # Returns
/// `sum_{i=0}^{n-1} floor((ai + b) / m) (mod 2^64)`
/* const */
#[allow(clippy::many_single_char_names)]
pub(crate) fn floor_sum_unsigned(mut n: u64, mut m: u64, mut a: u64, mut b: u64) -> u64 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let this function take Wrapping<u64> instead of u64 to avoid complication; it's an internal function anyway.

let mut ans = 0;
loop {
if a >= m {
if n > 0 {
ans += n * (n - 1) / 2 * (a / m);
}
a %= m;
}
if b >= m {
ans += n * (b / m);
b %= m;
}

let y_max = a * n + b;
if y_max < m {
break;
}
// y_max < m * (n + 1)
// floor(y_max / m) <= n
n = y_max / m;
b = y_max % m;
std::mem::swap(&mut m, &mut a);
}
ans
}

#[cfg(test)]
mod tests {
#![allow(clippy::unreadable_literal)]
Expand Down
46 changes: 31 additions & 15 deletions src/math.rs
Original file line number Diff line number Diff line change
Expand Up @@ -185,25 +185,22 @@ pub fn crt(r: &[i64], m: &[i64]) -> (i64, i64) {
///
/// assert_eq!(math::floor_sum(6, 5, 4, 3), 13);
/// ```
#[allow(clippy::many_single_char_names)]
pub fn floor_sum(n: i64, m: i64, mut a: i64, mut b: i64) -> i64 {
assert!(0 <= n && n < 1i64 << 32);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just replace it with something like matches!(n, 0..1 << 32)? This is slightly better than (0..1 << 32).contains(&n) because it does not involve taking a reference to n, which may avoid constant-factor performance degradation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the implementation of matches! macro seems to be insufficient, it causes error on parsing. Should I replace it with (0..1<<32).contains(&n) ?

% cargo build
   Compiling ac-library-rs v0.1.0 (/home/user/xxx/ac-library-rs)
error: no rules expected the token `<<`
   --> src/math.rs:190:33
    |
190 |     assert!(matches!(n, 0..1i64 << 32));
    |                                 ^^ no rules expected this token in macro call

error: no rules expected the token `<<`
   --> src/math.rs:191:33
    |
191 |     assert!(matches!(m, 1..1i64 << 32));
    |                                 ^^ no rules expected this token in macro call

error: aborting due to 2 previous errors

error: could not compile `ac-library-rs`.

To learn more, run the command again with --verbose.

assert!(1 <= m && m < 1i64 << 32);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

let mut ans = 0;
Copy link
Author

@nebocco nebocco Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since overflows occur in C++ without special handling, they use unsigned long long here to extend the range of values returned correctly; implementing this in Rust would require a lot of casting (as u64) and functions like u64::overflowing_sub. Since this is cumbersome, I've written the internal computation in i64, but this may have led to more cases of overflowing only in the middle of the computation. Which would be better?

Copy link
Collaborator

@TonalidadeHidrica TonalidadeHidrica Aug 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using wrapping_* variants?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wrap the variables with std::num::Wrapping<u64>. Note that if you shadow the variables, you need to check if a and b are negative beforehand.

if a >= m {
ans += (n - 1) * n * (a / m) / 2;
a %= m;
}
if b >= m {
ans += n * (b / m);
b %= m;
if a < 0 {
let a2 = internal_math::safe_mod(a, m);
ans -= n * (n - 1) / 2 * ((a2 - a) / m);
a = a2;
}

let y_max = (a * n + b) / m;
let x_max = y_max * m - b;
if y_max == 0 {
return ans;
if b < 0 {
let b2 = internal_math::safe_mod(b, m);
ans -= n * ((b2 - b) / m);
b = b2;
}
ans += (n - (x_max + a - 1) / a) * y_max;
ans += floor_sum(y_max, a, m, (a - x_max % a) % a);
ans
ans + internal_math::floor_sum_unsigned(n as u64, m as u64, a as u64, b as u64) as i64
}

#[cfg(test)]
Expand Down Expand Up @@ -306,5 +303,24 @@ mod tests {
499_999_999_500_000_000
);
assert_eq!(floor_sum(332955, 5590132, 2231, 999423), 22014575);
for n in 0..20 {
for m in 1..20 {
for a in -20..20 {
for b in -20..20 {
assert_eq!(floor_sum(n, m, a, b), floor_sum_naive(n, m, a, b));
}
}
}
}
}

#[allow(clippy::many_single_char_names)]
fn floor_sum_naive(n: i64, m: i64, a: i64, b: i64) -> i64 {
let mut ans = 0;
for i in 0..n {
let z = a * i + b;
ans += (z - internal_math::safe_mod(z, m)) / m;
}
ans
}
}