Skip to content

Commit

Permalink
Expose the number of unbuffered bytes remaining
Browse files Browse the repository at this point in the history
By exposing the number of unbuffered bytes, unchecked users have fewer
hops to jump through to ensure their usage is well defined.
  • Loading branch information
nickbabcock committed Dec 18, 2024
1 parent e079f2d commit 0654152
Show file tree
Hide file tree
Showing 2 changed files with 81 additions and 18 deletions.
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,20 +105,18 @@ The above is not an endorsement of the best way to simulate larger reads in Manu

There's one final trick that bitter exposes that dials performance to 11 at the cost of safety and increased assumptions. Welcome to the unchecked refill API (referred to as "unchecked"), which can only be called when there are at least 8 bytes left in the buffer. Anything less than that can cause invalid memory access. The upside is that this API unlocks the holy grail of branchless bit reading.

Always consider guarding unchecked access at a higher level:
Always guard unchecked access at a higher level:

```rust
use bitter::{BitReader, LittleEndianReader};
use bitter::{BitReader, LittleEndianReader, MAX_READ_BITS};

let mut bits = LittleEndianReader::new(&[0u8; 100]);
let objects_to_read = 10;
let object_bits = 56;
let bitter_padding = 64;
let desired_bits = objects_to_read * object_bits;
let bytes_needed = (desired_bits as f64 / 8.0).ceil();

// make sure we have enough data to read all our objects and there is enough
// data leftover so bitter can unalign read 8 bytes without fear of reading past
// the end of the buffer.
if bits.has_bits_remaining(objects_to_read * object_bits + bitter_padding) {
if bits.unbuffered_bytes_remaining() >= bytes_needed as usize {
for _ in 0..objects_to_read {
unsafe { bits.refill_lookahead_unchecked() };
let _field1 = bits.peek(2);
Expand All @@ -133,6 +131,13 @@ if bits.has_bits_remaining(objects_to_read * object_bits + bitter_padding) {
let _field4 = bits.peek(18);
bits.consume(18);
}
} else if bits.has_bits_remaining(desired_bits) {
// So have enough bits to read all the objects just not
// enough bits to call the unchecked lookahead API everytime.
assert!(false);
} else {
// Not enough data.
assert!(false);
}
```

Expand Down
80 changes: 69 additions & 11 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,26 +97,24 @@ bits.consume(hi_len);
assert_eq!(expected, (hi << lo_len) + lo);
```
The above is not an endorsement of the best way to simulate larger reads in Manual mode. For instance, it may be better to drain the lookahead first, or use `MAX_READ_BITS` to calculate `lo` instead of querying `lookahead_bits`. Always profile for your environment.
The above is not an endorsement of the best way to simulate larger reads in Manual mode. For instance, it may be better to drain the lookahead first, or use `MAX_READ_BITS` to calculate `lo` instead of querying `lookahead_bits`. Always profile for your environment.
## Unchecked mode
There's one final trick that bitter exposes that dials performance to 11 at the cost of safety and increased assumptions. Welcome to the unchecked refill API (referred to as "unchecked"), which can only be called when there are at least 8 bytes left in the buffer. Anything less than that can cause invalid memory access. The upside is that this API unlocks the holy grail of branchless bit reading.
Always consider guarding unchecked access at a higher level:
Always guard unchecked access at a higher level:
```rust
use bitter::{BitReader, LittleEndianReader};
use bitter::{BitReader, LittleEndianReader, MAX_READ_BITS};
let mut bits = LittleEndianReader::new(&[0u8; 100]);
let objects_to_read = 10;
let object_bits = 56;
let bitter_padding = 64;
let desired_bits = objects_to_read * object_bits;
let bytes_needed = (desired_bits as f64 / 8.0).ceil();
// make sure we have enough data to read all our objects and there is enough
// data leftover so bitter can unalign read 8 bytes without fear of reading past
// the end of the buffer.
if bits.has_bits_remaining(objects_to_read * object_bits + bitter_padding) {
if bits.unbuffered_bytes_remaining() >= bytes_needed as usize {
for _ in 0..objects_to_read {
unsafe { bits.refill_lookahead_unchecked() };
let _field1 = bits.peek(2);
Expand All @@ -131,6 +129,13 @@ if bits.has_bits_remaining(objects_to_read * object_bits + bitter_padding) {
let _field4 = bits.peek(18);
bits.consume(18);
}
} else if bits.has_bits_remaining(desired_bits) {
// So have enough bits to read all the objects just not
// enough bits to call the unchecked lookahead API everytime.
assert!(false);
} else {
// Not enough data.
assert!(false);
}
```
Expand Down Expand Up @@ -313,6 +318,18 @@ pub trait BitReader {
/// ```
fn bytes_remaining(&self) -> usize;

/// Returns how many bytes are still left in the passed in buffer.
///
/// How many bytes remain in the original buffer is typically an
/// implementation detail, and one should prefer
/// [`BitReader::bytes_remaining`], which includes bytes in the lookahead
/// buffer.
///
/// However, the bitter unchecked API,
/// [`BitReader::refill_lookahead_unchecked`], requires this same
/// calculation to avoid undefined behavior.
fn unbuffered_bytes_remaining(&self) -> usize;

/// Returns the exact number of bits remaining in the bitstream if the
/// number of bits can fit within a `usize`. For large byte slices,
/// calculating the number of bits can cause an overflow, hence an `Option`
Expand Down Expand Up @@ -412,15 +429,37 @@ pub trait BitReader {
/// of data remains unread.
fn lookahead_bits(&self) -> u32;

/// Refills the buffer without bounds checking
/// Refills the lookahead buffer without bounds checking
///
/// Guard any usage with [`BitReader::has_bits_remaining`]
/// After calling, the lookahead buffer is guaranteed to have between
/// [[`MAX_READ_BITS`], 64] bits available to read.
///
/// # Safety
///
/// This function assumes that there are at least 8 bytes left in the data
/// This function assumes that there are at least 8 bytes left unbuffered
/// for an unaligned read. It is undefined behavior if there is less than 8
/// bytes remaining
///
/// Guard all usages with [`BitReader::unbuffered_bytes_remaining`]
///
/// ```rust
/// # use bitter::{LittleEndianReader, BitReader};
/// let mut bits = LittleEndianReader::new(&[0u8; 100]);
/// let objects_to_read = 7;
/// let object_bits = 39;
/// let desired_bits = objects_to_read * object_bits;
/// let bytes_needed = (desired_bits as f64 / 8.0).ceil();
/// if bits.unbuffered_bytes_remaining() >= bytes_needed as usize {
/// for _ in 0..objects_to_read {
/// unsafe { bits.refill_lookahead_unchecked() };
/// let _field1 = bits.peek(10);
/// bits.consume(10);
///
/// let _field2 = bits.peek(29);
/// bits.consume(29);
/// }
/// }
/// ```
unsafe fn refill_lookahead_unchecked(&mut self);

/// Returns true if the reader is not partway through a byte
Expand Down Expand Up @@ -694,6 +733,11 @@ impl<'a, const LE: bool> BitReader for BitterState<'a, LE> {
self.bytes_remaining()
}

#[inline]
fn unbuffered_bytes_remaining(&self) -> usize {
self.unbuffered_bytes()
}

#[inline]
fn bits_remaining(&self) -> Option<usize> {
self.bits_remaining()
Expand Down Expand Up @@ -935,6 +979,11 @@ impl<'a> BitReader for LittleEndianReader<'a> {
self.0.bytes_remaining()
}

#[inline]
fn unbuffered_bytes_remaining(&self) -> usize {
self.0.unbuffered_bytes()
}

#[inline]
fn bits_remaining(&self) -> Option<usize> {
self.0.bits_remaining()
Expand Down Expand Up @@ -1075,6 +1124,11 @@ impl<'a> BitReader for BigEndianReader<'a> {
self.0.bytes_remaining()
}

#[inline]
fn unbuffered_bytes_remaining(&self) -> usize {
self.0.unbuffered_bytes()
}

#[inline]
fn bits_remaining(&self) -> Option<usize> {
self.0.bits_remaining()
Expand Down Expand Up @@ -1441,9 +1495,11 @@ mod tests {
let mut bits = LittleEndianReader::new(&[0xff, 0x04]);
assert!(!bits.is_empty());
assert_eq!(bits.bytes_remaining(), 2);
assert_eq!(bits.unbuffered_bytes_remaining(), 2);
assert!(bits.read_bit().is_some());
assert!(!bits.is_empty());
assert_eq!(bits.bytes_remaining(), 1);
assert_eq!(bits.unbuffered_bytes_remaining(), 0);
assert!(bits.read_bits(6).is_some());
assert!(!bits.is_empty());
assert_eq!(bits.bytes_remaining(), 1);
Expand Down Expand Up @@ -1530,7 +1586,9 @@ mod tests {
fn test_bytes_remaining() {
let mut bits = LittleEndianReader::new(&[0xff, 0x04]);
assert_eq!(bits.bytes_remaining(), 2);
assert_eq!(bits.unbuffered_bytes_remaining(), 2);
assert_eq!(bits.read_bit(), Some(true));
assert_eq!(bits.unbuffered_bytes_remaining(), 0);
assert_eq!(bits.bytes_remaining(), 1);
assert_eq!(bits.read_u8(), Some(0x7f));
assert!(bits.has_bits_remaining(7));
Expand Down

0 comments on commit 0654152

Please sign in to comment.