Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significantly improve performance by using a buffered writer #3101

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

MoSal
Copy link

@MoSal MoSal commented Oct 9, 2024

Before

Benchmark 1: seq 10000000 | bat
  Time (mean ± σ):      6.235 s ±  0.052 s    [User: 3.664 s, System: 2.714 s]
  Range (min … max):    6.172 s …  6.355 s    10 runs

After

Benchmark 1: seq 10000000 | ./target/release/bat
  Time (mean ± σ):     215.9 ms ±   5.1 ms    [User: 275.4 ms, System: 38.8 ms]
  Range (min … max):   210.3 ms … 224.9 ms    10 runs

Using less for comparison

Benchmark 1: seq 10000000 | less
  Time (mean ± σ):     637.3 ms ±  43.3 ms    [User: 642.1 ms, System: 95.6 ms]
  Range (min … max):   584.5 ms … 700.1 ms    10 runs

And raw

Benchmark 1: seq 10000000
  Time (mean ± σ):      63.1 ms ±   1.3 ms    [User: 57.1 ms, System: 5.9 ms]
  Range (min … max):    62.1 ms …  66.0 ms    10 runs

Before

```
Benchmark 1: seq 10000000 | bat
  Time (mean ± σ):      6.235 s ±  0.052 s    [User: 3.664 s, System: 2.714 s]
  Range (min … max):    6.172 s …  6.355 s    10 runs
```

After

```
Benchmark 1: seq 10000000 | ./target/release/bat
  Time (mean ± σ):     215.9 ms ±   5.1 ms    [User: 275.4 ms, System: 38.8 ms]
  Range (min … max):   210.3 ms … 224.9 ms    10 runs
```

Using `less` for comparison

```
Benchmark 1: seq 10000000 | less
  Time (mean ± σ):     637.3 ms ±  43.3 ms    [User: 642.1 ms, System: 95.6 ms]
  Range (min … max):   584.5 ms … 700.1 ms    10 runs
```

And raw

```
Benchmark 1: seq 10000000
  Time (mean ± σ):      63.1 ms ±   1.3 ms    [User: 57.1 ms, System: 5.9 ms]
  Range (min … max):    62.1 ms …  66.0 ms    10 runs
```

Signed-off-by: Mohammad AlSaleh <CE.Mohammad.AlSaleh@gmail.com>
@Enselic
Copy link
Collaborator

Enselic commented Oct 25, 2024

Those numbers look very nice 👍

I skimmed over the diff which is quite big. Is it possible to split this up into smaller commits/PRs to make it easier to review/handle?

@sharkdp
Copy link
Owner

sharkdp commented Oct 25, 2024

Didn't have time to look into the code, but didn't we experiment with this a while ago and decided that we can't use a buffered writer, because we need to support interactive use cases where you run bat, enter a single line of input, see the output, enter the next line of input, see the output, …

let mut writer = match output_buffer {
Some(buf) => OutputHandle::FmtWrite(buf),
None => OutputHandle::IoWrite(output_type.handle()?),
None => {
OutputHandle::IoWrite(BufWriter::with_capacity(BUF_W_SZ, output_type.handle()?))
Copy link
Collaborator

@Enselic Enselic Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might very well be the case. If so, it would be good to add a comment e.g. here that explains that. So that anyone that gets the same idea will find the comment when they are about to change the code.

And/or add a regression test that tests this use case so that CI does not become green.

 bat supports interactive use by opening without file arguments or with
 `-`. This uses the terminal stdin for input.

 This won't work with a buffered writer if the writer is not flushed
 after each read line from input.

 So this commit enables such flushing behavior if stdin is detected as
 terminal input, while retaining the benefits of using a buffered writer
 otherwise.

 Note that `Input::from_stdin()` from the `pretty_printer` API will
 assume input is terminal to retain old (unbuffered) behavior while
 the API is kept the same.

Signed-off-by: Mohammad AlSaleh <CE.Mohammad.AlSaleh@gmail.com>
@MoSal
Copy link
Author

MoSal commented Oct 25, 2024

@sharkdp
Is it an acceptable solution to flush on each read line if stdin is detected as terminal?
I just pushed a commit that does just that.

@eth-p
Copy link
Collaborator

eth-p commented Oct 28, 2024

Is it an acceptable solution to flush on each read line if stdin is detected as terminal?

That wouldn't account for other programs piping to bat: sed 's/A/a/' </dev/stdin | bat

If it's possible to flush at the end of every input line regardless of the source, I would be happy with that compromise. We can always optimize it further in the future to add special exceptions for cases such as "the input source is a regular file."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants