Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate non-UTF-8 byte sequences into replacement characters #60

Closed
wants to merge 7 commits into from
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,20 @@
CHANGELOG
=========

## 0.9.0

- Invalid UTF-8 sequences are now translated into replacement characters
in a manner consistent with `Rust::from_utf8_lossy` and the resolution to
["How many replacement characters?"](https://hsivonen.fi/broken-utf-8/).
- Add a `Parser::end` function allowing users to mark the end of a stream,
so that an incomplete UTF-8 encoding at the end of the stream can be
reported.
- Remove 8-bit C1 support. 8-bit C1 codes are now interpreted as UTF-8
continuation bytes.
- DCS/SOS/PM/APC recognition may now be disabled with the
`Parser::set_dcs_pm_apc` function. This is useful for UTF-8-only environments
where the 8-bit ST terminator is invalid.

## 0.8.0

- Remove C1 ST support in OSCs, fixing OSCs with ST in the payload
Expand Down
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@ vte
Parser for implementing virtual terminal emulators in Rust.

The parser is implemented according to [Paul Williams' ANSI parser state
machine]. The state machine doesn't assign meaning to the parsed data and is
thus not itself sufficient for writing a terminal emulator. Instead, it is
expected that an implementation of the `Perform` trait which does something
useful with the parsed data. The `Parser` handles the book keeping, and the
`Perform` gets to simply handle actions.
machine], modified to work in terms of UTF-8 encodings.

The state machine doesn't assign meaning to the parsed data and is thus not
itself sufficient for writing a terminal emulator. Instead, it is expected that
an implementation of the `Perform` trait which does something useful with the
parsed data. The `Parser` handles the book keeping, and the `Perform` gets to
simply handle actions.

See the [docs] for more info.

Expand Down
5 changes: 4 additions & 1 deletion examples/parselog.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,10 @@ fn main() {

loop {
match handle.read(&mut buf) {
Ok(0) => break,
Ok(0) => {
statemachine.end(&mut performer);
break;
},
Ok(n) => {
for byte in &buf[..n] {
statemachine.advance(&mut performer, *byte);
Expand Down
2 changes: 1 addition & 1 deletion src/definitions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ pub enum Action {
EscDispatch = 4,
Execute = 5,
Hook = 6,
Ignore = 7,
CheckDcsSosPmApc = 7,
OscEnd = 8,
OscPut = 9,
OscStart = 10,
Expand Down
Loading