Fix linter errors #29

GreyCat · 2024-03-28T22:15:28Z

After upgrade of Go / golangci-lint, a few new errors popped up. This PR attempts to fix them.

Disable unsupported linters
Fix problems with line lengths
Fix error comparison with modern one using errors.Is()
Wrap errors to fix some linter problems
kaitai/stream.go: fix rest of wrapping errors
kaitai/util.go: fix error wrapping problems
kaitai/writer.go: fix wrapping problems
kaitai/stream.go: fix traceless messages for invalid size requested
Rewrite TestStream_ReadBitsIntBe + TestStream_ReadBitsIntLe to avoid duplication
Remove unused/obsolete functions and tests for them: ReadStrEOS, ReadStrByteLimit, ReadBitsArray

generalmimon · 2024-03-28T22:40:20Z

I know I was the one who started writing the .golangci.yml config file this way, but don't you think it would be better to enable-all: true and specifically disable these we don't find useful in our code? The advantage of this approach is that we would always get new useful linters right upon updating to the new version of golangci-lint, and granted, there could also be some new linters we don't want (so would add these to the disable list in a following commit), but at least each linter would get a chance to show itself without us having to regularly check the current list of linters :)

GreyCat · 2024-03-29T10:49:37Z

@generalmimon To be honest, I don't have a strong opinion. We can do this either way.

<rant mode>

So far, from what little I know about Go, experience with linting and best practices in Go seems to be very edgy to me. For example, I understand why one might want to avoid dynamically constructed error messages, e.g:

                if bytesNeeded > 8 {
                        return res, fmt.Errorf("ReadBitsIntBe(%d): more than 8 bytes requested", n)
                }

As far as I can tell, the only alternative solution which passes lint would be:

var ErrMoreThanEightBytesRequested = errors.New("more than 8 bytes requested")

// ...

if bytesNeeded > 8 {
        return nil, fmt.Errorf("ReadBitsIntBe(%d): %w", bytesNeeded, ErrMoreThanEightBytesRequested)
}

However, it's still dynamically constructed string, just for some reason dynamically constructed string with an integer is not ok, but adding %w into the mix magically "fixes" the problem.

Another example that I consider bad would be these changes:

@@ -95,7 +95,7 @@ func (k *Stream) ReadU1() (v uint8, err error) {
 // ReadU2be reads 2 bytes in big-endian order and returns those as uint16.
 func (k *Stream) ReadU2be() (v uint16, err error) {
        if _, err = k.Read(k.buf[:2]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU2be: error reading 2 bytes: %w", err)
        }
        return binary.BigEndian.Uint16(k.buf[:2]), nil
 }
@@ -103,7 +103,7 @@ func (k *Stream) ReadU2be() (v uint16, err error) {
 // ReadU4be reads 4 bytes in big-endian order and returns those as uint32.
 func (k *Stream) ReadU4be() (v uint32, err error) {
        if _, err = k.Read(k.buf[:4]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU4be: error reading 4 bytes: %w", err)
        }
        return binary.BigEndian.Uint32(k.buf[:4]), nil
 }

This seems to be rather stupid, error-prone and repetitive to me, but it makes linter happy. Instead of just having machinery to do stack trace automatically, it looks like we aim to construct poor man's alternative of a stack manually, risking tons of repetitive copy-paste errors.

</rant mode>

With all that, I kind of see less and less value in linting in go.

generalmimon · 2024-03-29T13:07:59Z

However, it's still dynamically constructed string, just for some reason dynamically constructed string with an integer is not ok, but adding %w into the mix magically "fixes" the problem.

With the little I know about Go, I believe this makes some sense, it's explained in "Go Wiki: Error Values: Frequently Asked Questions":

I am already using fmt.Errorf with %v or %s to provide context for an error. When should I switch to %w?

It’s common to see code like
if err := frob(thing); err != nil {
    return fmt.Errorf("while frobbing: %v", err)
}
With the new error features, that code continues to work exactly as before, constructing a string that includes the text of err. Changing from %v to %w doesn’t change that string, but it does wrap err, allowing the caller to access it using errors.Unwrap, errors.Is or errors.As.

It seems that %w is indeed a magical format specifier that not only interpolates the error message, but creates a new error object with the original err wrapped inside. The difference is that the caller doesn't get a brand new error object each time (where the only option to tell different errors apart is to check the text error message) but can actually use errors.Is to check if it's caused by ErrMoreThanEightBytesRequested (which of course must be part of the public API of the library, otherwise it's useless).

Instead of just having machinery to do stack trace automatically, it looks like we aim to construct poor man's alternative of a stack manually, risking tons of repetitive copy-paste errors.

Welcome to Go with the "errors are values" philosophy :) I get the point of the linter that if you merely return err, you're literally passing the same I/O error object from the Go stdlib unchanged, so any kind of stack trace information that this I/O error originates from our ReadU2be/ReadU4be/... function is lost.

Another example that I consider bad would be these changes:

@@ -95,7 +95,7 @@ func (k *Stream) ReadU1() (v uint8, err error) {
 // ReadU2be reads 2 bytes in big-endian order and returns those as uint16.
 func (k *Stream) ReadU2be() (v uint16, err error) {
        if _, err = k.Read(k.buf[:2]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU2be: error reading 2 bytes: %w", err)
        }
        return binary.BigEndian.Uint16(k.buf[:2]), nil
 }
@@ -103,7 +103,7 @@ func (k *Stream) ReadU2be() (v uint16, err error) {
 // ReadU4be reads 4 bytes in big-endian order and returns those as uint32.
 func (k *Stream) ReadU4be() (v uint32, err error) {
        if _, err = k.Read(k.buf[:4]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU4be: error reading 4 bytes: %w", err)
        }
        return binary.BigEndian.Uint32(k.buf[:4]), nil
 }

This seems to be rather stupid, error-prone and repetitive to me, but it makes linter happy.

I agree this looks stupid, but unfortunately, at least the Go standard library doesn't really support stack traces, so something along these lines seems to be the recommended way in Go. I'd be happy to be corrected by someone who knows Go better, but it seems like it's either this or no stack traces at all (which in case of EOF errors means that the caller gets just &errors.errorString{s:"EOF"}, which I don't think is very helpful).

…duplication

GreyCat · 2024-03-29T15:51:14Z

@generalmimon Thanks for explanation, I think I understand this better now and I'm more confident that we're moving in the right direction then.

I believe this set of commits should fix at least all breaking errors in lint now. The only ones left is related to functions which, as far as I understand, are not used at all (i.e. ksc has knows nothing about them):

func (k *Stream) ReadStrEOS(encoding string) (string, error)
func (k *Stream) ReadStrByteLimit(limit int, encoding string) (string, error)
func (k *Stream) ReadBitsArray(n uint) error

Any objections if I will just delete them?

generalmimon · 2024-03-29T15:55:50Z

@GreyCat:

Any objections if I will just delete them?

Go ahead, I think they should have been deleted long ago...

…StrByteLimit, ReadBitsArray

generalmimon · 2024-03-29T21:57:51Z

kaitai/util.go

-		return "", e
+	d, err := ioutil.ReadAll(o)
+	if err != nil {
+		return "", fmt.Errorf("BytesToStr: error reading all: %w", err)


I don't like the reading all in the message, it doesn't capture the essence of the error. I suggest something like error decoding bytes, and it turns out we can sometimes reveal which encoding was used by printing the type of decoder.Transformer, so we can include it in the message too:

Suggested change

return "", fmt.Errorf("BytesToStr: error reading all: %w", err)

return "", fmt.Errorf("BytesToStr: error decoding bytes with %T: %w", decoder.Transformer, err)

For example (https://go.dev/play/p/Ljcbzu8EnSw):

package main import ( "bytes" "fmt" "io/ioutil" "golang.org/x/text/encoding" "golang.org/x/text/encoding/unicode" "golang.org/x/text/transform" ) func BytesToStr(in []byte, decoder *encoding.Decoder) (string, error) { i := bytes.NewReader(in) o := transform.NewReader(i, decoder) d, err := ioutil.ReadAll(o) if err != nil { return "", fmt.Errorf("BytesToStr: error decoding bytes with %T: %w", decoder.Transformer, err) } return string(d), nil } func main() { dec := unicode.UTF16(unicode.BigEndian, unicode.ExpectBOM).NewDecoder() s, err := BytesToStr([]uint8{ /* 0xfe, 0xff, */ 0x30, 0x53, 0x30, 0x93}, dec) fmt.Printf("%#v\n", s) fmt.Printf("%v\n", err) }

Output:

"" BytesToStr: error decoding bytes with *unicode.utf16Decoder: encoding: missing byte order mark

(The type of decoder.Transformer isn't always useful - for example, for charmap.CodePage437 from "golang.org/x/text/encoding/charmap", it prints only charmap.charmapDecoder. On the other hand, I'm not sure whether the charmap encodings can fail, they seem to solve any issue with the U+FFFD replacement character.)

Sounds good, let me commit this separately!

See #29 (comment) The idea is that this approach should be easier to maintain in the long run. CI will still be reproducible because we declare a specific `golangci-lint` version in `.github/workflows/go.yml` (`version: v1.58` at the moment), so we'll get a constant set of linters as long as the `golangci-lint` version stays the same. But at the same time, when we bump this `golangci-lint` version, we'll get all the new linters by default without having to change the `.golangci.yml` configuration. In contrast, with the existing `disable-all: true` + `enable` approach, we're not using any of the new linters until we manually review the latest list of linters that `golangci-lint` supports and update `.golangci.yml` accordingly (which is tedious). Until that is done, we're probably missing out on some useful linters that could find other issues in our code. Of course, it's possible that we won't find all new linters useful, but we can simply check the `golangci-lint` output after bumping its version, see which linters we don't care about and then update the `disable` list accordingly.

generalmimon approved these changes Mar 28, 2024

View reviewed changes

GreyCat force-pushed the fix_linter branch from 6cda535 to c7307aa Compare March 29, 2024 10:55

GreyCat added 5 commits March 29, 2024 12:14

Disable unsupported linters

79ac9dd

Fix problems with line lengths

3f6a6fc

Fix error comparison with modern one using errors.Is()

6727c5b

Wrap errors to fix some linter problems

d32b0ae

kaitai/stream.go: fix rest of wrapping errors

736aca6

GreyCat force-pushed the fix_linter branch from c7307aa to 736aca6 Compare March 29, 2024 12:27

kaitai/util.go: fix error wrapping problems

6ab5964

GreyCat added 3 commits March 29, 2024 14:43

kaitai/writer.go: fix wrapping problems

fce5b7c

kaitai/stream.go: fix traceless messages for invalid size requested

90ce202

Rewrite TestStream_ReadBitsIntBe + TestStream_ReadBitsIntLe to avoid …

b7fec8e

…duplication

Remove unused/obsolete functions and tests for them: ReadStrEOS, Read…

5e3c2c2

…StrByteLimit, ReadBitsArray

GreyCat merged commit 296c155 into master Mar 29, 2024
2 checks passed

GreyCat deleted the fix_linter branch March 29, 2024 19:26

generalmimon reviewed Mar 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix linter errors #29

Fix linter errors #29

GreyCat commented Mar 28, 2024 •

edited

Loading

generalmimon commented Mar 28, 2024

GreyCat commented Mar 29, 2024

generalmimon commented Mar 29, 2024 •

edited

Loading

I am already using `fmt.Errorf` with `%v` or `%s` to provide context for an error. When should I switch to `%w`?

GreyCat commented Mar 29, 2024

generalmimon commented Mar 29, 2024

generalmimon Mar 29, 2024

GreyCat Mar 29, 2024

	return "", fmt.Errorf("BytesToStr: error reading all: %w", err)
	return "", fmt.Errorf("BytesToStr: error decoding bytes with %T: %w", decoder.Transformer, err)

Fix linter errors #29

Fix linter errors #29

Conversation

GreyCat commented Mar 28, 2024 • edited Loading

generalmimon commented Mar 28, 2024

GreyCat commented Mar 29, 2024

generalmimon commented Mar 29, 2024 • edited Loading

I am already using fmt.Errorf with %v or %s to provide context for an error. When should I switch to %w?

GreyCat commented Mar 29, 2024

generalmimon commented Mar 29, 2024

generalmimon Mar 29, 2024

Choose a reason for hiding this comment

GreyCat Mar 29, 2024

Choose a reason for hiding this comment

GreyCat commented Mar 28, 2024 •

edited

Loading

generalmimon commented Mar 29, 2024 •

edited

Loading

I am already using `fmt.Errorf` with `%v` or `%s` to provide context for an error. When should I switch to `%w`?