Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix linter errors #29

Merged
merged 10 commits into from
Mar 29, 2024
Merged

Fix linter errors #29

merged 10 commits into from
Mar 29, 2024

Conversation

GreyCat
Copy link
Member

@GreyCat GreyCat commented Mar 28, 2024

After upgrade of Go / golangci-lint, a few new errors popped up. This PR attempts to fix them.

  • Disable unsupported linters
  • Fix problems with line lengths
  • Fix error comparison with modern one using errors.Is()
  • Wrap errors to fix some linter problems
  • kaitai/stream.go: fix rest of wrapping errors
  • kaitai/util.go: fix error wrapping problems
  • kaitai/writer.go: fix wrapping problems
  • kaitai/stream.go: fix traceless messages for invalid size requested
  • Rewrite TestStream_ReadBitsIntBe + TestStream_ReadBitsIntLe to avoid duplication
  • Remove unused/obsolete functions and tests for them: ReadStrEOS, ReadStrByteLimit, ReadBitsArray

@generalmimon
Copy link
Member

I know I was the one who started writing the .golangci.yml config file this way, but don't you think it would be better to enable-all: true and specifically disable these we don't find useful in our code? The advantage of this approach is that we would always get new useful linters right upon updating to the new version of golangci-lint, and granted, there could also be some new linters we don't want (so would add these to the disable list in a following commit), but at least each linter would get a chance to show itself without us having to regularly check the current list of linters :)

@GreyCat
Copy link
Member Author

GreyCat commented Mar 29, 2024

@generalmimon To be honest, I don't have a strong opinion. We can do this either way.

<rant mode>

So far, from what little I know about Go, experience with linting and best practices in Go seems to be very edgy to me. For example, I understand why one might want to avoid dynamically constructed error messages, e.g:

                if bytesNeeded > 8 {
                        return res, fmt.Errorf("ReadBitsIntBe(%d): more than 8 bytes requested", n)
                }

As far as I can tell, the only alternative solution which passes lint would be:

var ErrMoreThanEightBytesRequested = errors.New("more than 8 bytes requested")

// ...

if bytesNeeded > 8 {
        return nil, fmt.Errorf("ReadBitsIntBe(%d): %w", bytesNeeded, ErrMoreThanEightBytesRequested)
}

However, it's still dynamically constructed string, just for some reason dynamically constructed string with an integer is not ok, but adding %w into the mix magically "fixes" the problem.

Another example that I consider bad would be these changes:

@@ -95,7 +95,7 @@ func (k *Stream) ReadU1() (v uint8, err error) {
 // ReadU2be reads 2 bytes in big-endian order and returns those as uint16.
 func (k *Stream) ReadU2be() (v uint16, err error) {
        if _, err = k.Read(k.buf[:2]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU2be: error reading 2 bytes: %w", err)
        }
        return binary.BigEndian.Uint16(k.buf[:2]), nil
 }
@@ -103,7 +103,7 @@ func (k *Stream) ReadU2be() (v uint16, err error) {
 // ReadU4be reads 4 bytes in big-endian order and returns those as uint32.
 func (k *Stream) ReadU4be() (v uint32, err error) {
        if _, err = k.Read(k.buf[:4]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU4be: error reading 4 bytes: %w", err)
        }
        return binary.BigEndian.Uint32(k.buf[:4]), nil
 }

This seems to be rather stupid, error-prone and repetitive to me, but it makes linter happy. Instead of just having machinery to do stack trace automatically, it looks like we aim to construct poor man's alternative of a stack manually, risking tons of repetitive copy-paste errors.

</rant mode>

With all that, I kind of see less and less value in linting in go.

@generalmimon
Copy link
Member

generalmimon commented Mar 29, 2024

However, it's still dynamically constructed string, just for some reason dynamically constructed string with an integer is not ok, but adding %w into the mix magically "fixes" the problem.

With the little I know about Go, I believe this makes some sense, it's explained in "Go Wiki: Error Values: Frequently Asked Questions":

I am already using fmt.Errorf with %v or %s to provide context for an error. When should I switch to %w?

It’s common to see code like

if err := frob(thing); err != nil {
    return fmt.Errorf("while frobbing: %v", err)
}

With the new error features, that code continues to work exactly as before, constructing a string that includes the text of err. Changing from %v to %w doesn’t change that string, but it does wrap err, allowing the caller to access it using errors.Unwrap, errors.Is or errors.As.

It seems that %w is indeed a magical format specifier that not only interpolates the error message, but creates a new error object with the original err wrapped inside. The difference is that the caller doesn't get a brand new error object each time (where the only option to tell different errors apart is to check the text error message) but can actually use errors.Is to check if it's caused by ErrMoreThanEightBytesRequested (which of course must be part of the public API of the library, otherwise it's useless).

Instead of just having machinery to do stack trace automatically, it looks like we aim to construct poor man's alternative of a stack manually, risking tons of repetitive copy-paste errors.

Welcome to Go with the "errors are values" philosophy :) I get the point of the linter that if you merely return err, you're literally passing the same I/O error object from the Go stdlib unchanged, so any kind of stack trace information that this I/O error originates from our ReadU2be/ReadU4be/... function is lost.

Another example that I consider bad would be these changes:

@@ -95,7 +95,7 @@ func (k *Stream) ReadU1() (v uint8, err error) {
 // ReadU2be reads 2 bytes in big-endian order and returns those as uint16.
 func (k *Stream) ReadU2be() (v uint16, err error) {
        if _, err = k.Read(k.buf[:2]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU2be: error reading 2 bytes: %w", err)
        }
        return binary.BigEndian.Uint16(k.buf[:2]), nil
 }
@@ -103,7 +103,7 @@ func (k *Stream) ReadU2be() (v uint16, err error) {
 // ReadU4be reads 4 bytes in big-endian order and returns those as uint32.
 func (k *Stream) ReadU4be() (v uint32, err error) {
        if _, err = k.Read(k.buf[:4]); err != nil {
-               return 0, err
+               return 0, fmt.Errorf("ReadU4be: error reading 4 bytes: %w", err)
        }
        return binary.BigEndian.Uint32(k.buf[:4]), nil
 }

This seems to be rather stupid, error-prone and repetitive to me, but it makes linter happy.

I agree this looks stupid, but unfortunately, at least the Go standard library doesn't really support stack traces, so something along these lines seems to be the recommended way in Go. I'd be happy to be corrected by someone who knows Go better, but it seems like it's either this or no stack traces at all (which in case of EOF errors means that the caller gets just &errors.errorString{s:"EOF"}, which I don't think is very helpful).

@GreyCat
Copy link
Member Author

GreyCat commented Mar 29, 2024

@generalmimon Thanks for explanation, I think I understand this better now and I'm more confident that we're moving in the right direction then.

I believe this set of commits should fix at least all breaking errors in lint now. The only ones left is related to functions which, as far as I understand, are not used at all (i.e. ksc has knows nothing about them):

func (k *Stream) ReadStrEOS(encoding string) (string, error)
func (k *Stream) ReadStrByteLimit(limit int, encoding string) (string, error)
func (k *Stream) ReadBitsArray(n uint) error

Any objections if I will just delete them?

@generalmimon
Copy link
Member

@GreyCat:

Any objections if I will just delete them?

Go ahead, I think they should have been deleted long ago...

@GreyCat GreyCat merged commit 296c155 into master Mar 29, 2024
2 checks passed
@GreyCat GreyCat deleted the fix_linter branch March 29, 2024 19:26
return "", e
d, err := ioutil.ReadAll(o)
if err != nil {
return "", fmt.Errorf("BytesToStr: error reading all: %w", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the reading all in the message, it doesn't capture the essence of the error. I suggest something like error decoding bytes, and it turns out we can sometimes reveal which encoding was used by printing the type of decoder.Transformer, so we can include it in the message too:

Suggested change
return "", fmt.Errorf("BytesToStr: error reading all: %w", err)
return "", fmt.Errorf("BytesToStr: error decoding bytes with %T: %w", decoder.Transformer, err)

For example (https://go.dev/play/p/Ljcbzu8EnSw):

package main

import (
	"bytes"
	"fmt"
	"io/ioutil"

	"golang.org/x/text/encoding"
	"golang.org/x/text/encoding/unicode"
	"golang.org/x/text/transform"
)

func BytesToStr(in []byte, decoder *encoding.Decoder) (string, error) {
	i := bytes.NewReader(in)
	o := transform.NewReader(i, decoder)
	d, err := ioutil.ReadAll(o)
	if err != nil {
		return "", fmt.Errorf("BytesToStr: error decoding bytes with %T: %w", decoder.Transformer, err)
	}
	return string(d), nil
}

func main() {
	dec := unicode.UTF16(unicode.BigEndian, unicode.ExpectBOM).NewDecoder()
	s, err := BytesToStr([]uint8{ /* 0xfe, 0xff, */ 0x30, 0x53, 0x30, 0x93}, dec)
	fmt.Printf("%#v\n", s)
	fmt.Printf("%v\n", err)
}

Output:

""
BytesToStr: error decoding bytes with *unicode.utf16Decoder: encoding: missing byte order mark

(The type of decoder.Transformer isn't always useful - for example, for charmap.CodePage437 from "golang.org/x/text/encoding/charmap", it prints only charmap.charmapDecoder. On the other hand, I'm not sure whether the charmap encodings can fail, they seem to solve any issue with the U+FFFD replacement character.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, let me commit this separately!

generalmimon added a commit that referenced this pull request May 4, 2024
See
#29 (comment)

The idea is that this approach should be easier to maintain in the long
run. CI will still be reproducible because we declare a specific
`golangci-lint` version in `.github/workflows/go.yml` (`version: v1.58`
at the moment), so we'll get a constant set of linters as long as the
`golangci-lint` version stays the same. But at the same time, when we
bump this `golangci-lint` version, we'll get all the new linters by
default without having to change the `.golangci.yml` configuration.

In contrast, with the existing `disable-all: true` + `enable` approach,
we're not using any of the new linters until we manually review the
latest list of linters that `golangci-lint` supports and update
`.golangci.yml` accordingly (which is tedious). Until that is done,
we're probably missing out on some useful linters that could find other
issues in our code.

Of course, it's possible that we won't find all new linters useful, but
we can simply check the `golangci-lint` output after bumping its
version, see which linters we don't care about and then update the
`disable` list accordingly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants