-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor!: parsing, revisit short option groups, add support for combined short and value #75
Conversation
- remove now unused experimental code - split isLongOption - rename routines and files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add “exports”, if we want the utils not to be part of the public api.
Co-authored-by: Jordan Harband <ljharb@gmail.com>
Co-authored-by: Jordan Harband <ljharb@gmail.com>
Co-authored-by: Jordan Harband <ljharb@gmail.com>
I was neglecting the exports from the shim point of view, but makes good sense from the package point of view. Done. Thanks. |
|
Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do note that adding "exports" is a breaking change, which pre-1.0 increments the second number :-)
Co-authored-by: Jordan Harband <ljharb@gmail.com>
Co-authored-by: Jordan Harband <ljharb@gmail.com>
Co-authored-by: Jordan Harband <ljharb@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @shadowspawn! Lot to unpack here and didn't get through everything. Submitting this partial review to provide some initial feedback and introduce some potential discussion points/opportunities for clarification. Note: no comments are blocking, so feel free to resolve any of the callouts 👍
}); | ||
|
||
test('when combine string short with value like short option then parsed as value', (t) => { | ||
const passedArgs = ['-a-b']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (non-blocking): I'm not too familiar with this pattern (e.g. lack of ' '
or '='
delimiters between short option values). Is this common for short options/backed by any guidelines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are from the hard-core Open Group guidelines, which does not even mention long options. But I find useful to understand the origins and underpinnings of many command-line parsing details.
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_01
(Emphasis on last sentence added by me.)
12.1 Utility Argument Syntax
...
2. Option-arguments are shown separated from their options by characters, except when the option-argument is enclosed in the '[' and ']' notation to indicate that it is optional. This reflects the situation in which an optional option-argument (if present) is included within the same argument string as the option; for a mandatory option-argument, it is the next argument. The Utility Syntax Guidelines in Utility Syntax Guidelines require that the option be a separate argument from its option-argument and that option-arguments not be optional, but there are some exceptions in POSIX.1-2017 to ensure continued operation of historical applications:a. If the SYNOPSIS of a standard utility shows an option with a mandatory option-argument (as with [ -c option_argument] in the example), a conforming application shall use separate arguments for that option and its option-argument. However, a conforming implementation shall also permit applications to specify the option and option-argument in the same argument string without intervening characters.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html
The description has been written to make it clear that getopt(), like the getopts utility, deals with option-arguments whether separated from the option by characters or not. Note that the requirements on getopt() and getopts are more stringent than the Utility Syntax Guidelines.
https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap12.html
SYNOPSIS Shows: -a arg
Conforming application uses: -a arg
System supports: -a arg
and -aarg
Non-conforming applications may use: -aarg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very helpful reference! And reading further answered another question I had regarding what characters are allowed in option arguments:
Guideline 7 allows any string to be an option-argument; an option-argument can begin with any character, can be
-
or--
, and can be an empty string. For example, the commandspr -h -
,pr -h --
,pr -h -d
,pr -h +2
, andpr -h "
contain the option-arguments-
,--
,-d
,+2
, and an empty string, respectively. Conversely, the commandpr -h -- -d
treats-d
as an option, not as an argument, because the--
is an option-argument here, not a delimiter.
Almost makes me wonder if we should always capture the nextArg
if options[longOption].type === 'string'
. Perhaps we shouldn't open that can of worms 😅 I'm quite satisfied with the current behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, glad you mentioned it! I have actually been thinking about this and doing some research this week. I opened #76 partly to provide a context for revisiting whether options are "greedy", and was leaning towards opening a new issue about that. Now you have raised it, I definitely will!
Before I was clear on the implications I had opened #25 which covers this ground too, but much of the discussion there is about what to store in strict:false
mode if the value is missing, and not deep coverage of greedy vs non-greedy.
I'm ok with shipping either way, but would like people to be clearer on where the behaviour stands in comparison with other implementations and why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would like people to be clearer on where the behavior stands in comparison with other implementations and why.
Yes! I think it would be super helpful to have a clear understanding/consensus on what standards and implementations we follow and/or influence the parseArgs
API.
test/store-user-intent.js
Outdated
test('when use boolean long option used as string then result as if string', (t) => { | ||
const passedArgs = ['--bool=OOPS']; | ||
const stringOptions = { bool: { short: 'b', type: 'string' } }; | ||
const booleanOptions = { bool: { short: 'b', type: 'boolean' } }; | ||
|
||
const stringConfigResult = parseArgs({ args: passedArgs, options: stringOptions, strict: false }); | ||
const booleanConfigResult = parseArgs({ args: passedArgs, options: booleanOptions, strict: false }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought (non-blocking): I'm surprised by the results of the boolean
assertion. If I the author explicitly say --bool
is type: 'boolean'
, I don't understand why we would capture values from incorrect usage by users.
// node user-intent.js --bool=OOPS --more-flags...
const result = parseArgs({ options: { bool: { type: 'boolean' } } })
for (const [longOption, optionValue] of Object.entries(result.options)) {
if (typeof optionValue === 'string') // process string options
if (typeof optionValue === 'boolean') // process boolean options
}
☝️ This is an incredibly contrived example, but if my intent as the author was to do one thing with string
s and another with boolean
s it would break due to incorrect argument usage.
Note: This makes total sense to me if there was no configuration options, but if provided, I would think you're saying this is the argument contract my program accepts and relies on internally. If your program allows boolean-like
options to accept values, you would simply set the type
to string
and account for that in your implementation. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% if i configure an arg to be a boolean, it should throw if it's not one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have strict:false
and strict:true
.
In strict:true
mode, the "wrong" usage will throw (well, hopefully, soon!). Note, these tests use strict:false
in anticipation of that arriving.
In strict:false
mode we are doing best effort parsing no matter the input. We don't want to throw away information which might be needed for the author to make use of. We hope they pay some attention...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood the test was for strict: false
and that we would like to capture user intent without throwing away information. However, that seems more applicable to the zero-config behavior to me. If I configure a CLI to accept specific input, I should be confident the args parser will return the type I defined.
TS Playground
I took a shot at implementing some TypeScript definitions for this and it's not feeling very intuitive imo.
Notice bar
and baz
are configured as type: boolean
, yet the results can be a string
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this is non-blocking, but think this could lead to less than ideal implementations that require checking if type: 'boolean'
options are actually boolean
s. As well as convoluted TypeScript definitions that have different parse args results based on the strict
mode config. Honestly, this isn't a major issue and really just poking at possible holes in the implementation as we get closer to the Node release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding TypeScript, you said:
// Notice
bar
andbaz
are configured astype: boolean
, yet the results can be astring
.
The converse should apply to foo
too, it is configured as type: string
, yet the result can be true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What behavior do you have in mind for mis-using configured types? (Throwing, despite being strict:false?)
Not throw, I was thinking we just return a boolean
regardless if it's misconfigured.
// node misconfigured.js --foo=bar
const parsed = parseArgs({ strict: false, options: { foo: { type: 'boolean' } } })
parsed.values.foo // true
The input is already flawed, so might as well return the type I expect (as opposed to a completely different type string
). Missing --foo
would still be undefined
. Thoughts?
and for interest Minimist returns empty string for a configured "string" option which is misused without a value
Not against this either tbh, but don't have strong opinions. I'm more "concerned" about parseArgs
returning different types than the author defined.
- A high level question is why do you want to use
strict:false
mode at all?
I will likely always use strict:true
and haven't put much thought into what use cases folks have for strict:false
. Would be curious to hear your thoughts here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The converse should apply to
foo
too, it is configured astype: string
, yet the result can betrue
.
Oh really!? Maybe I do prefer the Minimist approach. i.e. setting an empty string
for a misconfigured type: string
It would be great if authors didn't have to check the typeof
before using prototype methods for a defined option type: e.g.
// node misconfigured.js --foo
const parsed = parseArgs({ strict: false, options: { foo: { type: 'string' } } })
if (parsed.values.foo?.includes('substring')) // Crashes the program
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if authors didn't have to check the typeof before using prototype methods for a defined option type
Without a throw, coercing the type means the program does not crash, but does not ensure safe and intended behaviour. In that sense crashing is arguably better behaviour than silently doing the wrong thing. Two examples:
// explode is configured as a boolean (but user mis-used)
$ node detonate.js --explode=false
// no-explode is configured as a boolean (but user typo)
$ node detonate.js --no-expode foo bar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great examples! Definitely demonstrates the severity and potential impact of misconfigured options. So I'm realizing strict:false
is probably more suited for bring your own validation and making authors responsible for robustness. Whereas, strict:true
allows authors to rely on parseArgs
for option validation and throwing errors for misuse.
Thanks @aaronccasanova , you have picked up several errata in the comments and some good suggestions and questions. |
Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>
Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>
Co-authored-by: Aaron Casanova <32409546+aaronccasanova@users.noreply.github.com>
Thanks for reviews and comments @ljharb and @Eomm and @aaronccasanova . I'll wait for @bcoe to at least have a quick look before I consider merging. Further comments welcome from any gentle readers. |
Waiting no longer. 😄 |
This incorporates ideas from #64, #68, and #69. Background on short options in #2.
Refactor parsing to use independent blocks of code, rather than nested cascading context. This makes it easier to reason about the behaviour.
Split out small pieces of logic to named routines to improve readability, and allow extra documentation and examples without cluttering the parsing. (Thanks to @aaronccasanova for inspiration.)
Existing tests untouched to make it clear that the tested functionality has not changed.
Be more explicit about short option group expansion, and ready to throw error in strict mode for string option in the middle of the argument. (See Is
strict
actually a goal? #11 and feat: Add strict mode to parser #74.)Add support for short option combined with value (without intervening
=
). This is what Commander and Open Group Utility Conventions do, but is not what Yargs does. I don't want to block PR on this and happy to comment it out for further discussion if needed. (I have found some interesting variations in the wild.) [Edit: see also Syntax for combined short option and value #78]Add support for multiple unit tests files. Expand tests from 33 to 113, but many for internal routines rather than testing exposed API.
Added
.editorconfig
file, mainly for my own convenience!