[HN Gopher] Conventions for Command Line Options ___________________________________________________________________ Conventions for Command Line Options Author : zdw Score : 89 points Date : 2020-08-01 15:02 UTC (7 hours ago) (HTM) web link (nullprogram.com) (TXT) w3m dump (nullprogram.com) | mmphosis wrote: | Do I add more to this code just for convention? The command line | option parsing (or broken ParseOptions dependency) will become | magnitudes larger and more complex than what the program does. | usage = 0 argc = len(sys.argv) if argc == 2 and | sys.argv[1] == "-r": hex2bin() else: if argc | == 2 and sys.argv[1].startswith('-w', 0, 2): s = | sys.argv[1][2::] elif argc == 3 and sys.argv[1] == '-w': | s = sys.argv[2] elif argc >= 2: usage = 1 | if usage == 0: try: width = int(s) | except ValueError: print("Error: invalid, -w | {}".format(s)) usage = 1 except NameError: | width = 40 if usage == 0: bin2hex(width) | else: print("usage: mondump [-r | -w width]") | print(" Convert binary to hex or do the reverse.") | print(" -r reverse operation: convert hex to binary.") | print(" -w maximum width: fit lines within width | (default is 40.)") sys.exit(usage) | m463 wrote: | I really do like argparse. | | It will cleanly do just about anything you need done, including | nice stuff like long/short options, default values, required | options, types like type=int, help for every option, and even | complicated stuff like subcommands. | | And the namespace stuff is clever, so you can reference arg.debug | instead of arg['debug'] | onei wrote: | I always found argparse did argument parsing well enough but it | felt clunky when you need something more complicated like lots | of subcommands. I find myself using it exclusively when I'm | trying to avoid dependencies outside the Python standard | library. | | My choice of argument parsing in Python is Click. It has next | to no rough edges and it's a breath of fresh air compared to | argparse. I recently recommended it to a colleague who fell in | love with it with minimal persuasion from me. I recommend it | highly. | | [1] https://click.palletsprojects.com/en/7.x/ | juped wrote: | Try Typer (https://typer.tiangolo.com/) sometime, which is | built on Click. | onei wrote: | At a glance it looks a bit too simple. I didn't look to far | into it, but it seems to be missing short options, | prompting for options and using environment variables for | options. As it's built on click, I'd guess you can call | into click to do those, but at that point I don't see a | major benefit typer is providing. | juped wrote: | Look, much of the purpose of posting things in a public | forum is not to directly converse with the person to whom | you are replying, but to add something relevant and | interesting to third-party readers of the thread. | | It's up to you whether you're interested in the software | package linked in the comment or not, but it is harmful | to third-party readers who might potentially be | interested in it when you inexplicably reply with | unsubstantiated lies about it, which they might take as | true. | Noumenon72 wrote: | A comment actually directed toward third-party readers | would read like "For people who like Click but could also | use X, try Typer." Using the second-person imperative to | say 'Try this' challenges your interlocutor to ask "Why | should I?", to evaluate it for their own use case, to say | "You shouldn't have recommended this to me with no | qualification, it's not for everyone." | juped wrote: | To be clear, not only is the package in question not | "missing short options, prompting for options and using | environment variables for options", there is not even the | slightest indication on its docs site that it might be | missing such features. | | I'm not harmed by the reckless indifference of that | comment, because I know a decent amount about these | libraries, and it's not my business if people want to | self-harm by believing false things, but people scrolling | the thread are potentially harmed ("damn, I like the | syntax of this, but it doesn't support short options?"). | cb321 wrote: | I feel like argh and plac preceded/inspired Click. | | Also, it's not Python but in Nim there is | https://github.com/c-blake/cligen which also does | spellcheck/typo suggestions, and allows --kebab-case | --camelCase or --snake_case for long options, among other | bells & whistles. | onei wrote: | Spell check is something I'd love to see in Click. As | complicated as git can be, I always liked the spell check | it has for subcommands. | | As for the different cases, I personally avoid using camel | or snake case purely because I don't need to reach for the | shift key. Maybe some people like it, but I find it | jarring. | cb321 wrote: | Agreed vis a vis SHIFT. | | cligen also has a global config file [1] to control | syntax options like whether '-c=val' requires an '=' (and | if it should be '=' or ':', etc.) which has been | discussed here, various colorization/help rendering | controls, etc. Shell scripts can always export | CLIGEN=/dev/null to force more strict and/or default | modes. | | [1] https://github.com/c-blake/cligen/wiki/Example- | Config-File | alkonaut wrote: | > program -iinput.txt -ooutput.txt | | What good is that? Who wants to save a space? Given -abcfoo.txt I | can't tell whether it's abcf oo.txt or -abc foo.txt? So that's a | definite drawback, and the benefit is? | ucarion wrote: | Most cli parsers don't fully support the author's suggested model | because it means you can't parse argv without knowing the flags | in advance. | | For example, the author suggests that a cli parser should be able | to understand these two as the same thing: | program -abco output.txt program -abcooutput.txt | | That's only doable if by the time you're parsing argv, you | already know `-a`, `-b`, and `-c` don't take a value, and `-o` | does take a value. | | But this is a pain. All it gets you is the ability to save one | space character, in exchange for a much complex argv-parsing | process. The `-abco output.txt` form can be parsed without such | additional context, and is already a pretty nice user interface. | | For those of us who aren't working on ls(1), there's no shame in | having a less-sexy but easier-to-build-and-debug cli interface. | goto11 wrote: | Why would someone ever want to use the second syntax? (Genuine | question, not rhetorical!) | | It seems using = for optional option arguments is only allowed | with long form. Why is that? Wouldn't it be nicer to be able to | write "program -c=blue" rather than "program -cblue"? | mehrdadn wrote: | The second syntax when you want to ask a program to pass an | argument to another program. Having to pass 2 arguments is a | lot more annoying since you'd often then have to prefix each | one. | saurik wrote: | But every single command line parser I have ever used (and I | have used many over the past 25 years in many programming | languages) does in fact know before parsing that a b and c | don't take a value and o does: accepting the grammar for the | flags and then parsing them in this way is like the only job of | the parser? | rabidrat wrote: | Some programs may allow for plugins which can have their own | options. But you may want to provide an option for where to | load the plugins from. | TeMPOraL wrote: | I suppose the only reason why you wouldn't want to have a | centralized description of the command line grammar is if | you're using flags which alter the set of acceptable flags. | Like e.g. if you put all git subcommands into one single | executable - the combinations and meanings of commandline | flags would become absurdly complicated. | rkagerer wrote: | Nice to see these guidelines all laid out. | | I grew up with the DOS / Windows convention of slashes, and | clicked the link hoping there'd be mention of it. | ur-whale wrote: | I wish he mentioned what C++ option parser sticks to the rules he | outlined. | daitangio wrote: | For python click library is fantastic and follows all the good | pratice explained in the blog post. | ur-whale wrote: | I very much like the subcommand paradigm git implements: | mycmd <global args> subcommand <subcommand specific args> | | However, I haven't found a C++ library that implements this | properly with an easy-to-use API (and what I mean by easy to use | is: specifying the structure of cmd line options should be easy | and natural, _and_ retrieving options specified by the user | should be easy, _and_ the "synopsis" part of the man page should | be auto-generated). | | If anyone knows of one, would love to hear about it. | htfy96 wrote: | Have you tried https://github.com/CLIUtils/CLI11 ? The | subcommand example can be found at | https://github.com/CLIUtils/CLI11/blob/master/examples/subco... | . It can't generate man pages though | exmadscientist wrote: | Just, whatever, you do, please please PLEASE _PLEASE_ support | `--help`. I don 't mean `-h` or `-help`, I mean `--help`. | | There's only one thing the long flag can possibly mean: give me | the help text. I understand that your program might prefer the | `-help` style, but many do not. And do you know how I figure out | which style your program likes? That's right, I use `--help`. I | have to use `--help` rather than just `-help` because of GNU | userspace tools, among others. It seems unlikely they're going to | suddenly clean up their act _this_ decade, so I have to default | to starting with `--help`. | | So it's very frustrating when the response to `program --help` is | "Argument --help not understood, try -help for help." This is | then often followed by me saying indecent things about the stupid | program. | misnome wrote: | Or worse - you pass --help and the script runs and starts doing | stuff | chriswarbo wrote: | I write a lot of commandline utilities and scripts for automating | tasks. I find environment variables _much_ simpler to provide and | accept key /value pairs than using arguments. Env vars don't | depend on order; if present, they always have a value (even if | it's the empty string); they're also inherited by subprocesses, | which is usually a bonus too (especially compared to passing | values to sub-commands via arguments, e.g. when optional | arguments may or may not have been given). | | Using arguments for key/value pairs allows invalid states, where | the key is given but not the value. It can also cause subsequent, | semantically distinct, options to get swallowed up in place of | the missing value. They also force us to depend on the order of | arguments (i.e. since values must follow immediately after their | keys), despite it being otherwise irrelevant (at least, if we've | split on '--'). I also see no point in concatenating options: it | introduces redundant choices, which imposes a slight mental | burden that we're better off without. | | The only advice I wholeheartedly encourage from this article is | (a) use libraries rather than hand-rolling (although, given my | choices above, argv is usually sufficient!) and (b) allow a '--' | argument for disambiguating flag arguments from arbitrary string | arguments (which might otherwise parse as flags). | TeMPOraL wrote: | Some tangential musings: | | I can't stop but see here parallels between cmdline arguments | vs. environment variables for programs, and keyword arguments | vs. dynamic binding for functions in a program (particularly in | a Lisp one). | | That is, $ program --foo=bar vs. $ FOO=bar program | | seems analogous to: (function :foo "bar") | ;; vs. (let ((*foo* "bar")) (function)) | | When writing code (particularly Lisp), keyword arguments are | preferred to dynamic binding because the function signature is | then explicitly listing arguments it uses, and dynamic binding | (the in-process phenomenon analogous to inheriting environment | from a parent process) is seen as dangerous, and a source of | external state that may be difficult to trace/spot in the code. | | I suppose the latter argument applies to env vars as well - you | can accidentally pass values differing from expectation because | you didn't know a parent process changed them. The former | doesn't, because processes don't have a "signature" specifying | its options, at least not in a machine-readable form. Which is | somewhat surprising - pretty much all software written over the | past decades follows the pattern of accepting argc, argv[] (and | env[]), and yet the format of arguments is entirely hidden | inside the actual program. I wonder why there isn't a way to | specify accepted arguments as e.g. metadata of the executable | file? | chriswarbo wrote: | Interesting that you bring up Lisp and dynamic scope, since | I've previously combined env vars with Racket's "parameters" | (AKA dynamic bindings): | https://lobste.rs/s/gnniei/setenv_fiasco_2015#c_owz2pp | TeMPOraL wrote: | Yeah, I just can't stop thinking about envs / dynamic | binding when someone brings up the other. | | Speaking of using both direct and dynamic arguments, a | common pattern in (Common) Lisp is using dynamically-bound | variables (so-called "special" variables) to provide | default values. Since default values in (Common) Lisp can | be given as any expression, which gets evaluated at | function-call time, I can write this: | (defun function (&key (some-keyword-arg *some-special*)) | ...) | | And then, if I call this function with no arguments, i.e | (some-function), the some-keyword-arg will pull its value | from current dynamic value of _some-special_. In process | land, this would be equivalent to having command line | parser use values in environment variables as defaults for | arguments that were not provided. | aidenn0 wrote: | Lisp will at least warn you if you bind a variable with | earmuffs that isn't declared special though. Biggest downside | to environment variables is the lack of warning if you | misspell something. | vbernat wrote: | It seems equivalent to long options requiring a value. Also, if | you mistype an environment variable, you won't get warned about | it. | enriquto wrote: | The beautiful thing of environment variables is that you can | read them whenever you actually need them in your program. They | pierce a hole all over your call stack to the point that you | need them. On the contrary, for command line arguments, you | need to pass their values from the main function through | whatever deep of the call stack you need them. | carlmr wrote: | I think that's an advantage. You know what goes in and out, | you can see where it's passed and used. It provides | transparency and fewer accidental behavioral changes. | | E.g. I had a really weird bug in a script, git seemed to | break for no reason, until i figured out that some other | script had exported GIT_DIR which pointed to the git top | level directory. The name of the variable isn't bad if want | to save that directory. But git uses GIT_DIR as the location | of the .git directory it should look at when it is defined. | | Using environment variables is kind of like using global | state, it can often cause weird behavior at a distance | without a clear path of influence if you don't know exactly | which environment variables are used by all your scripts. | klhugo wrote: | Quick comment, not an expert, but environ vars keep their state | after the program is called. From a functional programming | perspective, or just for my own sanity, wouldn't it be more | interesting to keep the states into minimum? | rabidrat wrote: | That's only if you `export` the vars. You can do `FOO=1 BAR=2 | cmd` and FOO and BAR will only have those values for the | process and children. This is isomorphic to `cmd --foo=1 | --bar=2`. | chriswarbo wrote: | You're right that mutable state should be kept to a minimum, | but immutable state is fine. There usually isn't much need to | mutate env vars. | | Some thoughts/remarks: | | - If we don't change a variable/value then it's immutable. | Some languages let us enforce this (e.g. 'const'), which is | nice, but we shouldn't worry too much if our language | doesn't. | | - We're violating 'single source of truth' if we have some | parts of our code reading the env var, and some reading from | a normal language variable. This also applies to arguments | through. | | - Reading from an env var is an I/O effect, which we should | minimise. | | - We can solve the last 2 problems by reading each env var | once, up-front, then passing the result around as a normal | value (i.e. functional core, imperative shell) | | - Env vars are easy to _shadow_ , e.g. if our script uses | variables FOO and BAR, we can use a different value for BAR | _within a sub-command_ , e.g. in bash: | BAR=abc someCommand | | This will inherit 'FOO' but use a different value of 'BAR'. | This isn't mutable state, it's a nested context, more like: | let foo = 123 bar = "hello" baz = | otherCommand quux = let bar = "abc" | in someCommand | | As TeMPOraL notes, env vars are more like _dynamically | scoped_ variables, whilst most language variables are | _lexically scoped_. | ucarion wrote: | Another nice thing about env vars is that they don't appear in | the process table, unlike argv. For many applications, that | makes them a suitable place to inject secrets, and makes them | convenient to inject via wrapper tools. | | For instance, the fact that the aws(1) cli tool and the AWS | SDKs by default take IAM creds from AWS_ACCESS_KEY_ID, | AWS_SECRET_ACCESS_KEY, etc. means that you can write programs | like aws-vault (https://github.com/99designs/aws-vault) which | can acquire a temporary AWS session and then execute another | command with the creds injected in the standard AWS_... env | vars. For instance: aws-vault exec prod-read | -- aws s3 ls aws-vault exec stage-admin -- ./my-script- | that-uses-the-aws-sdk-internally | | Also, passing arguments as env vars is part of the "12-factor | app" design (https://12factor.net/config). That page has some | good guidance. | fanf2 wrote: | The environment _does_ appear in the process table: see `ps | e` on Debian at | https://manpages.debian.org/buster/procps/ps.1.en.html or ps | -e on BSD at https://www.freebsd.org/cgi/man.cgi?query=ps&apr | opos=0&sekti... or pargs -e on Solaris at | https://www.unix.com/man-page/opensolaris/1/pargs | yjftsjthsd-h wrote: | It's protected by at least uid, but `/proc/$PID/environ` | _does_ exist, and it might be exposed other ways, too. | pfranz wrote: | I think there's a place for both, but env vars can make things | really annoying to troubleshoot. Already, I often print all env | vars before running automated commands and it can be a mess to | dig through. Culling down environment variables when spawning a | subprocess is difficult. Bad flags can error immediately if you | made a typo. I've often misspelled an envvar and its hard to | tell it did nothing (I think I saw a recent bug where trailing | white space like "FOO " was the source of a years long bug). | `FOO=BAR cmd` is also weird for history (although that's mostly | a tooling issue). | jpitz wrote: | In my head there's been a hierarchy for a long time. | | when I build command line utilities and I think about the way | that they'll be used, I tend to use configuration files for | things that will change very slowly over time, and environment | variables as a way to override default behaviors, and command | line arguments to specify things that will often vary from | invocation to invocation. In fact, most of the time, I use the | environment variables either for development/testing features | that I don't really intend to expose to most users, or for | credentials that don't get persisted. | | it's never occurred to me to use environment variables as a | primary way to configure an application. I'll have to noodle on | that for a while. My gut says that it's enough of a deviation | from the Unix convention that I probably won't use that. | AnonC wrote: | I'm curious if your preference for environment variables is | only for writing programs or for using programs as well. From a | user's perspective, would you prefer that common commands use | environment variables to get options? For example, would you | prefer a "find" command that uses environment variables instead | of command line options? | chriswarbo wrote: | > For example, would you prefer a "find" command that uses | environment variables instead of command line options? | | I use 'find' so much that my muscle memory would hurt if it | changed, but it's actually a really interesting example. Its | use of arguments to build a domain-specific language is a | cool hack, but pretty horrendous; e.g. find | . -type f -a -not \( -name \*.htm -o -name \*.html \) | | We can compare this to tools like 'jq', which use a DSL but | keep it inside a single string argument: | key=foo val=bar jq -n '{(env.key): env.val}' | | Note that 'jq' also accepts key/value pairs via commandline | arguments, but that requires _triplets_ of arguments, e.g. | jq -n --arg key foo --arg val bar '{(env.key): env.val}' | syshum wrote: | That would depend on the target, the Grand Parent talks about | Automation thus the target for their utilities is most likely | the system not a user. | | When writing automation routines env vars are often better | form my experience | | Tools that will be manually run by a user or admin then cli | options are better | ohazi wrote: | > Go's [...] intentionally deviates from the conventions. | | _Sigh_ | | Of course if does. | trasz wrote: | Go -options sound like what X11 uses. | programd wrote: | My impression is that nobody bothers with the Go flags package | and most people use the POSIX compatible pflag [1] library for | argument parsing, usually via the awsome cobra [2] cli | framework. | | Or they just use viper [3] for both command line and general | configuration handling. No point reinventing the wheel or | trying to remember some weird non-standard quirks. | | [1] https://github.com/spf13/pflag | | [2] https://github.com/spf13/cobra | | [3] https://github.com/spf13/viper | crehn wrote: | Well it does keep things simple, and as a side effect removes | the cognitive burden of choosing a certain argument style (for | both the developer and user). | dkrajzew wrote: | Hello there! | | Yep, it's some kind of an advertisement, but for an open source | project. | | I made some experience with parsing command line options when | working on an open source traffic simulation named SUMO | (http://sumo.dlr.de) and decided to re-code the options library. | It works similar to Python's argparse library - the options are | defined first, then you may retrieve the values type-aware. | | You may find more information on my blog pages: | http://krajzewicz.de/blog/command-line-options.php | | The library itself is hosted on github: - cpp-version: | https://github.com/dkrajzew/optionslib_cpp - java-version: | https://github.com/dkrajzew/optionslib_java | | Sincerely, Daniel | smusamashah wrote: | program -c # omitted program -cblue # provided program -c blue # | omitted | | This is confusing | goto11 wrote: | Your comment is confusing due to missing line breaks :-) Try | indenting the lines with two spaces. | | And I agree, I would prefer: program -c=blue | smusamashah wrote: | Too late for that. Was on mobile app. | | program -c # omitted program -cblue # provided program -c | blue # omitted | | This doesn't make sense. How -c blue is omitted while it | looks just like an argument for -c | hibbelig wrote: | Using the equals sign for long options has become convention, | but using the equals sign for short options is very unusual. | | I find "-c blue" the most intuitive of the possibilities. It | does mean that you can't have an optional argument for short | options. | helltone wrote: | Command-line options have conventions and so does the --help | message. There are command-line parsing libraries where you | specify your options and the help gets autogenerated. But I often | wonder if a better approach would be to do the opposite? Ie write | the --help screen (following usual conventions) and let the | option parsing be generated from it by some library. | barrkel wrote: | Very opinionated, and IMO without enough justification. | | The assertions around short options with arguments - conjoining | with other short options, for example - are actively harmful to | legibility in scripts, since there's no lexical distinction | between the argument and extra short options. I don't recommend | using that syntax when alternatives are available and I | deliberately don't support it when implementing ad-hoc argument | parsing (typically when commands are executed as parsing | proceeds). | Spivak wrote: | Counterexamples where this is good for legibility. | tar -xf archive.tar.gz tar -czf archive.tar.gz | dir/ | curryhoward wrote: | I'm guessing the comment was talking about examples like | this: program -abcdefg.txt | | Just from reading this, you can't tell where the flags end | and the filename begins unless you have all the flags and | their arities memorized. | cellularmitosis wrote: | > tar -czf dir/ | | You forgot the output filename. | | The thing I like about tar is that you only need to learn two | options: 'c' and 'x': tar c somedir | gzip | > somedir.tar.gz cat somedir.tar.gz | gunzip | tar x | misnome wrote: | I recently ran into a case I hadn't seen before with python's | argparse. Multiple arguments in a single option, e.g. "--foo bar | daz" with --foo set to '*' swallows both bar and daz, where I | would have expected to have to explicitly specify "--foo bar | --foo daz" to get that behaviour. I guess this is a side effect | of treating non-option arguments the same as dash-prefixed | arguments, but I have no idea what the "standard" to expect with | this is? | | Otherwise, my main bugbear is software using underscore instead | of dash for long-names, and especially applications or suites | that mix these cases. | | I really like the simplicity that docopt somewhat forces you | into, which avoids most of these tricky edge cases, but am seeing | less and less usage nowadays of it. ___________________________________________________________________ (page generated 2020-08-01 23:00 UTC)