[HN Gopher] In praise of --dry-run
       ___________________________________________________________________
        
       In praise of --dry-run
        
       Author : Smaug123
       Score  : 295 points
       Date   : 2021-05-24 11:26 UTC (11 hours ago)
        
 (HTM) web link (www.gresearch.co.uk)
 (TXT) w3m dump (www.gresearch.co.uk)
        
       | blt wrote:
       | Abstracting a bit, this is an application of the principle
       | "represent important intermediate results as explicit data
       | structures". It shows up in many places: testing, extensibility,
       | logging, and parallelization are all easier to implement when the
       | result is represented explicitly instead of implicitly in the
       | program's call stack.
       | 
       | Has anyone written about this idea? It seems to have some
       | interesting interaction with laziness.
        
       | nerpderp82 wrote:
       | Even better is to invert the flag, that default is to dump the
       | plan and have to pass an `--actually-do-it` to have it make the
       | changes.
        
       | nicoburns wrote:
       | For particularly dangerous operations, even better is to make dry
       | run the default, and force users to explicitly opt-in to actually
       | running it.
        
         | ben509 wrote:
         | I've done that and called the counter option --for-real, but
         | I'd like both args so runbooks can spell out:
         | 1. do-step-one --dry-run         2. (do validation)         3.
         | do-step-one --real-run         4. (do validation)
         | 
         | Many commandline parsing libraries assume you're doing
         | booleans, and I think --dry-run and --no-dry-run is confusing.
         | And, internally, you have a boolean flag so there's always the
         | possibility of some code getting it backwards.
         | 
         | Internally, I'd like an enum flag that's clearly dry_run or
         | real_run, so the guards are using positive logic:
         | switch(run) {           case dry_run:             print("Would
         | do this...");           case real_run:
         | do_real_thing();         }
        
           | Smaug123 wrote:
           | _coughs gently_ I use `DryRunMode.Dry` and `DryRunMode.Wet`
           | in the OP.
        
             | kelnos wrote:
             | I think the point here, though, is that the user needs to
             | be explicit on the command line: they have to specify one
             | of `--wet` or `--dry`; specifying neither is an error. It's
             | not clear from your code if you do that, or if you
             | interpret `--dry-run` as `DryRunMode.Dry` and the absence
             | of an option as `DryRunMode.Wet`.
             | 
             | While I kinda like this idea in principle, I haven't really
             | seen any CLI apps that require an explicit option for both
             | modes, so it might be a bit of unexpected UX for people.
        
         | llimllib wrote:
         | I've written a couple tools that default to dry run and require
         | `--yes-actually-delete` and similar, but I can't think of any
         | publicly available tools that follow this pattern - anybody
         | have suggestions?
        
           | Phlogistique wrote:
           | https://github.com/dmerejkowsky/ruplacer
           | 
           | Uses `--go` to confirm.
        
           | jerf wrote:
           | rm has "--no-preserve-root", which isn't quite "--no-
           | seriously-delete-my-root-directory" but is at least heading
           | in that direction.
        
             | llimllib wrote:
             | if you squint, you can kind of see file permissions as
             | implementing this pattern? rm /usr fails unless you sudo rm
             | /usr
        
             | gbrown_ wrote:
             | Obligatory Cantrill reference
             | https://youtu.be/wTVfAMRj-7E?t=5046. The whole series of
             | these are well worth a listen.
        
           | TFortunato wrote:
           | Not quite the same, but apt commands like apt-get require
           | user interaction to confirm what you are about to do, unless
           | you explicitly add a "-y" / "--yes" to the command.
        
             | stevekemp wrote:
             | apt will sometimes prompt you to confirm really dangerous
             | options too:                      sudo apt-get purge login
             | ..            WARNING: The following essential packages
             | will be removed.            This should NOT be done unless
             | you know exactly what you are doing!              login
             | 0 upgraded, 0 newly installed, 1 to remove and 303 not
             | upgraded.            After this operation, 1,212 kB disk
             | space will be freed.            You are about to do
             | something potentially harmful.            To continue type
             | in the phrase 'Yes, do as I say!'             ?]
        
           | OskarS wrote:
           | "git clean" sort-of works like this. It errors out unless you
           | explicitly specify if it's a real run or dry run.
        
             | nicoburns wrote:
             | `git push` in cases of conflict too.
        
           | btdmaster wrote:
           | sed does not operate on files directly unless turned
           | (-i)nteractive.
        
             | adrianstoll wrote:
             | I only use sed on version controlled files and immediately
             | run git diff afterwards to check the results.
        
             | llimllib wrote:
             | oh good one, prettier does the same thing - output to
             | stdout unless --write is set
        
             | okl wrote:
             | -i is short for --in-place
        
             | gbrown_ wrote:
             | Minor quibble the -i option mnemonic is "in-place" rather
             | than "interactive".
        
           | Smaug123 wrote:
           | Also not quite the same, but `git clean` bails out extremely
           | early unless you supply one of -i, -f, or -n (the latter of
           | which is `--dry-run`).
        
           | Izkata wrote:
           | Not exactly the same, but along the same lines: mysql has an
           | --i-am-a-dummy flag so that DELETE queries won't run without
           | a WHERE.
        
           | tshaddox wrote:
           | React has a deliberately inconvenient API for injecting
           | unsanitized HTML markup straight into the DOM:
           | <div dangerouslySetInnerHTML={{ __html: "Some raw HTML"
           | }}></div>
           | 
           | https://reactjs.org/docs/dom-
           | elements.html#dangerouslysetinn...
        
           | matja wrote:
           | `hdparm --yes-i-know-what-i-am-doing --please-destroy-my-
           | drive --fwdownload`
           | 
           | https://github.com/Distrotech/hdparm/blob/4517550db29a91420f.
           | ..
        
           | wgarvin wrote:
           | Perforce's command line app uses this pattern for one of its
           | most dangerous commands: p4 obliterate //depot/path/... which
           | prints out exactly which files will be obliterated from the
           | server (which not only removes their entire history, but can
           | corrupt other files too if you don't p4 snap them first).
           | 
           | To actually carry out the obliterate action, you have to add
           | the -y option.
        
         | kirkules wrote:
         | Definitely true. In this case, it's important to vet
         | guides/examples to make sure you minimize copypastable examples
         | that have the --no_dry_run flag
        
         | globular-toast wrote:
         | I don't like this because it trains users to assume that
         | commands can never do dangerous stuff by default.
        
         | steego wrote:
         | It's a good opportunity to use long options like:
         | 
         | --i-acknowledge-this-is-dangerous-and-i-gave-this-more-than-
         | cursory-thought
        
         | jakebasile wrote:
         | Yep, this is what I do now. The dry run is the default, you
         | have to flag `--really` to do anything permanent.
        
       | xupybd wrote:
       | Gresearch puts out some interesting articles, and they work in
       | F#. I wish I were in the UK, I'd love to try and get a job there.
        
         | dkarp wrote:
         | They're an interesting organization.
         | 
         | See this past submission:
         | https://news.ycombinator.com/item?id=18499712
        
         | Smaug123 wrote:
         | I believe (though can't swear to it) that we do provide
         | immigration/relocation assistance and money, at least for quant
         | researchers and possibly for all employees. If that would be
         | enough to get you interested, feel free to drop me a line at
         | patrick.stevens@gresearch.co.uk and I can find out actual
         | details rather than half-remembered possibly-false snippets
         | from years ago.
        
       | yuchi wrote:
       | Fantastic article, and I promote a stricter approach: `--dry-run`
       | is the default, and you need to disable with `--no-dry`.
       | 
       | For irreversible (such as deletions) actions I even go one step
       | further with a `--no-dry=<some-non-trivial-value>` in order to
       | force some thinking.
       | 
       | Edit: apparently this is way more common than I thought, by
       | reading comments here!
        
       | kevincox wrote:
       | Their example of pushing it into the type system is nice but
       | requires a language with a fairly powerful type system. Once
       | simple way to do this in basically any statically typed language
       | is just requiring a "Run Token" for any bit of code that does
       | something "for real". This method works well if you want the
       | output to be similar or identical for dry run and actual run.
       | struct RunToken;              fn do_the_thing(_: RunToken, ...);
       | 
       | The general approach works for just about any statically typed
       | languages (although null is the enemy).
       | 
       | You can make it a bit more tedious to generate that RunToken so
       | that a `do_the_think(RunToken, ...)` looks more natural. But the
       | idea is simply that if the only place you create a RunToken is
       | when parsing the --no-dry-run flag you can't accidentally
       | do_the_thing when you move the code out of the `if is_dry_run`
       | block without noticing.
       | 
       | Of course this is a simple way to get fairly reliable dry run. I
       | do agree that having seperate plan and apply steps is approach is
       | the best when you can do it.
        
       | wheybags wrote:
       | I've made a further extension of this principle for a video game
       | release script, by adding a --unsafe flag. The script was
       | responsible for building the game on a bunch of platforms,
       | uploading it to various storefronts and announcing the release on
       | twitter, reddit etc. There was a --dry-run flag too, that just
       | prints what it would do, but sometimes you want to test part of
       | the script without doing anything public. There were various
       | flags like --no-upload and --no-twitter etc, but it was easy to
       | forget them, so I added in a top level --unsafe flag. At every
       | point where we do something public, at the deepest level of the
       | call stack just before doing it, I check for unsafe, and if its
       | not present, throw an exception. It worked pretty well, and
       | definitely saved my ass a few times, especially during
       | refactoring.
        
       | agumonkey wrote:
       | I wonder how it would feel to have every program dry-runnable but
       | composable.. basically FP/virtual systems. You can (dry-run
       | cmd-0) | (dry-run ...) | (dry-run cmd-n) and it will yield a
       | potential system state.
        
       | notafraudster wrote:
       | How do you all write dry-run logic for utilities that perform
       | file operations (edits, deletions, creations) and have some kind
       | of sequential dependency in operations such that later steps
       | depend on the previous steps having been executed?
       | 
       | I have found it a frustrating nut to crack because it seems like
       | it needs to involve either writing a simulation layer that tracks
       | the state of all the file calls virtually, or else copy all the
       | files to a temp folder (so the dry run isn't dry, it's just on a
       | separate version of the data). Both of these seem like bad
       | solutions.
        
         | Eridrus wrote:
         | Besides having dry-run support, I think this gets to the
         | question of what the level of abstraction should be, and I
         | think that is clearly application-specific.
         | 
         | Maybe in your case the level you are trying to report dry-run
         | information at is too granular.
         | 
         | Or maybe you have too tight coupling between your code that
         | determines what needs to be changed and that which actually
         | does the change and you might want to refactor the code that
         | determines changes to the code that makes the changes.
        
         | gameman144 wrote:
         | I don't see anything wrong with the "copy to a temporary
         | directory" approach if your function actually does operate on
         | files within a directory. In that scenario, copying to a
         | different directory is actually probably _exactly_ what you
         | want for a dry-run, since then the code that you 're executing
         | is the same exact code that would run were you to execute in
         | non-dry-run mode (as opposed to virtual operations which are
         | prone to bugs if that virtualization layer and your real
         | operations ever fall out of sync).
        
         | ItsMonkk wrote:
         | I don't know of any system that works like this currently, but
         | it seems like you are asking if you can switch branches, do the
         | work, then your "dry-run" is a pull request complete with all
         | files changes which you can approve manually.
        
         | programmarchy wrote:
         | Copying to a temp folder sounds pretty reasonable to me.
        
         | lamontcg wrote:
         | Yes, if you call a black box outside of your code base which
         | does something then subsequent operations can't easily be dry-
         | run'd.
         | 
         | So if you need to install a package which sets up a systemd
         | service, the subsequent dry-run of managing that systemd
         | service would fail because the service doesn't exist yet. Or
         | you need to make assumptions in the service dry run that it
         | should have been set up before.
         | 
         | You can just assume that the service would have been setup and
         | report that you were, say, enabling and starting the service.
         | Or you could require some kind of hint to be setup from the
         | package installation to the specific service configuration (for
         | well known things this could be done by default, but for
         | arbitrary user-supplied packages this cannot be done). Or you
         | could list the entire package contents and try to determine if
         | the manifest looks like it is configuring a service.
         | 
         | And it still may fail because you don't know the contents of
         | that file without extracting it and the service may not parse
         | at all.
         | 
         | And that's just a simple package-service interaction example.
         | You could spend a week noodling on how to do that fairly
         | precisely, then there's a hundred or a thousand other
         | interactions to get correct.
         | 
         | You're being told not to do the thing, so there's fundamentally
         | a black-box there, and you need to figure out how much you're
         | going to cheat, how much you're going to try to crack open the
         | black box into some kind of sandbox so you can figure out its
         | internals, and how much you're going to offload onto the user.
         | Not actually an easy problem at all.
        
           | swiley wrote:
           | Yeah, the ability to indicate a shell command is pure (or
           | that it must be pure) is something that's really missing in
           | POSIX-like APIs. It's something I've certainly missed.
           | Something like fork that disables write outside a handful of
           | file descriptors (like one the parent starts with popen)
           | would be pretty awesome. Maybe BSD jails do that.
        
             | lamontcg wrote:
             | In general an composable dry run API with the ability to
             | make promises would be good. Then its on the lower level
             | black box to have been tested correctly and make accurate
             | promises.
             | 
             | In practice though what you'll find is that its easier to
             | treat whole systems as black boxes, and test changes in
             | throwaway virt systems, then test them in development
             | environments, then roll them out to prod (or that whole
             | immutable infrastructure thing and throw it all away and
             | replace so you don't get bitten by dev-prod discrepancies
             | in theory).
        
       | zbentley wrote:
       | Even better than a dry-run flag is to have the noop mode be the
       | default behavior and have e.g. "--execute" enable non-read-only
       | behavior (or, for really scary commands like programmatic data
       | destroyers, "--nuke-my-data-i-solemnly-swear-i-know-what-i-am-
       | doing"). Prevents plenty of accidental mistakes by sleepy
       | operators.
       | 
       | That's a quibble though, the article is well-put and quite
       | correct about the value of a dry-run mode.
        
         | zbentley wrote:
         | Oops, I read the article but not the comments. I see I'm far
         | from the first person to suggest this, apologies for the
         | duplication.
        
       | aranchelk wrote:
       | Dry-run can be implemented succinctly in Python with a function
       | decorator wrapping those functions that should run conditionally.
       | 
       | Nice aspects of this technique:
       | 
       | Inside the decorator you have access to both the function's name
       | and its arguments so you can print descriptive messages to the
       | user e.g. "In dry run, would have run function 'delete' on
       | '/foo/bar'"
       | 
       | You can use the name of the decorator as de facto documentation,
       | e.g. "@destructive"
        
       | amirkdv wrote:
       | I'm adamantly pro dry-run and like OP I've found that you really
       | need to design for it. From my experience, OP's two main ideas
       | are spot on: librarification and pushing dry-run logic deeper
       | into library code.
       | 
       | Here are two other things I've found:
       | 
       | 1. Regardless of where you push your run/dry-run dispatch to, the
       | underlying "do it" logic really needs to be factored properly for
       | individual side effects (think functional). Otherwise, you
       | inevitably end up with loads of "if dry-run / else" soup or
       | worse, bugs _caused_ by your dry-run support.
       | 
       | 2. You still need safety nets for when someone is doing dangerous
       | operations without dry-run. Pointing and calling [0] is a great
       | trick for this. For example, say your tool is about to delete X,
       | Y, and Z. Instead of a simple "Yes" confirmation which the user
       | would quickly end up doing on auto-pilot, you could have a more
       | involved confirmation like "Enter the number of resources listed
       | above to proceed" and then only proceed to delete if the user
       | enters "3".
       | 
       | Very curious to hear about design idioms folks have come up with!
       | 
       | [0] https://en.wikipedia.org/wiki/Pointing_and_calling
        
         | lytefm wrote:
         | > 2. You still need safety nets for when someone is doing
         | dangerous operations without dry-run.
         | 
         | Another important property is idempotency. Especially if the
         | script involves network requests or moving files around, you'll
         | want to reach the goal by re-running in case something breaks
         | half-way.
        
         | IgorPartola wrote:
         | What I like is a system where you first create a plan for what
         | you will do, then there is an executor that will execute the
         | plan. So --dry-run just doesn't run the executor. Of course,
         | depending on what you are working with that might not be
         | possible, but if you can design it like so, do it. It also
         | makes everything nicely decoupled.
        
           | renewiltord wrote:
           | I'm sure there have been older tools with this, but Terraform
           | is the first I encountered with it and I really like this
           | model.
        
             | ficklepickle wrote:
             | rsync is where I first encountered it. I've made it a habit
             | to always do a dry run first as a sanity check and it has
             | saved me some headaches.
        
               | renewiltord wrote:
               | rsync has a plan mode? I had no idea.
        
           | cortesoft wrote:
           | Yeah, not sure how this would work with any workflow that
           | caLLs out to another service and uses the result of that to
           | determine the next step, if that intermediate call changes
           | state somewhere.
        
             | dundarious wrote:
             | Then you can only really dry run that first phase anyway.
        
             | minitoar wrote:
             | Indeed, the other service then needs to support dry runs.
        
             | bcbrown wrote:
             | Are you worried about that intermediate call changing
             | between the dry run and the actual run? That can be avoided
             | by saving the execution plan and allowing the actual run to
             | load the execution plan from the dry run.
        
             | ItsMonkk wrote:
             | I think a large part of the next 10 years is us figuring
             | out that frameworks are an anti-pattern and that everything
             | is going to have to move to libraries that abide by CQRS.
        
           | klodolph wrote:
           | Yes, I like this a lot. You often end up with a main code
           | path that looks like this:                   do a bunch of
           | stuff              make a plan              if dryRun {
           | plan.Print()             exit()         }
           | plan.Execute()
        
             | toxik wrote:
             | Lines 1 & 3 had side effects, sad panda
        
               | toxik wrote:
               | I guess you really do have to preface your jokes with
               | joke disclaimers these days.
        
               | klodolph wrote:
               | It's impossible to tell if someone is being sarcastic on
               | the internet, because no matter how stupid your comment
               | is at face value, there's always a percentage of people
               | on the forum stupid enough to say it in earnest.
               | 
               | https://en.wikipedia.org/wiki/Poe%27s_law
               | 
               | Use a winking smiley face emoticon ;-) or /s
        
               | klodolph wrote:
               | Reading from disk, spending CPU cycles, allocating
               | memory, and making queries on the network are all
               | technically side effects--but they're not really cause
               | for "sad panda".
               | 
               | Everything on a computer has _some_ side effects, and the
               | purpose of --dry-run is to stop execution without certain
               | effects that we care about. It 's impossible to eliminate
               | side effects entirely, so this is not a goal.
        
               | numpad0 wrote:
               | I had http:// 192.168.100.123 :7654/lights_on.sh that
               | idempotentially turns on lights in my room. Trying to
               | query or even pre-fetching does it. I was aware it's not
               | wisest, wasn't as stupid as to expose it, also it broke
               | so it is no more, but there's always the guy who does
               | that.
        
               | klodolph wrote:
               | The "n" key on your keyboard is connected to some
               | dynamite attached under the desk, if you type "--dry-run"
               | or even just "-n", it plays the song "Those Endearing
               | Young Charms" and then explodes, killing everyone in the
               | room.
               | 
               | https://www.youtube.com/watch?v=ZLNduq2pnN0
               | 
               | At some point it doesn't matter if there's "the guy who
               | does that", because the right choice is to let "the guy
               | who does that" deal with the consequences of their own
               | crazy setup.
        
         | GuB-42 wrote:
         | Dry run is among the things you really need to design for,
         | others are:
         | 
         | - cancel
         | 
         | - undo/redo
         | 
         | - progress bars
         | 
         | Progress bars with accurate timing are notoriously difficult,
         | or even impossible to get right. But even regardless of timing,
         | having a progress bars that really shows progress and doesn't
         | freeze is hard. Every slow operation has to have some sort of
         | callback mechanism to update progress, and you have to know the
         | number of steps in advance.
         | 
         | Undo requires, for each operation, to know how to roll back.
         | You also need to have every operation go through the undo
         | stack. Another option is to have an efficient snapshot system,
         | which may be just as hard or even harder.
         | 
         | Cancel is actually the hardest to do right because it combines
         | the difficulties of both the progress bar and undo. You have to
         | have a way to interrupt a long operation at any time and get
         | back to before it started. And because "cancel" is most often
         | used when things go wrong (ex: disk full, bad connection, ...)
         | you have to be very careful with error handling.
        
         | whateveracct wrote:
         | > Regardless of where you push your run/dry-run dispatch to,
         | the underlying "do it" logic really needs to be factored
         | properly for individual side effects (think functional).
         | Otherwise, you inevitably end up with loads of "if dry-run /
         | else" soup or worse, bugs _caused_ by your dry-run support.
         | 
         | --dry-run is one of the best motivations for using granular,
         | extensible effects
         | 
         | when you constrain literally every bit of IO your program is
         | allowed to do, you can now 100% know you've stubbed them all
         | out when doing a dry run
        
         | hinkley wrote:
         | > I've found that you really need to design for it
         | 
         | It's an architectural step a lot of people skip, and by doing
         | so you often end up in a situation where the only way to make
         | things faster is via caching, and chasing caching bugs for the
         | rest of your tenure.
         | 
         | One of the less appreciated aspects of model-view-controller is
         | that usually it demands that you plan out your action before
         | you do it - especially when that action is read-only. Like a
         | cooking recipe, you gather up all of the ingredients at the
         | beginning and then use them afterward.
         | 
         | By fetching all of the data early, you reduce the length of the
         | read transactions to the database, allowing MVCC to work more
         | efficiently. You also paint a picture of the dependency tree in
         | your application, up high where people can spot performance
         | regressions as they show up in the architecture - where there's
         | a better opportunity to intercede, _and_ to create teachable
         | moments.
         | 
         | To make dry-run work you are best served by book-ending your
         | reads and your writes, because if the writes are spread across
         | your codebase how do you disable all of the writes? And if any
         | reads are after writes, how do you simulate that in your dry-
         | run code? It won't be easy, that's for sure.
         | 
         | The problem is that we tend to write code stream-of-
         | consciousness, grabbing things just as we need them instead of
         | planning things out. This results in a web of data dependencies
         | that is scattered throughout the code and difficult or
         | impossible to reason about. This is where your caching insanity
         | really kicks into high gear.
         | 
         | To my eye, Dependency Injection was sort of a compromise. You
         | still get a good deal of the ability to statically analyze the
         | code but you can work more stream-of-consciousness on a new
         | feature. But it does rob you of some more powerful caching
         | mechanisms (memoization, hand tuning of concurrency control,
         | etc)
        
           | Smaug123 wrote:
           | This is the first comment I've used HN's "favorite" feature
           | on. How wonderfully insightful.
        
         | kbenson wrote:
         | Usually I just use dry run directives to make sure stuff is
         | working right when I first develop it, so all that is worked
         | out from the very beginning. I also like to develop with a lot
         | of debug output gated behind debug as well, and just leave that
         | there for the inevitable but at some point that would make me
         | add it if not already present.
         | 
         | What I don't like is passing those as params as args. Currying
         | this stuff around is very error prone and cumbersome. I prefer
         | to set environment variables for DRY_RUN and DEBUG (with debug
         | accepting higher integer levels for more debugging).
         | 
         | This works wonderfully for me since I never have to worry I
         | didn't pass something along correctly, I'm always asking the
         | global set at runtime.
        
           | d4mi3n wrote:
           | There are other approaches to this as well:
           | 
           | 1. Passing around a configuration or context (similar to your
           | gripe around params)
           | 
           | 2. Referencing some kind of global configuration (a la env
           | vars)
           | 
           | 3. Referencing a local or scoped configuration (for example,
           | a method of an object can check instance variables that
           | dictate behavior)
           | 
           | I personally prefer either a passed context or a local
           | configuration; I find both easier to test in isolation.
           | Global contexts have their uses, but tend to become
           | problematic when they clash with other libraries or tools
           | that may also be present in the execution environment.
        
             | kbenson wrote:
             | Those are good approaches, and I wouldn't try to use
             | manually set env vars for most things. I do tend to think
             | they work very well for debug an dry run options though,
             | because those are generally ephemeral, and things you might
             | want to set ad-hoc in different environments easily without
             | changing the config for everything in that environment, or
             | passing around a special config which may not be updated
             | when the real one is.
             | 
             | That said, I'm not married to it, if I saw something that
             | seemed obviously better, I would switch. I also suspect
             | that different languages may make one approach
             | easier/better than others based on their capabilities,
             | idioms, etc. In many scripting languages, accessing an
             | environment variable is extremely easy. In some compiled or
             | more strictly typed languages, the access and conversion to
             | the expected type might be cumbersome enough to do on site
             | that it's worse, and if you are standardizing in come
             | parsing routine, that might tip the benefits in favor of
             | some global context that is used instead.
        
           | hinkley wrote:
           | I have some code that lacks permissions to run to completion
           | except on Bamboo or other servers, so I practically have to
           | have a dry-run mode anyway.
        
           | Buttons840 wrote:
           | So, for a top level bit of code to pass arguments to lower
           | level bits of code you set environment variables? Seems a bit
           | odd to me
           | 
           | Using environment variables for the user to configure their
           | environment or pass information into a program seems normal,
           | but code passing arguments to other bits of code in the same
           | language with environment variables seems odd.
        
             | kbenson wrote:
             | The only two things I use this for are for setting things
             | that I consider runtime flags that I want honored
             | throughout the instance of that applicaton run.
             | 
             | In the instance of dry run, I think it's far more important
             | that it's correctly seen and honored than that globals
             | should be avoided. I view dry run mode as a contract with
             | the person running, and if I can avoid having to pass
             | arguments along every time, and avoid having to got change
             | the arguments of all the utility functions I call and then
             | all their call sites, then that's a win because if that
             | needs to be done to correctly honor that sometimes it _won
             | 't_ be done and some worse solution will happen, if
             | anything.
             | 
             | > but code passing arguments to other bits of code in the
             | same language with environment variables seems odd.
             | 
             | I don't pass the environment variables around, they exist
             | as part of the program state. if foo() calls bar(), I don't
             | pass the environment variable to bar, it exists as a global
             | flag, I just check for it in bar(). That's the benefit,
             | it's a global and set at runtime.
             | 
             | The other benefit is that as an environment variable, it's
             | inherited by child processes. Even if I system out to
             | another utility script, I don't have to pass a debug flag
             | their either, it's inherited as part of the environment the
             | child runs in.
        
               | [deleted]
        
             | raziel2p wrote:
             | In many programming languages, environment variables are a
             | mutatable dictionary/map, which makes it the ultimate place
             | to store global variables.
             | 
             | It's definitely my guilty go-to-hack when I'm not up for
             | refactoring everything to be more functional and/or take a
             | dry_run function parameter everywhere.
        
               | Buttons840 wrote:
               | It's a global mutable map that multiple uncoordinated
               | processes can change. If you want a global map, just make
               | a global map. Or just make global variables, since the
               | variable namespace itself is a map.
        
               | kbenson wrote:
               | All that assumes you have some stuff to set up that
               | global map, and that means that setup code is a
               | requirement.
               | 
               | I end up writing lots of library code. Sometimes that
               | library code is called from within a command line utility
               | I created, sometimes it's called from a web service,
               | sometimes it's just a small driver script because I'm not
               | developing or testing the utility, but the library code
               | itself.
               | 
               | I can make that include to set up the shared global and
               | try to make sure it's included in all instances and all
               | ways I want to use the code, or make all the call sites
               | resilient to it not existing, or I can just use the
               | included OS mechanism for doing this and since that's
               | _always_ available, I get it for free.
               | 
               | Also, dry run mode isn't necessarily something you want
               | set in a config. It's generally something you run once or
               | twice prior to running for real (the normal case) or
               | while in development/debugging. It's not something you
               | would want to set in code and accidentally forget and
               | push live, and generally a good dry run mode will look
               | like it succeeded without actually succeeding, mocking
               | responses that would fail along the way, because you
               | aren't testing one small thing you're testing a workflow
               | of some sort generally which has a few steps.
               | 
               | That said, I fully admit the trade-off might go a
               | different way for different languages. Using a compiled
               | strongly typed language may mean there's enough bits to
               | check that you need to write a debug/dry run helper
               | function to make it convenient, so there's not a lot lost
               | by requiring setup in that as well. But for something
               | like Perl (and I assume Python and Ruby and JS, to almost
               | the same degree) where I can do:                   warn
               | "Calling out to foo() with args: " . Dumper($args) if
               | $ENV{DEBUG} and $ENV{DEBUG} >= 2;         foo($args) if
               | not $ENV{DRY_RUN};
               | 
               | and it will be completely valid, obvious and idiomatic
               | with zero additional work, there's a real draw to using
               | environment vars for these two specific cases (even if
               | not for all config).
        
       | voakbasda wrote:
       | Simulation is a great thing. I use this wrapper, even in my most
       | trivial shell scripts:
       | pretend=${scriptname_pretend:-false}       run() { echo "$@";
       | $pretend || "$@"; }
       | 
       | Sure, this doesn't give copy/paste re-usable output, but it's
       | really just a quick and dirty sanity check when things start
       | getting complicated.
        
       | deckard1 wrote:
       | If you're doing a lot of modifications to a filesystem or a
       | database, one pattern I've used and liked in the past is to have
       | your code simply dump out the commands _to_ run. Usually to
       | stdout and you save it as a script.
       | 
       | That way you don't need to worry about issues with bugs in the
       | dry run logic _and_ you have a record (a script) of what you
       | actually ran.
       | 
       | Even better if you can extend this output script so that it's
       | idempotent and fails on the first failed command.
        
       | slver wrote:
       | Dry run is OK, but if we want to look at it closer, it occurs in
       | tools with mixed query + command responsibilities.
       | 
       | For example                   rm PATTERN
       | 
       | Is really a shortcut for something like (pseudo code):
       | find -name PATTERN | rm
       | 
       | If things were always factored for command/query seggregation
       | you'd not need dry-run, you'd simply just run the query.
        
         | rwl wrote:
         | This is a good point, but notice that what makes the
         | segregation easy in your example is that the query and the
         | command have a shared way to refer to what the command will
         | operate on: the filename.
         | 
         | Consider a conceptually very similar case: instead of finding
         | and deleting files in a filesystem, think about finding and
         | deleting lines matching a pattern within a file. Then the query
         | is something like grep...but what do you put for the command on
         | the other side of the pipe?
         | 
         | Of course, you can tell grep to output line numbers and use a
         | command that operates on line numbers, or similar. The point
         | is, in order to achieve this kind of segregation, you need
         | _some_ common way of naming operands on both sides of the
         | divide. And naming things is hard, so segregating things this
         | way is hard, and thus there 's a lot of tools with mixed
         | query/command responsibilities.
        
           | slver wrote:
           | PowerShell solved this problem by outputting easy to map
           | objects instead of just plain text. Unfortunately they made
           | it a bit more verbose than it could've been, and the industry
           | didn't recognize the potential, so it remained a niche
           | product.
           | 
           | I still love PowerShell.
        
           | ItsMonkk wrote:
           | Finding is a query. Deleting is destructive. Make sure that
           | you are not nesting your destructions within a query.
           | 
           | You would first build a pure function that read the files and
           | returned matching lines. You could then display the count of
           | the matched lines, and if the user wants they can see those
           | lines and what line number it is. This looks a lot like a
           | dry-run.
           | 
           | You can then pass these results into an executor. You then
           | build a system that combines all deletes of a file and passes
           | that to your DeleteLinesInFile fuction that does so in a
           | single transaction. Cycle through each file and voila.
        
       | lbhdc wrote:
       | I have been incorporating dry run flags into the tools I have
       | been working on (typically etl like tools). Typically this flag
       | only stops the final call to external services that would change
       | the internal state of those services.
       | 
       | Its been really helpful when hacking on things, adjusting logic,
       | or validating that input data does what I think it will.
        
       | JustSomeNobody wrote:
       | I use --dry-run with rsync almost every time I use rsync. I don't
       | really use rsync enough to remember if I should use a trailing
       | slash or not...
        
       | overshard wrote:
       | At my work I've gone through and setup a variety of "replacement"
       | commands for commands that may have dire impact on production
       | systems. All of these replacement commands dry run by default (or
       | fake dry run/try to create a dry run the best they can) and
       | require a "--do-it" (think emperor palpatine voice) flag to do
       | the intended operation. I've had multiple coworkers thank me for
       | this setup as it's saved people from silly typos and mistakes,
       | one such being the different handling of programs in the use of
       | and ending "/" in folders. ex. "/etc" is different than "/etc/"
       | sometimes for certain operations.
       | 
       | I'd like to one day see a global flag on operating systems to dry
       | run nearly all commands that changed anything.
        
         | Hjfrf wrote:
         | It's official best practice for powershell to enable - whatif
         | in any potentially dangerous scripts.
         | 
         | Seems to be followed most of the time.
        
         | jonnycomputer wrote:
         | Good idea. I'm writing some db scripts, and I think I'll do
         | this.
        
         | Twirrim wrote:
         | I've largely switched to writing all my tools dry-run by
         | default. We've had a few undesirable events from tools that had
         | --dry-run modes and accidentally got run without. Making --run
         | or similar necessary almost guarantees it's never run without
         | intention.
        
         | greggyb wrote:
         | It should be a parameter that takes as an argument the current
         | user and logs the invocation, "--if-it-breaks-i-take-personal-
         | responsibility $USER" to really make sure folks understand what
         | they're doing.
        
         | nicoburns wrote:
         | > All of these replacement commands dry run by default
         | 
         | Same here. In addition if there's never a use case for running
         | in production (e.g. at one point we had a command that reset a
         | dev environment to a clean slate) then the command will
         | actually specifically check for the production environment and
         | refuse to run as an extra failsafe.
        
         | IgorPartola wrote:
         | --make-it-so is even better.
        
           | andrewshadura wrote:
           | I sometimes create a makefile with "it so: build" so that I
           | can make it so :)
        
         | danmur wrote:
         | Ditto :). I use --really though.
        
           | MAGZine wrote:
           | --void-warranty
        
             | andrewshadura wrote:
             | --deliberately-do-insane-thing
        
               | matsemann wrote:
               | React.__SECRET_DOM_DO_NOT_USE_OR_YOU_WILL_BE_FIRED
        
               | wtetzner wrote:
               | --launch-the-missles
        
         | maherbeg wrote:
         | +1. I prefer having a --real flag that you have to invocate.
        
       | ayoisaiah wrote:
       | This resonates with me. I recently published a cross-platform
       | tool [1] for bulk renaming files with dry run as the default. In
       | this mode, it shows the changes about to be made and any possible
       | problems that may occur (such as overwriting existing files). To
       | actually carry out the changes a flag must be used.
       | 
       | I felt it was a good design decision since the effects of bulk
       | renaming can be substantial. Most other similar tools have it
       | backwards with their dry-run mode relegated to a secondary
       | action.
       | 
       | [1]: https://github.com/ayoisaiah/f2
        
       | swyx wrote:
       | all CLIs that do substantial work should have dry runs.
       | 
       | more examples I've collected:
       | 
       | - Angular CLI https://malcoded.com/posts/angular-fundamentals-
       | cli/#--dry-r...
       | 
       | - AWS CLI https://docs.aws.amazon.com/cli/latest/userguide/cli-
       | usage-h...
       | 
       | - Rspec https://relishapp.com/rspec/rspec-
       | core/v/3-8/docs/command-li...
       | 
       | - Serverless framework https://forum.serverless.com/t/dry-run-
       | with-serverless-frame...
       | 
       | at netlify we didnt do a dry run but Netlify Dev is close:
       | https://news.ycombinator.com/item?id=19615546
       | 
       | I'm requesting one at Gatsby:
       | https://github.com/gatsbyjs/gatsby/discussions/16384
        
       | mr-wendel wrote:
       | For all you Bash heads out there:                 DRYRUN=1
       | VERBOSE=1              cmd() {           [ $DRYRUN -eq 1 -o
       | $VERBOSE -eq 1 ] \               && echo -e "\033[0;33;40m#
       | $(printf "'%s' " "$@")\033[0;0m" >&2           if [ $DRYRUN -eq 0
       | ]; then               "$@"           fi       }
       | 
       | And now you can put this all over your scripts to very easily
       | implement dry-run behavior without any quoting worries. Note that
       | you can't invoke aliases this way.                 cmd echo
       | "don't worry about quoting"       cmd do_something_dangerous
       | "$scary_arg" "$scary_arg2"
        
         | tremon wrote:
         | Why not simply                 DRYRUN=echo              cmd() {
         | $DRYRUN "$@"       }          ?
        
           | croehrig wrote:
           | I use this all the time (I call it $EXEC instead), and have
           | gotten into the habit of starting all my scripts with
           | EXEC=echo and put it on every command that does any file
           | changes. It has saved my bacon many times.
           | 
           | It doesn't work all the time as the previous poster noted,
           | but it is very low friction which is especially important
           | when writing quick throw-away scripts.
        
           | mr-wendel wrote:
           | The printf version will give you automatic quoting making
           | copy'n'paste really easy to test. It' also mixes well w/ a
           | verbose option for "--verbose" or such.
        
             | ebeip90 wrote:
             | For those using zsh, you can get better quoting (and only
             | when necessary, and handles non-printable characters and
             | newlines better) than using printf.
             | 
             | run() { if dry; then echo "${(q-)@}" else "$@" fi }
        
       | Smaug123 wrote:
       | (Author here.)
       | 
       | A whistlestop tour of the technique "insert a lightweight API
       | boundary down the middle of a tool, for fun and profit". It leads
       | you towards a structure that has lots of benefits, such as
       | meaningful `--dry-run` output and more ready librarification.
        
         | tasogare wrote:
         | Nice to see F# in posts from time to time. I might try the
         | technic in the future.
        
         | b3morales wrote:
         | This was a great read!
         | 
         | It's not material to the point, but I think there's a small
         | typo on line 12 of the second "Finishing the example" snippet:
         | it looks like                   let instructions = gather args
         | 
         | should be                   let instructions = gather
         | inputGlobs
        
           | Smaug123 wrote:
           | Thanks!
           | 
           | You're quite right; thanks for letting me know. I'll get that
           | fixed.
        
       | gm wrote:
       | What do you guys think of "dry run by default" where you have to
       | turn dry run OFF via an argument? I've done this on some of my
       | utilities just because screwing up the arguments would result in
       | much grief (lost data).
        
       | gorgoiler wrote:
       | I don't even trust dry run for some things.
       | 
       | Yesterday I wrote a cleanup tool. In fact it's two tools: one
       | that outputs a parseable list of things to clean up...
       | # snapshotX # keep           snapshotY # delete
       | 
       | ...and another that consumes the list and does stuff. It's the
       | only way to be sure.
        
       | im3w1l wrote:
       | First I found myself nodding along, but then a troublesome
       | question popped into my mind. What about Toc/tou race conditions?
        
         | iudqnolq wrote:
         | I think this can be mitagated with more abstract descriptions,
         | like:
         | 
         | "Would have gotten all servers tagged foo (currently 1,341) and
         | deleted them"
        
           | Smaug123 wrote:
           | That's all very well for just `--dry-run`, but it's still a
           | problem if you use the method in OP to _architect the entire
           | tool_ (and in particular the non-dry-run execution) around
           | the existence of a `--dry-run` phase.
        
             | iudqnolq wrote:
             | I don't think so? The planning stage generates something
             | like DeleteOp(ServerSpec { tag: "foo" }).
        
               | Smaug123 wrote:
               | I think the problem is that describing it as "would have
               | deleted 1378 servers", and then going off and deleting
               | those 1378 servers, doesn't solve the race condition
               | problem that leaves you with some servers left undeleted
               | because they popped into existence after the check.
               | 
               | With server deletion, of course, the problem is less
               | visible because it's naturally idempotent (unless you're
               | referring to servers by some non-unique key). In that
               | case, you can safely just pretend the leftover servers
               | came into existence after the entire tool came into
               | being.
               | 
               | Of course, if you're referring to servers by name or IP
               | or something, then you hit exactly the problem.
               | Concretely: I run a command that deletes all servers
               | older than one day. The 'dry-run' section determines what
               | servers need to be deleted and it turns out that server
               | "foo" needs to be deleted. Elsewhere, someone spots that
               | "foo" exists, deletes it, and creates a new server with
               | the name "foo". Then the 'execute' flow deletes the new
               | "foo", which is entirely not the server we wanted to
               | delete.
        
               | cmeacham98 wrote:
               | Your example inaccurately describes what would happen
               | with GP's approach.
               | 
               | Planning stage would generate
               | `Delete(Server{CreatedBefore: 1621887357}))`.
               | 
               | It would output something like "Deleting servers created
               | before $time$, currently $servers$."
               | 
               | The point is the planning stage would encode the
               | operation being done, not a list of servers to delete.
        
               | Smaug123 wrote:
               | Ohh, sorry - I got the wrong end of the stick entirely.
               | Yes, in that case I'm satisfied!
        
         | Smaug123 wrote:
         | Indeed, that's the thing I rather swept under the rug. I'll
         | definitely take precautions if it's at all easy to avoid the
         | assumption that just because my `--dry-run` stage said
         | something was safe, it will definitely be safe by the time
         | execution passes to the "do-it" stage. However, in general you
         | are indeed doomed the instant you decided to split the tool up
         | this way.
         | 
         | I have never actually encountered such a race condition while
         | using any tool I've written this way. But I do remember one
         | place where I specifically didn't add a feature to the tool
         | because it was so vulnerable to a TOC/TOU race condition.
         | Instead, I turned that feature into a "did I just manage to do
         | the thing successfully, and if not, why not" check at the end
         | of the "do-it" stage.
         | 
         | So yes, that concern is extremely valid, and if you're doing an
         | operation whose validity is liable to change after you've done
         | the check, then you do just need to follow a different approach
         | for that operation. (But then there's probably no possibility
         | even in principle of a `--dry-run` for such an operation
         | anyway!)
        
           | mst wrote:
           | Something I've found can work is "check, output plan, prompt
           | for confirmation, if 'yes' execute plan in a way that
           | maximises the odds of bailing out if anything changed while
           | you were displaying the prompt" - for bonus points,
           | recalculate after each action that had to be simulated and do
           | something policy-defined if anything differs from your prior
           | expectations.
           | 
           | Obviously the effort of doing this becomes a trade-off but
           | it's worth considering.
        
       | da39a3ee wrote:
       | This is obvious but a good way to implement dry run in some
       | settings is to obtain a read-only connection to the relevant
       | datastore / roll back a transaction without committing, etc.
        
       | SavantIdiot wrote:
       | Duplicity has this option. I love it. Especially when something
       | bad can happen, like blowing up my disk because an ISO image is
       | in the backup area.
        
       ___________________________________________________________________
       (page generated 2021-05-24 23:00 UTC)