[HN Gopher] Dockerfile Security Best Practices
       ___________________________________________________________________
        
       Dockerfile Security Best Practices
        
       Author : gbrindisi
       Score  : 287 points
       Date   : 2020-10-14 14:17 UTC (8 hours ago)
        
 (HTM) web link (cloudberry.engineering)
 (TXT) w3m dump (cloudberry.engineering)
        
       | usr1106 wrote:
       | I think he mixes 2 aspects. There is security and there is
       | reproducibility/traceability/reliability.
       | 
       | For security using the latest versions of both base images and
       | packages is typically a good thing. The cases that the newest
       | package is more vulnerable since something 1, 2, 3 years old are
       | not that common.
       | 
       | However, if your process requires reproducibility/traceability
       | (medical and other regulated domains) you cannot just deploy the
       | latest and greatest. You need to pass it through some release
       | process first. That should not be an excuse to run outdated,
       | vulnerable software though. The same holds if you require high
       | availability. Even if you might not need to document what you are
       | using, you want to test whether it causes performance issues
       | (zero performance is the worst one...).
        
       | jtchang wrote:
       | How do you get around sometimes needing root inside the container
       | to build things? For example building a container with buildroot
       | inside.
        
         | vitalysh wrote:
         | Using intermediate container to build and copy over resulting
         | binaries should work.
        
       | EdSchouten wrote:
       | If you want to create container images for an application you
       | wrote yourself in a commonly used programming language (Go,
       | Python), consider using Bazel with rules_docker:
       | 
       | https://github.com/bazelbuild/rules_docker
       | 
       | rules_docker allows you to create byte-for-byte reproducible
       | container images on your system, without even having a Docker
       | daemon installed. So much cleaner to use than 'docker build' once
       | you get the hang of it!
        
       | chias wrote:
       | > Do not upgrade your system packages
       | 
       | Ah, the joys of working somewhere that isn't required to
       | document, answer for, and ultimately remediate _every CVE that is
       | present in any package installed on any of your containers_
       | within your production application. Sadly, compliance and
       | regulatory oversight don 't leave this option open to everyone.
        
         | cmwelsh wrote:
         | Is this a good argument for building containers as "bare metal"
         | as possible? You don't have to remedy CVEs (and rebuild your
         | containers) for anything that isn't actually your application.
        
       | vorticalbox wrote:
       | Quick question of you're not meant to use env for secrets then
       | how are you meant to get secrets into your application?
       | 
       | What's the best way to handle this?
        
         | keikun17 wrote:
         | using buildkit https://pythonspeed.com/articles/docker-build-
         | secrets/
        
         | tynorf wrote:
         | Probably either via a third party service (such as AWS secrets
         | manager), or mounted as files scoped to the user your process
         | is running as (which is not root, right? :) ).
        
         | raxor53 wrote:
         | Typically your host will have a service specifically for
         | secrets. For example: https://docs.github.com/en/free-pro-
         | team@latest/rest/referen...
        
       | tofflos wrote:
       | The number one advice should be to use a linter. The number two
       | advice should be to use an image security scanner. These tools
       | combined will prevent most issues. Integrate them with CI to
       | enforce a common set of best practices across an organization and
       | to prevent security bike shedding.
        
       | minimaxir wrote:
       | Google's distroless containers are an interesting approach for
       | both security and performance as well, albeit with limited
       | language support:
       | https://github.com/GoogleContainerTools/distroless
        
         | _nhynes wrote:
         | Relatedly, rust-musl-builder [0] is useful for getting Rust
         | binaries to run on the `static` instead of `cc` base image.
         | 
         | [0] https://github.com/emk/rust-musl-builder
        
       | m0zg wrote:
       | Why would you write such an article and not show what the actual
       | _recommended_ practices look like?
        
       | pella wrote:
       | imho: always check what is upgrading - before make a decisions ..
       | 
       | the ubuntu:20.10 now want to upgrade the "libssl1.1"!
       | docker run --rm -it ubuntu:20.10 bash -c "apt update && apt
       | upgrade"       ...       The following packages will be upgraded:
       | debianutils diffutils findutils gcc-10-base         libgcc-s1
       | libgnutls30 libprocps8 libssl1.1 libstdc++6        libsystemd0
       | libudev1 procps sed zlib1g
       | 
       | the ubuntu:20.04 is better                 docker run --rm -it
       | ubuntu:20.04 bash -c "apt update && apt upgrade"       ...
       | The following packages will be upgraded:       gcc-10-base
       | libgcc-s1 libstdc++6 zlib1g
        
         | mfontani wrote:
         | One could do worse than using (more) stable distribution for
         | their base images.
         | 
         | 20.10 is not a LTS, so it's more likely to get semi-spurious
         | updates than a LTS version like 20.04 or, say, debian stable.
         | 
         | For most of my non-alpine-based images, I use debian:buster-
         | slim as base, as it's got a fairly stable base and gets quite
         | routinely updated:                   $ docker run --rm -it
         | debian:buster-slim bash -c "apt update >dev/null 2>&1 && apt
         | upgrade"         Reading package lists... Done         Building
         | dependency tree                Reading state information...
         | Done         Calculating upgrade... Done         0 upgraded, 0
         | newly installed, 0 to remove and 0 not upgraded.
        
           | pella wrote:
           | > I use debian:buster-slim as base
           | 
           | this image has been upgraded "39 hours ago".. so you have to
           | check 1-2 month later                 debian  buster-slim
           | f49666103347   39 hours ago   69.2MB
        
             | mfontani wrote:
             | That's part of my point ;) It's updated very often.
             | 
             | I check for base image updates every day, and it's one of
             | the most oft-updated ones -- hence my preference for using
             | it as "the" base of all others.
        
               | TimWolla wrote:
               | Monthly typically (and when important security updates
               | are released): https://github.com/docker-
               | library/official-images/pulls?q=is...
        
       | tetha wrote:
       | Besides the Docker practices - does someone have experience with
       | OPA? This is the first time I've heard of that tool, but an
       | extensible policy tool like that might solve a lot of challenges
       | we have at work.
        
         | andyroid wrote:
         | We use OPA for use cases ranging from kubernetes admission
         | control, to microservice authorization and CI/CD pipeline
         | policies. It's one of those tools you can't realize you could
         | have lived without once you start using it.
        
       | blago wrote:
       | I would add one more general security tip. Always restrict your
       | mapped ports to localhost (as in 127.0.0.1:3306:3306) unless you
       | really want to expose it to the world.
       | 
       | I think it's counterintuitive, but I learned the hard way that
       | 3306:3306 will automatically add a rule to open the firewall on
       | linux and make MySQL publicly accessible.
        
         | junon wrote:
         | > automatically add a rule to open the firewall on linux
         | 
         | There's no way this is true. This completely defeats the
         | purpose of a firewall.
         | 
         | If it is happening, then it's Docker doing this - not the linux
         | firewall that comes stock with most distribution (iptables and
         | the like). They would never simply add rules to the chains just
         | because something was listening on that port.
         | 
         | Really, the best security advice for using Docker is to not use
         | Docker. Unfortunately, there aren't very many "hold your hand"
         | alternatives available. Aside from LXD and that family of
         | technologies, which are criminally underused.
        
           | hxtk wrote:
           | It has to do with the implementation of the networking. The
           | DOCKER chain in the nat table gets hit by inbound traffic on
           | the PREROUTING chain before UFW's chains on the filter table.
           | IIRC, can get around this by directing UFW to write its rules
           | to the DOCKER-USER chain.
           | 
           | Firewalld is implemented differently and will exhibit the
           | expected blocking behavior: traffic targeting docker ports
           | will encounter firewalld before it encounters docker.
        
           | Nullabillity wrote:
           | Docker uses iptables for port forwarding, and those rules
           | typically end up ahead of the rules inserted by your firewall
           | (firewalld/ufw/manual scripts/whatever).
           | 
           | It's not so much that they explicitly open a firewall rule,
           | as that they take a networking path that isn't really covered
           | by traditional firewalls.
           | 
           | Another way of viewing it is that Docker "is" your firewall
           | for your container workloads, and that adding a port-forward
           | is equivalent to adding a firewall rule. Of course, that
           | doesn't change that public-by-default is a bad default.
        
             | l3s2d wrote:
             | This has been a major pain point for me. Despite my
             | `firewalld` configuration only allowing specific traffic,
             | all my containers were exposed.
             | 
             | My current policy is to set `"iptables": false` in Docker's
             | `daemon.json` on any public machine. I don't understand why
             | this isn't the default.
        
               | Nullabillity wrote:
               | > My current policy is to set `"iptables": false` in
               | Docker's `daemon.json` on any public machine. I don't
               | understand why this isn't the default.
               | 
               | If you don't muck with iptables then you need a (slow)
               | userspace proxy to expose your pods. That also means
               | losing things like the source IP address for any incoming
               | connections.
        
             | junon wrote:
             | This is right, I remember now - docker does mangle your
             | iptables chains. I remember fighting with this a while
             | back.
             | 
             | Terrible practice, in my opinion. Docker shouldn't be
             | touching firewall stuff.
        
               | krab wrote:
               | Iptables magic is essential to how a lot of container
               | networking stuff is implemented, though.
        
               | peterwwillis wrote:
               | This is (imho) a huge flaw in the concept of a
               | "container". I don't think most people comprehend how
               | much crap is going on in the background.
               | 
               | For _most_ container purposes, host networking and the
               | default process namespace is absolutely fine, and reduces
               | a lot of problems with interacting with containerized
               | apps. 95% of the use case of containers is effectively
               | just a chroot wrapper. If you _need_ more features, this
               | should be optional. This would also make rootless
               | federated containerized apps just work. But nobody wants
               | to go back to incremental features if Docker gives them
               | everything at once.
        
               | thr0w3345 wrote:
               | If you think that's bad, wait til you see what the
               | iptables-save output is like on an istio-proxy sidecar ;)
        
               | unilynx wrote:
               | I've resorted to adding my own firewall rules to the
               | 'raw' table, which pretty much preempts all the rules
               | Docker or the distribution inserts.
               | 
               | It's not as powerful as the later tables in the chain
               | (see https://upload.wikimedia.org/wikipedia/commons/3/37/
               | Netfilte... ) but a lot more robust.
        
           | [deleted]
        
           | acdha wrote:
           | If this is a surprise to you, scan your servers and see what
           | else has helpful behavior which you didn't expect. Services
           | like shodan.io will even do this regularly and email you when
           | things change.
           | 
           | It's easy to blame Docker but I've seen this failure mode
           | happen many times over the years - even experienced admins
           | make mistakes. As always, defense in depth and continuous
           | monitoring trumps raging about software.
        
             | junon wrote:
             | I don't disagree with any of your points, but they aren't
             | relevant to anything I've said. I never said Docker is the
             | only one doing this. It's also not a surprise Docker in
             | particular is doing this - Docker has a long history of
             | doing bad things.
        
               | acdha wrote:
               | It is relevant because you said "there's no way this is
               | true" when it is in fact true, which means that your
               | understanding of how the system works doesn't match the
               | actual behaviour. I mentioned the importance of scanning
               | to catch those situations quickly.
        
           | kspacewalk2 wrote:
           | >Aside from LXD and that family of technologies, which are
           | criminally underused.
           | 
           | Criminally underused indeed. I have no idea why it's not more
           | popular for 'average' users/orgs. I don't know what issues
           | may come up with scaling this up, but in our small org we've
           | been running 20-30 (mostly unprivileged) LXD containers in
           | production for several years now for all sorts of intranet
           | and external-facing services (auth, DB, web, etc). Sure, it
           | requires a bit more thought to set up than Docker, but it's
           | well-documented (for most people's uses at least), secure,
           | stable and lightweight.
        
             | scns wrote:
             | >I have no idea why it's not more popular for 'average'
             | users/orgs.
             | 
             | Maybe because many devs use Macs / Windows? Maybe WSL may
             | tilt the balance in LXDs favour, but on OSX? Run it a VM
             | yourself without the conveniences of docker-compose up?
             | 
             | As a Linuxuser myself, i looked at the competing solutions
             | (podman, LXD, and the one by Canonical whose name i forgot)
             | and thought: "Ain't gonna fly in a mixed environment."
             | 
             | They may be technologically superior, worse is better once
             | again i guess. Would prefer them to Docker too.
        
         | gchamonlive wrote:
         | I like creating a security choke point, like a firewall in a vm
         | serving as a nat gateway or actual cloud security groups and
         | network acl.
         | 
         | This way you can make all your servers private and manage the
         | firewall in a single access point to the outside world.
         | 
         | Making your servers public to the net by default and without a
         | separate firewall solution is not so advisable in the first
         | place.
        
           | chousuke wrote:
           | People who know enough to consider architectures like this
           | aren't the ones most likely to accidentally expose databases
           | to the internet. It happens, but most often these mistakes
           | are made by people who just don't have the experience to be
           | wary.
           | 
           | I think software like docker have a responsibility to
           | encourage secure-by-default configurations, but unfortunately
           | "easy" is often the default that wins mindshare.
        
             | carlmr wrote:
             | I agree with you, but since Docker is kind of a given, how
             | can one learn the necessary stuff about networking as to
             | not make these mistakes?
             | 
             | I always see best practices like this, but they don't
             | really help in grokking what's happening and why. I'd like
             | to know more about the networking stuff, but whenever i
             | look something up it's very specific, so you don't really
             | learn why it's bad.
             | 
             | How can a regular user understand how the network stack
             | works? At least enough to get an instinct why something
             | would be bad.
        
               | gchamonlive wrote:
               | I guess you are cutting straight to the chase and
               | overlooking the fundamentals. I took a lot from the Well-
               | Architected framework from AWS and applied in all my
               | projects.
               | 
               | https://aws.amazon.com/architecture/well-architected/?wa-
               | len...
               | 
               | Take a look at the security pillar with extra care. For
               | the cloud I would suggest you take a basic practitioner
               | exam, or at least a preparation course in a platform like
               | whizlabs. There you would get a basic understanding of
               | how networking is laid on the cloud.
               | 
               | For private, on-premises projects, it really comes down
               | to what you have at hand. In this case maybe the Google
               | SRE book would be good. You take good practices in
               | maintaining a data center and apply the distilled
               | knowledge to what makes sense to your infrastructure:
               | 
               | https://landing.google.com/sre/sre-book/toc/index.html
               | 
               | Read this book as in topics, not sequentially, coming
               | back to the fundamentals when you feel lost, otherwise
               | you might end up lost in technicalities that make little
               | sense to your work.
               | 
               | Also take a look at the shared responsibility principle.
               | There it is exposed what are the client and cloud
               | provider responsibility. When you have a private on-
               | premises project all you have to do is implement the
               | entire responsibility stack that the cloud does for you.
        
             | gchamonlive wrote:
             | I am not sure why you were downvoted. I agree with that. I
             | prefer technologies that are restrictive by default and
             | more flexible and potentially harmful configurations hidden
             | behind explicit and well structured options.
             | 
             | Either an exception should be raised or a default safe
             | behaviour should be adopted when the example is
             | encountered. I prefere breaking as soon as possible because
             | the alternative is harder to debug.
        
           | optimuspaul wrote:
           | Exactly. A public facing ip for a server, especially a
           | database, is just a bad idea. You need something a bit more
           | hardened to route the traffic. And a publicly accessible
           | database is just asking for trouble.
        
         | optimuspaul wrote:
         | Why would you have a database running on a machine with a
         | public ip?
        
         | BossingAround wrote:
         | > I learned the hard way that 3306:3306 will automatically add
         | a rule to open the firewall on linux and make MySQL publicly
         | accessible.
         | 
         | Is that true? _3306:3306_ would bind the port on all
         | interfaces, but I was under the assumption you 'd have to
         | explicitly enable firewall port 3306 for the machine to accept
         | traffic to port 3306 from outside of your machine.
         | 
         | I'll have to test that.
        
           | blago wrote:
           | https://github.com/docker/for-linux/issues/690
        
           | isbvhodnvemrwvn wrote:
           | From memory and a quick glance at one of my servers:
           | 
           | When an IP packet related to your container arrives (<host
           | ip>:<host port>):
           | 
           | - docker rewrites it to target <your container's ip
           | address>:<your container's port> (NAT table, chain PREROUTING
           | delegates to DOCKER chain with the relevant DNAT entry)
           | 
           | - since the IP does not match your host, the packet is
           | forwarded (there's a relevant routing entry pointing to
           | docker's virtual interface)
           | 
           | - the first thing you encounter in the FORWARD chain of the
           | FILTER table is a few jumps to the docker-related chains,
           | DOCKER chain in particular accepts all packets destined to
           | <your container's ip address>:<your container's port>
           | 
           | So a few takeaways:
           | 
           | - your standard firewall might not be involved because its
           | chains are plugged in _after_ the docker chains in the
           | FORWARD chain (e.g. ufw under Ubuntu)
           | 
           | - if the above is true and you want your firewall to matter,
           | you have to add stuff to DOCKER-USER chain in the FILTER
           | table
           | 
           | - at that point the host port and IP doesn't matter since
           | it's already been mapped in the NAT table's PREROUTE chain at
           | the beginning of processing - write your firewall rules to
           | address specific containers
        
         | nickjj wrote:
         | Yep, this was especially deadly a few years back with Redis
         | because it bound to 0.0.0.0 by default and by default Redis
         | allows connections without a password.
         | 
         | So if you had 6379:6379 published anyone from the internet
         | could connect to your Redis container. Oops.
         | 
         | > Always restrict your mapped ports to localhost (as in
         | 127.0.0.1:3306:3306) unless you really want to expose it to the
         | world
         | 
         | It's also worth pointing out that typically you don't even need
         | to publish your database port. If you omit it, containers on
         | the same network can still communicate with each other.
         | 
         | Very rarely do I end up publishing ports in production. If you
         | wanted to connect to postgres you could still docker exec into
         | the container on the box it's running on and interact with your
         | db using psql that's already installed in that container.
         | 
         | A while back I wrote a blog post on the difference vs exposing
         | and publishing ports at https://nickjanetakis.com/blog/docker-
         | tip-59-difference-betw....
        
           | amanzi wrote:
           | This is why I like to use both a host-based firewall __as
           | well as__ a network-based firewall. For the VPS's that I have
           | running on the internet, I always use the hosting provider's
           | firewall offering in addition to iptables (or ufw).
        
             | MaxBarraclough wrote:
             | This is something AWS gets right, and Digital Ocean and
             | Linode both get wrong (they offer no cloud firewall of any
             | sort, to my knowledge).
             | 
             | It should be trivial for me to lock down the ports of my
             | instance, from the VPS web UI. AWS lets me create a new
             | instance which is entirely closed to incoming connections
             | except for port 22, which is closed except for whitelisting
             | my IP address. This gives me good assurances even if my
             | instance is running a vulnerable SSH server. It's also
             | trivial to block outgoing connections, where that's
             | appropriate.
             | 
             | It also means my instance is spared from constant probing,
             | which keeps the logs clear.
        
               | nickjj wrote:
               | > Digital Ocean and Linode both get wrong (they offer no
               | cloud firewall of any sort, to my knowledge)
               | 
               | DO has had a cloud firewall for a while now (a year or
               | 2?)
               | https://www.digitalocean.com/docs/networking/firewalls/.
               | It's free too.
               | 
               | Looks like Linode has one coming soon
               | https://www.linode.com/products/firewall/.
        
               | MaxBarraclough wrote:
               | That's good news, thanks.
        
       | tigrezno wrote:
       | About the "apt upgrade". In major distributions, that's safe.
       | They won't introduce major package versions, only minor (and
       | security fixes).
       | 
       | It should be totally OK to upgrade the packages.
       | 
       | Also I don't get that it's advised to use "apt update" when you
       | can't do "apt upgrade". What's the point??
        
         | diggan wrote:
         | It's not entirely clear from the article but my guess would be
         | that you should really update the repository + base
         | dependencies in the base image, not the "application" image or
         | whatever you wanna call it. But not the end of the world if
         | you're not reusing that base image. Feels weird that an article
         | on "Security Best Practices" would advise _against_ latest
         | versions of your software, especially using apt as an example
         | which is usually used in stable distributions (major version
         | upgrades happens seldom) like Ubuntu/Debian.
         | 
         | But yeah, that last part doesn't make any sense. If you're not
         | running `apt-get upgrade`, it doesn't make sense to run `apt-
         | get update` as nothing is using the newly fetched data
         | anyways...
        
         | kortex wrote:
         | Apt update just updates the cache of the package lists -
         | /var/lib/apt/lists/ - based on your lists -
         | /etc/apt/sources.list.d/. "upgrade" actually upgrades the code
         | on the system.
         | 
         | `rm -rf /var/lib/apt/lists/*` in the same RUN decreases bloat
         | and I think possibly decreases cache misses as well.
        
           | Nullabillity wrote:
           | > and I think possibly decreases cache misses as well.
           | 
           | Unlikely, you'd still have misses because of stuff like file
           | metadata mismatches (think modification times).
        
           | richardwhiuk wrote:
           | Base images usually ship with that done. `apt-get update`
           | just undoes the good work of removing irrelevant lists.
        
             | acdha wrote:
             | Some base images do, but some don't, and they update on
             | their schedule rather than yours. Doing the update on your
             | schedule is trivial, has no meaningful downside, and means
             | it's done and tested as quickly as you need.
        
             | kortex wrote:
             | Huh, TIL. I'll give that a try. Usually, that's the least
             | of my problems. In my line of work, I'm often doing things
             | which would likely horrify most webapp devs, like building
             | a container with multiple conda envs in it. OTOH, most of
             | these containers run on airgapped systems with petabytes of
             | storage, so shipping a bit of bloat barely hurts the end
             | user. But every time I do, I die a bit on the inside.
        
         | shaded-enmity wrote:
         | If you upgrade your packages during a build then inadvertently
         | if 2 people build your Dockerfile at two distinct times they
         | can end up with 2 different images. So rather than focusing on
         | "we're using base-image:1.0.2" you need to start asking
         | questions like "list all packages and versions and start
         | comparing those".
         | 
         | If there's a security issue you need to rebuild your images
         | anyway and optimally you have a system in place that represents
         | images and their dependencies as some sort of dependency graph
         | structure, so when you upgrade your base image all dependent
         | images get rebuilt automatically.
        
       | nomadiccoder wrote:
       | This article is interesting but it would be far more helpful to
       | link to a solution for each point.
        
       | sgnnseven wrote:
       | Solid post though there are a couple of things I would disagree
       | with:
       | 
       | > Do not upgrade your system packages
       | 
       | Most distros will have smooth upgrades and provide you with
       | patched libs that your app may need and the latest image may not
       | provide. It's slightly more prone to breaks but it creates a less
       | vulnerable runtime app env.
       | 
       | > Do not use 'latest' tag for base image
       | 
       | Depends on the org but sometimes pinning means that you will
       | likely end up using and end-of-life image because it requires
       | proactive work to maintain. If you leave it as 'latest' this
       | won't happen but you will get out-of-band breaks to keep that
       | working. Choose wisely.
       | 
       | A few things I would add too:
       | 
       | - Don't mount Docker socket into _any_ container unless
       | absolutely necessary
       | 
       | - Your biggest security threat will be from your app's
       | dependencies, not the container's setup
       | 
       | - Do not run a full init system unless absolutely necessary as
       | this is just a security disaster waiting to happen. There are
       | valid use cases for it but they're rare.
        
         | TheGallopedHigh wrote:
         | Can you explain your last point further please?
        
           | daxelrod wrote:
           | "Full" init systems tend to need to do things that are hard
           | to secure in a container.
           | 
           | Many must run as root, and the reasons not to do that are
           | discussed in the article this HN thread is discussing.
           | 
           | Systemd is particularly tricky because it needs to be able to
           | control the cgroups of its child processes, which means the
           | container needs to be granted that capability. See
           | https://developers.redhat.com/blog/2019/04/24/how-to-run-
           | sys... about how to run systemd in a container via Podman,
           | and is a follow up to
           | https://developers.redhat.com/blog/2016/09/13/running-
           | system... which discusses why the Docker case is even more
           | difficult.
           | 
           | That said, if you just want a process supervisor for a multi
           | process container, there are several more minimal init
           | systems that will work well, for example, supervisord.
        
       | andmarios wrote:
       | Dockerfile security best practices: treat is as you would treat
       | any linux server. :)
        
       | azangru wrote:
       | A very nice web page, too! Snappy, no javascript apart from
       | analytics, tastefully designed, mobile-friendly.
        
       | lipanski wrote:
       | One of the most common mistakes I see is not using a
       | .dockerignore file or, better said, relying on .gitignore when
       | calling `COPY` on entire directories. Without a .dockerignore
       | file in place, you could be copying over your local .env files
       | and other unwanted things into the final image.
       | 
       | On top of that, you might also want to add `.git/` to your
       | .dockerignore file, as it could significantly reduce the size of
       | your image when calling `COPY`.
       | 
       | A more subtle issue I've noticed is the fact that `COPY`
       | operations don't honour the user set when calling `USER`. The
       | `COPY` command has its own `--chown` argument, which needs to be
       | set if you'd like to override the default user (which is root or
       | a root-enabled user in most cases).
       | 
       | I wrote up a similar article a while back, though it's focused on
       | general best practices: https://lipanski.com/posts/dockerfile-
       | ruby-best-practices
        
         | silverwind wrote:
         | Better yet, only COPY the actually needed needed files instead
         | of the whole working directory. That way, there's no need for a
         | `.dockerignore`.
        
           | lipanski wrote:
           | I'm not sure that's always practical. Consider the average
           | Rails or Symfony app - you'd have to include quite a few
           | files (even if you add entire directories at a time).
        
           | raju wrote:
           | While this is a good idea, having a `.dockerignore` reduces
           | how much Docker has to load into the build context. For
           | projects with large histories, the `.git` directory itself
           | can be rather large. Add to that directories that hold build
           | artifacts, documentation, and you are unnecessarily
           | increasing the time it takes to start the build process.
        
           | unleashit wrote:
           | Wondering why you think this is better. Not sure the trade
           | off of a messy dockerfile and/or adding a bunch of layers
           | (possibly bloating the image size) is worth the trade off if
           | the concern is just about forgetting to update the
           | dockerignore. The same could be said about gitignore.
        
             | kinghajj wrote:
             | Not the person you replied to, but personally, I like
             | having control over what exactly gets into the final image,
             | and (IME) have found that devs aren't great about
             | remembering to update .dockerignore files. Re: extra
             | layers, if you use multi-stage builds to separate the
             | builder and final app images, you can avoid that.
        
       | naranha wrote:
       | > Using ENV to store secrets is bad practice because Dockerfiles
       | are usually distributed with the application, so there is no
       | difference from hard coding secrets in code.
       | 
       | This sounds wrong. If secrets are in the environment they are not
       | in the Dockerfile, so they are NOT distributed with the
       | application.
        
         | LawnGnome wrote:
         | What they're referring to is more specific than environment
         | variables in general: you can use the ENV command in a
         | Dockerfile to bake in a secret at build time, and that's what
         | you generally shouldn't do.
         | 
         | Injecting environment variables at runtime, however (through
         | docker run -e or whatever orchestration system you're using),
         | is good.
        
           | naranha wrote:
           | Ah yes, that makes sense. I didn't understand they were
           | talking about the ENV command at build time.
           | 
           | It was the heading that got me on the wrong path, I think
           | that should be clarified further:
           | 
           | > Do not store secrets in environment variables
        
             | pvtmert wrote:
             | I'm bit aganist having env-secrets inside container.
             | 
             | Because PID 1 has that env, all processes spawned from that
             | can read all of those.
             | 
             | I prefer mounting them to /run/secrets via tmpfs. Which can
             | also have selinux policy attached.
             | 
             | This way, someone else cannot read them by spawning shell
             | inside container
        
         | sethhochberg wrote:
         | I don't think the author is talking about loading secrets from
         | the environment - I think they're specifically talking about
         | hardcoding secrets into the Dockerfile and using the Dockerfile
         | ENV directive to set secrets for the processes running in the
         | image, instead of passing them at runtime, which sounds just
         | horrifying enough that I'm sure people do it in real codebases.
        
       | rzimmerman wrote:
       | > Do not store secrets in environment variables
       | 
       | Yes, definitely don't put secrets in the Dockerfile itself. I'm
       | curious if there are reasons not to use a .env file though?
       | 
       | > Only use trusted base images
       | 
       | This is a good sentiment, but docker hub also has plenty of
       | images that are built directly from a github repo. You can
       | inspect the Dockerfile and (as long as you trust Docker) trust
       | that it was built as written. In this case I recommend pinning to
       | a specific container SHA (image_name@sha256:...) in case the
       | source repo gets compromised. For official images, you can pin to
       | a tag IMO.
       | 
       | > Do not use 'latest' tag for base image
       | 
       | Regardless of the security concerns, pinning to latest will
       | probably bite you when there's a major version bump (or even
       | maybe a minor one). Imagine you built a container on
       | ubuntu:latest when latest was 20.04 and some new employee gets
       | hit with 22.04. That's a bad surprise.
       | 
       | > Avoid curl bashing
       | 
       | Evergreen fight here, but I agree with the author that you should
       | at least validate checksums on external files before executing or
       | using them.
       | 
       | > Do not upgrade your system packages
       | 
       | This is where the throne of lies about Docker idempotency
       | crumbles. You have to apt-get update because Debian/Ubuntu
       | repositories remove old versions when there are security fixes
       | (not 100% sure on this, feel free to correct me). So if the
       | ubuntu:20.04 image is released and there's a security update in
       | openssh-client, running "apt-get install openssh-client" without
       | "apt-get update" will fail. So we all run "apt-get update" and
       | pretend that the containers we build are time-invariant. They're
       | not, and in fact we occasionally get security updates snuck in
       | there. Luckily Debian and Ubuntu do a good job not breaking
       | things and no one complains. But if you build a container on
       | Tuesday it's not guaranteed to be the same as on Wednesday and
       | there's nothing you can do about that with a Debian distro. But
       | it's actually fine in practice so we pretend not to notice. The
       | point about not running "apt-get upgrade" is kind of moot -
       | you're going to effectively be upgrading whatever you "apt-get
       | install", so it's probably worth taking the same trade on the
       | built-in packages.
       | 
       | Importantly, there's no security risk - just a configuration one.
       | 
       | > Do not use ADD if possible
       | 
       | Same as above - you should be checksumming external dependencies
       | or self-hosting them.
       | 
       | > Do not root
       | 
       | > Do not sudo
       | 
       | I haven't thought deeply about these. I'd prefer to trust the
       | container system isolation rather than playing with users and
       | permissions. I don't understand this risk well enough to have a
       | well-formed opinion.
        
         | kelnos wrote:
         | > _Yes, definitely don 't put secrets in the Dockerfile itself.
         | I'm curious if there are reasons not to use a .env file
         | though?_
         | 
         | Because often that .env file will get checked into source
         | control (sometimes intentionally, sometimes accidentally), and
         | then you have secrets in your git history that you need to
         | rotate.
         | 
         | In general the best thing to do is store your secrets in a
         | place/service specifically designed for secrets, and either
         | fetch them directly into the container in your entrypoint, or
         | pull them into a location that you mount as a volume in the
         | container.
         | 
         | > _I 'd prefer to trust the container system isolation rather
         | than playing with users and permissions._
         | 
         | Don't. Containers don't give you full isolation. The 'root'
         | user inside the container is the same 'root' user as outside,
         | and container escapes may be possible. A good defense-in-depth
         | strategy suggests that you should run things using the least
         | privileges possible, and that doesn't change just because
         | something is running in a container.
        
       | PhantomBKB wrote:
       | You can also use talisman to make sure you are not checking in
       | secrets in dockerfiles https://github.com/thoughtworks/talisman
        
       | jose_zap wrote:
       | Hadolint, a docker file linter checks for most of those best
       | practices
       | 
       | https://github.com/hadolint/hadolint
        
       | 3pt14159 wrote:
       | Overall great post.
       | 
       | > If you rely on latest you might silently inherit updated
       | packages that in the best worst case might impact your
       | application reliability, in the worst worst case might introduce
       | a vulnerability.
       | 
       | On this one I disagree. Your CI should handle reliability (and if
       | not, you have a bigger problem) and you're more likely to patch a
       | vulnerability than to introduce a new one and it's unlikely that
       | by the time a PR hits production that the version is compromised.
       | 
       | I understand that updating cuts both ways when it comes to
       | security, but I agree with Matt Tait's ultimate conclusion when
       | he spoke on this issue a few years ago: For most medium size and
       | smaller companies constantly updating is safer than delaying. He
       | had real world data and graphs of compromise windows, etc. Short
       | answer was that attackers are more time motivated than defenders.
        
         | cooljacob204 wrote:
         | Would you have a link to the talk? I couldn't find it with a
         | google search.
        
           | 3pt14159 wrote:
           | I don't have time to track down a link to a video at the
           | moment, but it was at Infiltrate, an offensive-minded
           | cybersecurity conference out of Miami.
        
             | kronin wrote:
             | Is it this one? https://vimeo.com/267445424
        
         | tomphoolery wrote:
         | Pinning the version tag for base images in Dockerfile is a good
         | idea beyond the scope of security. It helps with onboarding as
         | well. I've been in situations where depending on the `:latest`
         | tag of a base image caused different versions of that image to
         | be used on different machines, resulting in developers having
         | weird issues that no one else was having ("I thought Docker was
         | supposed to solve this!"). Now, I only use non-specific tags
         | like `:latest` or `:12-alpine` before I distribute the Docker
         | image, by either collaborating with others or pushing it to
         | Docker Hub. It just gives me peace of mind to know that others
         | are building on the exact same stack as I was building on.
        
           | 3pt14159 wrote:
           | This is a reasonable choice to make depending on the
           | complexity of your project.
           | 
           | For the projects I've been on, the latest version plus the
           | test suite is enough to catch weirdness creeping in and, as a
           | side benefit, it gets fixed faster than if it were pinned.
           | Sometimes the issue really is caused by the base image and it
           | is easier to get a fix merged if the issue was caused quite
           | recently because the developers responsible see early reports
           | of issues as more endemic than if they're reported days or
           | weeks later.
        
         | 5d749d7da7d5 wrote:
         | This makes me wonder, why there has not been a bigger push
         | towards microkernel/minimal OS with audited toolchains that
         | were "done". Minimal features and minimal surface area. A plug
         | and play distribution with security at the forefront which
         | rarely needed updating because only the essential was
         | available.
         | 
         | I would be fine taking a healthy performance hit if I knew that
         | the base OS was secure. (At this point I expect the BSD folks
         | to chime in that they have had this for years)
        
         | michaelperel wrote:
         | Shameless Plug: I wrote a cli-plugin for docker, docker-lock,
         | that will track the image digests (SHA's) of your images in a
         | separate file that you can check in to git, so you will always
         | know exactly which image you are using (even if you are using
         | the latest tag). It can even rewrite all your Dockerfiles and
         | docker-compose files to use those digests.
         | 
         | https://github.com/safe-waters/docker-lock
         | 
         | Would love for anyone dealing with this issue to check it out!
        
         | dbrgn wrote:
         | Operating system updates are the responsibility of the base
         | image. Use a base image that is regularly updated. For your own
         | images, ensure that they are also being rebuilt regularly.
         | 
         | There are a lot of semi-official Docker images on Docker Hub
         | that are published once and never updated until the next
         | software release. That is a huge anti-pattern in my view, those
         | images should not be relied on for production.
        
         | donmcronald wrote:
         | I always thought `:latest` was a bit of a special case. Ex:
         | Pushing `2.0.1` followed by `1.1.4` would leave `1.1.4` as the
         | `latest` image which could be an issue by itself. Is that
         | wrong?
         | 
         | I've always tried to pick stable tags [1] if they're available.
         | 
         | 1. https://docs.microsoft.com/en-
         | us/archive/blogs/stevelasker/d...
        
           | sfoley wrote:
           | There is absolutely nothing special about `latest`, it's just
           | a tag like any other. It simply happens to be the default tag
           | when omitted, just like how `origin` is the default remote
           | name when performing a `git clone`.
        
       | sergioisidoro wrote:
       | A note about not running as root: In certain systems if you break
       | out of the application and have access to a shell, it might
       | already be game over. An attacker probably already has access to
       | secrets, to the database, and all the assets worth protecting.
       | 
       | On the other hand, changing the docker user to non root might
       | introduce some failure scenarios (eg file ownership) which might
       | lead to other problems like availability incidents.
       | 
       | Security should start with threat modelling, and taking a risk-
       | based approach. You can spend hours fearing that someone might
       | break out of the Docker virtual machine through a zero day,
       | instead of using that time to fix much more likely and plausible
       | threat scenarios. Pick your battles.
        
         | felipelemos wrote:
         | > On the other hand, changing the docker user to non root might
         | introduce some failure scenarios (eg file ownership)
         | 
         | If you application needs root to execute, with very few
         | exceptions, it is already wrong.
        
           | sergioisidoro wrote:
           | Agreed. But when copying files to docker and building the
           | image, you will have to take care that files are not written
           | with root ownership in any stage of the build, which would
           | make them inaccessible to the application running as non
           | root.
           | 
           | That's the case I had in mind when writing that quote.
        
             | junon wrote:
             | > Agreed. But when copying files to docker and building the
             | image, you will have to take care that files are not
             | written with root ownership in any stage of the build,
             | which would make them inaccessible to the application
             | running as non root.
             | 
             | That's not the case, either. And root inside the container
             | != root outside the container. A completely new user:group
             | namespace is created inside the container. This is, in very
             | large part, what Linux namespaces are for.
             | 
             | Further, you can certainly have a root-owned file
             | accessible to non-root users, via chmod bits.
             | 
             | There are only a handful of excuses, ever, to run a
             | privileged container. If you're not 100% sure, then it is
             | not one of those excuses.
        
               | lawnchair_larry wrote:
               | _A completely new user:group namespace is created inside
               | the container. This is, in very large part, what Linux
               | namespaces are for._
               | 
               | No. root inside is root outside (if you can get outside).
               | The behavior you describe only applies if you enable user
               | namespace remapping, which docker doesn't by default.
        
           | freedomben wrote:
           | If you are are writing the app, I agree with you.
           | Unfortunately in some cases the person/team that wrote the
           | app has been gone for a long time. I've even seen a case
           | where the source code was missing and nobody knew where it
           | was, yet the service had to continue running.
           | 
           | If you're in that boat, there isn't much you can do except
           | work with it.
        
       | dkarp wrote:
       | Something really useful I recently discovered is multi-stage
       | Dockerfiles. Using _FROM_ and then _COPY --from_ to copy from a
       | previous stage to prevent unwanted intermediate build steps that
       | might expose secrets or just bloat your images.
       | 
       | Two ways this was useful for us. Firstly, we needed a private key
       | in the image to pull some private git repos. By doing this in a
       | previous stage, they're not included in the final image layers.
       | Secondly, we have a python backend and small react app served
       | from the same image. By splitting their build steps into a
       | _backend_ and _frontend_ stage, changes to frontend code don 't
       | break caching for the later backend steps or vice versa. E.g.
       | FROM python:3.8.3-slim-buster AS frontend       # Do frontend
       | build steps       FROM python:3.8.3-slim-buster AS backend
       | # Do backend build steps       FROM python:3.8.3-slim-buster AS
       | final       COPY --from=frontend /app /app/frontend       COPY
       | --from=backend /app /app/backend
        
         | junon wrote:
         | This is how C/C++ services are done in production, too. You
         | have a build step that does `apk add --update alpine-sdk git
         | cmake` etc., builds the service, and then you start again fresh
         | with a FROM (new stage) and `COPY --from` over the build
         | artifacts.
         | 
         | Reading this thread I'm surprised this isn't common knowledge
         | by now given how it's so incredibly paramount to efficient
         | production releases.
        
           | dkarp wrote:
           | It may well be common knowledge now. I discovered it when
           | writing our Dockerfiles earlier this year and before that
           | hadn't seen it. It looks like multi-stage builds were added
           | to the best practices in 2018 [https://github.com/docker/dock
           | er.github.io/blob/master/devel...] which is probably why I
           | missed it before
        
       | lysium wrote:
       | Great post. Two points:
       | 
       | ..If you don't inspect the wget script you might as while pipe it
       | into bash.
       | 
       | .. How to distribute secrets if not by env? (which I agree!
       | Honest question)
        
         | freedomben wrote:
         | _Disclaimer: I work for Red Hat as an OpenShift consultant so I
         | 'm biased_
         | 
         | There are competing pieces of advice for secure distribution of
         | secrets, but my current preference comes down to one of these
         | ways, depending on the organization:
         | 
         | 1. OpenShift/Kubernetes Secrets mounted into the Pod at
         | runtime.
         | 
         | 2. Hashicorp Vault (has a really well designed API. It's very
         | usable just with curl, which makes using it a joy)
         | 
         | 3. Sealed Secrets (less experience here but it's looking
         | positive right now) - https://github.com/bitnami-labs/sealed-
         | secrets
         | 
         | If you're using a different PaaS besides OpenShift, it may also
         | offer options worth considering (although do think about
         | portability. These days apps move platforms every few years on
         | average, though I think that may be changing now that K8s is
         | becoming the standard).
        
           | csunbird wrote:
           | > 1. OpenShift/Kubernetes Secrets mounted into the Pod at
           | runtime.
           | 
           | Do you recommend mounting secrets as environment variables to
           | the kubernetes pods instead of files?
        
             | freedomben wrote:
             | Yes, that is by far my preference. Much more 12 factor app-
             | ish and framework independent. A lot of Java apps will want
             | files though, so sometimes it isn't possible.
        
           | lysium wrote:
           | Thank you for the pointers! I'll have a look!
        
         | drablyechoes wrote:
         | We use `sops`[1] to do this and it works really well.
         | 
         | There is a Google Cloud KMS keyring (for typical usage) and a
         | GPG key (for emergency/offline usage) set up to handle the
         | encryption/decryption of files that store secrets for each
         | application's deployment. I have some bash scripts that run on
         | CI which are essentially just glorified wrappers to `sops` CLI
         | to generate the appropriate `.env` file for the application,
         | which is put into the container by the `Dockerfile`.
         | 
         | Applications are already configured to read
         | configuration/secrets from a `.env` file (or YAML/JSON,
         | depending on context), so this works pretty easily and avoids
         | depending on secrets being set in the `ENV` at build time.
         | 
         | You can also, of course, pass any decrypted values from `sops`
         | as arguments to your container deployment tool of choice (e.g.
         | `helm deploy foo --set
         | myapp.db.username=${decrypted_value_from_sops}`) and not bundle
         | any secrets at build time at all.
         | 
         | [1] https://github.com/mozilla/sops
        
         | nicolast wrote:
         | > How to distribute secrets if not by env? (which I agree!
         | Honest question)
         | 
         | You'll want to use BuildKit (`docker buildx`), see
         | https://docs.docker.com/develop/develop-images/build_enhance...
         | 
         | [edit] My bad, that works for secrets needed at build time, not
         | at runtime of course.
        
         | sgnnseven wrote:
         | Keep them away from the container and use one or more of the
         | following:
         | 
         | - A vault (Conjur, HCV, something else)
         | 
         | - A built-in credential service that comes with your cloud
         | 
         | - A sidecar that injects credentials or authenticates
         | connections to the backend directly (Secretless Broker, service
         | meshes, etc)
         | 
         | If you are doing a poor man's solution, mounted tmpfs volumes
         | that contain secrets are not terrible (but they're not really
         | that much safer than env vars).
        
           | richardwhiuk wrote:
           | Keep them away from the container _image_
        
             | sgnnseven wrote:
             | Keep them away from _both the image and the container_!
             | Getting env var values dumped for a process is trivial
             | outside of the process and even easier within the container
             | process space.
        
         | funkaster wrote:
         | I use docker secrets[0] and a script like this[1] to inject
         | them in the ENV hashmap in my app.
         | 
         | [0]: https://www.docker.com/blog/docker-secrets-management/
         | 
         | [1]: https://gitlab.com/-/snippets/2029832
        
         | KaiserPro wrote:
         | Secrets are an after thought in docker. When I first started
         | using docker I was surprised at how _rubbish_ it was
         | 
         | I've found its best to use the secrets provider that comes with
         | your cloud provider.
         | 
         | For AWS using SSM's get_parameter seems the best thing. But it
         | means you need to find a custom shim to put in your container
         | that will go and fetch the secrets and put them somewhere they
         | are needed.
        
           | cle wrote:
           | There's also Secrets Manager which integrates with other
           | services and has hooks for custom secret-fetching and
           | rotation, so your application doesn't need to.
        
         | [deleted]
        
         | evilduck wrote:
         | I think they meant to not ship secrets inside the container
         | using the Dockerfile keyword ENV, because they're retrievable.
         | If you must ship a ENV value in an image to the public (it's
         | quite useful for config values that need a default), then know
         | that it isn't secret anymore.
         | 
         | If you need to provide a secret value to an image and it needs
         | to remain secret (like a database password), you most commonly
         | would set the env values at runtime or volume mount a config
         | file at runtime.
         | 
         | On a different side of this, if you need a secret at image
         | build time (like an SSH key to access a private repo), you can
         | use build arguments with the ARG keyword and they won't persist
         | into the final image. Multi stage dockerfiles are also a great
         | way to keep your final image lean and clean.
        
           | lysium wrote:
           | Thank you for your pointers and the clarification about ENV!
           | I actually misunderstood.
        
       | surfer7837 wrote:
       | I also use Red Hat UBI (Universal Base Images). If you pull down
       | any package from DockerHub (even official images) you'll be
       | surprised about how many vulnerabilities they have.
        
         | freedomben wrote:
         | Second this. The UBI images are meticulously maintained by Red
         | Hat and are freely available and redistributable without a
         | subscription. Some of the largest companies in the world are
         | using these in production and putting dollars behind them, so
         | you can be pretty confident that they will work and be
         | maintained.
        
       | kelnos wrote:
       | Regarding using                   USER somenonrootuser
       | 
       | ... how do people deal with the need to write things out on
       | container start? Some of my services require config files, and I
       | need to interpolate the values of environment variables into
       | those config files, write them out, and then start the
       | application. (Also assume the application itself doesn't have
       | options for dropping privileges.) I'd rather not make the
       | filesystem locations in question writable by `somenonrootuser`.
       | 
       | My current strategy is to let the container entrypoint start as
       | root, but then I have a wrapper program installed that drops
       | privileges before it exec()s the actual service. It works, but is
       | there a better / more accepted way of doing this?
        
         | Propaganda_ wrote:
         | If the services can't be modified to load their config directly
         | from env vars, write the config to an off-root scratch volume
         | (e.g. mounted to /tmp/) and have them load from that. The root
         | volume should be mounted read-only either way to prevent
         | modification of your services should something get RCE.
        
       | itamarst wrote:
       | 1. "Don't update system packages" is bad advice. There are base
       | Docker images with out-of-date packages that need security
       | updates. CentOS for example doesn't update their base image for
       | months on end.
       | 
       | 2. Given you do want system package updates, you need to deal
       | with Docker caching breaking your security updates. So you need
       | to rebuild your image from scratch, without caching, either every
       | time there is a security update or on a weekly basis.
       | https://pythonspeed.com/articles/docker-cache-insecure-image...
       | 
       | 3. Not mentioned: don't use build secrets via ARG.
       | 
       | Some approaches to secure build secrets:
       | 
       | 1. BuildKit supports them:
       | https://docs.docker.com/develop/develop-images/build_enhance...
       | 
       | 2. Via the network, which is a hack, but it works:
       | https://pythonspeed.com/articles/docker-build-secrets/
       | 
       | 3. Via multi-stage builds, but this destroys caching.
        
         | BossingAround wrote:
         | > "Don't update system packages" is bad advice. There are base
         | Docker images with out-of-date packages that need security
         | updates. CentOS for example doesn't update their base image for
         | months on end.
         | 
         | Generally, if you want to update system packages, rebuild the
         | container make it the new base for you. Updating with every
         | build provides a potentially non-reproducible build.
        
           | itamarst wrote:
           | It's a tradeoff, yes. You can do more work to rebuild a base
           | image, or you can say "technically have slightly newer
           | version of glibc isn't reproducible but in practice I don't
           | expect that to break anything so I'll live with the risk".
        
       | freedomben wrote:
       | Really solid post.
       | 
       | I've helped a lot of people with Dockerfiles ranging from
       | horrendous security issues to simple bad practice making lives
       | hell, and much of this is solid advice.
       | 
       | A lot of what I tell people boils down to: keep your container
       | lean and clean. Don't do things in the container that you
       | wouldn't do on the host (like curl-ing from the internet into
       | bash as root :-D, or using questionable base images).
       | 
       | My deployment life has been vastly improved by shipping in
       | containers, but I have seen a lot of security regressions because
       | people feel safe to be reckless (like running the app as root)
       | due to the container guard rails. Don't think this way.
        
       | xaduha wrote:
       | > Using ENV to store secrets is bad practice because Dockerfiles
       | are usually distributed with the application
       | 
       | Nevermind Dockerfiles, env variables are preserved in stopped
       | containers that are hanging around even when you use docker-
       | machine AFAIK, you can easily `docker inspect` them.
        
         | tsumnia wrote:
         | Is this referring to ENVs inside the Docker image or on the
         | system running it as a whole?
        
           | wadkar wrote:
           | Aren't they the same thing (sort of)? The ENV in Dockerfile
           | (or .env if you're doing docker-compose) will be available
           | during the build as well as runtime.
        
         | move-on-by wrote:
         | Always use `--rm` flag to automatically remove containers when
         | it exits could probably be another best practice.
        
       ___________________________________________________________________
       (page generated 2020-10-14 23:00 UTC)