[HN Gopher] Container networking is simple (2020)
       ___________________________________________________________________
        
       Container networking is simple (2020)
        
       Author : zdw
       Score  : 163 points
       Date   : 2021-01-19 15:36 UTC (7 hours ago)
        
 (HTM) web link (iximiuz.com)
 (TXT) w3m dump (iximiuz.com)
        
       | caymanjim wrote:
       | This is...not simple. I agree that container networking is not
       | all that much different from other Linux networking, but that
       | doesn't make it simple. A lot of application developers are
       | switching to working with containers and haven't historically had
       | to do any manual network configuration. It's all new to them.
       | Linux networking conventions change every few years and simplying
       | keeping up with the basics is a chore. netplan is the current
       | flavor of the week but it's still new to plenty of people who've
       | never had to worry about anything more than auto-configured DHCP
       | or cloud-provider-default VPS networking.
       | 
       | There are also important security considerations with container
       | networking. Docker, by default, punches massive holes through
       | your firewall in non-obvious ways. People don't realize that with
       | a default Docker configuration, containers are ignoring any
       | normal firewall rules you may have setup with iptables or ufw.
       | Locking that down is only easy if you already know iptables well,
       | and even if you do, managing it is a pain.
       | 
       | The article doesn't touch on Kubernetes, but that's a whole other
       | can of worms. You have to pick a CNI and manually configure it;
       | DNS doesn't just magically work; default CIDR allocations often
       | conflict with existing networks; load balancer ingress for a
       | development/single-host/non-cloud environment is a horror show.
       | 
       | This is a good and helpful article, but container networking is
       | not simple by any stretch.
        
         | mauflows wrote:
         | I hope k8s (or some managed subset) eventually provides a
         | stable networking abstraction. There is so much to consider
         | when it comes to network hardening and I'd like it to not have
         | to constantly stress about it, feeling like I don't know
         | enough.
        
         | kenniskrag wrote:
         | > through your firewall in non-obvious ways. People don't
         | realize that with a default Docker configuration, containers
         | are ignoring any normal firewall rules you may have setup with
         | iptables or ufw.
         | 
         | What do you mean by non-obvious? If I bind a port of the
         | container to the host eg. -p 8080:80 only this port is a hole.
         | Do you have something different in mind? (I'm a docker
         | beginner)
        
           | mmgutz wrote:
           | Docker's custom iptables chains, upon restart of network or
           | docker, can and most likely WILL override rules that used to
           | work. Docker adds the DOCKER-USER user chain for the purpose
           | of readding rules back _after_ docker has set up its rules.
           | Most engineers never have to deal with it unless you 're
           | deploying to production boxes.
        
         | skrtskrt wrote:
         | Any good books on Linux networking?
         | 
         | I have started spinning up "bare metal" k8s on a cloud VM and
         | it's not that hard to get going until you get to anything
         | networking related then I feel like I've just jumped off a
         | cliff. I have no knowledge there and the online resources seem
         | to be nonexistent because you're expected to just use a
         | prebaked solution from cloud providers.
         | 
         | I ended up just installing k3s but I have yet to figure out
         | where the Traefik packaged in k3s can listen directly on port
         | 80 and 443, but the basic Traefik installed via Helm cannot
        
           | kazen44 wrote:
           | To be fair, starting with something like a BSD might be
           | easier in terms of networking. Mainly because the tooling
           | hasn't been all over the place in the last 2 decades.
           | 
           | Also, a lot of network knowledge is not OS specific. Learning
           | about IP, ethernet, routing protocols etc is valuable no
           | matter which OS you use.
        
             | skrtskrt wrote:
             | I have done some learning about more generalized network
             | concepts (ip addresses, protocols, routing) but I want to
             | make the jump to actually apply it in some VMs now
             | 
             | I think my next stop is the network section of Unix and
             | Linux System Administration Handbook
        
           | yrro wrote:
           | https://access.redhat.com/documentation/en-
           | us/red_hat_enterp... though it's a bit task based
           | 
           | http://policyrouting.org/PolicyRoutingBook/ONLINE/TOC.html
           | but chapter 4 which introduces/explains how to use the ip
           | command never got written!
        
         | Hackbraten wrote:
         | I think it's time for us to get past "X is simple" or "X is
         | easy." It's a highly subjective thing to say, no matter the
         | topic. It's never helpful, it just makes other people feel bad.
        
           | iximiuz wrote:
           | Well, to be honest, some level of sarcasm was meant to be in
           | this title because it precedes a 4000+ words explanation of
           | the topic... But seems like I'm pretty bad at writing good
           | titles. For sure I didn't mean to make anyone feel bad.
        
         | stonesweep wrote:
         | > _netplan is the current flavor of the week_
         | 
         | ... for Ubuntu, and Ubuntu only. It is an invention of
         | Canonical not adopted by the rest of Linux distros (except
         | Ubuntu derivatives), generally speaking.
        
           | pnutjam wrote:
           | Long live wicked, easily managed with Yast on OpenSuse.
        
       | jVinc wrote:
       | I knew a guy, he hated networking, so he decided to set up a
       | kubenetes cluster on his home server to abstract all his
       | networking difficulties away. Now he occationally complains that
       | the network inside the network inside the network is annoying
       | because it doesn't play nicely with the network outside the
       | network on top of the main network. Overall I feel the rate at
       | which he talks about networking issues has increased, but his
       | enthusiasm for talking about them has gone up, so I guess that's
       | a win.
        
       | gyre007 wrote:
       | This blog post reminded me of the post [1] I wrote about 7 years
       | ago o_O
       | 
       | [1] https://cybernetist.com/2013/11/19/lxc-networking/
        
         | jnwatson wrote:
         | A lot of discussion around Docker in general is rehashing stuff
         | that lxc did a long time ago, so this fits the mold.
        
       | zekica wrote:
       | Why does no one talk about IPv6 this day and age?
        
         | ggm wrote:
         | Came here to say this. ULA would have been stacks better than
         | this.
        
       | cooervo wrote:
       | how is that simple? :O
        
       | avmich wrote:
       | > Working with containers always feels like magic.
       | 
       | Containers are just application - which existed for many decades
       | - in a single-process (roughly) operating system. Like MS DOS.
       | 
       | Nothing magical except containerized processes can't talk to each
       | other directly - there are security boundaries.
        
         | dangerbird2 wrote:
         | > in a single-process (roughly) operating system.
         | 
         | Containers don't (generally) run in their own operating system,
         | just their own userland. It can actually be a source of
         | security vulnerabilities to assume that containers are
         | completely isolated from each other, such as running a root
         | user container in production, assuming it can't get privileged
         | access to the host. It's less similar to a bare-metal MS DOS
         | application than it is a glorified chroot jail
        
           | avmich wrote:
           | > Containers don't (generally) run in their own operating
           | system
           | 
           | Right, but containerized application can't (ideally) talk to
           | other applications on the same machine, that's how it's
           | similar to a single-process OS with a single app running. Of
           | course there are details like a single application may still
           | contain multiple processes from OS standpoint, but the
           | overall comparison stands.
           | 
           | > It's less similar to a bare-metal MS DOS application than
           | it is a glorified chroot jail
           | 
           | These two cases are similar enough from containerized
           | application standpoint (only OS services are different than
           | those of MS DOS).
        
       | gorgoiler wrote:
       | Brilliant write up. Lots of this topic is so much easier to
       | understand when it's presented from first principles, without any
       | of the LXC or Docker helpfulness hiding the details.
       | 
       | (If the author is reading, thank you! I'll likely use this
       | material for the pupils in my computer club.)
       | 
       | Managing an IPv6 stack alongside IPv4 is also very informative.
       | IPv6 is still not widely deployed -- SMTP is likely tied to v4
       | for all eternity -- but it's incredibly useful for managing
       | multiple sites of inventory over the internet. Seeing RFC1918
       | style 10.x.y.z private addresses and IPAM in use by internal ops
       | and IT _in 2021_ brings tears to my eyes.
       | 
       | Adding a section on using _conntrack_ to watch the way in which
       | the kernel handles MASQUERADE and DNAT would be illustrative as
       | well.
       | 
       | I really like the diagrams too.
        
         | iximiuz wrote:
         | Thank you very much for your feedback! I appreciate it a lot
         | because at the end of the day that's what keeps me motivated!
        
           | gorgoiler wrote:
           | The _ip_ tool chain is a worthy thing to promote.
           | 
           | I especially like all of the _replace_ verbs, which I wish
           | I'd known about sooner. They make idempotency much simpler to
           | express without any _if ! ip thing get <long list of route
           | args>; then ip add <same list of args> ; fi_ stuff.
        
       | NexRebular wrote:
       | Indeed... VIMAGE and Crossbow are such great technologies for
       | virtualizing container network stacks. Couldn't imagine life
       | without them anymore.
        
       | gegtik wrote:
       | This is the first article i see on HN with (2020) in parens to
       | show that it's a dated article
       | 
       | =\
        
       | z3t4 wrote:
       | You can have a lot of fun with network namespaces in Linux, like
       | running some programs in their own network/VPN.
        
       | m463 wrote:
       | It is if you use:                   --net=host
        
         | 29athrowaway wrote:
         | Which defeats the purpose of containerization because now the
         | container can sniff traffic on the host, hijack TCP
         | connections, etc.
        
           | stingraycharles wrote:
           | I know many people use single tenant hardware to run
           | containers, so I wouldn't say that it defeats the purpose of
           | containerization, just one of the benefits.
        
             | jeffbee wrote:
             | The confusion and disagreement on this topic is rooted in
             | faulty assumptions about other organizations' requirements.
             | Many people use containers with host networking, without
             | uid namespaces, without pid namespaces, without bind
             | mounts. What you think is "a container" may not be at all
             | universal. Also the idea that host networking grants all
             | processes CAP_NET_RAW is just weird and wrong.
        
           | MaxBarraclough wrote:
           | If you're running untrusted code in your container, you've
           | pretty much already lost.
           | 
           | Containers are useful for deployment and configuration, they
           | are _not_ a robustly secure sandbox. For that you still need
           | to go with a VM.
           | 
           | No cloud provider will offer to run your containers alongside
           | other customer's containers, on a shared kernel. Your
           | containers always run within your own VM.
        
             | effie wrote:
             | No, it is levels of security, not a white/black issue. For
             | some customers and providers, containered processes are
             | enough isolation. For many small businesses, even shared
             | hosting with chroot is enough.
             | 
             | Generally speaking, Xen or Firecracker VMs do have smaller
             | attack surface than containered processes on a shared Linux
             | kernel. But configuration and exposed capabilities matter -
             | it is possible to have container better secured than a VM
             | (e.g. minimal Zones/jails env + correct MAC config vs.
             | general Qemu/VMware VM with many default legacy devices and
             | bad/no MAC config).
             | 
             | Motivated attackers can escape even these VMs. So they are
             | not a magical solution.
             | 
             | Common hypervisors are too big and buggy to be pronounced
             | as security panacea. From time to time, VM escapes
             | resurface to public but most are probably guarded and being
             | exploited in quiet. As we know after Spectre and Meltdown,
             | standard computing technology is buggy/bugged all the way
             | down to hardware.
             | 
             | If you want _really_ "robustly secure" server environment,
             | such do exist: for example, separation kernels like the L4
             | family or the Green Hills INTEGRITY systems. But for web
             | apps, almost nobody bothers.
             | 
             | > No cloud provider will offer to run your containers
             | alongside other customer's containers, on a shared kernel.
             | Your containers always run within your own VM.
             | 
             | Joyent does - via SmartOS zones.
        
             | jeffbee wrote:
             | I have to say, it would be cool if they did. If I could get
             | a dirt-cheap rate for running batch workloads in a
             | potentially antagonistic environment, I could make use of
             | that. Not all data is sensitive.
        
               | MaxBarraclough wrote:
               | It wouldn't save you that much compared to just going
               | with a VM, especially if it's computationally intensive.
        
       | KaiserPro wrote:
       | This is a very good article.
       | 
       | Container networking is as simple as linux networking, which
       | isn't.
       | 
       | Its as much of a faff now as it was when I was doing KVM
       | virtualisation professionally (don't, pay and use VMware. You'll
       | be much happier, and have lots of free time)
       | 
       | To debug its a massive arse and lacks any useful and friendly
       | debugging tools.
       | 
       | One thing that does help is either to use VLANs or a second
       | adaptor to separate container traffic from control. It makes
       | debugging _slightly_ easier. By easier, I mean that when you
       | accidentally misconfigure it, there is a better chance that you
       | 'll be able to get control of the host still.
        
         | CameronNemo wrote:
         | Besides VLANs, another way to separate control and data plane
         | traffic is using VRFs.
         | 
         | https://people.kernel.org/dsahern/management-vrf-and-dns
         | 
         | https://people.kernel.org/dsahern/docker-and-management-vrf
        
       ___________________________________________________________________
       (page generated 2021-01-19 23:00 UTC)