[HN Gopher] Container networking is simple (2020) ___________________________________________________________________ Container networking is simple (2020) Author : zdw Score : 163 points Date : 2021-01-19 15:36 UTC (7 hours ago) (HTM) web link (iximiuz.com) (TXT) w3m dump (iximiuz.com) | caymanjim wrote: | This is...not simple. I agree that container networking is not | all that much different from other Linux networking, but that | doesn't make it simple. A lot of application developers are | switching to working with containers and haven't historically had | to do any manual network configuration. It's all new to them. | Linux networking conventions change every few years and simplying | keeping up with the basics is a chore. netplan is the current | flavor of the week but it's still new to plenty of people who've | never had to worry about anything more than auto-configured DHCP | or cloud-provider-default VPS networking. | | There are also important security considerations with container | networking. Docker, by default, punches massive holes through | your firewall in non-obvious ways. People don't realize that with | a default Docker configuration, containers are ignoring any | normal firewall rules you may have setup with iptables or ufw. | Locking that down is only easy if you already know iptables well, | and even if you do, managing it is a pain. | | The article doesn't touch on Kubernetes, but that's a whole other | can of worms. You have to pick a CNI and manually configure it; | DNS doesn't just magically work; default CIDR allocations often | conflict with existing networks; load balancer ingress for a | development/single-host/non-cloud environment is a horror show. | | This is a good and helpful article, but container networking is | not simple by any stretch. | mauflows wrote: | I hope k8s (or some managed subset) eventually provides a | stable networking abstraction. There is so much to consider | when it comes to network hardening and I'd like it to not have | to constantly stress about it, feeling like I don't know | enough. | kenniskrag wrote: | > through your firewall in non-obvious ways. People don't | realize that with a default Docker configuration, containers | are ignoring any normal firewall rules you may have setup with | iptables or ufw. | | What do you mean by non-obvious? If I bind a port of the | container to the host eg. -p 8080:80 only this port is a hole. | Do you have something different in mind? (I'm a docker | beginner) | mmgutz wrote: | Docker's custom iptables chains, upon restart of network or | docker, can and most likely WILL override rules that used to | work. Docker adds the DOCKER-USER user chain for the purpose | of readding rules back _after_ docker has set up its rules. | Most engineers never have to deal with it unless you 're | deploying to production boxes. | skrtskrt wrote: | Any good books on Linux networking? | | I have started spinning up "bare metal" k8s on a cloud VM and | it's not that hard to get going until you get to anything | networking related then I feel like I've just jumped off a | cliff. I have no knowledge there and the online resources seem | to be nonexistent because you're expected to just use a | prebaked solution from cloud providers. | | I ended up just installing k3s but I have yet to figure out | where the Traefik packaged in k3s can listen directly on port | 80 and 443, but the basic Traefik installed via Helm cannot | kazen44 wrote: | To be fair, starting with something like a BSD might be | easier in terms of networking. Mainly because the tooling | hasn't been all over the place in the last 2 decades. | | Also, a lot of network knowledge is not OS specific. Learning | about IP, ethernet, routing protocols etc is valuable no | matter which OS you use. | skrtskrt wrote: | I have done some learning about more generalized network | concepts (ip addresses, protocols, routing) but I want to | make the jump to actually apply it in some VMs now | | I think my next stop is the network section of Unix and | Linux System Administration Handbook | yrro wrote: | https://access.redhat.com/documentation/en- | us/red_hat_enterp... though it's a bit task based | | http://policyrouting.org/PolicyRoutingBook/ONLINE/TOC.html | but chapter 4 which introduces/explains how to use the ip | command never got written! | Hackbraten wrote: | I think it's time for us to get past "X is simple" or "X is | easy." It's a highly subjective thing to say, no matter the | topic. It's never helpful, it just makes other people feel bad. | iximiuz wrote: | Well, to be honest, some level of sarcasm was meant to be in | this title because it precedes a 4000+ words explanation of | the topic... But seems like I'm pretty bad at writing good | titles. For sure I didn't mean to make anyone feel bad. | stonesweep wrote: | > _netplan is the current flavor of the week_ | | ... for Ubuntu, and Ubuntu only. It is an invention of | Canonical not adopted by the rest of Linux distros (except | Ubuntu derivatives), generally speaking. | pnutjam wrote: | Long live wicked, easily managed with Yast on OpenSuse. | jVinc wrote: | I knew a guy, he hated networking, so he decided to set up a | kubenetes cluster on his home server to abstract all his | networking difficulties away. Now he occationally complains that | the network inside the network inside the network is annoying | because it doesn't play nicely with the network outside the | network on top of the main network. Overall I feel the rate at | which he talks about networking issues has increased, but his | enthusiasm for talking about them has gone up, so I guess that's | a win. | gyre007 wrote: | This blog post reminded me of the post [1] I wrote about 7 years | ago o_O | | [1] https://cybernetist.com/2013/11/19/lxc-networking/ | jnwatson wrote: | A lot of discussion around Docker in general is rehashing stuff | that lxc did a long time ago, so this fits the mold. | zekica wrote: | Why does no one talk about IPv6 this day and age? | ggm wrote: | Came here to say this. ULA would have been stacks better than | this. | cooervo wrote: | how is that simple? :O | avmich wrote: | > Working with containers always feels like magic. | | Containers are just application - which existed for many decades | - in a single-process (roughly) operating system. Like MS DOS. | | Nothing magical except containerized processes can't talk to each | other directly - there are security boundaries. | dangerbird2 wrote: | > in a single-process (roughly) operating system. | | Containers don't (generally) run in their own operating system, | just their own userland. It can actually be a source of | security vulnerabilities to assume that containers are | completely isolated from each other, such as running a root | user container in production, assuming it can't get privileged | access to the host. It's less similar to a bare-metal MS DOS | application than it is a glorified chroot jail | avmich wrote: | > Containers don't (generally) run in their own operating | system | | Right, but containerized application can't (ideally) talk to | other applications on the same machine, that's how it's | similar to a single-process OS with a single app running. Of | course there are details like a single application may still | contain multiple processes from OS standpoint, but the | overall comparison stands. | | > It's less similar to a bare-metal MS DOS application than | it is a glorified chroot jail | | These two cases are similar enough from containerized | application standpoint (only OS services are different than | those of MS DOS). | gorgoiler wrote: | Brilliant write up. Lots of this topic is so much easier to | understand when it's presented from first principles, without any | of the LXC or Docker helpfulness hiding the details. | | (If the author is reading, thank you! I'll likely use this | material for the pupils in my computer club.) | | Managing an IPv6 stack alongside IPv4 is also very informative. | IPv6 is still not widely deployed -- SMTP is likely tied to v4 | for all eternity -- but it's incredibly useful for managing | multiple sites of inventory over the internet. Seeing RFC1918 | style 10.x.y.z private addresses and IPAM in use by internal ops | and IT _in 2021_ brings tears to my eyes. | | Adding a section on using _conntrack_ to watch the way in which | the kernel handles MASQUERADE and DNAT would be illustrative as | well. | | I really like the diagrams too. | iximiuz wrote: | Thank you very much for your feedback! I appreciate it a lot | because at the end of the day that's what keeps me motivated! | gorgoiler wrote: | The _ip_ tool chain is a worthy thing to promote. | | I especially like all of the _replace_ verbs, which I wish | I'd known about sooner. They make idempotency much simpler to | express without any _if ! ip thing get <long list of route | args>; then ip add <same list of args> ; fi_ stuff. | NexRebular wrote: | Indeed... VIMAGE and Crossbow are such great technologies for | virtualizing container network stacks. Couldn't imagine life | without them anymore. | gegtik wrote: | This is the first article i see on HN with (2020) in parens to | show that it's a dated article | | =\ | z3t4 wrote: | You can have a lot of fun with network namespaces in Linux, like | running some programs in their own network/VPN. | m463 wrote: | It is if you use: --net=host | 29athrowaway wrote: | Which defeats the purpose of containerization because now the | container can sniff traffic on the host, hijack TCP | connections, etc. | stingraycharles wrote: | I know many people use single tenant hardware to run | containers, so I wouldn't say that it defeats the purpose of | containerization, just one of the benefits. | jeffbee wrote: | The confusion and disagreement on this topic is rooted in | faulty assumptions about other organizations' requirements. | Many people use containers with host networking, without | uid namespaces, without pid namespaces, without bind | mounts. What you think is "a container" may not be at all | universal. Also the idea that host networking grants all | processes CAP_NET_RAW is just weird and wrong. | MaxBarraclough wrote: | If you're running untrusted code in your container, you've | pretty much already lost. | | Containers are useful for deployment and configuration, they | are _not_ a robustly secure sandbox. For that you still need | to go with a VM. | | No cloud provider will offer to run your containers alongside | other customer's containers, on a shared kernel. Your | containers always run within your own VM. | effie wrote: | No, it is levels of security, not a white/black issue. For | some customers and providers, containered processes are | enough isolation. For many small businesses, even shared | hosting with chroot is enough. | | Generally speaking, Xen or Firecracker VMs do have smaller | attack surface than containered processes on a shared Linux | kernel. But configuration and exposed capabilities matter - | it is possible to have container better secured than a VM | (e.g. minimal Zones/jails env + correct MAC config vs. | general Qemu/VMware VM with many default legacy devices and | bad/no MAC config). | | Motivated attackers can escape even these VMs. So they are | not a magical solution. | | Common hypervisors are too big and buggy to be pronounced | as security panacea. From time to time, VM escapes | resurface to public but most are probably guarded and being | exploited in quiet. As we know after Spectre and Meltdown, | standard computing technology is buggy/bugged all the way | down to hardware. | | If you want _really_ "robustly secure" server environment, | such do exist: for example, separation kernels like the L4 | family or the Green Hills INTEGRITY systems. But for web | apps, almost nobody bothers. | | > No cloud provider will offer to run your containers | alongside other customer's containers, on a shared kernel. | Your containers always run within your own VM. | | Joyent does - via SmartOS zones. | jeffbee wrote: | I have to say, it would be cool if they did. If I could get | a dirt-cheap rate for running batch workloads in a | potentially antagonistic environment, I could make use of | that. Not all data is sensitive. | MaxBarraclough wrote: | It wouldn't save you that much compared to just going | with a VM, especially if it's computationally intensive. | KaiserPro wrote: | This is a very good article. | | Container networking is as simple as linux networking, which | isn't. | | Its as much of a faff now as it was when I was doing KVM | virtualisation professionally (don't, pay and use VMware. You'll | be much happier, and have lots of free time) | | To debug its a massive arse and lacks any useful and friendly | debugging tools. | | One thing that does help is either to use VLANs or a second | adaptor to separate container traffic from control. It makes | debugging _slightly_ easier. By easier, I mean that when you | accidentally misconfigure it, there is a better chance that you | 'll be able to get control of the host still. | CameronNemo wrote: | Besides VLANs, another way to separate control and data plane | traffic is using VRFs. | | https://people.kernel.org/dsahern/management-vrf-and-dns | | https://people.kernel.org/dsahern/docker-and-management-vrf ___________________________________________________________________ (page generated 2021-01-19 23:00 UTC)