[HN Gopher] How 1500 bytes became the MTU of the internet
       ___________________________________________________________________
        
       How 1500 bytes became the MTU of the internet
        
       Author : petercooper
       Score  : 357 points
       Date   : 2020-02-19 12:13 UTC (10 hours ago)
        
 (HTM) web link (blog.benjojo.co.uk)
 (TXT) w3m dump (blog.benjojo.co.uk)
        
       | rcarmo wrote:
       | I used to glue stuff together to FDDI rings and Token Ring
       | networks back in the day (I used Xylan switches, which had ATM-25
       | UTP line cards among other long-forgotten oddities), and MTU
       | sizes always struck me as being particularly arbitrary.
       | 
       | But I'm not really sure about the clock sync limitations being a
       | factor here. It was way back in the deepest past.
       | 
       | What I do remember vividly is the mess that physical layer
       | networking evolved into over the years thanks to dial-up and DSL
       | (ever had to set your MTU to 1492 to accommodate an extra PPP
       | header?).
       | 
       | And something is obviously wrong today, since we're still using
       | the same baseline value for our gigabit fiber to the home
       | connections, our 3/4/5G (scratch to taste) mobile phones, etc.
        
         | kalleboo wrote:
         | > _ever had to set your MTU to 1492 to accommodate an extra PPP
         | header?_
         | 
         | I had to replace my Apple AirPort Extreme when I got gigabit
         | fiber since it didn't have a manual MTU setting and it didn't
         | autodetect the MTU properly over PPPoE... In 2020 I still need
         | to manually set the MTU on my Ubiquiti USG...
        
         | mrkstu wrote:
         | Oh, and throw IPsec and a bunch of other protocols into the
         | mix- networking is often a fragile beast:
         | 
         | https://www.networkworld.com/article/2224654/mtu-size-issues...
        
         | hylaride wrote:
         | PPPoE was such an ugly mess. In its early days there were
         | various server-side OSes (IRIX and one other I can't remember)
         | that had piss-poor TCP/IP MTU implementations. The result was
         | that you ran into random websites that just took forever to
         | load as packets that happened to not fill the full 1492 limits
         | eventually got the data through. By around 2003 i rarely
         | encountered it anymore, but then I moved to a place with cable
         | and never had to deal with it again.
        
         | neurostimulant wrote:
         | > ever had to set your MTU to 1492 to accommodate an extra PPP
         | header?
         | 
         | Ah, I was always wondering why my ISP configured my fiber
         | modem's mtu to 1492. So it's due to using PPPoE? Is there no
         | way to use bigger mtu when using PPPoE?
        
           | jburgess777 wrote:
           | RFC 4638 provides a mechanism for this. It relies on the
           | Ethernet devices supporting a 'baby jumbo' MTU of 1508 bytes
           | and support for it is still a bit scarce.
           | 
           | https://tools.ietf.org/html/rfc4638
        
           | toast0 wrote:
           | Nowadays, there's PPPoA (over ATM) which wraps at a lower
           | level, and allows 1500 byte ethernet payloads through. But
           | running the ethernet over ATM at 1508 MTU so that PPPoE would
           | be 1500 was probably out of reach --- when PPPoE was
           | introduced, the customer endpoint was often the customer PC,
           | and some of those were using fairly old nics that might not
           | have supported larger packets.
           | 
           | Sadly, smaller than 1500 byte MTUs still cause issues for
           | some people to this day. It's all fine if everything is
           | properly configured, or if at least everything sends and
           | receives ICMP, but if something is silently dropping packets,
           | you're in for a bad day. These days, I think it's usually
           | problems with customers sending large packets, as opposed to
           | early days where receiving large packets would routinely
           | fail, but a lot of that is because large sites gave up on
           | sending large packets.
        
             | rcarmo wrote:
             | Yes, PPPoA was also a thing I dealt with, and another
             | source of irritating MTU issues.
        
       | franga2000 wrote:
       | Not my field, so I might be making an obvious error here, but:
       | 
       | If there are efficiency gains to be had from using jumbo frames,
       | wouldn't setting my MTU to a multiple a 1500 still be of some
       | benefit? If my PC, my switch and my router all support it, that
       | would still be a tiiiny improvement. If the server's network does
       | as well and let's say both of our direct providers, even if none
       | of the exchanges or backbones in between do, that would still be
       | an efficiency gain for ~10% of the link, right?
        
         | benjojo12 wrote:
         | Locally you can set your MTU to larger than 1500, but if you
         | (generally) try and send a packet towards the internet larger
         | than 1500 it will be dropped without a trace, or it will be
         | dropped and an ICMP message will be generated to tell your
         | system to lower the MTU. Assuming you have not firewalled off
         | ICMP ;)
         | 
         | As a handy feature on Linux at least, you can set your MTU to
         | 9000 locally, and then set the default (internet generally)
         | route to have a MTU of 1500 to prevent issues:
         | 
         | ip route add 0.0.0.0/0 via 10.11.11.1 mtu 1500
        
           | luma wrote:
           | Over-sized packets can (and generally will) be fragmented by
           | your router. It shouldn't be dropped unless you've
           | intentionally set DNF.
        
             | zajio1am wrote:
             | AFAIK, most OSes today set DNF by default.
        
             | benjojo12 wrote:
             | Fragments are very hit or miss on the internet,
             | https://blog.cloudflare.com/ip-fragmentation-is-broken/
        
             | duxup wrote:
             | They will fragment them but many times you will see
             | performance or other misc issues ... eventually.
        
           | apexalpha wrote:
           | Oh I never knew this. I wonder if I could enable Jumbo Frames
           | to stream 4k content more efficiently on my local LAN.
        
             | a_t48 wrote:
             | You could...if the software on both your server and your
             | media pc support large frames. And you're willing to deal
             | with every once in a while some piece of software doing the
             | wrong thing and sending out every packet with large MTU
             | without doing detection on max packet size.
        
         | duxup wrote:
         | Potentially but troubleshooting performance issues from
         | mismatched MTU can be brutal so most providers drop anything
         | over 1500.
         | 
         | Many devices can do over 1500 but anyone who has done so
         | without careful consideration knows the outcome isn't
         | predictable unless everyone on the network is prepared to do
         | so.
         | 
         | A dedicated / controlled SAN type environment can do it just
         | fine, beyond that it can be difficult.
        
       | tambourine_man wrote:
       | Nah, it's 1492 forever!
        
         | dredmorbius wrote:
         | Found the ADSL user.
        
       | willis936 wrote:
       | IEEE 802 history is disappearing without a trace? Afaik it's
       | pretty well documented, you just need to be a member for some of
       | the stuff.
       | 
       | http://www.ieee802.org/
       | 
       | I feel like the last piece we're missing in this story is the
       | performance impact of fragmentation. Like why not just set all
       | new hardware to an MTU of 9000 and wait ten years?
        
         | teddyh wrote:
         | My favorite Ethernet resource is Charles Spurgeon's Ethernet
         | (IEEE 802.3) Web Site: http://www.ethermanage.com/resources/
         | 
         | It used to have even more stuff, but I think he removed a lot
         | when he got his book published.
        
         | cesarb wrote:
         | > Like why not just set all new hardware to an MTU of 9000 and
         | wait ten years?
         | 
         | The hardware in question is Ethernet NICs. However, for you to
         | set the MTU on an Ethernet NIC to 9000, _every_ device on the
         | same Ethernet network (at least the same Ethernet VLAN),
         | including all other NICs and switches, including ones which
         | aren 't connected yet, must also support and be configured for
         | that MTU. And this also means you cannot use WiFi on that
         | Ethernet network (since, at least last time I looked, WiFi
         | cannot use a MTU that large).
        
           | willis936 wrote:
           | Sending a jumbo frame down a line that has hardware that
           | doesn't support jumbo frames somewhere along the way does not
           | mean the packet gets dropped. The NIC that would send the
           | jumbo frame fragments the packet down to the lower MTU. So
           | what's the performance impact of that fragmentation? If it
           | isn't higher than the difference in bandwidth overhead from
           | headers of 9000 MTU traffic vs. 1500 MTU traffic then why not
           | transition to 9000 MTU?
        
             | tyingq wrote:
             | It does mean packets sent to another local, non-routed,
             | non-jumbo-frame interface would get lost. So you could, for
             | example, _maybe_ talk to the internet, but you couldn 't
             | print anything to the printer down the hall.
        
             | cesarb wrote:
             | AFAIK, Ethernet has no support for fragmentation; I've
             | never seen, in the Ethernet standards I've read (though I
             | might have missed it), a field saying "this is a fragment
             | of a larger frame". There's fragmentation in the IP layer,
             | but it needs: (a) that the frame contains an IP packet; (b)
             | that the IP packet can be fragmented (no "don't fragment"
             | on IPv4, or a special header on IPv6); (c) that the sending
             | host knows the receiving host's MTU; (d) that it's not a
             | broadcast or multicast packet (which have no singular
             | "receiving host").
             | 
             | You can have working fragmentation if you have two separate
             | Ethernet segments, one for 1500 and the other for 9000,
             | connected by an IP router; the cost (assuming no broken
             | firewalls blocking the necessary ICMP packets, which sadly
             | is still too common) is that the initial transmission will
             | be _resent_ since most modern IP stacks set the  "don't
             | fragment" bit (or don't include the extra header for IPv6
             | fragmentation).
        
             | vlan0 wrote:
             | PMTU-D will save their ass in some cases. But it's not safe
             | to assume all routers in the path will respond to ICMP.
        
               | toast0 wrote:
               | It doesn't matter that the routers respond to ICMP, it
               | matters that they generate them, and that they're
               | addressed properly, and that intermediate routers don't
               | drop them.
               | 
               | Some routers will generate the ICMPs, but are rate
               | limited, and the underlying poor configuration means that
               | the rate limits are hit continously and most connections
               | are effectively in a path mtu blackhole.
        
               | vlan0 wrote:
               | >It doesn't matter that the routers respond to ICMP, it
               | matters that they generate them, and that they're
               | addressed properly, and that intermediate routers don't
               | drop them.
               | 
               | >Some routers will generate the ICMPs, but are rate
               | limited, and the underlying poor configuration means that
               | the rate limits are hit continously and most connections
               | are effectively in a path mtu blackhole.
               | 
               | Sure. But I'm not about to sit here and name all the
               | different reasons for folks. And since most here do not
               | have a strong networking background running consumer
               | grade routers at home, it seemed most applicable.
               | 
               | I could have used a more encompassing term like PMTU-D
               | blackhole, but I didn't.
        
             | sathackr wrote:
             | But how does the NIC know that, 11 hops away, there is a
             | layer 2 device, which cannot communicate with the
             | NIC(switches do not typically have the ability to
             | communicate directly with the devices generating the
             | packets), that only supports a 1500 byte frame?
             | 
             | Now you need Path MTU discovery, which as the article
             | indicates, has its own set of issues. (Overhead from trial
             | and error, ICMP being blocked due to security concerns,
             | etc...)
        
               | willis936 wrote:
               | Why should it need to? Ethernet is designed to have non-
               | deterministic paths (except in cases of automotive,
               | industrial, and time sensitive networks). If you get to a
               | hop that doesn't support jumbo frames then break it into
               | smaller frames and send them individually. The higher
               | layers don't care if the data comes in one frame or ten.
        
               | [deleted]
        
               | wbl wrote:
               | If you block ICMP you deserve what you get. Don't do
               | this. (Edit: don't block ICMP)
        
               | oarsinsync wrote:
               | So now you're trying to communicate from your home
               | machine to some random host on the internet (website,
               | VPS, streaming service), and you're configured for MTU
               | 9000, the remote service is also configured for MTU 9000,
               | but some transit provider in the middle is not, and
               | they've disabled ICMP for $reasons.
               | 
               | They blocked ICMP, do you deserve what you get?
        
               | wbl wrote:
               | Transit providers should push packets and generally do.
               | With PMTU failures it's usually clueless network admins
               | on firewalls nearer endpoints. And no, you don't and I
               | wish the admin responsible could feel your pain.
        
               | oarsinsync wrote:
               | > Transit providers should
               | 
               | Agreed
               | 
               | > and generally do
               | 
               | Agreed.
               | 
               | Now if you can make it 'will always just push packets',
               | we'll be golden.
               | 
               | Unfortunately, there are enough ATM/MPLS/SONET/etc
               | networks being run by people who no longer understand
               | what they're doing, that we're never going to get there.
               | 
               | To make matters more entertaining, IPv6 depends on icmp6
               | even more.
        
             | [deleted]
        
             | toast0 wrote:
             | > Sending a jumbo frame down a line that has hardware that
             | doesn't support jumbo frames somewhere along the way does
             | not mean the packet gets dropped
             | 
             | Almost all IP packets on the internet at large have the 'do
             | not fragment' flag set. IP defragmentation performance
             | ranges from pretty bad to an easy DDoS vector, so a lot of
             | high traffic hosts drop fragments without processing them.
             | 
             | If we had truncation (with a flag) instead of
             | fragmentation, that might have been usable, because the
             | endpoints could determine in-band the max size datagram and
             | communicate it and use that; but that's not what we have.
        
             | zamadatix wrote:
             | Fragmentation/reassembly is an l3 concept and not
             | guaranteed to large MTUs when it is there.
        
           | Avamander wrote:
           | The worst case I just recently encountered with Jumbo Frames
           | was with NetworkManager trying to follow local DNS server's
           | advertised MTU but when the local interface doesn't support
           | Jumbo Frames it just dies and keeps looping.
           | 
           | Even if you really want devices to use JF, some fail
           | miserably because it's just not well thought out.
        
         | tyingq wrote:
         | _" why not just set all new hardware to an MTU of 9000"_
         | 
         | Routers can fragment the packets, switches can't. So that would
         | be pretty chaotic for non-techie installed equipment.
        
           | kitteh wrote:
           | Plenty of routers today that can't fragment packets. And they
           | have rate limiters where they can only generate a small
           | amount of ICMP 3/4s (maybe 50 a second).
        
           | brutt wrote:
           | Last time I saw hardware Ethernet switch was 20 years ago.
           | 8-[ ]
        
             | tyingq wrote:
             | There's one in your house probably. It won't frag packets
             | between your wired PC and your wired printer.
             | 
             | There are also certainly a shit load of them in closets and
             | top-of-rack all over where I work.
        
         | vlan0 wrote:
         | >I feel like the last piece we're missing in this story is the
         | performance impact of fragmentation. Like why not just set all
         | new hardware to an MTU of 9000 and wait ten years?
         | 
         | Because a node with a MTU of 9000 will very likely be unable to
         | determine the MTU of every link in it's path. At best, you'll
         | see fragmentation. At worst, the node's packets will be
         | registered as interface errors when it encounters an interface
         | lower than 9k. Neither of those are desirable.
        
           | [deleted]
        
         | phicoh wrote:
         | The problem seems to be that both the IEEE and the IETF don't
         | want to do anything.
         | 
         | IEEE could define a way to support larger frames. 'just wait 10
         | years' doesn't strike me as the best solution, but at least it
         | is a solution. In my opinion a better way be if all devices
         | would report the max frame length they support. Bridges would
         | just report the minimum over all ports on the same vlan. When
         | there are legacy devices that don't report anything, just stay
         | at 1500.
         | 
         | IETF can also do someything today by having hosts probe the
         | effective max. frame length. There are drafts but they don't go
         | anywhere because too few people care.
        
       | hinkley wrote:
       | > If we look at data from a major internet traffic exchange point
       | (AMS-IX), we see that at least 20% of packets transiting the
       | exchange are the maximum size.
       | 
       | He's so optimistic. My brain heard this as " _only_ 20% of
       | packets [...] are the maximum size"
       | 
       | What are all of those 64 byte packets? Interactive shells, or
       | some other low bitrate protocol?
        
         | wmf wrote:
         | Probably mostly ACKs.
        
           | hinkley wrote:
           | Well now I feel dumb.
        
         | labawi wrote:
         | Note that maximum size is defined as >=1514, with 50% of
         | packets being >= 1024. It very well may be that ~45% of packets
         | are >= 1400 bytes.
         | 
         | The transfer graph is wrong - it shows packet count
         | distribution, not size. Quick math says roughly 90% of transfer
         | size are >= 1024 byte packets.
        
       | mhandley wrote:
       | For 802.11, the biggest overhead is not packet headers but the
       | randomized medium aquisition time so as to minimize collisions.
       | 1500 bytes is way too small here with modern 802.11, so if you
       | only send one packet for each medium aquisition, you end up with
       | something upwards of 90% overhead. The solution 802.11n and later
       | uses here is to use Aggregate MPDUs (AMPDUs). For each medium
       | aquisition, the sender can send multiple packets in a contiguous
       | burst, up to 64 KBytes. This ends up adding a lot of mechanism,
       | including a sliding window block ack, and it impacts queuing
       | disciplines, rate adaptation and pretty much everything else.
       | Life would be so much simpler if the MTU had simply grown over
       | time in proportion to link speeds.
        
         | saber6 wrote:
         | > Life would be so much simpler if the MTU had simply grown
         | over time in proportion to link speeds.
         | 
         | For every thing besides real-time maybe.
        
           | sjwright wrote:
           | The M in MTU stands for maximum, not mandatory.
        
             | Avamander wrote:
             | Tell that to NetworkManager
        
             | LeifCarrotson wrote:
             | It ends up being mandatory if you're sharing a non-MIMO
             | link with other systems that are using large packets.
        
             | saber6 wrote:
             | I understand. I have architected networks for over a decade
             | now. The real issue is serialization delay. If I have a
             | tiny voice packet that has to wait to be physically
             | transmitted behind a huge dump truck packet (big), it can
             | still be a problem even with high speed links with regards
             | to microbursts.
        
         | wtallis wrote:
         | > Life would be so much simpler if the MTU had simply grown
         | over time in proportion to link speeds.
         | 
         | The problem is that the world went wireless, so _maximum_ link
         | speeds grew a lot but _minimum_ link speeds are still
         | relatively low. A single 64kB packet tying up a link for
         | multiple milliseconds--unconditionally delaying everything else
         | in the queue by at least that much--is not what we want.
        
           | mhandley wrote:
           | 802.11 AMPDUs already tie up the link for ~4ms in normal
           | operation. Without this, the medium acquisition overheads
           | kill throughput. But you're correct that a single 64KB packet
           | sent at MCS-0 would take a lot longer than that.
           | 
           | 802.11 already includes a fragmentation and reassembly
           | mechanism at the 802.11 level, distinct from any end-to-end
           | IP fragmentation. Unlike IP fragmentation, fragments are
           | retransmitted if lost. So you could use 802.11 fragmentation
           | for large packets sent at slow link speeds to avoid tying up
           | the link for a long time.
        
           | btown wrote:
           | Especially since there are a lot of low latency applications
           | (games, etc.) that take advantage of being able to fit data
           | in a single packet that will not be held up due to other
           | applications sharing the link that might try to stuff larger
           | packets down the link.
        
           | inetknght wrote:
           | > _The problem is that the world went wireless, so maximum
           | link speeds grew a lot but minimum link speeds are still
           | relatively low._
           | 
           | I would argue: the problem is that the MTU isn't negotiated
           | at all, but especially not based on link availability.
        
             | snuxoll wrote:
             | IPv6 tries to solve this with path MTU discovery.
        
               | inetknght wrote:
               | Yes, but IPv6 is still at a higher level than Ethernet,
               | Wifi, et al and is therefore subject to the limitations
               | of the lower level framing
        
               | jandrese wrote:
               | Sure, I mean that's what pMTUd is all about. One big
               | difference with IPv6: Routers can't fragment packets.
               | They either send or they don't.
        
               | snuxoll wrote:
               | Sure?
               | 
               | At this point 1500 is the standard, we can't ever hope to
               | increase it without a way to negotiate the acceptable
               | value across the entire transmission path - that's what
               | IPv6 gives us.
        
               | inetknght wrote:
               | I'm not sure that negotiating the acceptable value across
               | the entire transmission path is a reasonable thing to do.
               | I'm not sure that IPv6 _should_ be aware of a
               | minimum/maximum MTU of underlying transmission path
               | particularly since that path can often change
               | transparently and each segment is subject to different
               | requirements.
        
         | [deleted]
        
         | fulafel wrote:
         | Is there a way to set a bigger MTU with wireless, like it is
         | with wired ethernet?
        
       | fireattack wrote:
       | Probably a dumb question, why the maximum size (and the one has
       | most of packages) in the AMS-IX graph 1514 bytes instead of 1500
       | bytes that got discussed in the article?
        
         | ra1n85 wrote:
         | 1500 bytes is the MTU of IP, in most cases. It often excludes
         | the Ethernet header, which is 14 bytes excluding the FCS,
         | preamble, IFG, and any VLANs.
         | 
         | If have a 1500 byte MTU for IP, then we need at least a 1514
         | byte MTU for IP + Ethernet. We often call the > 1514B MTU the
         | "interface MTU". It's unnecessarily confusing.
        
       | Animats wrote:
       | The original MTU was 576 bytes, enough for 512 bytes of payload
       | plus 64 bytes for the IP and TCP header with a few options. 1500
       | bytes is a Berkeleyism, because their TCP was originally
       | Ethernet-only.
        
         | wmf wrote:
         | Yeah, didn't T1 and ISDN use 576 to limit serialization delay
         | and jitter? The backbone probably switched to 1500 when OC-3
         | was adopted.
        
           | tssva wrote:
           | The default MTU for a T1/E1 was usually 1500. The default for
           | HSSI was 4470 which meant the default for DS3 circuits was
           | 4470. This was also the usual default MTU for IP over ATM
           | which is what most OC-3 circuits would have been using when
           | they were initially rolled out for backbone use. This
           | remained the usual default MTU all the way through OC-192
           | circuits running packet over sonnet.
           | 
           | I left the lSP backbone and large enterprise WAN field around
           | that time and can't speak to more recent technologies.
        
       | jleahy wrote:
       | As others have said, with Manchester encoding 10BASE2 is self-
       | clocking, you can use the data to keep your PLL locked, just as
       | you would on modern ethernet standards. However I imagine with
       | these standards you may not even have needed an expensive/power-
       | hungry PLL, probably you could just multi-sample at a higher
       | clock rate like a UART did (I don't actually know how this
       | silicon was designed in practice).
       | 
       | Futher PLLs have not got a lot better, but a lot worse. Maybe
       | back when 10BASE2 was introduced you could train a PLL on 16
       | transitions and then have acquired lock but there's no way you
       | can do that anymore (at modern data rates). PCI express takes
       | thousands of transitions to exit L0s->L0, which is all to allow
       | for PLL lock.
       | 
       | My best guess for the 1500 number is that with a 200ppm clock
       | difference between the sender and receiver (the maximum allowed
       | by the spec, which says your clock must be +-100ppm) then after
       | 1500 bytes you have slipped 0.3 bytes. You don't want to slip
       | more than half a byte during a packet as it may result in
       | duplicated or skipped byte in your system clock domain. (200
       | _1e-6)_ 1500=0.3.
        
         | Unklejoe wrote:
         | I thought most Ethernet PHYs don't lock actually to the clock,
         | but instead use a FIFO that starts draining once it's half way
         | full. The size of this FIFO is such that it doesn't under or
         | overflow given the largest frame size and worst case 200 PPM
         | difference.
         | 
         | I figured this is what the interframe gap is for - to allow the
         | FIFO to completely drain.
        
           | saber6 wrote:
           | IFP is really more to let the receiver knows where one stream
           | of bits stop and the next stream of bits start. How they
           | handle the incoming spray of data is up to them on a
           | queue/implementation level.
        
       | zamadatix wrote:
       | I've always wondered how 9000 became "jumbo". Technically
       | anything over 1500 is consider ju!no and there is no standard.
       | The largest I've seen is 16k. I think there are some crc accuracy
       | concerns at larger sizes but 9k still seems quite arbitrary for
       | computer land.
        
         | ajross wrote:
         | Ethernet frame size was never strictly limited. The way the
         | packet length works with Ethernet II frames (802.3 is more
         | explicit, but never really caught on) is that the hardware
         | needs to read all the way to the end of the packet and detect a
         | valid CRC and a gap at the end before it knows the thing is
         | done. So there's no reason beyond buffer size to put a fixed
         | limit on it, and different hardware had different SRAM
         | configuration.
         | 
         | Wikipedia has this link showing that 9000 bytes was picked by
         | one site c. 2003 simply because it was generally well-supported
         | by their existing hardware:
         | https://noc.net.internet2.edu/i2network/jumbo-frames/rrsum-a...
        
         | cesarb wrote:
         | The explanation according to
         | https://web.archive.org/web/20010221204734/http://sd.wareone...
         | is: "First because ethernet uses a 32 bit CRC that loses its
         | effectiveness above about 12000 bytes. And secondly, 9000 was
         | large enough to carry an 8 KB application datagram (e.g. NFS)
         | plus packet header overhead."
         | 
         | That is, 9000 is the first multiple of 1500 which can carry an
         | 8192-byte NFS packet (plus headers), while still being small
         | enough that the Ethernet CRC has a good probability to detect
         | errors.
        
       | gargs wrote:
       | This reminds me of various Windows applications back in the day
       | (Windows 3.1 and 95) that claimed to fine tune your connection
       | and one of the tricks they used was changing the MTU setting, as
       | far as I can recall. Could anyone share how that worked?
        
         | ndespres wrote:
         | If your computer sends a larger MTU than the next device
         | upstream can handle, the packets will be fragmented leading to
         | increased CPU usage, increased work by the driver, higher I/O
         | on the network interface, higher CPU load on your router or
         | modem, etc depending on where the bottleneck is. For example if
         | you connect over Ethernet to a DSL modem, or to a router that
         | has a DSL uplink, all your packets will be fragmented. This is
         | because DSL uses 8 bytes per packet for PPPoE authentication.
         | So if you send a 1500 byte packet to the modem, it will get
         | broken up by the modem into 2 packets: one is 1492+8 bytes, and
         | the other is 8+8 bytes.
         | 
         | But your PC is still sending more packets.. the modem is
         | struggling to fragment them all and send them upstream.. its
         | memory buffer is filling up.. your computer is retrying packets
         | that it never got a response on..
         | 
         | By lowering your computer MTU to 1492 to start with, you avoid
         | the extra work by the modem, which can have considerable speed
         | increase.
        
       | 2rsf wrote:
       | I remembered something different related to shared medium and
       | CMSA/CD where 1500 ensured fairness, and the minimum of 46
       | related to propagation time in the longest allowable cable
       | 
       | More at:
       | 
       | https://networkengineering.stackexchange.com/questions/2962/...
        
       | CGamesPlay wrote:
       | I think you're saying that the smallest bucket of packets are all
       | packets that would have been combined with a larger packet of
       | that had been an option... but that doesn't make sense. That
       | class of packets includes TCP SYN, ACK, RST, and 128 bytes could
       | fit an entire instant message on many protocols.
        
       | MrLeap wrote:
       | For.. reasons, I found myself having to make a 'driver' for a
       | PoE+ sensing device this month. The manufacturer had an SDK, but
       | compiling it requires an old version of Visual Studio - a bouquet
       | of dependencies, and it had no OSX support. None of the bundled
       | applications would do what I needed (namely, let me forward the
       | raw sensing data to another application.. _SOMEHOW_ ).
       | 
       | The data isn't encoded in the usual ways, so even 4 hours of
       | begging FFMPEG were to no avail.
       | 
       | A few glances at wireshark payloads, the roughly translated
       | documentation, and weighing my options, I embarked on a harrowing
       | journey to synthesize the correct incantation of bytes to get the
       | device to give me what I needed.
       | 
       | I've never worked with RTP/RTSP prior to this -- and I was
       | disheartened to see nodejs didn't have any nice libraries for
       | them. Oh well, it's just udp when it comes down to it, right?
       | 
       | SO MY NAIVETE BEGOT A JOURNEY INTO THE DARKNESS. Being a bit of
       | an unknown-unknown, this project did _not_ budget time for the
       | effort this relatively impromptu initiative required. An element
       | of sentimentality for the customer, and perhaps delusions of
       | grandeur, I convinced myself I could just crunch it out in a few
       | days.
       | 
       | A blur of coffee and 7 days straight crunch later, I built a
       | daisy chain of crazy that achieved the goal I set out for. I read
       | rfc3550 so many times I nearly have it committed to memory. The
       | final task was to figure out how to forward the stream I had
       | ensorcelled to another application. UDP seemed like the "right"
       | choice, if I could preserve the heavy lifting I had accomplished
       | to reassemble the frames of data.. MTU sizes are not big enough
       | to accomodate this (hence probably why the device uses RTP,
       | LOL.). OSX supports some hilariously massive MTU's (It's been a
       | few days, but I want to say something like 13,000 bytes?) Still,
       | I'd have to chunk and reasemble each frame into quarters. Having
       | to write _additional_ client logic to handle drops and OOO and
       | relying on OSX's embiggened MTU's when I wanted this to be
       | relatively OS independent... and the SHIP OR DIE pressure from
       | above made me do bad. At this point, I was so crunched out that
       | the idea of writing reconnect logic and doing it with TCP was
       | painful so I'm here to confess... I did bad...
       | 
       | The client application spawns a webserver, and the clients poll
       | via HTTP at about 30HZ. Ahhh it's gross...
       | 
       | I'm basically adrift on a misery raft of my own manufacture.
       | Maybe protobufs would be better? I've slept enough nights to take
       | a melon baller to the bad parts..
        
         | sneak wrote:
         | What does it sense that changes >=30 times a second?
        
           | dahfizz wrote:
           | You only use the Real Time Protocol (RTP) when you need time
           | sensitive data streaming (typically audio or video)
        
           | ses1984 wrote:
           | I'm guessing video frames given ffmpeg was part of the story.
        
           | Craighead wrote:
           | 60 hz electricity in American electric systems maybe? Thus
           | its polling every other wave?
        
           | jsight wrote:
           | I was curious about that to. Lots of references to video
           | related standards that imply its a PoE camera, but then why
           | isn't the data encoded in the usual ways? What does that
           | mean?
        
             | MrLeap wrote:
             | What codec would you use for a camera that captures not
             | RGB, but poetry of the soul?
             | 
             | CONTEXTLESS, HEADERLESS, ENDLESS BYTE STREAMS OF COURSE,
             | where the literal, idealized (remember udp) position of
             | each byte is part of a vector in a non euclidean coordinate
             | system.
        
               | cfallin wrote:
               | > What codec would you use for a camera that captures not
               | RGB, but poetry of the soul?
               | 
               | I would love to read a collaborative work between you and
               | James Mickens -- this genre of writing seems sadly under-
               | present in the computing world...
        
         | jtbayly wrote:
         | This needs to be its own post. Lol.
        
         | hinkley wrote:
         | https://en.m.wikipedia.org/wiki/Jumbo_frame
         | 
         | The wiki page talks about getting 5% more data through at full
         | saturation but it doesn't mention an important detail that I
         | recall from when it was proposed.
         | 
         | It turned out with gigabit Ethernet or higher that a single TCP
         | connection cannot saturate the channel with an MTU of 1500
         | bytes. The bandwidth went up but the latency did not go down,
         | and ACKs don't arrive fast enough to keep the sender from
         | getting throttled by the TCP windowing algorithm.
         | 
         | If I have a typical network with a bunch of machines on it
         | nattering at each other, that might not sound so bad. But when
         | I really just need to get one big file or stream from one
         | machine to another, it becomes a problem.
         | 
         | So they settled on a multiple of 1500 bytes to avoid uneven
         | packet fragmentation (if you get half packets every nth packet
         | you lose that much throughput). Somehow that multiple became 6.
         | 
         | And then other people wanted bigger or smaller and I'm not
         | quite sure how OS X ended up with 13000. You're gonna get
         | 8x1500 + 1000 there. Or worse, 9000 + 4000.
        
         | hinkley wrote:
         | In college I only had one group project, which scandalized me
         | but apparently lots of others found this normal. We had to fire
         | UDP packets over the network and feed them to an MJPeG card.
         | You got more points based on the quality of the video stream.
         | 
         | My very industrious teammate did 75% of the work (4 man team, I
         | did 20%, if you are generous with the value of debugging). One
         | of the things we/he tried was to just drop packets that arrived
         | out of order rather than reorder them. Turned out the
         | reordering logic was reducing framerates. So he ran some trials
         | and looked at OOO traffic, and across the three or so routers
         | between source and sink he never observed a single packet
         | arriving out of order. So we just dropped them instead and got
         | ourselves a few more frames per second.
        
           | pantalaimon wrote:
           | Tbh that's what most real time video/audio applications will
           | do. Reordering adds latency and that is worse than the
           | occasional dropped frame.
        
             | MrLeap wrote:
             | I can drop a frame, I can't casually drop misordered
             | packets. It takes many packets to build a frame. I have to
             | reorder interframe packets (actually I just insert-in-
             | order). If I drop packets, I get data scrolling like a
             | busted CRT raster.
             | 
             | I'm using a KoalaBarrel. Koalas receive envelopes full of
             | eucalyptus leaves. Koalas have to eat their envelopes in
             | order. First koala to get his full subscription becomes fat
             | enough to crush all the koalas beneath him. Keep adding
             | koalas. Disregard letters addressed to dead koalas.
        
         | anticensor wrote:
         | > embiggened
         | 
         | For non-native speakers: embiggened means huge, enlarged,
         | overgrown.
         | 
         |  _I am not a native speaker of English either_
        
           | squiggleblaz wrote:
           | *For non-Simpsons watchers
           | 
           | The word was created as a joke in a Simpsons episode, a word
           | used in Springfield only. It is described as "perfectly
           | cromulent" by a Springfielder, which is evidently meant to
           | mean "acceptable" or "ordinary" but is another
           | Springfieldism.
           | 
           | The joke may be lost on future generations who don't realise
           | they're not normal words.
        
             | skykooler wrote:
             | Actually, "embiggened" is an actual word, though archaic,
             | it's been around for over 130 years. The coinage of
             | "cromulant" to describe it as such was the joke there, not
             | "embiggen" itself.
             | 
             | Source: https://en.wiktionary.org/wiki/embiggen
        
               | kahirsch wrote:
               | It was used _once_ in 1884 and the writer there
               | specifically said he invented it. There are no other
               | recorded uses of the word before The Simpsons.
        
               | kalleboo wrote:
               | The show writers thought they came up with the word on
               | their own, they didn't know about the previous usage of
               | the word in 1884 (the episode was written in 1996, the
               | internet wasn't quite as full of facts back then),
               | "embiggen" was still supposed to be a joke.
        
           | MrLeap wrote:
           | To be fair to everyone, I've had native English speakers tell
           | me what I speak is barely English.
        
       | IshKebab wrote:
       | I don't think the Ethernet Frame Overhead graph is correct.
       | Surely the overhead is proportionally higher, per amount of data,
       | for smaller packets. That graph shows that the overhead is just
       | proportional to the amount of data sent, irrespective of the
       | packet size, which can't be right.
        
       | tartoran wrote:
       | I find that technology cements in strata (the archaeology term)
       | just as the layers that accumulate as the result of natural
       | processes and human activity. The dynamics are not exactly the
       | same but the tendency is similar. I wonder whether we'll always
       | be capable of digging down deeper to the beginnings as things get
       | more and more complicated.
        
       | smoyer wrote:
       | The article talks about how the 1500 byte MTU came about but
       | doesn't mention that the problem of clock recovery was solved by
       | using 4b/5b or 8b/10b encoding when sending Ethernet through
       | twisted-pair wiring. This encoding technique also provides a
       | neutral voltage bias.
       | 
       | EDIT: As pointed out below, I failed to account for the clock-
       | rate being 25% faster than the bit-rate in my original assertion
       | that Ethernet over twisted-pair was only 80% efficient due to the
       | encoding (see below)
        
         | mchristen wrote:
         | There has to be something else going on here because I
         | routinely achieve > 800mbps on my gigabit network over copper.
        
           | jws wrote:
           | Yes, the wire symbol rates are higher. For instance, 100mbit
           | Ethernet has a 125 million symbols per second wire rate.
        
           | adrianmonk wrote:
           | I'm not a hardware engineer, but from some quick research it
           | appears that 100 megabit ethernet ("fast ethernet") transmits
           | at effectively 125 MHz. So the 100 megabit number describes
           | the usable bit rate, not the electrical pulses on the wire.
           | 
           | Gigabit Ethernet is more complicated, and it uses multiple
           | voltages and all four pairs of wires bidirectionally. So it
           | is not just a single serial stream of on/off.
        
         | Unklejoe wrote:
         | > Ethernet through twisted-pair wiring only provides 80% of the
         | listed bit-rate
         | 
         | Actually, they already accommodated for this in the advertised
         | speed.
         | 
         | In other words, a 1 GbE SerDes runs at 1.250 Gbit/s, so you end
         | up with an actual 1 Gbit/s bandwidth.
         | 
         | The reason you don't actually hit 1 Gbit/s in practice is due
         | to other overheads such as the interframe gaps, preambles, FCS,
         | etc.
        
           | blattimwind wrote:
           | > The reason you don't actually hit 1 Gbit/s in practice is
           | due to other overheads such as the interframe gaps,
           | preambles, FCS, etc.
           | 
           | Actually Gigabit Ethernet is highly efficient; it can
           | actually give you 98-99 % of line rate as the payload rate.
        
           | smoyer wrote:
           | You're absolutely correct ... it's been a long time since I
           | was designing fiber transceivers but I should have remembered
           | this. Ultimately efficiency is also affected by other layers
           | of the protocol stack too (UDP versus TCP headers) which also
           | explains why larger frames can be more efficient. In the
           | early days of RTP and RTSP, there were many discussions about
           | frame size, how it affected contention and prioritization and
           | whether it actually helped to have super-frames if the
           | intermediate networks were splitting and combining the frames
           | anyway.
        
       | anonymousiam wrote:
       | Minor factoid the article does not mention. ATM is an alternative
       | to Ethernet that's used in many optical fiber environments. The
       | "transfer unit" size of the ATM "cell" is 53 bytes (5 for the
       | header and 48 for the payload). This is much smaller than 1500.
       | 
       | Another quirky story from the past: Sometime around 20 years ago
       | I was having a bizarre networking problem. I could telnet into a
       | host with no trouble, and the interactive session would be going
       | just fine until I did something that produced a large volume of
       | output (such as 'cat' on a large file). At that point the session
       | would freeze and I would eventually get disconnected. After
       | troubleshooting for a while I identified the problem as one of
       | the Ethernet NICs on the client host. It was a premium NIC (3Com
       | 3C509). Nonetheless, the NIC crystal oscillator frequency had
       | drifted sufficiently that it would lose clock synchronization to
       | the incoming frame if the MTU was larger than about 1000.
        
         | saber6 wrote:
         | ATM is mostly dead. The only places it exists now is legacy
         | deployments. Everyone has been deploying MPLS/IP instead of ATM
         | for the past 15-20 years.
        
         | gerdesj wrote:
         | The picture in the article looks like a 3c509 - three media
         | types and 100Mbs-1. Cor! I have loads of them somewhere. Plus a
         | fair few 905 and 595s.
        
       | gugagore wrote:
       | It would be nice to corroborate this reason with another source,
       | because my understanding is that clock synchronization was not a
       | factor in determining the MTU, which seems really more like a OSI
       | layer 2/3 consideration.
       | 
       | I am surprised the PLLs could not maintain the correct clocking
       | signal, since the signal encodings for early ethernet were "self-
       | clocking" [1,2,3] (so even if you transmitted all 0s or all 1s,
       | you'd still see plenty of transitions on the wire).
       | 
       | Note that this is different from, for example, the color burst at
       | the beginning of each line in color analog TV transmission [4].
       | It is also used to "train" a PLL, which is used to demodulate the
       | color signal transmission. After the color burst is over, the PLL
       | has nothing to synchronize to. But the 10base2/5/etc have a
       | carrier throughout the entire transmission.
       | 
       | [1]
       | https://en.wikipedia.org/wiki/Ethernet_physical_layer#Early_...
       | 
       | [2] https://en.wikipedia.org/wiki/10BASE2#Signal_encoding
       | 
       | [3] http://www.aholme.co.uk/Ethernet/EthernetRx.htm
       | 
       | [4] https://en.wikipedia.org/wiki/Colorburst
        
         | stripline wrote:
         | I also don't believe this is the reason. Early Ethernet
         | physical standards used Manchester encoding to recover the data
         | clock.
        
           | peteri wrote:
           | I would agree given I worked on an Ethernet chipset back in
           | 1988/9 keeping the PLL synched was not a problem. I can't
           | remember what the maximum packet size we supported was (my
           | guess is 2048) but that was more of a buffering to SRAM and
           | needing more space for counters.
           | 
           | The datasheet for the NS8391 has no such requirement for PLL
           | sync.
           | 
           | https://archive.org/details/bitsavers_nationaldaDataCommunic.
           | ..
        
       | trixie_ wrote:
       | Kind of expected an article titled 'How 1500 bytes became the MTU
       | of the internet' to tell us how 1500 bytes became the MTU of the
       | internet.
       | 
       | Even I could of told you, 'the engineers at the time picked 1500
       | bytes'.
        
       | leroman wrote:
       | Looks like a ripe low-hanging fruit for SpaceX Starlink to pick..
        
         | leroman wrote:
         | Why the downvote? possibly facilitating the end-to-end
         | transport will allow them to offer jambo packets
        
           | ekimekim wrote:
           | This would only be possible if you were talking from a jumbo-
           | configured client (let's say you've set up your laptop
           | correctly), across a jumbo-configured network (Starlink, in
           | your scenario), to a jumbo-configured server (here's the
           | problem).
           | 
           | The problem is that Starlink only controls the steps from
           | your router to "the internet". If you're trying to talk to
           | spacex.com it'd be possible, but if you're trying to talk to
           | google.com then now you need Starlink to be peering with ISPs
           | that have jumbo frames, and they need to peer with ISPs with
           | jumbo frames, etc etc and then also google's servers need to
           | support jumbo frames.
           | 
           | Basically, the problem is that Starlink is not actually end
           | to end, if you're trying to reach arbitrary servers on the
           | internet. It just connects you to the rest of the internet,
           | and you're back to where you started.
           | 
           | This is also true for any other ISP, Starlink is not special
           | in this regard.
        
             | Avamander wrote:
             | True, you'd expect endpoints to support Jumbo Frames as
             | well, but why not start at least making it possible. It's a
             | dead loop otherwise. IPv6 was the same at start.
        
           | saber6 wrote:
           | Because you don't know what you're talking about and are
           | engaging in "what if"-isms? There is no business case to
           | solve with jumbos frames over the Internet. I've been in this
           | business for 20 years. Seen this argument a dozen times. It
           | never changes.
        
           | dooglius wrote:
           | Most connections will not be peer-to-peer over Starlink, so
           | you need to deal with the least common denominator.
        
           | hylaride wrote:
           | Well, depending on the quality of the connections, a re-
           | transmit of a jumbo frame could mean having to re-transmit a
           | lot more data.
           | 
           | But since the local network and the end network where the
           | servers are located will almost certainly be 1500, the point
           | is almost all but moot.
        
       | bjornsing wrote:
       | Actually, MTUs below 1500 bytes are pretty common, e.g. with PPP
       | over Ethernet or other forms of encapsulation/tunneling.
        
       | russfink wrote:
       | IIRC it was called "thinnet" (10B2). I loved the vampire taps on
       | thick net.
        
       | alexforencich wrote:
       | I think the author may have made a mistake in some of the math.
       | The frame size distribution plots are likely based on the number
       | of frames, not the amount of data contained in said frames. The
       | 1500 byte and other large frames should therefore account for the
       | lion's share of the actual data transferred. Correcting this
       | error will totally change the final two graphs.
        
         | labawi wrote:
         | Yes. But only the "AMS-IX traffic by packet size range" graph
         | is wildly inaccurate. Ethernet frame overhead is per-packet and
         | presumably right.
        
           | alexforencich wrote:
           | Ah yeah, that's probably true. According to some back of the
           | envelope math, it seems like the distribution should be more
           | like 5%, 1%, 1%, 3%, 50%, 39%, ignoring the first and last
           | size bins.
        
       | afandian wrote:
       | Off-topic but looking at that old network card picture reminded
       | me of a very vague memory of more than one card with a component
       | that looked like a capacitor, except it looked cracked.
       | 
       | Is my mind playing tricks? Were they faulty units or was there
       | meant to be a crack?
       | 
       | This picture could be the same thing:
       | 
       | https://www.vogonswiki.com/images/3/37/Viglen_Ethergen_PnP_2...
        
         | gerdesj wrote:
         | Old network card eh? Its a 3Com 3C509:
         | 
         | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
         | 
         | I still have a load of them gathering dust somewhere. However a
         | system with ISA on it is a bit rare now and I'm not sure I can
         | be bothered to compile a modern kernel small enough to boot on
         | one. Besides, it will probably need cross compiling on
         | something with some grunt that has heard of the G unit prefix.
        
       | mertenVan wrote:
       | Software developer talking confidently about electrical
       | engineering issues he knows nothing about. How cute. /s
       | 
       | All Ethernet adapters since the first Alto card had self-clocking
       | data recovery [1].
       | 
       | Clock accuracy was never a problem, as long as it was withing the
       | acceptable range required for PLL lock/track loop.
       | 
       | The reason for 1500 MTU is that for packet-based systems, you
       | don't want infinitely large packets. You want _small_ packets.
       | but large enough so that packet overhead is insignificant, which
       | in engineering terms means less than 2%-5% overhead. Thus 1500
       | max packet size. Everything above that just makes switching and
       | buffering needlessly expensive, SRAM was hella expensive back
       | then. Still is today (in terms of silicon area).
       | 
       | Look at all the memory chips on Xerox Alto's Ethernet board
       | (below) - memory chips were already taking ~50% of the board
       | area!
       | 
       | [1] Schematic of the original Alto Ethernet card clock recovery
       | circuit: https://www.righto.com/2017/11/fixing-ethernet-board-
       | from-vi...
       | 
       | EDIT: Lol! Author has completely replaced erroneous explanation
       | with correct explanation, including link to seminal paper about
       | packet switching. Good.
        
         | mlyle wrote:
         | > How cute. /s
         | 
         | > EDIT: Lol! Author has completely replaced erroneous
         | explanation with correct explanation, including link to seminal
         | paper about packet switching. Good.
         | 
         | Don't be a jerk. Being right doesn't give you the right to make
         | fun of people.
        
           | mertenVan wrote:
           | Noted. I find the over-confidence over a completely imagined
           | issue funny and interesting. I make that mistake too. It's
           | always interesting to do a post-mortem: why was I so
           | confident? how did I miss the correct answer? I respect the
           | author for doing such a fast turn around :)
        
             | hackmiester wrote:
             | Good to know that your intent wasn't malicious, but fwiw, I
             | also didn't find the tone particularly appropriate for HN,
             | either.
        
               | contingencies wrote:
               | The hardware world is full of this.
        
               | generatorguy wrote:
               | I think because in the hardware world the cost for being
               | wrong is so much higher since you can't push out an over
               | the air update or update your saas or whatever. So if you
               | don't know you stay quiet instead of being wrong.
        
         | kingosticks wrote:
         | > Everything above that just makes switching and buffering
         | needlessly expensive, SRAM was hella expensive back then. Still
         | is today (in terms of silicon area).
         | 
         | Why does a larger MTU make switching more expensive?
         | 
         | And why does it effect buffering? Won't the internal buses and
         | data buffers of networking chips be disconnected from the MTU?
         | Surely they'll be buffering in much smaller chunks, maybe
         | dictated by their SRAM/DRAM technology. Otherwise, when you
         | consider the vast amount of 64B packets, buffering with 1500B
         | granularity would be extremely expensive.
        
           | mertenVan wrote:
           | I suggest you read the paper linked by the blog author
           | ("Ethernet: Distributed Packet Switching for Local Computer
           | Networks"), specifically Section 6 (performance and
           | efficiency 6.3). It will answer all your questions.
           | 
           | > Why does a larger MTU make switching more expensive?
           | 
           | Switching requires storage of the entire packet in SRAM.
           | 
           | Larger MTU = More SRAM chips
           | 
           | If existing MTU is already 95% network efficient (see paper),
           | then larger MTU is simply wasted money.
        
             | mprovost wrote:
             | Traditionally it's been true that you need SRAM for the
             | entire packet, which also increases latency since you have
             | to wait for the entire packet to arrive down the wire
             | before retransmitting it. But modern switches are often
             | cut-through to reduce latency and start transmitting as
             | soon as they see enough of the headers to make a decision
             | about where to send it. This also means that they can't
             | checksum the entire packet, which was another nice feature
             | with having it all in memory. So if it detects corruption
             | towards the end of the incoming packet it's too late since
             | the start has already been sent - most switches will
             | typically stamp over the remaining contents and send
             | garbage so it fails a CRC check on the receiver.
             | 
             | Which raises another point in relation to the 1500 MTU -
             | all of the CRC checks in various protocols were designed
             | around that number. Even the checksum in the TCP header
             | stops being effective with larger frames, so you end up
             | having to do checksums at the application level if you care
             | about end to end data integrity.
             | 
             | https://tools.ietf.org/html/draft-ietf-tcpm-anumita-tcp-
             | stro...
        
               | mertenVan wrote:
               | You're describing cut-through switching [1]. Because of
               | its disadvantage, it is usually limited to uses that
               | require pure performance, such as HFT (High Frequency
               | Trading). Traditional store-and-forward switching is
               | still commonly used (or some hybrid approach).
               | 
               | "The advantage of this technique is speed; the
               | disadvantage is that even frames with integrity problems
               | are forwarded. Because of this disadvantage, cut-through
               | switches were limited to specific positions within the
               | network that required pure performance, and typically
               | they were not tasked with performing extended
               | functionality (core)." [2]
               | 
               | [1] https://en.wikipedia.org/wiki/Cut-through_switching
               | 
               | [2] http://www.pearsonitcertification.com/articles/articl
               | e.aspx?...
        
               | mlyle wrote:
               | > Which raises another point in relation to the 1500 MTU
               | - all of the CRC checks in various protocols were
               | designed around that number.
               | 
               | Hmm. Why is this? It seems if we have a CRC-32 in
               | Ethernet (and most other layer 2 protocols), we'll have a
               | guarantee to reject certain types of defects entirely...
               | But mostly we're relying on the fact that we'll have a 1
               | in 4B chance of accepting each bad frame. Having a bigger
               | MTU means fewer frames to pass the same data, so it would
               | seem to me we have a lower chance of accepting a bad
               | frame per amount of end-user data passed.
               | 
               | TCP itself has a weak checksum at any length. The real
               | risk is of hosts corrupting the frame between the actual
               | CRCs in the link layer protocols. E.g. you receive frame,
               | NIC sees it is good in its memory, then when DMA'd to bad
               | host memory it is corrupted. TCP's sum is not great
               | protection against this at any frame length.
        
               | mprovost wrote:
               | The risk is that multiple bits in the same packet are
               | flipped, which the CRC can't detect. If the bit error
               | rate of the medium is constant, then the larger the
               | frame, the more likely that is to occur. Also as Ethernet
               | speeds increase, the underlying BER stays the same (or
               | gets worse) so the chances of encountering errors in a
               | specific time period go up. 100G Ethernet transmits a
               | scary amount of bits so something that would have been
               | rare in 10Base-T might happen every few minutes.
        
       | thehappypm wrote:
       | When networks were new, computers connected to each other using a
       | shared trunk that you _physically_ drilled into. It 's a non-
       | trivial problem to send data over a shared channel; it's very
       | easy for two systems to clobber each other. A primitive, but
       | somewhat effective mechanism is ALOHA
       | (https://en.wikipedia.org/wiki/ALOHAnet), where multiple senders
       | randomly try to send their message to a single receiver. The
       | single receiver then repeats back any messages it successfully
       | receives. In that way the sender is able to confirm its message
       | got through -- an ack. After a certain amount of time with no
       | ack, senders repeat their messages. As you can imagine, shorter
       | packets are less likely to cause collisions.
       | 
       | Ethernet uses something similar, but is able to detect if someone
       | else is using the wire, called carrier sense. A short packet of
       | 1500 bytes reduced the likelihood of collisions.
        
         | blitmap wrote:
         | Does multiplexing over Ethernet exist?
        
           | 5436436347 wrote:
           | Not anymore for all practical purposes, but it once did for
           | the very old 10Base-2 standard for Ethernet over coaxial
           | cable. This is practically why the old MII Ethernet PHY
           | interface protocol had the collision-sense lines to indicate
           | to the MAC to stop sending data if it detects incoming data,
           | in attempts to minimize collisions.
           | 
           | https://en.wikipedia.org/wiki/10BASE2
        
             | blitmap wrote:
             | This is very cool history, and something I never would have
             | stumbled upon myself. Thank you for sharing! :-)
        
       | throw0101a wrote:
       | Unreliable IP fragmentation, and the brokenness of Path MTU
       | Discovery (PMTUD), is causing the DNS folks to put a clamp on the
       | size of (E)DNS message size:
       | 
       | * https://dnsflagday.net/2020/
        
       ___________________________________________________________________
       (page generated 2020-02-19 23:00 UTC)