site banner

Small-Scale Question Sunday for January 5, 2025

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

2
Jump in the discussion.

No email address required.

What's a good book about TCP/IP networking? I am currently redoing my home network setup and I've realized my knowledge of networking is very fragmented. I know the right incantations, but I have no idea what they actually do.

  • what does "default gateway" actually do? What happens when it's the "wrong" IP? When it's blank?
  • what happens when two machines claim to have the same IP?
  • how exactly does DHCP work?
  • how does UDP go through NAT?
  • what are VLANs?

Questions like this are pretty much in the wheelhouse of things like ChatGPT. It's really good at answering these high-level questions and providing good direction with the ability to dive deeper into each of the topics.

I asked on your behalf and everything looks pretty much like I would've written. https://chatgpt.com/share/677bd93a-310c-8004-9dcc-9b36c30fde8c

My take:

For home networking, unless you're setting up a homelab, you can probably ignore VLANs. Honestly, most of these are pretty much ignorable for what I'm expecting your use case of home network are concerned.

Anything vaguely modern in terms of a home router should handle all of these pretty transparently. Without getting into packet-level stuff, DHCP from the router will configure the clients and configure the default gateway to itself as well as prevent duplicate IPs (unless you're configuring them manually). DHCP itself tends to just work out of the box. UDP NATing, similarly, tends to just work. VLANs, at what I'm expecting is your scale, should likely just be ignored.

In my case, I have a small server rack that has a couple of NASes living in it along with a few switches (1GbE and 10GbE). The switches support VLANs, but even for what I'm doing, I'm far from needing any of the functionality it would provide. The router I'm using are a set of Eeros -- they can provide a mesh network, but for me all of them are hardwired to the switch.

If you're looking to experiment from a homelab perspective, that's another story. But it could be a really fun story. A common way of getting started there to get a solid grounding on the fundamentals is doing something like setting up a Raspberry Pi cluster and playing with those. It's a cheap and approachable way to learn these concepts.

I can already set up my home network (which is currently an x86 router, a custom-built NAS, a router working as a wireless AP and another router working as a wireless extender plus all the end-user devices), I want to understand why I am doing the things I am doing. I am sorry, but your ChatGPT log didn't exactly help with that. I'll see if asking it for a more textbook explanation from the ground up will work.

Much of this is really building on many decades worth of tech and it's hard to understand the why until you understand much of the whole stack.

Here's some of the whys, from my perspective in the order I would talk about them:

DHCP: when a device joins a network, it can broadcast on the network and ask for how it should configure it's network stack. Implicit in the request is the MAC (Media Access Control) address of the interface itself which provides the physical address of the interface. The DHCP server (in a home setting, usually in the router) assigns an IP from a block it manages and gives the rest of the networking details (gateway, subnet, etc) to the client. DHCP isn't strictly needed as the clients can be configured manually in many cases. Cheap IoT devices tend to rely on it.

Default Gateway: When you're sending any packet to something outside your local network, you send the packet to the gateway and it figures out how to get the packet to the destination. In a home setting, this will just be forwarding the packet upstream to your ISP. In a larger scale setting, it's going to consult things like BGP routing to figure out where to send things to. The beauty of IP is that the client doesn't need to worry about it and it's completely abstracted into the gateway itself.

Duplicate IPs: As mentioned before, every interface has a MAC address. When you're sending a packet on the network to another machine (i.e. not broadcast), you send the packet to the MAC address. But we're dealing with IP, not MACs. To translate from an IP address to a MAC address we send out a broadcast ARP (Address Resolution Protocol) request asking basically "will the device with IP xxx respond?" Broadcasts are received by all the machines on the network. The machine with the requested IP will respond. If there are multiple machines that are configured with the same IP, they'll all respond. What happens here is usually the first one wins. This is complicated by modern switches because they learn what IPs/MACs are on each of their ports. They'll likely assume there are two routes to the same host and weird things may happen. Lesson: don't do it, things break.

VLANs: From a switch perspective, it just controls what ports can talk to which other ports. If you have an 24-port switch, you can configure multiple VLANs such that, say, ports 1-12 can talk to each other, and 13-24 can talk to each other. It's setting up two "Virtual LANs." You can have a router that attaches to both of the VLANs to handle routing between them if you want. These are typically used to prioritize certain network traffic, or for security (e.g. a guest network can't talk to your servers).

UDP and NAT: Since there's no connection in UDP, the NAT device just remembers things like "when device XX using port YY sends a packet to internet address AA port BB, I sent the packet on my port PP. Later, if I get a packet from AA:BB on port PP, I'll look that up and forward the packet to XX:YY." The key here is that all IP packets have the source IP and port and destination IP and port. When it's doing NATing, it replaces the local IP (which isn't going to be publically routable) with it's own address and port. On the way back, it just does the reverse and replaces the destination IP/port (which is how the packet got to it in the first place) with the local network's addresses and ports and forwards.

Thanks, that was helpful!

DHCP: when a device joins a network, it can broadcast on the network and ask for how it should configure it's network stack. Implicit in the request is the MAC (Media Access Control) address of the interface itself which provides the physical address of the interface. The DHCP server (in a home setting, usually in the router) assigns an IP from a block it manages and gives the rest of the networking details (gateway, subnet, etc) to the client. DHCP isn't strictly needed as the clients can be configured manually in many cases. Cheap IoT devices tend to rely on it.

How does it broadcast its request if it doesn't have an IP address?

Default Gateway: When you're sending any packet to something outside your local network, you send the packet to the gateway and it figures out how to get the packet to the destination. In a home setting, this will just be forwarding the packet upstream to your ISP. In a larger scale setting, it's going to consult things like BGP routing to figure out where to send things to. The beauty of IP is that the client doesn't need to worry about it and it's completely abstracted into the gateway itself.

The local network is defined by the network mask, right? So with 255.255.255.0 if I send something from 192.168.1.2 192.168.1.3 there's no need for the gateway to be set up, but 192.168.2.3 is outside the network and the packets will be routed to the gateway?

This makes me wonder how the packets are routed within the local network, actually. Let's say I'm sending a request from my PC (192.168.1.5) to my NAS (192.168.1.2). The PC is connected to my wireless switch/AP (192.168.1.4), and both the switch/AP and the NAS are connected to the wired router (192.168.1.1). How does the switch/AP know it should send the request to the wired router and not to one of its other LAN ports?

How does it broadcast its request if it doesn't have an IP address?

This is where IP and ethernet get a bit blurry. ARP is operating at the raw ethernet level and it's sending out the raw ethernet packet to the ethernet broadcast address. In the packet it has it's IP and the requested IP. Implicit in the packet is the MAC address of the requesting machine. (Deeper dive: https://en.wikipedia.org/wiki/Ethernet_frame)

In most cases you think "I'm IP xxx sending something to IP yyy," the reality is at the ethernet level, the IP stuff is all payload the network really doesn't care about. Internally, everything on the actual network level is working with MAC addresses. IPs are just a really convenient abstraction on top of it. (in this case "network" is the layer 2 of the entire stack -- the data link layer)

The local network is defined by the network mask, right? So with 255.255.255.0 if I send something from 192.168.1.2 192.168.1.3 there's no need for the gateway to be set up, but 192.168.2.3 is outside the network and the packets will be routed to the gateway?

That's correct. Anything on the local subnet stays on your local network. Anything outside gets punted to the gateway to deal with.

This makes me wonder how the packets are routed within the local network, actually. Let's say I'm sending a request from my PC (192.168.1.5) to my NAS (192.168.1.2). The PC is connected to my wireless switch/AP (192.168.1.4), and both the switch/AP and the NAS are connected to the wired router (192.168.1.1). How does the switch/AP know it should send the request to the wired router and not to one of its other LAN ports?

I'm going to cavalierly ignore WiFi in this because it muddies things up and deal with layer 2 of the stack and up and just treat it as a switch. This is what's in my mental model of what's happening in some detail.

  1. You try to access "nas.orthoxerox.com"
  2. DNS lookup for that. Oops, we only have the IP of the DNS server: 192.168.1.254 (making something up)
  3. ARP on ethernet to get the MAC for ...254.
  4. This gets to the switch. It'll broadcast this packet to all its ports. (Once the switch knows that a certain MAC is on a port it remembers it. Most home-grade switches can remember a few thousand MAC addresses)
  5. NAS responds and then the switch and your machine know the MAC of the DNS.
  6. DNS lookup (several round-trips to do this) -- you now know the IP of the NAS. (Since the switch now knows the IP of the DNS, it sends it directly to the port it knows it's on)
  7. ARP for the IP of the NAS. (same as before)
  8. Finally, send an ethernet packet from your machine to the NAS. (Again, from the ethernet perspective, this is sending from your machine to the NAS based on it's MAC address when we're at the low level)

If there are multiple switches between you and the destination, the broadcast just keeps going.

If you want to have some "fun," look up "ARP storm." It's likely one of the few times most networking folks (I'm a programmer) even think about things at that level.

Thanks a lot! How does Ethernet deal with someone pulling a Spartacus and spoofing MAC addresses of existing nodes?

By default, absolutely nothing... you've found one of the common attack surfaces of ethernet! You can use this to do all sorts of malicious things. You can overload the switches by just spamming them with new MAC addresses. You can intercept traffic. General denial of service attacks. Circumventing security. All sorts of mayhem.

So, ways of dealing with this... you can have switches that are configured to only allow an interface with a certain MAC to connect to certain ports. Or you can have softer ways of dealing with this by feeding information from the switch to some variety of intrusion detection system. Similarly, a switch can be configured to ensure that a device DHCPing for an address can't suddenly start using a different MAC.

There's a host of enterprise-y tech being built in this arms race if you want to fund some hardcore security-focused teams. That said, I don't think I've ever encountered (maybe because I'm not an attacker) these in the run-of-the-mill office environments. This is including working at Amazon, which is a bit persnickety on security. I'm quite sure that they're running these things in the data centers though. For something like AWS, they have segregated networks for control-plane traffic (the back-end of the services and how they are configured) and customer traffic. And for customer traffic, everything is on its own VLAN to ensure that I can't make a malicious service that would attack neighboring instances on the same machine or subnet. They also have a bunch of security in place to ensure only trusted clients can connect to services and verify the servers' authenticity.

This is one of the underlying reasons that having good physical security is essential. Once you have access to a network you want to attack, you have a lot more surface area that you can use to attack it while (preferably from the attacker's perspective) remaining undetected.

There are an annoying number of shops that used to love Cisco's port security option, which will lock down an interface on a switch to a certain segmentation of MAC addresses (usually configured in adaptive modes). It's... not as unmanagable as it sounds, though it is very unmanageable and very much something that's usually only helpful against very specific threat models and when paired with a lot of other stuff.