Matt Dillon has made some changes to the xl driver that apparently solves a mysterious bug; I’m quoting from his changelog message below:
“Turn off hardware assisted transmit checksums by default. In buildworld loop tests this has been conclusively shown to corrupt transmit packets about one out of every million packets. The receive will not know the the packet is bad because hardware assist also apples the correct checksum to the corrupted packet. The result are random failures or corruption of network data in certain situations. On DragonFly, for some reason, doing a ‘resident /usr/bin/*’ seems to bring the problem out every few buildworlds with (primarily) mkdep’s cpp complaining about odd errors trying to open non-existant header files (during a header file search), such as EPROTONOSUPPORT. A tcpdump on both NFS client and server showed the client transmitting an access RPC and the server seeing a corrupted access RPC on its end, and then responding with EPROTONOSUPPORT. Other uncaught errors are also almost certainly occuring. mkdep is more likely to catch them because it actually checks the errno of a failed open() and does a huge number of open()’s (and as an NFS client this generates a huge amount of packet traffic).”