So a few weeks ago some of our Centos 5.4 and OEL 5.5 servers started exibiting strange connectivity problems. Monitoring started alerting that hosts were down when they weren't; some boxes could ping target hosts and some couldn't; some boxes became unresponsive when interfaces were failed over, and the strangest of all is that some of the boxes would magically "repair" themselves. Like I said, strange.
Over the next week or so we ran into the issue a few more times and were able to see a pattern emerge. All the affected servers were running Centos 5.4 or Oracle Linux 5.5 and had broadcom (bnx2) adapters that were on the recieving end of some pretty decent traffic. Most importantly, all had a good number of dropped recieved packets that was continuously, albeit slowly, increasing.
A bit of google research led us to this bugzilla, which suggested changing the adapter's coalescense settings…. so a bit on coalescense.
In your network adapter, coalescence is all about interupts. Traditionally interupt coalescense (or IC) is used to reduce the number of interupts generated by the system by delaying the generating of an interrupt by a very short period of time…think less then a milisecond. In turn more traffic will be recieved by the host and the next interupt generated will be larger in size. You can find out more than you would ever want to know about coalesence here
So apparently the Broadcom IC settings were not aggressive enough. Packets would come in, fill up the receive queue, and get dropped before they could be sent off for processing via an interrupt. This takes us back to the bugzilla above and the suggested settings below which you set with the ethtool command
ethtool -C ethX rx-usecs 8 rx-usecs-irq 8 rx-frames 0 rx-frames-irq 0
Note that this was not an issue on any in Centos 5.6, any server with Intel adapters, or any server with 10g adapters. As a matter of fact, those servers had IC settings even more agressive then those above. See the Intel 82599EB 10-gigabit settings below
So now that we know the fix we need to make it permanent, which is not as easy as editing a config file for the device as the coalescence config is set at boot and it part of the installed driver for the device. Rather than muck around with trying to modify the driver itself, we decided to set and configure our devices at boottime with a rc script that checks the checks the each network interface on the box and modifys their IC settings if they are using the bnx2 (Broadcom) driver. We dropped the script below into /etc/rc.d and created a symbolic link to it in /etc/rc3.d.
case "$1" in
IFACE=$(ls /etc/sysconfig/network-scripts/ifcfg-eth*| grep -v bak | cut -d – -f 3)
for ETH in $IFACE
if ( ethtool -i $ETH | grep -qw bnx2 )
echo "$Changing Settings for $ETH"
ethtool -C $ETH rx-usecs 8 rx-usecs-irq 8 rx-frames 0 rx-frames-irq 0
echo "$ETH is not a broadcom"
echo " hammer time"
echo "usage: $0 (start|stop)"