CEPH: TCP Performance Tuning


Below are a few TCP tunables that I ran into when looking into TCP performance tuning for CEPH.

Note that there are two separate sections for 10GE connectivity, so you will want to test with both to find what works best for your environment.

To implement, we just add what is below to /etc/sysctl.d/99-sysctl.conf and run “sysctl -p“. Changes are persistent across reboots. Ideally these TCP tunables should be deployed to all CEPH nodes (OSD most importantly).

[code language=”css”]
## Increase Linux autotuning TCP buffer limits
## Set max to 16MB (16777216) for 1GE
## 32MB (33554432) or 54MB (56623104) for 10GE

# 1GE/16MB (16777216)
#net.core.rmem_max = 16777216
#net.core.wmem_max = 16777216
#net.core.rmem_default = 16777216
#net.core.wmem_default = 16777216
#net.core.optmem_max = 40960
#net.ipv4.tcp_rmem = 4096 87380 16777216
#net.ipv4.tcp_wmem = 4096 65536 16777216

# 10GE/32MB (33554432)
#net.core.rmem_max = 33554432
#net.core.wmem_max = 33554432
#net.core.rmem_default = 33554432
#net.core.wmem_default = 33554432
#net.core.optmem_max = 40960
#net.ipv4.tcp_rmem = 4096 87380 33554432
#net.ipv4.tcp_wmem = 4096 65536 33554432

# 10GB/54MB (56623104)
net.core.rmem_max = 56623104
net.core.wmem_max = 56623104
net.core.rmem_default = 56623104
net.core.wmem_default = 56623104
net.core.optmem_max = 40960
net.ipv4.tcp_rmem = 4096 87380 56623104
net.ipv4.tcp_wmem = 4096 65536 56623104

## Increase number of incoming connections. The value can be raised to bursts of request, default is 128
net.core.somaxconn = 1024

## Increase number of incoming connections backlog, default is 1000
net.core.netdev_max_backlog = 50000

## Maximum number of remembered connection requests, default is 128
net.ipv4.tcp_max_syn_backlog = 30000

## Increase the tcp-time-wait buckets pool size to prevent simple DOS attacks, default is 8192
net.ipv4.tcp_max_tw_buckets = 2000000

# Recycle and Reuse TIME_WAIT sockets faster, default is 0 for both
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1

## Decrease TIME_WAIT seconds, default is 30 seconds
net.ipv4.tcp_fin_timeout = 10

## Tells the system whether it should start at the default window size only for TCP connections
## that have been idle for too long, default is 1
net.ipv4.tcp_slow_start_after_idle = 0

#If your servers talk UDP, also up these limits, default is 4096
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192

## Disable source redirects
## Default is 1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0

## Disable source routing, default is 0
net.ipv4.conf.all.accept_source_route = 0

Reference here

RHEL6 – Getting Up Close and Personal With Rsyslog

LogRsyslog has replaced Syslog as the default logging daemon in RHEL6. Rsyslog was designed to complete with syslog-ng and has several enhancements over plain old syslog. This includes but is not limited to more granularity with timestamps, direct database logging,   TCP support, and  relay server names in host fields which makes it easier to track the path a message has taken.

Below we are going to take a look at a few simple rsyslog configuration items.

Configure Rsyslog to Accept Remote Logs.

Within /etc/rsyslog.conf, comment out either the TCP or UDP syslog reception lines below. TCP is more reliable, however UDP is more widely supported.

# Provides UDP syslog reception
#$ModLoad imudp.so
#$UDPServerRun 514

# Provides TCP syslog reception
#$ModLoad imtcp.so
#$InputTCPServerRun 514

Configure a Server to Send Logs to a Remote Host.

To send all messages of info priority or higher to a remote host via udp, use the following format. Note that is the remote server that I want to send logs to.

*.info    @

To send the same priorities to the remote host via TCP, use two "@@"

*.info    @@

Note that you can specify the port number on which to send by using IP:PORT. When no port is specified the default port of 514 is used.

Note that depending on your configuration you may need to alter your IPtables configuration on your sending and/or receiving server. In my case I needed to allow UDP on port 514 on my remote syslog server. To accomplish this I used system-config-firewall-tui which added the following line to /etc/sysconfig/iptables.

-A INPUT -m state –state NEW -m udp -p udp –dport 514 -j ACCEPT

Which shows up as what you see below in the output of 'iptables -L'

ACCEPT     udp  –  anywhere             anywhere            state NEW udp dpt:syslog

Testing Your Configuration

Ok lets send a test to our remote syslog server. Note that rsyslog has been restarted on both hosts.

# logger "testing to remote rsyslog server"

Checking the messages file on the remote host we can see that the test message has arrived.

Aug 13 14:55:26 vfatmin02 root: testing to remote rsyslog server