It’s recommended to set the nf_conntrack_max as below across all nodes in the cloud environment (for all computes, controller and storage nodes):
Set the nf_conntrack_max to 1048576 (default is 65536)
#sysctl -w net.netfilter.nf_conntrack_max=1048576
#echo 24576 > /sys/module/nf_conntrack/parameters/hashsize
And add the below line to /etc/modprob.conf
Note: hashsize can be nf_conntrack_max by 4 or 8.
NOTE: Check the defaults as below:
cat /etc/sysctl.conf | grep nf_conntrack_max
Why the pg number required to be reduced?
- The default pool’s page number may be higher.
- Ceph cluster usage, recovery and re-balance time negatively impacts with higher PG numbers.
Warning: This process requires a maintenance window of ceph cluster and a can take a significant amount of time of downtime based the data on the pools.
In general, Ceph does not allow decreasing pg/pgp number for a pool
- Plan a schedule a maintenance, during which no client should use the pool, which required to decrease the pg number. If needed stop appropriate client services to stop the IO on this pool.
- If the pool, contain a significant amount of data, the steps will take a while, because it copies all data for that pool into a new pool, then deletes the old pool.
- In the below script, edit the parameters for pool_name and new_pg as per requirements.
pool='.users' // set the pool to be reduced the pg num
pool_new='.users.new' // new pool used temp
new_pg=64 // Set the new pg number required. For ex, set as 64
ceph osd pool create $pool_new $new_pg
rados cppool $pool $pool_new
ceph osd pool delete $pool $pool --yes-i-really-really-mean-it
ceph osd pool rename $pool_new $pool
NOTE: Make sure, new pool use same crushrule-set as current pool
Notes: Please keep the below items before running the above steps.
- Do not use this above steps for a pool, which contain snapshots.
- Please test this on the staging environment before doing it on production.
- Ensure you have enough storage capacity to hold a copy of the biggest pool, you plan to reduce the pg number. Use “ceph df” to check the free/available raw space.
- Make there are no near full OSDs on the Ceph cluster.they may cause “full” state during pool data copy, which may lead to stop above the process and may cause an Ceph storage outage.