In a VM of 16GB RAM, we are running rsync as a cron job(for every 10 minutes) on our production to sync GBs of folder from AWS EFS to local storage. After few days of running we found that the VM is running low on memory and we found that size of our buffer/cache is more than 4GB and using vmtouch we confirmed that entire folder we are syncing is getting cached.
Since we are running this on production we have an alerting system which alerts whenever available memory in VM is less than 20%.
So as a quick fix we are clearing cache after each time rsync runs using command echo 2 > /proc/sys/vm/drop_caches.
I'm completely against this, because clearing cache will affect the performance. But on the internet there are few articles which suggests clearing cache after running rsync jobs some of them are 1, 2. Also there are lot of other resources on internet which says not to clear cache.
Considering only rsync
- Do we really need to worry about aggressive caching it is doing because of which our buff/cache is very high
- How rest of the world is dealing with this? are you guys clearing cache each time you run rsync
rsync command we are using
rsync -aA --delete /... /...