Unexplained memory leak? Free RAM goes down over time
Hi LowEndHelpDesk,
I have a VPS with 2GB of memory. I don't use it a lot, leaving it mostly idle except the IPv6 BGP session (over 6in4) I have with he.net and NetAssist. I don't load the routing table; I just route everything to he.net.
However, I noticed that the memory usage increases slowly over time. I need to hard-reset the VPS every few weeks to restore it. This behaviour started roughly since 2020 started but I haven't made any significant changes to the configuration
Thanks to this beautiful graph from hetrixtools hopefully it explains better:
Memory usage is quite high:
[me@lax2 ~]$ free -h total used free shared buff/cache available Mem: 2.1Gi 1.6Gi 95Mi 1.0Mi 366Mi 289Mi Swap: 303Mi 20Mi 283Mi
You can see that there's actually no application actively consuming memory:
[me@lax2 ~]$ ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.3 109744 8264 ? Ss May13 5:58 /sbin/init root 2 0.0 0.0 0 0 ? S May13 0:00 [kthreadd] root 3 0.0 0.0 0 0 ? I< May13 0:00 [rcu_gp] root 4 0.0 0.0 0 0 ? I< May13 0:00 [rcu_par_gp] root 6 0.0 0.0 0 0 ? I< May13 0:00 [kworker/0:0H-kblockd] root 8 0.0 0.0 0 0 ? I< May13 0:00 [mm_percpu_wq] root 9 0.0 0.0 0 0 ? S May13 1:39 [ksoftirqd/0] root 10 0.0 0.0 0 0 ? S May13 0:00 [rcuc/0] root 11 0.0 0.0 0 0 ? I May13 11:11 [rcu_preempt] root 12 0.0 0.0 0 0 ? S May13 0:00 [rcub/0] root 13 0.0 0.0 0 0 ? S May13 0:04 [migration/0] root 14 0.0 0.0 0 0 ? S May13 0:00 [idle_inject/0] root 16 0.0 0.0 0 0 ? S May13 0:00 [cpuhp/0] root 17 0.0 0.0 0 0 ? S May13 0:00 [kdevtmpfs] root 18 0.0 0.0 0 0 ? I< May13 0:00 [netns] root 19 0.0 0.0 0 0 ? S May13 0:00 [rcu_tasks_kthre] root 20 0.0 0.0 0 0 ? S May13 0:03 [kauditd] root 21 0.0 0.0 0 0 ? S May13 0:00 [khungtaskd] root 22 0.0 0.0 0 0 ? S May13 0:00 [oom_reaper] root 23 0.0 0.0 0 0 ? I< May13 0:00 [writeback] root 24 0.0 0.0 0 0 ? S May13 0:01 [kcompactd0] root 25 0.0 0.0 0 0 ? SN May13 0:00 [ksmd] root 26 0.0 0.0 0 0 ? SN May13 0:00 [khugepaged] root 114 0.0 0.0 0 0 ? I< May13 0:00 [kintegrityd] root 115 0.0 0.0 0 0 ? I< May13 0:00 [kblockd] root 116 0.0 0.0 0 0 ? I< May13 0:00 [blkcg_punt_bio] root 117 0.0 0.0 0 0 ? I< May13 0:00 [ata_sff] root 118 0.0 0.0 0 0 ? I< May13 0:00 [edac-poller] root 119 0.0 0.0 0 0 ? I< May13 0:00 [devfreq_wq] root 120 0.0 0.0 0 0 ? S May13 0:00 [watchdogd] root 121 0.0 0.0 0 0 ? S May13 4:59 [kswapd0] root 124 0.0 0.0 0 0 ? I< May13 0:00 [kthrotld] root 125 0.0 0.0 0 0 ? I< May13 0:00 [acpi_thermal_pm] root 126 0.0 0.0 0 0 ? I< May13 0:00 [nvme-wq] root 127 0.0 0.0 0 0 ? I< May13 0:00 [nvme-reset-wq] root 128 0.0 0.0 0 0 ? I< May13 0:00 [nvme-delete-wq] root 129 0.0 0.0 0 0 ? I< May13 0:00 [ipv6_addrconf] root 140 0.0 0.0 0 0 ? I< May13 0:00 [kstrp] root 146 0.0 0.0 0 0 ? I< May13 0:00 [zswap-shrink] root 147 0.0 0.0 0 0 ? I< May13 0:00 [kworker/u3:0] root 158 0.0 0.0 0 0 ? I< May13 0:00 [charger_manager] root 188 0.0 0.0 0 0 ? S May13 0:00 [scsi_eh_0] root 189 0.0 0.0 0 0 ? I< May13 0:00 [scsi_tmf_0] root 190 0.0 0.0 0 0 ? S May13 0:00 [scsi_eh_1] root 191 0.0 0.0 0 0 ? I< May13 0:00 [scsi_tmf_1] root 195 0.0 0.0 0 0 ? I< May13 0:15 [kworker/0:1H-kblockd] root 206 0.0 0.0 0 0 ? S May13 0:10 [jbd2/vda2-8] root 207 0.0 0.0 0 0 ? I< May13 0:00 [ext4-rsv-conver] root 234 0.0 2.2 172832 47940 ? Ss May13 3:28 /usr/lib/systemd/systemd-journald root 242 0.0 0.0 78076 688 ? Ss May13 0:00 /usr/bin/lvmetad -f root 247 0.0 0.2 30940 4732 ? Ss May13 0:02 /usr/lib/systemd/systemd-udevd systemd+ 250 0.0 0.2 26260 5208 ? Ss May13 0:05 /usr/lib/systemd/systemd-networkd systemd+ 276 0.0 0.1 91784 4276 ? Ssl May13 0:03 /usr/lib/systemd/systemd-timesyncd root 289 0.0 0.0 6588 1968 ? Ss May13 0:07 /usr/bin/crond -n dbus 290 0.0 0.1 6780 2772 ? Ss May13 3:35 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only root 291 0.0 0.2 17504 5660 ? Ss May13 1:47 /usr/lib/systemd/systemd-logind v2ray 294 0.1 0.6 1168500 13268 ? Ssl May13 14:52 /usr/bin/v2ray -config /etc/v2ray/0.json bird 295 0.0 0.0 7156 1268 ? Ss May13 2:00 /usr/bin/bird -s /run/bird/bird.ctl root 296 0.0 0.0 5424 1304 tty1 Ss+ May13 0:00 /sbin/agetty -o -p -- \u --noclear tty1 linux http 315 0.0 0.5 1255500 12164 ? Ssl May13 0:54 /usr/bin/caddy -log stdout -agree -conf /etc/caddy/caddy.conf -root=/usr/share/caddy root 3555765 0.0 0.0 0 0 ? I 21:27 0:00 [kworker/0:0-rcu_gp] root 3566553 0.0 0.0 0 0 ? I 21:33 0:00 [kworker/u2:1-ext4-rsv-conversion] root 3567234 0.0 0.0 0 0 ? I 21:33 0:00 [kworker/0:1-events] root 3569037 0.0 0.3 10704 7448 ? Ss 21:34 0:00 sshd: me [priv] me 3569138 0.0 0.4 18972 9776 ? Ss 21:34 0:00 /usr/lib/systemd/systemd --user me 3569140 0.0 0.0 113348 2116 ? S 21:34 0:00 (sd-pam) me 3569145 0.0 0.1 10704 4156 ? S 21:34 0:00 sshd: me@pts/0 me 3569146 0.0 0.1 7488 4000 pts/0 Ss 21:34 0:00 -bash root 3577610 0.0 0.0 0 0 ? I 21:38 0:00 [kworker/u2:0-flush-254:0] root 3584705 0.1 0.0 0 0 ? I 21:42 0:00 [kworker/0:2-events] root 3588276 0.0 0.0 0 0 ? I 21:44 0:00 [kworker/u2:2-events_unbound] root 3591768 0.0 0.1 9636 3600 ? S 21:46 0:00 /usr/bin/CROND -n hetrixt+ 3591769 0.0 0.1 7144 2860 ? Ss 21:46 0:00 /bin/sh -c bash /etc/hetrixtools/hetrixtools_agent.sh >> /etc/hetrixtools/hetrixtools_cron.log 2>&1 hetrixt+ 3591770 0.3 0.1 7144 3392 ? S 21:46 0:00 bash /etc/hetrixtools/hetrixtools_agent.sh hetrixt+ 3593070 0.0 0.0 7144 1816 ? S 21:46 0:00 bash /etc/hetrixtools/hetrixtools_agent.sh hetrixt+ 3593071 0.0 0.0 7972 1184 ? S 21:46 0:00 vmstat 3 2 hetrixt+ 3593072 0.0 0.0 5340 580 ? S 21:46 0:00 tail -1 me 3593073 0.0 0.1 9500 3596 pts/0 R+ 21:46 0:00 ps aux
[me@lax2 ~]$ cat /proc/meminfo MemTotal: 2163084 kB MemFree: 111300 kB MemAvailable: 302496 kB Buffers: 5500 kB Cached: 70908 kB SwapCached: 1008 kB Active: 56160 kB Inactive: 54484 kB Active(anon): 10912 kB Inactive(anon): 22148 kB Active(file): 45248 kB Inactive(file): 32336 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 311292 kB SwapFree: 289952 kB Dirty: 8 kB Writeback: 0 kB AnonPages: 33544 kB Mapped: 46160 kB Shmem: 1624 kB KReclaimable: 284608 kB Slab: 1881652 kB SReclaimable: 284608 kB SUnreclaim: 1597044 kB KernelStack: 1520 kB PageTables: 1616 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1392832 kB Committed_AS: 476668 kB VmallocTotal: 34359738367 kB VmallocUsed: 10888 kB VmallocChunk: 0 kB Percpu: 37680 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 1665012 kB DirectMap2M: 563200 kB
Any ideas?
Comments
Looks like it's all being eaten by Slab? Which iirc is kernel-related cache. Try
slabtop
and maybe that can give you some hints, or you can share the output here for someone much smarter than myself to help with. lol🦍🍌
Thanks for the pointers! I never realised slab is a thing. Here's the output from slabtop and I will continue Googling tomorrow:
What does cat /proc/slabinfo give?
https://clients.mrvm.net
V2ray and caddy in-memory cache?
What is
cred_jar
under slabinfo? I have to reboot my box every week, otherwise the box will become unresponsive after 8 or 9 days.