Ramblings of an aging IT geek
← Ramblings of an aging IT geek
homelab

nextcloud, the third install and the one that stuck

Reinstalling Nextcloud for the third time, this time on Postgres with Redis and a proper reverse proxy, and what finally made it stop being slow.

A server rack with cabling in a homelab

This is my third Nextcloud install. The first two were fine until they weren't, and the failure mode both times was the same: it got slow, I lost patience, and I let it rot. So before I did it again I went and read the admin manual properly, the bit everyone skips, and I think this one is going to last.

The headline lesson from the first two attempts is that the default install lies to you about what it needs. Nextcloud on SQLite with no cache and PHP's defaults will run, and will keep running, right up until you have a few thousand files and two users, at which point every page load feels like it is being assembled by hand. None of that is Nextcloud's fault exactly. It is just that the easy path and the good path diverge early, and the installer never tells you which one you're on.

What I changed this time

Postgres instead of SQLite. This is the single biggest difference. SQLite is genuinely fine for a single user kicking the tyres, but file sync is a write-heavy workload with a lot of small transactions, and a real database handles that without locking the whole thing up.

Redis for both the file lock and the memory cache. The transactional file locking warning in the admin overview was the thing that finally got me to do this. With Redis configured it goes away, and uploads of many small files stop stepping on each other.

'memcache.local' => '\OC\Memcache\APCu',
'memcache.locking' => '\OC\Memcache\Redis',
'memcache.distributed' => '\OC\Memcache\Redis',
'redis' => [
  'host' => '/run/redis/redis.sock',
  'port' => 0,
],

APCu for the local cache, Redis for the distributed and locking caches. That combination is what the manual actually recommends, and it is the difference between the admin page nagging me and the admin page being quiet.

A homelab setup with mini PCs and a switch

A real reverse proxy in front, terminating TLS, with the headers Nextcloud expects. I am running it behind nginx and the security scan in the admin panel now comes back clean, which it never did when I was hand-rolling the config and missing Strict-Transport-Security.

The bits that bite everyone

Cron. The default AJAX cron only runs when someone has a tab open, which means background jobs basically never happen on a personal instance. Switch it to system cron, a real entry every five minutes calling cron.php, and the preview generation and trash cleanup actually run.

PHP memory limit. The default is too low for generating previews of large photos, and the symptom is a blank thumbnail with a 500 buried in the log. Bumping memory_limit to 512M sorted it.

The occ command is your friend and the web UI is not. Anything that matters, adding the trusted domain, running the database index check, repairing after an upgrade, is faster and clearer from occ files:scan --all and friends than clicking around.

I am sure I will find something else that is slow in a month. But this time I built it on the foundations the manual asked for rather than the foundations the installer handed me, and the difference in feel is night and day. The page loads are instant, the mobile sync stopped silently dropping files, and I have stopped resenting the thing. Third time, properly. We'll see.