Skip to content

Troubleshooting

Most appliance issues fall into four buckets: it's not reporting at all, a TLS error, a device that won't authenticate, or the local AI didn't install. Each has a concrete check. SSH to the box as networkops for these.

Where the logs live

What Where
Live collector activity journalctl -u network-collector -f
Collector service state systemctl status network-collector
First-boot provisioning log /opt/acutis-collector/firstboot.log
Local AI install log + status /opt/acutis-collector/ollama-install.log, /opt/acutis-collector/ai-status
Runtime config (secrets, 0600) /etc/acutis-collector.env
Offline telemetry buffer /opt/acutis-collector/collector_cache.db (SQLite)

The collector logs are verbose and plain-spoken — they tell you why something failed (e.g. "Backend REJECTED this collector's TENANT_API_KEY"), so journalctl is almost always the fastest answer.

The appliance isn't reporting

  1. Is the service running?

    systemctl status network-collector
    
    If it's dead, journalctl -u network-collector -n 100 will show why. It runs Restart=always, so a crash normally self-heals within ~10 s.

  2. Did first boot finish? Check /opt/acutis-collector/firstboot.log for === Acutis collector firstboot done ===. If it stopped early (e.g. "BACKEND_URL not set" or "could not download firstboot.sh"), provisioning didn't complete — re-check provision.conf and that the box can reach the backend.

  3. Can it reach the backend?

    curl -fsS -k https://app.acutisgo.com/        # use YOUR backend URL
    
    No response = a network/firewall/DNS problem between the appliance and the backend. The appliance only needs outbound 443.

  4. Is telemetry being buffered? If you see FastAPI server offline. Telemetry cached locally, the appliance is healthy but can't reach the backend — it's queuing data (up to MAX_CACHED_PAYLOADS, newest kept) and will flush automatically once the backend returns.

TLS errors

  • Backend cert errors after setting BACKEND_CA_CERT: the file must exist and contain the backend's cert/CA. If the path is wrong or the file is missing, the appliance silently falls back to unvalidated HTTPS — so a "works without pinning, fails with it" symptom points at the cert file. Verify the path in /etc/acutis-collector.env, then sudo systemctl restart network-collector.
  • Device API cert warnings (DEVICE_TLS_VERIFY is off …): this is an expected one-time warning when device cert validation is disabled (the default for LAN self-signed gear). To silence it and harden, set DEVICE_CA_CERT to your device CA bundle, or DEVICE_TLS_VERIFY=true.
  • Cleartext concern: make sure BACKEND_URL/FASTAPI_COLLECTOR_URL is https://, not http:// — that's what keeps collector↔backend credentials encrypted.

Device authentication failures

  1. "Vault REJECTED this collector's TENANT_API_KEY" (HTTP 401/403): the appliance's own backend key is stale, rotated, or bound to a different tenant. Re-provision TENANT_API_KEY (download a fresh installer bundle, which rotates the key). The appliance will keep reusing its last known-good device list meanwhile rather than guessing.
  2. "No vault credential stored for device" (HTTP 404): you onboarded the device but never stored its credential. Add one in the dashboard vault (see Onboarding devices).
  3. Device login rejected: confirm the username/secret in the vault is correct and that the auth method matches — a PAN-OS API key stored as password (or vice-versa) is a common cause. The appliance will try the other interpretation once for legacy entries, but re-saving with the correct auth_method is the clean fix.
  4. Cisco SSH negotiation fails on old IOS: the appliance already disables rsa-sha2-256/512 so paramiko falls back to ssh-rsa for legacy boxes — if it still fails, the device likely refuses the credentials or the account lacks privilege.

The local AI didn't install

Check the status file:

cat /opt/acutis-collector/ai-status
  • ready — model installed and serving.
  • installing — still working (first pull can take a while).
  • failed:ollama_install / failed:model_pull / etc. — see /opt/acutis-collector/ollama-install.log. Usual cause: the box couldn't reach ollama.ai or the package mirror, or it's low on RAM/disk.
  • not_installed — the installer never ran.

This is non-fatal: the collector polls and reports normally without the local AI. The AI install unit (acutis-ollama.service) is enabled, so an interrupted pull retries on the next boot and disables itself once it reaches ready.

Still stuck?

Grab the machine's support code (or the relevant device id) and the last ~50 lines of journalctl -u network-collector and reach out — the logs almost always pinpoint the cause.

Next: Security.