Troubleshooting¶
Most appliance issues fall into four buckets: it's not reporting at all, a TLS error, a device that won't authenticate, or the local AI didn't install. Each has a concrete check. SSH to the box as networkops for these.
Where the logs live¶
| What | Where |
|---|---|
| Live collector activity | journalctl -u network-collector -f |
| Collector service state | systemctl status network-collector |
| First-boot provisioning log | /opt/acutis-collector/firstboot.log |
| Local AI install log + status | /opt/acutis-collector/ollama-install.log, /opt/acutis-collector/ai-status |
Runtime config (secrets, 0600) |
/etc/acutis-collector.env |
| Offline telemetry buffer | /opt/acutis-collector/collector_cache.db (SQLite) |
The collector logs are verbose and plain-spoken — they tell you why something failed (e.g. "Backend REJECTED this collector's TENANT_API_KEY"), so journalctl is almost always the fastest answer.
The appliance isn't reporting¶
-
Is the service running?
If it's dead,journalctl -u network-collector -n 100will show why. It runsRestart=always, so a crash normally self-heals within ~10 s. -
Did first boot finish? Check
/opt/acutis-collector/firstboot.logfor=== Acutis collector firstboot done ===. If it stopped early (e.g. "BACKEND_URL not set" or "could not download firstboot.sh"), provisioning didn't complete — re-checkprovision.confand that the box can reach the backend. -
Can it reach the backend?
No response = a network/firewall/DNS problem between the appliance and the backend. The appliance only needs outbound 443. -
Is telemetry being buffered? If you see
FastAPI server offline. Telemetry cached locally, the appliance is healthy but can't reach the backend — it's queuing data (up toMAX_CACHED_PAYLOADS, newest kept) and will flush automatically once the backend returns.
TLS errors¶
- Backend cert errors after setting
BACKEND_CA_CERT: the file must exist and contain the backend's cert/CA. If the path is wrong or the file is missing, the appliance silently falls back to unvalidated HTTPS — so a "works without pinning, fails with it" symptom points at the cert file. Verify the path in/etc/acutis-collector.env, thensudo systemctl restart network-collector. - Device API cert warnings (
DEVICE_TLS_VERIFY is off …): this is an expected one-time warning when device cert validation is disabled (the default for LAN self-signed gear). To silence it and harden, setDEVICE_CA_CERTto your device CA bundle, orDEVICE_TLS_VERIFY=true. - Cleartext concern: make sure
BACKEND_URL/FASTAPI_COLLECTOR_URLishttps://, nothttp://— that's what keeps collector↔backend credentials encrypted.
Device authentication failures¶
- "Vault REJECTED this collector's TENANT_API_KEY" (HTTP 401/403): the appliance's own backend key is stale, rotated, or bound to a different tenant. Re-provision
TENANT_API_KEY(download a fresh installer bundle, which rotates the key). The appliance will keep reusing its last known-good device list meanwhile rather than guessing. - "No vault credential stored for device" (HTTP 404): you onboarded the device but never stored its credential. Add one in the dashboard vault (see Onboarding devices).
- Device login rejected: confirm the username/secret in the vault is correct and that the auth method matches — a PAN-OS API key stored as
password(or vice-versa) is a common cause. The appliance will try the other interpretation once for legacy entries, but re-saving with the correctauth_methodis the clean fix. - Cisco SSH negotiation fails on old IOS: the appliance already disables
rsa-sha2-256/512so paramiko falls back tossh-rsafor legacy boxes — if it still fails, the device likely refuses the credentials or the account lacks privilege.
The local AI didn't install¶
Check the status file:
ready— model installed and serving.installing— still working (first pull can take a while).failed:ollama_install/failed:model_pull/ etc. — see/opt/acutis-collector/ollama-install.log. Usual cause: the box couldn't reachollama.aior the package mirror, or it's low on RAM/disk.not_installed— the installer never ran.
This is non-fatal: the collector polls and reports normally without the local AI. The AI install unit (acutis-ollama.service) is enabled, so an interrupted pull retries on the next boot and disables itself once it reaches ready.
Still stuck?¶
Grab the machine's support code (or the relevant device id) and the last ~50 lines of journalctl -u network-collector and reach out — the logs almost always pinpoint the cause.
Next: Security.