WHIM

Complete Application Manual

The Unified Desktop & Mobile Ecosystem for Voice, AI, Connectivity, and Automation

v0.3.4 — April 2026 — CARRARA Mint — Hybrid Connection
01

Application Overview

Whim is a comprehensive desktop and mobile ecosystem built in Python (Tkinter) that unifies voice recording, AI-powered conversation, messaging, IoT automation, screen sharing, and document editing into a single dark-themed application. It runs on Linux (CARRARA Mint) and connects to Samsung Android devices via a reverse SSH tunnel through a public VPS, LAN, or USB/ADB.

Whim Desktop — Sessions Tab
Whim desktop application showing the Sessions tab with all 14 module tabs visible across the header.

Key Highlights

18 Module Tabs (2 Rows)
Chat, Whim.AI, SmartThings, Sys Status, AVR Lab, Voice Engine, Live, TRV Cipher, Library | Archive, Discord, Signal, RyvenCore, Ryven Editor, Sessions, Events/Debug, Settings
Mobile Companion
Whim.m v3.4 provides a five-tab mobile experience (REC, LIBRARY, CHAT, WAKE, DEVICES) with voice recording, AI chat, wake word commands, cross-device file sharing, and inter-device messaging — accessible over LAN or VPS reverse SSH tunnel. Runs on Samsung Galaxy S22, Galaxy S9, and Lenovo TB311FU tablet.
Local-First AI
All AI inference runs locally through Ollama with DeepSeek R1:32B and Llama 3.1:8B-16K. No cloud dependency.
System Tray Integration
Minimizes to system tray with a dynamic tunnel status icon (grey/yellow/green) showing connection health at a glance. Header bar shows real-time Tunnel and Whim status dots.
02

Connectivity & Networking

Whim uses a multi-layered connectivity architecture to bridge the desktop application with mobile devices, messaging platforms, and IoT infrastructure.

Whim System Architecture & Network Topology

System Architecture & Network Topology (PlantUML)

Samsung Phone ← VPS Tunnel (default) / Tailscale (fallback) / LAN → CARRARA Desktop ← WebSocket → OpenClaw Gateway
Whim.m PWA ← HTTP :8088 → Journal Ingest Server ← file → ~/Journal
SS Server :8091 ← MJPEG → Phone Camera ← POST → Desktop Preview

Connection Channels

Channel Protocol Default Port Purpose
OpenClaw Gateway WebSocket 18789 Core command bus for chat, approvals, sessions, presence
Journal Ingest / Whim.m HTTP (multipart) 8088 Voice recording upload, Whim.m PWA, AI chat proxy
Screen Share HTTP (MJPEG stream) 8091 Desktop-to-phone screen share + phone camera feed
SSH Tunnel (autossh) Reverse SSH via VPS 8089 Primary: secure cross-network connectivity via VPS at 104.207.140.242
Tailscale (fallback) WireGuard mesh VPN 8089 Fallback: direct connection via Tailscale IP 100.69.17.20, handles WiFi↔cellular handoffs
Ollama HTTP REST 11434 Local LLM inference (streaming chat completions)
Signal CLI HTTP 8080 Signal messenger send/receive via signal-cli
ADB USB / TCP 5555 Android device management, APK install, screenshots
Singleton Lock TCP 48891 Prevents duplicate Whim instances, sends SHOW signal

Header Bar Controls

The application header provides real-time connectivity controls:

Two-Row Tab Layout

All 17 module tabs are split across two rows for better visibility and click targets. Row 1 contains the primary workflow tabs (Chat through TRV Cipher). Row 2 contains utility and configuration tabs (Library through Settings). The active tab is highlighted in green with a border accent. Tabs are button-based with hover effects.

03

Desktop Tabs & Features

Whim organizes its functionality into 14 dedicated tabs, each serving a specific domain within the ecosystem.

Tab Internal Key Description
CHATchatDirect command-line chat with the OpenClaw gateway. Send messages, abort tasks, view real-time WebSocket traffic.
WHIM.AIwhimaiFull AI console with local Ollama LLM, presets, observability, context metering, tool trace, output templates, and capture tools.
SMARTTHINGSsmartthingsSamsung SmartThings device browser with scan, filter, favorites, device detail, and recently-controlled history.
AVR LABxttsXTTS v2 voice synthesis lab with speaker references, spectrogram visualization, and Table Reads output.
VOICE ENGINEvoice_engineWake word tuning: live spectrogram (Whim-Scope), gain/HPF/AGC/parametric EQ, sensitivity, VAD, spectral subtraction, confidence ghost bar, intelligibility band.
PERSONApersonaVoice personality manager: coined response playlists per voice clone, confidence-gated, context-aware, XTTS pre-render pipeline, behavioral categories.
TRV CIPHERhearmeoutAudio transcription workstation with spectrogram, playback transport, Whisper transcription, ODT export, and scrub tools.
GEOFgeofGeofence tracker with canvas map, collar status table, LoRa bridge integration, 20-minute heartbeat monitor, and fence pin management for livestock tracking.
NODEFLOWnodeflowVisual node-based flow editor showing active droids, LLM reasoning, OpenClaw telemetry, and data flow connections with drag-and-drop canvas, auto-poll, and node inspector.
ARCHIVEarchiveRich text editor with formatting toolbar, font selection, alignment, bullet lists, find/replace, word count, and file browser.
SSssScreen share server with QR code, phone camera feed, desktop preview, FPS/quality settings, and MJPEG streaming.
SETTINGSsettingsAPI keys & endpoints (Ollama, OpenAI, SmartThings, Notion), model management (pull/delete from Ollama), app preferences, theme, paths.

Desktop Tab Screenshots

Chat Desktop Tab
Chat Tab
Library Desktop Tab
Library Tab
AVR Lab Desktop Tab
AVR Lab Tab
TRV Cipher Desktop Tab
TRV Cipher Tab
04

Whim.AI — Local LLM Console

Whim.AI Desktop Tab
The Whim.AI console showing the AI chat panel, presets, observability metrics, and tool command reference.

Whim.AI is the central intelligence hub. It connects to a local Ollama instance for streaming chat completions with full observability.

Layout

Presets

PresetModelContextTemperatureToolsSystem Prompt
Defaultllama3.1:8b-16k163840.7all(none)
Creativellama3.1:8b-16k163841.2allCreative writing assistant
Codellama3.1:8b-16k163840.2codeConcise code assistant
Analystllama3.1:8b-16k81920.3search,calcData analyst
Minimalllama3.1:8b-16k40960.5noneAs few words as possible

Observability Dashboard

Real-time performance telemetry including:

Capture Tools

Output Templates

One-click templates for structured content: Weekly Recap, Meeting Summary, Script Draft, Debug Report.

Tools & Commands Reference

The right panel lists all available commands organized into categories:

Quick Prompts
droid, note, calc, search, summarize, rewrite, translate, explain
OpenClaw Core
connect, disconnect, heartbeat, status, sessions, presence, approve, deny
Chat Ops
send, abort, retry, history, clear, export
Voice & Media
record, transcribe, tts, playback, scrub
Signal / Discord
sig.send, sig.recv, sig.contacts, disc.send, disc.react, disc.search
Archive & Files
archive.new, archive.save, archive.open, journal, ingest
05

Whim.m — Mobile App

Whim.m v3.4 is a full-featured mobile companion app served as a native APK or Progressive Web App (PWA) from either the desktop Whim app or a standalone Python server. It provides voice recording, AI chat (via Ollama proxy), wake word voice commands, a cross-device file library, and inter-device messaging — all organized into five dedicated bottom navigation tabs: REC, LIBRARY, CHAT, WAKE, and DEVICES. A persistent "Listening for Hey Whim" banner runs across the top of every tab, giving always-on wake word access regardless of which tab is active. The app uses a hybrid connection strategy: VPS tunnel by default with Tailscale as an opt-in fallback, switchable from a dropdown in the mobile UI or from the desktop Control Panel.

v3.4 — Five-Tab Navigation

Lenovo Tablet (TB311FU)

Whim.m REC Tab — Lenovo Tablet
REC tab (Lenovo Tablet) — Whim.m v3.4 with record button, EXPORT TO WHIM, SENT TO WHIM file list, "Listening for Hey Whim" banner, and five-tab bottom navigation (REC, LIBRARY, CHAT, WAKE)
Whim.m LIBRARY Tab — Lenovo Tablet
LIBRARY tab (Lenovo Tablet) — Shared files across devices with "Pick from Gallery / Screenshots" and "Upload any file to library", persistent wake word banner
Whim.m CHAT Tab — Lenovo Tablet
CHAT tab (Lenovo Tablet) — Device Chat for inter-device messaging with "Message all devices..." input, attachment button, and Send
Whim.m WAKE Tab — Lenovo Tablet
WAKE tab (Lenovo Tablet) — Wake Word "Hey Whim" with live waveform visualization, voice profile indicator, VOICE CHAT input, and microphone listening state
Whim.ai Welcome — Lenovo Tablet
Whim.ai welcome screen (Lenovo Tablet) — "Welcome. Ask me anything." with Send button, powered by Llama + OpenClaw
Whim.ai Chat — Lenovo Tablet
Whim.ai chat (Lenovo Tablet) — AI conversation interface with keyboard open, "Ask anything..." prompt, and on-screen keyboard

Samsung Galaxy S22

Whim.m Connecting — S22
Whim.m splash screen (S22) — "Connecting to Whim..." with animated logo and Tailscale endpoint
Whim.m REC Tab — S22
REC tab (S22) — Voice recorder with timer, EXPORT TO WHIM, file list with playback buttons, and status bar indicators
Whim.ai Welcome — S22
Whim.ai welcome screen (S22) — "Welcome. Ask me anything." with Screen Share and location FABs
Whim.ai Chat — S22
Whim.ai live chat (S22) — AI conversation with desktop app query, crab avatar, and detailed response
Tailscale Devices — S22
Tailscale mesh network — showing michaels-s22 and carraramint devices connected under the Whim Tailscale network

Features

Wake Word Command Protocol

When a voice command is recognized, Whim.m embeds a whim-cmd JSON block in the AI response to trigger device actions:

whim-cmd {"action":"open_app","params":{"package":"app.organicmaps"}} whim-cmd {"action":"play_music","params":{"query":"song name"}} whim-cmd {"action":"open_maps","params":{}}

Running Whim.m Standalone

python3 ~/vaults/WHIM/mobile/whim_m_v2.1.py --port 8089

Output will show LAN IP and VPS tunnel URL for connecting from your phone.

APK Variants

FileSizeDescription
whim_m_v3.4_phone.apk~61 KBWebView wrapper for phones (Samsung Galaxy S22, Galaxy S9)
whim_m_v3.4_tablet.apk~61 KBWebView wrapper for tablet (Lenovo TB311FU)
06

QR Code System

Whim generates QR codes in two locations to enable instant phone-to-desktop connectivity without typing URLs:

1. TRV Cipher — Upload from Phone Dialog

Clicking "Upload from Phone" in the TRV Cipher tab opens a modal dialog with:

The QR code is generated using the qrcode Python library with error correction level M, rendered as a pixel-perfect canvas grid.

2. Screen Share (SS) Tab — QR Panel

The SS tab has a dedicated QR CODE card in the left column that displays a QR code for the screen share server URL. When the server starts, the QR is generated via qrcode.make() and displayed on a Tkinter canvas, scaled to fit the available space.

How to Use

  1. Start the relevant server (Journal Ingest or Screen Share)
  2. Ensure your phone is on the same WiFi network or connected via the VPS tunnel
  3. Scan the QR code with your phone camera or QR scanner
  4. The browser opens the Whim.m PWA or Screen Share page
Important Note

Use a COLON before the port number, not a period.
Correct: http://192.168.1.100:8088
Wrong: http://192.168.1.100.8088

07

Hybrid Tunnel Networking

Whim uses a hybrid two-mode connection strategy: a reverse SSH tunnel through a public VPS as the always-on primary, with Tailscale retained as an opt-in fallback for situations where rock-solid stability is needed (e.g., WiFi↔cellular handoffs). The mode is switchable from the mobile app or the desktop Control Panel.

Connection Modes

ModeDefaultHow It Works
VPS TunnelYESPhone → VPS:8089 → SSH tunnel → PC:8089. Works everywhere, no VPN client needed on phone.
TailscaleOPT-INPhone → 100.69.17.20:8089 direct via WireGuard mesh. Handles network transitions natively. Requires Tailscale on phone.
Auto-detectOPT-INOn connection, checks if Tailscale IP is reachable; uses it if available, otherwise falls back to VPS tunnel.

Switching Modes

Auto-Reconnect with Exponential Backoff

When any connection drops, the mobile client automatically retries:

Tailscale Device IPs

DeviceTailscale IPLAN IP
PC (carraramint)100.69.17.20192.168.1.231
Samsung Galaxy S9100.97.96.1192.168.1.198
Samsung Galaxy S22100.77.59.2
Lenovo TB311FU (Tablet)100.64.255.124192.168.1.112

How It Works

VPS MODE: [Phone] → [VPS: 104.207.140.242:8089] ← SSH tunnel ← [PC:8089] TAILSCALE MODE: [Phone] → [100.69.17.20:8089] → [PC:8089] (direct WireGuard) AUTO MODE: Tries Tailscale first, falls back to VPS if unreachable

VPS Tunnel (default): The CARRARA desktop opens an outbound SSH connection to a public VPS. The VPS accepts inbound connections from mobile devices and forwards them through the tunnel back to the desktop. No ports need to be opened on the home router.

Tailscale (fallback): When enabled, phones connect directly to the PC's Tailscale IP (100.69.17.20) via WireGuard mesh. This is more stable during WiFi↔cellular transitions because Tailscale handles NAT traversal and connection migration natively.

Infrastructure

ItemValue
VPS104.207.140.242 (Vultr)
Tunnel port8089
Servicewhim-tunnel.service (systemd, starts on boot)
Toolautossh (auto-reconnects on failure)
AuthSSH key only (~/.ssh/id_ed25519), passwords disabled
Firewallufw: ports 22 (SSH) + 8089 (tunnel)
sshd configGatewayPorts yes

Tunnel Service

The tunnel runs as a persistent systemd service on CARRARA:

# Service file: /etc/systemd/system/whim-tunnel.service ExecStart=/usr/bin/autossh -M 0 -N -R 8089:localhost:8089 root@104.207.140.242 \ -o "ServerAliveInterval 30" -o "ServerAliveCountMax 3" # Check status: sudo systemctl status whim-tunnel.service

Desktop Status Indicators

The Whim Terminal header bar displays two auto-updating status dots that poll every 10 seconds:

IndicatorGreenRed/Grey
Tunnelwhim-tunnel.service active AND VPS:8089 reachableService down or VPS unreachable
WhimWhim.m server responding on localhost:8089Server not running
TailscaleTailscale daemon running (BackendState: Running)Tailscale stopped or not installed
OllamaOllama responding on localhost:11434Ollama not running

System Tray Status

Whim also displays a system tray icon with three states:

StateIcon ColorTray Tooltip
Tunnel downGreyTunnel: Down | Whim: Offline
Tunnel up, Whim unreachableYellowTunnel: Connected | Whim: Offline
Both connectedGreenTunnel: Connected | Whim: Online

Mobile Health Bar

The Whim.m mobile app health bar shows five indicators: tunnel, server, mic, ollama, and TS (Tailscale). The tunnel dot turns green when the phone can reach the Whim server through the VPS, confirming end-to-end tunnel connectivity. The TS dot turns green when Tailscale is running on the PC.

A connection mode dropdown in the top-right corner allows switching between VPS Tunnel, Tailscale, and Auto-detect modes. When disconnected, a pulsing red banner appears: "Connection lost — reconnecting..." with automatic exponential backoff retries.

Mobile Access via Tunnel

When the tunnel is active, the Whim.m standalone server prints the VPS URL alongside the LAN IP:

VPS Tunnel : http://104.207.140.242:8089 LAN IP : 192.168.1.231 Listening on : 0.0.0.0:8089 Open on phone : http://192.168.1.231:8089 Via VPS tunnel: http://104.207.140.242:8089

Troubleshooting

08

LLMs & AI Stack

Whim uses a fully local AI inference stack powered by Ollama. All models run on the CARRARA machine's GPU with no external API calls.

Models in Use

ModelRoleContextNotes
DeepSeek R1:32B Primary agent model Varies Default model for OpenClaw gateway agents. Reasoning-optimized with chain-of-thought.
Llama 3.1:8B-16K Fallback / Whim.AI default 16384 Used for the Whim.AI console and mobile Whim.m chat. Fast inference, 16K context window.

Ollama Configuration

{ "models": { "mode": "merge", "providers": { "ollama": { "baseUrl": "http://127.0.0.1:11434", "api": "ollama" } } }, "agents": { "defaults": { "model": { "primary": "ollama/deepseek-r1:32b", "fallbacks": ["ollama/llama3.1:8b-16k"] } } } }

AI Endpoints

Health Monitoring

Both desktop and mobile clients poll /health to check Ollama availability. The health endpoint returns {"status": "ok", "ollama": true/false} by probing http://localhost:11434/api/tags.

09

Voice & Audio Pipeline

AVR Lab (XTTS Voice Synthesis)

The AVR Lab tab provides text-to-speech synthesis using Coqui XTTS v2 running in a dedicated conda environment (xtts).

TRV Cipher (Hear Me Out)

The TRV Cipher tab is a complete audio transcription workstation:

Audio Flow

Phone Mic Whim.m Record → HTTP upload → ~/Journal TRV Cipher → Whisper → Transcript ODT
10

Voice Engine — Wake Word Tuning

The VOICE ENGINE tab is a dedicated audio diagnostics and wake word calibration environment built for use in noisy environments such as vehicles, outdoor settings, or anywhere ambient noise interferes with "Hey Whim" detection. It provides a real-time spectrogram, signal processing controls, and wake word sensitivity tuning — all in a three-column layout.

Voice Engine Desktop Tab

Whim-Scope (Live Spectrogram)

The top half of the tab displays a real-time frequency heatmap covering the 300 Hz – 8 kHz range, driven by a 512-point Hanning FFT at 16 kHz mono. Key visual features:

Column A: Gain & Noise Floor (Pre-Amp)

ControlRangeDescription
Dynamic Gain0.1x – 5.0xAdjusts input volume before processing. Drop if mic is near a vent to avoid clipping.
Noise Floor Gate-80 to 0 dBSilence threshold. Anything below is ignored, preventing wake word hallucinations from static.
High-Pass FilterToggle (150 Hz cutoff)Cuts engine vibration and road hum. Critical for vehicles. Hotkey: H
Spectral SubtractionToggle + Capture"Capture Noise Profile" learns ambient/keyboard sound and subtracts that frequency profile from mic input.
Automatic Gain ControlToggleAuto-levels gain based on ambient noise. Raises gain at highway speed, lowers at idle. Smooth tracking with -20 dB target.
Parametric EQ (400 Hz)Toggle + depth (-24 to 0 dB)Narrow notch dip at ~400 Hz to reduce cabin reverb "boxiness" that masks the "W" sound in "Whim".

Column B: "Hey Whim" Sensitivity

ControlRangeDescription
Sensitivity Threshold0.0 – 1.0Lower = fewer false starts but must shout. Higher = hears whispers but sneezes may trigger. Hotkey: S
Phonetic Trigger Delay200 – 1500 msHow long the engine waits after "Hey" to hear "Whim." Bump up to ~800 ms for slow speech.
Voice Activity DetectionToggleOnly runs the expensive AI wake-word check when human-like speech patterns are detected. Saves CPU.
Wake Word EngineSelectorChoose: placeholder (energy-based), openWakeWord, or Porcupine. The latter two support custom "Hey Whim" phrase.
Intelligibility BandToggleHighlights 1–3 kHz on the Whim-Scope to visualize the critical voice frequency range.

Column C: Optimization & Hardware

StatValue
Sample Rate16,000 Hz (16 kHz) — optimal for voice; higher wastes CPU, lower loses "s" and "sh" sounds
Bit Depth16-bit PCM Mono
FFT Window512-point Hanning
Freq Range300 Hz – 8,000 Hz
Buffer SizeAdjustable 256 – 4096 frames (80–100 ms standard)

Live readouts include inference latency (ms), buffer frame count, CPU usage, and active audio device name. All settings persist across sessions to ~/.openclaw/voice_engine.json.

Hotkeys

KeyAction
GCycle Gain (0.5 → 1.0 → 2.0 → 5.0)
SCycle Sensitivity (0.3 → 0.5 → 0.7 → 0.9)
HToggle High-Pass Filter on/off

Audio Backend

Uses sounddevice (PortAudio) at 16 kHz mono with float32 samples. The audio callback pipeline processes in order: gain → HPF → parametric EQ → AGC → spectral subtraction → FFT → spectrogram → wake word detection. The wake word function is a placeholder (_ve_detect_wake_word) returning energy-based confidence, ready to swap in openWakeWord or Porcupine for actual custom "Hey Whim" inference.

Vehicle Tip: Keyboard Noise

If mechanical keyboard clacks trigger the wake word, use "Capture Noise Profile" to learn the keyboard's frequency signature, then enable Spectral Subtraction to remove it from the mic input.

11

Persona — Voice Personalities

The PERSONA tab is a voice personality manager that treats coined responses like playlists. Each voice clone (MillyAI, Revy, future voices) gets its own persona profile with a curated set of responses organized by behavioral situation. When Whim needs to respond to a trigger — wake word, command acknowledgment, error, idle chatter — it pulls from that persona's playlist instead of generating a generic response.

Persona Desktop Tab

Three-Column Layout

Behavioral Categories

CategoryColorWhen It Fires
Wake WordGreenImmediately after "Hey Whim" is detected (e.g., "Yeah?")
AcknowledgmentCyanAfter a command is successfully parsed (e.g., "On it.")
MisheardOrangeWhen confidence is below threshold (e.g., "The road's loud. One more time?")
ErrorRedWhen a command fails (e.g., "Can't reach the PC. Tunnel might be down.")
NarrativePurpleDuring table read sessions in AVR Lab (e.g., "Rolling.")
AmbientGreySystem events: boot, reconnect, idle timeout (e.g., "Tunnel's back up.")
CustomBlueUser-defined triggers for future expansion

Confidence-Gated Selection

Each response has a confidence range (e.g., 40–60%). The Voice Engine's wake word confidence score determines which response fires. At 90%+ confidence, wake responses fire. At 40–60%, partial-match misheard responses fire. Below 20%, the strongest "speak up" responses fire. This maps directly to the Confidence Ghost Bar in the Voice Engine tab.

Context-Aware Selection

The context field enables situational awareness. A response tagged "driving" only fires when connected via VPS tunnel (implying mobile/vehicle use). "Morning" fires between 5–10am. "table_read" only fires when AVR Lab is active. Multiple responses matching the same trigger + context are selected randomly to prevent repetition.

Pre-Render Pipeline

Responses are pre-rendered as cached WAV files via the XTTS conda environment (same GPU-accelerated pipeline as AVR Lab). Render All batch-processes every unrendered entry, skipping existing cache. Cached clips play in <100ms instead of waiting 2–5 seconds for live XTTS generation. Cache is stored at ~/voices/personas/[name]/cache/.

Default Persona: MillyAI

Ships with 42 coined responses across all 7 categories: 6 wake word, 8 acknowledgment, 8 misheard, 7 error, 6 narrative, 7 ambient. Ready to render with the MillyAI voice clone.

Why Not Just Prompt the LLM?

Coined responses are deterministic — they fire the same way every time. LLMs drift, get verbose, add qualifiers. The LLM handles open conversation; the persona handles mechanical reflexes. "Hey Whim" → "Yeah?" is not a conversation, it's a reflex. Reflexes should be fast, consistent, and characteristic.

10

Signal & Discord Integration

Signal Messenger

Whim integrates with Signal via signal-cli running as a local HTTP service:

Discord (OpenClaw Bot)

The Discord tab manages the OpenClaw bot (Enoch persona) with full action control:

11

SmartThings IoT Control

The SmartThings tab provides a complete dashboard for Samsung SmartThings device management via the OpenClaw gateway.

SmartThings Desktop Tab
SmartThings Hub
Samsung SmartThings hub hardware (photographed from phone, Feb 21, 2026) -- the physical IoT bridge controlled via the Whim SmartThings tab

Features

12

GeoF — Geofence Tracker

The GEOF tab is a geofencing and livestock tracking system designed for hilly terrain (Ozarks). It combines a canvas-based map with real-time collar monitoring via LoRa radio, GPS point-in-polygon fence checking, and ESP32-S3 collar firmware with deep sleep power management.

GeoF Desktop Tab
Not Just for Livestock

GeoF works just as well for the four-legged family members who think the backyard fence is more of a suggestion than a rule. If your dog has mastered the art of the great escape — or simply can't resist chasing squirrels into the neighbor's yard — a lightweight GPS collar with GeoF gives you peace of mind without the drama. You'll get a gentle heads-up the moment your adventurous pup wanders past the boundary, so you can call them back before they make it three blocks down the street. Same LoRa collar, same map, same alerts — just swap "Cow-1" for "Biscuit" and you're set.

Architecture

Phone (GPS Pins) → JSON sync → Whim GeoF Tab ← LoRa Bridge ← LoRa Gateway ← SF12 radio ← ESP32-S3 Collars

Layout

PanelContent
ToolbarSync Pins, Load/Save/Clear Fence, Start/Stop Bridge, Start/Stop Heartbeat
Left (60%)Canvas map with pan, zoom, grid lines, fence polygon, pin markers, and collar positions (color-coded by status)
Right (40%)Collar status treeview, detail panel, and LoRa log

Collar Status Indicators

StatusColorCondition
OKGreenHeartbeat received within 20 minutes and inside fence
STALEYellowNo heartbeat for 20–40 minutes
OFFLINERedNo heartbeat for >40 minutes
ALERTBright RedCollar reported position outside the geofence boundary

Pin Sync & Fence Management

LoRa Bridge Service

The LoRa bridge (services/lora_bridge.py) runs as a subprocess managed from the GeoF tab. It supports three modes:

ModeFlagDescription
Serial--port /dev/ttyUSB0Reads from a hardware LoRa gateway via serial (default 115200 baud)
TCP--tcp 0.0.0.0:9600Accepts collar packets over TCP sockets
Simulated--simulateGenerates synthetic collar data for testing without hardware

The bridge performs ray-casting point-in-polygon geofence checks on every packet. If a collar reports a position outside the fence boundary, the packet is tagged with OUTSIDE_FENCE alert.

LoRa Configuration

ParameterDefaultNote
Frequency915 MHzUS ISM band
Spreading FactorSF12Maximum range for hilly Ozarks terrain. Slower data rate but signals “bend” over ridges.
TX Power20 dBmMaximum allowed for LoRa in US
CRCEnabledError detection on all packets

ESP32-S3 Collar Firmware

Each livestock collar runs on an ESP32-S3 with GPS, LoRa radio (SX1276), and IMU accelerometer. The firmware (Collar/firmware/main.cpp) uses a deep sleep cycle:

Packet Format

Collars transmit CSV over LoRa: COLLAR_ID,LAT,LON,BATTERY,NAME[,OUTSIDE_FENCE]

C001,36.350123,-93.200456,87,Cow-1 C002,36.341000,-93.195000,62,Cow-2,OUTSIDE_FENCE

File Structure

PathPurpose
services/lora_bridge.pyLoRa bridge service (serial/TCP/simulated)
Collar/firmware/main.cppESP32-S3 Arduino firmware
Collar/config/fence.jsonDefault fence config (flash to ESP32 SPIFFS)
~/.openclaw/fence_config.jsonActive fence config (desktop)
~/.openclaw/geof_pins.jsonCached pin data from mobile sync

Heartbeat Monitor

The heartbeat monitor runs as a background timer in the Whim Terminal. Every 20 minutes it scans all registered collars and flags any that have gone silent as STALE or OFFLINE. Alerts appear in the LoRa Log panel and collar table rows change color accordingly.

Ozarks Terrain Tip

SF12 (Spreading Factor 12) is critical for hilly terrain. It trades data rate for range, significantly increasing the chance of a signal clearing ridgelines between the collar and your antenna mast. Expect 2–5 km line-of-sight range, or 500m–1.5 km over hills with SF12 + 20 dBm.

13

NodeFlow — Visual Node Editor

The NodeFlow tab is a visual node-based flow editor that maps the real-time data pipeline inside Whim. It renders each active component — User Input, Whim Brain (LLM), Opus Droid, OpenClaw Telemetry, and Wisp/GPS — as draggable nodes on an infinite canvas, with dashed edges showing how data flows between them.

NodeFlow Desktop Tab
NodeFlow — Full Graph with Wisp/GPS Node
Full node graph showing all five default nodes connected with dashed edge lines and the Wisp/GPS endpoint visible
NodeFlow — Node Inspector Detail
Selecting the Wisp/GPS node populates the Node Inspector with type, position, description, and upstream connections

Architecture

User Input Whim Brain (LLM) Opus Droid Wisp / GPS
Whim Brain (LLM) OpenClaw Telemetry Wisp / GPS

Default Nodes

NodeTypeDescription
User InputinputPrompt and command entry point for the pipeline
Whim Brain (LLM)brainLocal Ollama model handling reasoning, tool calls, and token streaming
Opus DroiddroidCode execution, syntax analysis, and active path highlighting
OpenClaw TelemetryopenclawHardware telemetry: RSSI, battery level, heartbeat status
Wisp / GPSwispGPS coordinates, geofence status, and LoRa packet data

Layout

PanelContent
HeaderTitle, Refresh / Auto-Poll / Reset View buttons, idle/active status indicator
Canvas (left, 75%)Infinite dark canvas with grid lines, color-coded draggable nodes, dashed edge connections, zoom (scroll wheel), and pan (right-click drag)
Node Inspector (right top)Detail card showing the selected node’s label, type, metadata, and connection list
Flow Log (right bottom)Timestamped event log with color-coded severity (info, ok, warn, err)

Interaction

Node Color Scheme

TypeBorder ColorPurpose
brainPurpleLLM reasoning engine
droidGreenCode execution agents
openclawOrangeHardware telemetry sources
wispBlueGPS and geofence endpoints
inputTanUser entry points
14

Archive Tab Editor

The Archive tab is a full-featured document editor that saves files to ~/ARCHIVE. All documents created in Whim are stored in this directory.

Archive Desktop Tab

Document Actions

Formatting Toolbar

File Browser

The right column shows all files in ~/ARCHIVE with refresh, open, and double-click-to-load. A changelog panel at the bottom tracks all document actions with timestamps.

Document Header Format

--- Archive Entry --- Date: 2026-03-15 Notes: User notes here --- (document content)
15

Screen Share System

The SS (Screen Share) tab enables bidirectional visual communication between the desktop and mobile devices.

Screen Share / Live Desktop Tab

Architecture

Layout

ColumnContent
LeftSettings (FPS, Quality, Camera selection) + QR Code for phone connection
CenterPhone Camera Feed canvas (receives phone frames)
RightDesktop Preview canvas (shows what the phone sees)

Settings

Endpoints

PathMethodDescription
/GETMobile HTML page with camera capture + desktop stream viewer
/desktop_streamGETMJPEG stream of desktop screen
/phone_streamGETMJPEG relay of phone camera frames
/phone_framePOSTAccepts JPEG frame from phone camera
/ss_healthGETJSON health check with capture status
16

ADB Portal & Emulator

The Whim ADB Portal is a standalone GUI (whim_adb_portal.py) for managing APK installs and Android emulators, matching the Whim dark theme.

Device Management

Emulator Profiles

ProfileResolutionDPIRAMAPI Level
Samsung Galaxy S91440 x 29605704096 MB30 (Android 11)
Samsung Galaxy S221080 x 23404258192 MB33 (Android 13)

SDK Management

The portal can download and set up the full Android SDK command-line tools (~2 GB), accept licenses, install platform-tools, emulator, and system images, create AVDs with custom device profiles, and launch emulators with GPU acceleration.

17

OpenClaw Gateway

The OpenClaw Gateway is the central command bus that connects the Whim desktop client to the AI agent infrastructure via WebSocket.

Protocol

Connection Flow

  1. Connect to WebSocket URL (default: ws://127.0.0.1:18789)
  2. Receive challenge with nonce and timestamp
  3. Send connect request with protocol version, client info, auth token, and device signature
  4. Enter bidirectional message loop (incoming events displayed in Events/Debug tab)

Sessions & Presence

The Sessions tab manages active OpenClaw sessions with auto-refresh, presets, crash recovery, and a Notion integration for session notes. The Presence tab shows real-time online status with heartbeat pings to each connected component.

Events/Debug Log

The Events/Debug tab provides a structured, filterable event log with:

18

Settings Tab

The SETTINGS tab provides a three-column configuration panel for managing API keys, LLM models, and application preferences. All settings persist to ~/.openclaw/whim_settings.json.

Settings Desktop Tab

Column 1: API Keys & Endpoints

FieldDescription
Ollama URLBase URL for the local Ollama LLM server (default: http://localhost:11434)
OpenAI API KeyAPI key for optional OpenAI integration (masked input, stored locally)
SmartThingsPersonal access token for Samsung SmartThings API
Notion TokenIntegration token for Notion session tracking

Column 2: Model Management

Manages Ollama models directly from the Whim Terminal:

Column 3: App Preferences

LLM Model Dropdown (Header Bar)

A global model selector in the header bar lets you switch between local LLMs at any time without opening Settings. It shows all models available in Ollama (fetched on startup and via the refresh button). Selecting a model immediately updates Whim.AI's active model for the next prompt. Currently available:

19

Audio Capture Tool

A floating always-on-top tool window for capturing system audio output as lightweight audio files — no video, just audio. Designed for the use case of turning YouTube videos, podcasts, or any playing audio into portable files you can listen to in the car.

How It Works

Clicking the 🎧 Capture button in the header bar opens a compact floating window that stays on top of all other windows. It captures audio from PipeWire/PulseAudio monitor sources — virtual loopback devices that tap into whatever audio is playing through your speakers or HDMI output. No screen recording, no video — just the audio stream, encoded to a small file.

Controls

ControlDescription
SourceDropdown listing all PipeWire monitor sources. Auto-selects HDMI if available. Options include USB speakers, headphones, S/PDIF, and HDMI.
FormatOutput codec: MP3 (default, car-compatible), Opus, OGG Vorbis, M4A (AAC), WAV (lossless)
Bitrate64k – 320k. Default 128k gives ~1 MB/min for MP3 (good for podcasts/speech).
Record / StopStart/stop capture. Header button flashes red while recording.
VU MeterLive level indicator (green/yellow/red).
TimerRunning elapsed time (HH:MM:SS) and live file size.
Name / RenameInline rename of the output file after stopping.

Output

Files save to ~/Journal/audio_captures/ with timestamps (e.g. capture_20260317_143022.mp3). The folder link in the tool opens the directory in the file manager. At 128k MP3, a 1-hour podcast capture is roughly 60 MB.

Typical Workflow

  1. Start playing a YouTube video or podcast in the browser
  2. Click 🎧 Capture in the Whim header bar
  3. Select the HDMI or speaker monitor source
  4. Click Record — the tool captures audio while you watch/listen
  5. Click Stop when done — rename the file to something meaningful
  6. Transfer the MP3 to your phone (via Whim.m Library, ADB, or file share) for car listening
Audio Backend

Uses ffmpeg -f pulse to read from PipeWire/PulseAudio monitor sources. The monitor sources are virtual loopback devices created automatically by PipeWire for every output sink. No additional driver or loopback configuration is needed.

20

Technical Stack & Configuration

Runtime Environment

ComponentTechnology
OSLinux Mint (CARRARA machine)
Python3.12+ (system) + conda env: xtts (3.10+)
GUI FrameworkTkinter with ttk (Azure dark theme)
AI RuntimeOllama (local GPU inference)
Voice SynthesisCoqui XTTS v2 (conda env: xtts)
TranscriptionOpenAI Whisper
NetworkingReverse SSH tunnel via VPS (autossh + systemd)
Messagingsignal-cli (Signal) + discord.py/nextcord (Discord)
IoTSamsung SmartThings via OpenClaw gateway
AndroidADB + Android SDK command-line tools
Screen Capturemss (Python)
Image ProcessingPillow (PIL)
QR Codesqrcode (Python library)
System Traypystray
Document Exportodf (OpenDocument Format) + LibreOffice Writer
Audio ProcessingFFmpeg, NumPy, wave

Key Directories

PathPurpose
~/vaults/WHIM/Main Whim project vault
~/vaults/WHIM/app/Desktop application source code
~/vaults/WHIM/mobile/Mobile app, APKs, build artifacts
~/vaults/WHIM/assets/Fonts, icons, logos
~/.openclaw/OpenClaw config, Whim icon, sessions store
~/.openclaw/WhimUI/Custom fonts and icon packs (Papirus, Mint-Y)
~/Journal/Voice recordings and notes uploaded from phone
~/ARCHIVE/Documents created in the Archive Tab Editor
~/TRANSCRIPT/Exported ODT transcripts
~/TableReads/XTTS voice synthesis output
~/voices/Speaker reference files for XTTS
~/Incoming/fire.pngFlame logo used in the header and taskbar

Configuration File

The main configuration lives at ~/.openclaw/openclaw.json and controls:

Singleton Instance

Whim enforces a single instance by binding TCP port 48891. If a second instance is launched, it sends a SHOW signal to the existing instance, which restores and focuses its window.

21

Desktop Environment Customizations

The CARRARA desktop runs Linux Mint with Cinnamon. The following customizations have been applied to the desktop environment for a cleaner workflow and ergonomic comfort.

Start Menu Cleanup

All non-pinned application entries have been removed from the start menu. Only taskbar-pinned favorites remain accessible via the start menu:

Application.desktop IDStatus
Firefoxfirefox.desktopPinned
Software Managermintinstall.desktopPinned
System Settingscinnamon-settings.desktopPinned
Terminalorg.gnome.Terminal.desktopPinned
Files (Nemo)nemo.desktopPinned
Google Chromegoogle-chrome.desktopPinned

Removed .desktop overrides are backed up at ~/.local/share/applications/_backup_removed/. Custom app entries removed include: OpenClaw, Whim ADB Portal, Control Panel, Droid, Revy Acousto, and OnlineChat webapp. System app overrides (Discord, Signal, Audacity, LibreOffice, etc.) were also removed, reverting them to default system entries.

Additionally, all Preferences and Administration category entries (65 items) have been hidden from the start menu via NoDisplay=true overrides. This includes all Cinnamon settings sub-panels (Backgrounds, Themes, Keyboard, Display, etc.), system tools (Firewall, Timeshift, Driver Manager, Update Manager, etc.), and utility launchers. The main System Settings app remains accessible from the pinned taskbar for when settings changes are needed.

ALT Key Shortcuts Disabled

All Cinnamon keyboard shortcuts that use the ALT key have been disabled for ergonomic reasons (wrist rest positioning). This includes:

Window Management (removed)

ActionPrevious Shortcut
Switch windowsAlt+Tab
Switch windows backwardShift+Alt+Tab
Close windowAlt+F4
Toggle maximizedAlt+F10
UnmaximizeAlt+F5
Window menuAlt+Space
Move windowAlt+F7
Resize windowAlt+F8
Run dialogAlt+F2
Switch groupAlt+Above_Tab

Workspace Navigation (removed)

ActionPrevious Shortcut
Switch workspace up/down/left/rightCtrl+Alt+Arrow
Move window to workspaceCtrl+Shift+Alt+Arrow
Switch panelsCtrl+Alt+Tab

System & Media (removed)

ActionPrevious ShortcutRetained Non-ALT Binding
LogoutCtrl+Alt+Delete
TerminalCtrl+Alt+T
Lock screenCtrl+Alt+LXF86ScreenSaver
ShutdownCtrl+Alt+EndXF86PowerOff
Restart CinnamonCtrl+Alt+Escape
Toggle recordingCtrl+Shift+Alt+R
Window screenshotAlt+Print
Magnifier zoomAlt+Super+=/−/0
Restoring ALT Shortcuts

To restore all Cinnamon ALT shortcuts to defaults, run:
gsettings reset-recursively org.cinnamon.desktop.keybindings

22

Windows 11 Support

Whim Terminal runs natively on Windows 11 via a platform compatibility layer that abstracts OS-specific calls (paths, services, audio). The same core codebase powers both the Linux and Windows builds.

Prerequisites

SoftwareRequiredInstall From
Python 3.10+Requiredpython.org (check "Add to PATH")
Ollama for WindowsRequiredollama.com
TailscaleOptionaltailscale.com
ffmpegOptionalffmpeg.org (add to PATH)
Signal DesktopOptionalsignal.org

Quick Start

Option A — PowerShell Setup (Recommended)
git clone https://github.com/scarter84/Whim.git cd Whim Set-ExecutionPolicy -Scope CurrentUser RemoteSigned .\scripts\setup_windows.ps1

This creates a virtual environment, installs dependencies, sets up data directories, and creates a desktop shortcut.

Option B — Batch Setup
git clone https://github.com/scarter84/Whim.git cd Whim scripts\setup_windows.bat
Launch
scripts\launch_whim.bat

Or use the desktop shortcut created by the PowerShell setup.

Windows Path Mapping

Whim stores data in Windows-native locations:

Linux PathWindows Path
~/.openclaw/%APPDATA%\OpenClaw\
~/Journal/Documents\Whim\Journal\
~/ARCHIVE/Documents\Whim\ARCHIVE\
~/TRANSCRIPT/Documents\Whim\TRANSCRIPT\
~/TableReads/Documents\Whim\TableReads\
~/voices/Documents\Whim\voices\
~/Incoming/Documents\Whim\Incoming\

Platform Differences

FeatureLinuxWindows 11
File openerxdg-openos.startfile()
Service checksystemctl is-activesc query
Audio sourcespactl (PulseAudio/PipeWire)sounddevice (Windows Audio)
SSH Tunnelsystemd whim-tunnel.serviceManual SSH or Tailscale direct
DPI scalingSystem nativePer-monitor DPI aware (auto-set)
Control PanelCustom Cinnamon panelUse Windows Settings directly
TTS EngineXTTS via conda envXTTS via pip or system Python

Architecture on Windows

File Structure
app/ openclaw_tkui.py ← Main terminal (cross-platform) whim_windows.py ← Windows 11 entry point platform_compat.py ← OS abstraction layer requirements_windows.txt scripts/ setup_windows.bat ← Batch setup setup_windows.ps1 ← PowerShell setup launch_whim.bat ← Quick launcher

The platform_compat.py module detects the OS at import time and provides correct path defaults, service checkers, audio source enumeration, and file-open commands. The whim_windows.py launcher sets DPI awareness, verifies Ollama, patches path constants, then loads the main app.

Connectivity on Windows

On Windows, the preferred connection method to mobile devices is Tailscale (direct mesh VPN). The Linux systemd SSH tunnel is not available natively on Windows, but Tailscale provides the same end-to-end encrypted connectivity with zero configuration.

Alternatively, use Windows OpenSSH to create a manual tunnel:

ssh -N -R 8089:localhost:8089 user@YOUR_VPS_IP

Known Limitations

22.5

iOS Support (Tahoe)

Whim.m is accessible on iOS devices via Safari or Chrome as a Progressive Web App (PWA). The iOS variant, codenamed Tahoe, connects to the same Whim server backend and provides the same five-tab experience (REC, LIBRARY, CHAT, WAKE, DEVICES) with platform-specific adaptations for Apple hardware.

Prerequisites

SoftwareRequiredNotes
iOS 16+RequiredPWA support requires iOS 16 or later
Safari / ChromeRequiredSafari recommended for best PWA integration (Add to Home Screen)
Tailscale for iOSOptionalRequired for direct Tailscale mesh connection mode

Configuration

SettingValue
ConnectionVPS Tunnel (default) or Tailscale (requires Tailscale iOS app)
URLhttp://104.207.140.242:8089 (VPS) or http://100.69.17.20:8089 (Tailscale)
PWA InstallSafari → Share → Add to Home Screen
Audio RecordingWebRTC MediaRecorder API (Safari 14.5+)
Wake WordRequires microphone permission grant; iOS may suspend background audio
NotificationsWeb Push supported on iOS 16.4+ (requires PWA mode)

Platform Differences — iOS vs Android

FeatureAndroid (APK)iOS (PWA / Tahoe)
App DeliveryNative APK via ADB sideloadPWA via Safari "Add to Home Screen"
WebView EngineChromium (Android WebView)WebKit (Safari)
Audio FormatWebM/Opus (native)MP4/AAC (Safari MediaRecorder default)
Background AudioSupported (WebView keeps running)Limited — iOS may suspend after ~30s in background
Wake WordAlways-on via WebViewActive only while app is in foreground
File UploadFull filesystem access via intentPhoto Library + Files app picker
Camera AccessDirect WebRTC + Screen ShareWebRTC supported; no Screen Share capture
NotificationFirebase / localWeb Push (iOS 16.4+ in PWA mode only)
Install Size~61 KB APK~0 KB (bookmark/PWA shell)
TailscaleTailscale Android appTailscale iOS app (App Store)

Known Limitations on iOS

23

Multi-Terminal Sync

The SYNC tab enables state synchronization across multiple Whim Terminal instances running on different machines (Linux + Windows). Seven sync approaches are available, managed through a unified engine.

Multi-Terminal Sync Desktop Tab

Sync Approaches

#ApproachTransportReal-timeOffline
1WebSocket DaemonTailscale meshYesNo
2VPS rsyncSSH to VPSNoYes
3CRDT CollaborationWebSocketYesNo
4Git SyncGit remoteNoYes
5Hybrid (1+2)Tailscale + VPSYesYes
6Session MirrorWebSocketYesNo
7Phone BridgeHTTP (Whim.m)BufferedYes

What Gets Synced

DataFileSync Default
Session Historywhim_sessions.jsonOn
Settingswhim_settings.jsonOn
Voice Engine Configvoice_engine.jsonOn
Device Locationsdevice_locations.jsonOn
Personaspersonas.jsonOn
Journal Manifest~/Journal/*.jsonOn
Archive Text~/ARCHIVE/*.txtOn
API Keys / TokensNever

Sync Modes

Hybrid Mode (Recommended)

Combines WebSocket + VPS for maximum reliability:

  • When both machines are on Tailscale: live WebSocket sync (sub-second)
  • When one is offline: changes queue locally
  • On close: auto-push to VPS. On open: auto-pull from VPS.
  • VPS acts as tie-breaker for conflicts
WebSocket Only

Real-time peer-to-peer sync via Tailscale. Both machines must be online. Heartbeat every 10s, full reconciliation every 5 min.

VPS Only

Async push/pull via rsync over SSH. Works even when the other machine is off. Manual or auto-triggered.

Git Mode

Auto-commit every 60s, push/pull from a private Git repo. Full version history and easy rollback.

Conflict Resolution

The sync engine uses vector clocks for last-writer-wins conflict resolution. Each node maintains a logical clock that increments on every local change. When merging, the node with the higher clock value wins. For simultaneous edits at equal clocks, the node with the lexicographically higher ID wins (deterministic tie-breaking).

The CRDT layer (Approach 3) provides conflict-free merging for structured data like session lists and chat histories, ensuring eventual consistency without data loss.

Session Mirror

Cast a live Whim Terminal session to another machine for read-only viewing. Enter the host's Tailscale IP in the SYNC tab and click WATCH. The mirror updates in real-time. Optional control handoff allows the viewer to operate the remote session.

Phone Bridge

Uses connected Whim.m phones as store-and-forward relays. When desktop A pushes changes, the phone stores them. When desktop B comes online, it pulls buffered changes from the phone. Leverages the existing Whim.m HTTP server on port 8089.

Security

Configuration

Sync config is stored at:

PlatformPath
Linux~/.openclaw/whim_sync.json
Windows%APPDATA%\OpenClaw\whim_sync.json

Quick Start

Enable Sync
  1. Open the SYNC tab in Whim Terminal
  2. Select your preferred mode (Hybrid recommended)
  3. Toggle Enable Sync on
  4. Click START
  5. Enter the Tailscale IP of your other Whim instance and click CONNECT