GuyWithGames

Welcome to GuyWithGames — home of the Story Universe and Halfax AI.

Gaming, infrastructure, and software creation in one place.

Welcome to GuyWithGames! I love gaming of all sorts, and I am a Systems Architect and Developer by trade. I also run a personal hybrid lab focused on AI experimentation, game development, and practical automation in my spare time when I am not gaming. This page highlights some of my projects and the homelab setup I run.

Story Universe Halfax AI Hybrid homelab Kubernetes + Docker Game dev experiments

Home Lab Overview

Five hosts split by what each one’s good at — one big Linux box for compute and AI, a gaming/dev laptop for daily-driver work, a Raspberry Pi for the always-on stuff, and two Linode VPSes for the public faces. They all talk to each other over the same private mesh.

Betelgeuse Primary Linux host

  • Where the heavy work happens — LLM inference, container orchestration, GitLab, the canonical data
  • Strix Halo iGPU pulls double duty for the AI stack and the desktop, no discrete card
  • Most other things hang off this one, which is exactly why the recovery plan is so detailed

UYScuti Gaming + dev laptop

  • My daily driver — code, games, anything that wants a real keyboard and an RTX 5080
  • Some projects only run here (the kernel-driver telemetry, Hyper-V VMs, Windows-native experiments)
  • Doubles as the manual third-copy disaster-recovery surface for when both Linux machines are gone

hera-pi + Cloud Utility + public services

  • The Pi runs the always-on stuff — Chronicle Keeper for the Story Universe, the DDNS updater, the off-box backup target
  • Guy (Linode) hosts the public sites, mail, and the live World Browser; Shop is a secondary web host
  • All five hosts share the same Netbird mesh, so internal traffic stays internal even when machines are continents apart

Hardware Highlights

What’s actually on the rack (well, shelf and Linode dashboard). Specs not stories — the why is in the project cards.

Betelgeuse Primary server

  • CPU: AMD Ryzen AI MAX+ 395 (16C/32T)
  • GPU: Radeon 8060S, 64 GiB UMA VRAM (paired 64/64 split with system RAM, set in BIOS)
  • RAM: 62 GiB with 186 GiB swap
  • Storage: 2 TB SN7100 + 2 TB Lexar NVMe + 8 TB HDD
  • Platform: GMKtec NucBox EVO-X2
  • OS: Ubuntu 24.04.4 LTS, kernel 6.17.0-20 HWE, GPU via Vulkan (Mesa RADV)

UYScuti Gaming + dev laptop

  • System: Alienware 18 Area-51 AA18250
  • CPU: Intel Core Ultra 9 275HX (24C/24T, up to 5.4 GHz)
  • RAM: 31.5 GiB
  • GPU: NVIDIA GeForce RTX 5080 Laptop GPU with 15.92 GB VRAM
  • Storage: 1.88 TB C: + 1.80 TB D: + 7.28 TB external
  • OS: Windows 11 Pro 25H2, build 26200

hera-pi Utility node

  • Model: Raspberry Pi 5
  • CPU: ARM Cortex-A76 (4 cores)
  • RAM: 8 GiB
  • Storage: 477 GB NVMe (boot + root) — microSD as recovery fallback
  • OS: Debian 13 (Trixie)

Guy VPS Public host

  • Provider: Linode 4 GB
  • CPU: AMD EPYC (2 vCPU)
  • RAM: 3.6 GiB
  • Storage: 80 GB SSD
  • OS: Rocky Linux 9.6

Shop VPS Secondary web host

  • Provider: Linode Nanode 1 GB
  • CPU: AMD EPYC (1 vCPU)
  • RAM: 765 MiB
  • Storage: 25 GB SSD
  • OS: Rocky Linux 9.6

Virtualization & Orchestration Stack

Different workloads want different abstractions — containers for the everyday stuff, full VMs when something needs its own kernel, and a Kubernetes cluster on standby for when a project actually earns the complexity.

Docker container runtime

  • Where almost everything on Betelgeuse lives — GitLab, the homepage dashboard, Portainer, SearXNG, the GitLab Runner.
  • Every image is pinned to a specific version. No :latest rolls; if a container restarts in the middle of the night, it comes back exactly the same.
  • Every docker-compose.yml is captured in the recoveryplan tree, so a clean rebuild reproduces the fleet by checking out a directory and running docker compose up -d in each subdir.
  • If a box dies tomorrow, every container can be back up by tomorrow morning.

Kubernetes (k3d) scratch cluster, currently idle

  • I keep a k3d cluster on Betelgeuse for orchestration experiments — multi-service stacks with rolling updates, ingress routing, and scaled workers.
  • How: declarative manifests define desired state; controller loops keep workloads healthy.
  • Why: when a project genuinely needs the lifecycle features Kubernetes gives you, it’s already wired up and ready.
  • Honest current state: dormant. I don’t have an active workload that justifies the runtime overhead today, so the cluster sits unused and I’d delete it before bringing it back.

Hyper-V windows virtualization

  • Comes with Windows Pro, so it’s already there — mostly used for spinning up clean Windows VMs to test installer behavior on a fresh OS without polluting my actual workstation.
  • Native virtual switching and snapshot support, so trial-and-error is cheap.
  • Most of my VM time is spent in VMware Workstation instead, but Hyper-V is the right tool for Windows-native scenarios where I don’t want a third hypervisor in the mix.

VMware Workstation cross-os VM lab

  • Where I run Linux VMs that need their own kernel — cross-distro toolchain testing, kernel-module experiments, distro comparisons.
  • Snapshots are the killer feature. I can break a VM badly and roll it back in seconds; that’s the right cost for “let me try this thing” experiments.
  • For multi-node integration tests where I need real network behavior between nodes, this is what I reach for instead of containers.

Network Stack

Two layers that do most of the work: a private DNS zone that gives every service a stable name, and a mesh VPN that lets the hosts talk to each other without anything being on the public internet.

BIND9 Implementation private DNS authority

  • I run an authoritative private zone (lab.guywithbeer.com) on Betelgeuse so every internal service has a real name.
  • Things like keysecrets.lab.guywithbeer.com, chronicle-keeper.lab.guywithbeer.com, betelgeuse.lab.guywithbeer.com resolve from any host on the mesh.
  • If I move a service to a different host, I update one DNS record and the rest of the lab figures it out — nothing is hardcoded against an IP.

NetBird VPN Usage wireguard mesh overlay

  • NetBird is a managed WireGuard mesh. Every host gets a 100.87.x.x address, and they all reach each other directly — no central VPN gateway, no NAT-busting, no dynamic-DNS gymnastics.
  • SSH, MongoDB, the KeySecrets API, the Story Universe pub/sub channel — all reachable across hosts as if they were on the same LAN, but none of them touch the public internet.
  • The home machines (Betelgeuse + UYScuti + hera-pi), the two Linode VPSes, and the laptop wherever it is — one flat private fabric.

Hardware Interaction Model right workload, right node

  • Each host does what its hardware is good at, not what’s convenient at the moment.
  • Heavy compute on Betelgeuse, daily-driver work on UYScuti, always-on lightweight stuff on the Pi (where idle power draw matters), public entry points on the Linode VPSes (where uptime matters more than throughput).
  • The Story Universe is the canonical example: Pi holds the canonical state because it never goes down, the big box generates events because it has the GPU, the VPS renders the public dashboard because it’s reachable from the internet.

How It Supports Projects design rationale

  • Stable DNS + mesh routing means a project doesn’t care which host it runs on — it just needs the name and the port.
  • The same conventions work whether the workload is in a container, a VM, or running directly on the host.
  • Public-facing entrypoints stay on the Linode VPSes; private backend traffic stays on the mesh and never touches the internet.
  • Adding a new project is mostly “edit one zone file, write one compose file” — the networking layer stays out of the way.

Let's Encrypt TLS automated certificate authority

  • Every public-facing service uses free TLS certificates from Let's Encrypt (ISRG).
  • How: certbot handles issuance and renewal via ACME v2 with zero downtime (Apache plugin for web, standalone for mail).
  • Coverage: www.guywithgames.com, www.guywithbeer.net, mail.guywithgames.com (SMTP/IMAP), www.shopmom.net — all ECDSA.
  • Why: HTTPS for everyone, no cost, no friction. Auto-renews 30 days before expiry via systemd timers.

Let's Encrypt is a service everyone on the internet benefits from. Consider donating to ISRG to keep it free.

Projects

I build across AI tools, game mechanics, story systems, automation dashboards, and infrastructure helpers. My projects blend experimentation, long-form worldbuilding, infrastructure design, and hands-on software development.

AI & Story

Story Universe Ecosystem distributed narrative system

An AI-driven story world that keeps running on its own across three computers — characters, factions, and events that evolve in the background while I’m doing other things. Like an always-on TV show whose next episode the AI writes by reacting to what happened yesterday.

Three machines split the work. A Raspberry Pi 5 (the Chronicle Keeper) holds the world’s canonical state — every character, faction, location, and accepted event — in SQLite, and broadcasts a world clock over ZeroMQ pub/sub. My main server (the Narrative Engine) subscribes to those ticks and asks the Dark Champion LLM to generate the next arc or event. A public VPS (the World Browser) renders the live chronicle as a website that reads straight from the canonical store. Each host does what its hardware is good at, and a single machine going down doesn’t take the others with it.

The story actually remembers itself. Every accepted event and arc update is written into MemPalace — a ChromaDB-backed semantic memory framework co-created by actress Milla Jovovich and engineer Ben Sigman, organized into wings and rooms to mimic human spatial recall. Before each generation call, the engine searches MemPalace for events relevant to the current scene and injects them as context, so the LLM builds on what already happened instead of restarting cold every tick. When I archive a story, MemPalace gets purged and the next one starts with a clean slate.

The prose-quality feedback loop. Generated events run through a regex filter that rejects atmospheric-formula openings (“Dawn sunlight filters through the windows…”), tension-family clichés, and time-of-day contradictions. Each rejected attempt is fed back to the LLM as a real negative example before the next retry, with rising temperature, up to six tries before force-accept. Combined with grammar-constrained JSON decoding (output is physically prevented from being malformed), parse failures dropped from ~51% of LLM calls to 0% — every retry now does prose-quality work instead of being wasted on broken JSON.

It pauses, it never lies. A HealthMonitor probes the Chronicle Keeper and the LLM endpoint every tick. If either is down, the engine pauses generation entirely instead of faking events from a template. When the dependencies recover, it resumes silently. There is no fallback path that would let a fabricated event into the canonical world — the universe simply waits.

What I’m proud of here: most “AI story generators” are one prompt and one reply. I built a persistent world that runs across three machines with synchronized state, a semantic memory the LLM actually reads from, a measurable prose-quality feedback loop, and a hard rule that the world pauses rather than fabricates. The architecture I ended up with is closer to an MMO backend than a chatbot.

Halfax AI Stack multi-model serving + ops

A private AI server I run on my own hardware. Two large language models live in GPU memory at once and answer questions through the same API shape (OpenAI-compatible) that ChatGPT, Copilot, and Cursor use — so any tool that knows how to talk to OpenAI talks to mine instead, with nothing leaving the house.

Two models concurrent on one GPU. Qwen3-Coder-30B handles code (lm1, ~22 GiB resident), Dark Champion V2 21B handles narrative and reasoning (lm2, ~16 GiB); a Mistral-7B slot is configured for fast utility prompts and loads on demand. The active pair lives in the AMD Strix Halo’s 64 GiB UMA VRAM via Vulkan, with ~12 GiB headroom on top — the ROCm/HIP path is broken on this GPU until 7.10+ ships, so I built around that. Models load largest-first to minimize fragmentation. Each runs as a supervised llama-server child process — if one crashes, a watchdog thread polls every 10 seconds and restarts the dead child as soon as it’s detected; per request I can target a specific model or let the API auto-failover to the next available.

Three independent memory layers, opt-in per request. RAG knowledge store: 99,000 chunks indexed from my PDFs, an offline encyclopedia ZIM, and notes; queried through an hnswlib approximate-nearest-neighbor index at ~4 ms median latency (down from ~50 ms linear scan), with embeddings cached by SHA-256 of each chunk’s text — so adding or removing a single document only re-embeds that document, never the whole library. MemPalace: the same ChromaDB-backed semantic memory the Story Universe uses, exposed so any client can search the full narrative timeline. Two-tier personal memory: a 60-bullet always-loaded CORE file for rules / identity / preferences / active context, plus an unbounded archive auto-retrieved per turn by cosine similarity above a configurable threshold.

Structured output that physically can’t fail. The API forwards json_schema and grammar fields straight through to llama.cpp, which compiles them to GBNF and masks any token that would break the schema before it can be sampled — the model is blocked, token by token, from emitting malformed output. Wiring this into the Story Universe event generator dropped the parse-failure retry rate from ~51% of LLM calls to 0%, roughly doubling effective throughput.

Vulkan-accelerated embeddings via a separate supervised process. A second llama-server on port 5001 runs all-MiniLM-L6-v2 (converted to GGUF F16, mean-pooled) on the same Vulkan backend — necessary because Python sentence-transformers hangs forever on HIP device init on this GPU. Same watchdog, same auto-restart, same Vulkan path the chat models use; just engineered around the broken driver instead of giving up on GPU embeddings.

Halfax AI VSCode Extension native IDE client — TypeScript

A from-scratch VSCode extension that does what GitHub Copilot, Cursor, and Windsurf Cascade do — chat with an AI in the sidebar, ask it to autonomously edit code, get autocomplete as I type, apply suggested edits with one click — but it talks to my AI server on my GPU, with everything logged to plain JSON files I own. No account, no telemetry, no cloud, dead-internet still works.

Not a Continue.dev fork. Written from nothing in TypeScript: custom React webview UI, Zustand state, an in-process MCP stdio client over @modelcontextprotocol/sdk, OpenAI-compatible streaming straight to the local llama.cpp server. Distributed only as a sideloaded VSIX from my homelab — never the marketplace.

The agent loop is a real state machine. Six states — pending → in_progress → completed | awaiting_user | error | cancelled — driving multi-step work through a strict JSON tool-call protocol. The extractor is forgiving about how the LLM formats its output (stray braces, fence variations, missing args), the validator normalizes no-arg calls before rejecting, and consecutive-error thresholds pause the agent into awaiting_user instead of spinning. Crucially: every successful turn flushes the agent’s history to disk before the next. If VSCode quits or crashes mid-task, reopening it auto-resumes the agent from where it stopped. None of the three commercial agents do this.

Single-slot priority queue across the whole extension. The local GPU has one inference slot at a time, so the queue prioritizes FIM=0 < chat=1 < agent=2: a long-running agent task can never starve typing-time autocomplete, and a chat reply doesn’t wait behind a background completion. Apply on suggested edits previews via vscode.diff and writes through the MCP file_write tool with mtime conflict detection — if I edit the file myself between the model proposing the change and the apply click, the write fails instead of clobbering my work.

Skills, hooks, and self-update — all file-resident. Skills are markdown files in ~/.halfax/skills/ with a tiny frontmatter (name / description / mode) — they appear instantly as slash commands with a popover palette in the composer. Lifecycle hooks (on-session-start, on-tool-call, on-stop) run shell commands of my choice with structured event variables. Self-update pulls a new VSIX from my homelab and installs it through workbench.extensions.installExtension — no marketplace, no update server, no vendor in the loop.

How it differs from Copilot, Cursor, and Cascade. Those three send every keystroke and chat to a corporate cloud — account-required, telemetry-by-default, per-token or per-seat billing, hard rate and context limits, dead without internet. Mine: every inference on my own GPU, every chat saved as a plain JSON under ~/.halfax-sessions/, every memory bullet a line in a CORE.md I can edit. The agent FSM is visible in the UI and pauseable mid-loop; every destructive tool call hits a per-call permission prompt by default.

Infrastructure & Operations

Homelab MCP Server infrastructure monitoring — Python + Paramiko

A piece of glue software that lets an AI assistant (mine, Cascade, Copilot — anything that speaks Model Context Protocol) inspect and operate my entire homelab in plain English. “Is hera-pi healthy?” “Which containers restarted in the last hour?” “Deploy this site.” — all answered as structured tool calls instead of shell commands.

Five hosts, 35+ services, under two seconds. Betelgeuse, hera-pi, Guy VPS, Shop VPS, UYScuti — checked concurrently through Paramiko SSH with connection pooling (30-second keepalive reuses the same SSH session across tool calls) and a ThreadPoolExecutor with 8 workers. The first call cold-starts the pool; every call after is one round-trip per host.

Local Betelgeuse dashboard. systemd services, Docker containers, my AI model endpoints with per-model response timing, GPU telemetry (temperature, power, VRAM via rocm-smi), disk / memory / load, BIND9 DNS, Netbird mesh status, k3d Kubernetes cluster health, and the Narrative Engine API.

Fleet tools. systemd units, Docker containers, and processes on Linux; PowerShell Get-Service on Windows. Remote log retrieval, disk and memory inspection, arbitrary-command execution behind a destructive-pattern denylist (filesystem wipes, disk formats, reboots, fork bombs). SFTP upload and download between any two fleet hosts via the same pooled SSH. The host/service map lives in YAML and hot-reloads on file change — adding a host doesn’t restart the server. Service restarts are gated behind an explicit environment variable.

Story Universe orchestration. Dedicated tools that drive the multi-host narrative stack end-to-end: storyverse_status snapshots every component across hosts in parallel; storyverse_check parses the engine log from the last restart marker and reports ticks, accept/reject counts, top filter patterns, and active arcs in one call; storyverse_sync_engine rsyncs the running engine to the canonical mirror; and storyverse_new_story archives the current story (HTML + JSON + DB), purges MemPalace, resets the world, and restarts services — a complete universe reset from one natural-language request.

Site deploy. Local mirrors of my web properties (this page is one). I edit files locally, then the server uploads via SFTP as root and automatically fixes ownership (chown apache:apache) and permissions (chmod 644). Pull, deploy, and status-compare keep local and remote in sync.

Worth saying: this page was edited on my workstation and pushed live through these tools. The same MCP handles deploys, log retrieval, restarts, and the Story Universe orchestration — one natural-language interface for five hosts and 35+ services.

Halfax AI MCP Server AI bridge — Python + MCP

The other half of the bridge: a Model Context Protocol server that gives any MCP-capable client (my own VSCode extension, Windsurf Cascade, Continue.dev) direct access to the Halfax AI stack — inference, memory, file operations, code analysis. 49 tools registered against my hardware.

Inference and code. Per-model targeting (lm1, lm2, lm3) with multi-turn OpenAI-style messages, side-by-side comparison across all loaded models on the same prompt, automatic RAG context injection from the knowledge store. code_analyze wraps a file in a language-fenced prompt for review. project_analyze walks a directory tree, classifies files by language, finds entry points, and builds an import graph (Python via ast, JS/TS via regex). code_validate syntax-checks Python, JSON, and YAML.

File operations with safety nets. Path-validated file_read; file_write with atomic temp+rename, timestamped .bak backups, a dry_run preview, and mtime conflict detection that refuses the write if the file changed between read and save; file_list with glob filtering. A configurable allowlist (HALFAX_FILE_OPS_ROOTS) plus symlink resolution keeps the agent inside the directories I tell it about.

File-backed sessions. JSON sessions with a reserved-key namespace, automatic trimming above 80% capacity, tool-execution logging (last 200 entries), and archive-after-N-days — the API server auto-injects a session context block whenever a session_id is passed. The autonomous task FSM has been hoisted out of this server and into my VSCode extension; this MCP now focuses on the inference, file-ops, and analysis surface.

AI Collaboration Memory ADR pattern + off-box DR — Markdown + git

I don’t want my AI assistants to start cold every session, and I don’t want their memory locked into one vendor’s storage layer. So I treat assistant rules the way teams treat Architecture Decision Records (the Nygard-2011 pattern) — flat markdown at the project root, off-boxed like production data, used as common ground across whatever tool I’m running today.

DECISIONS.md — load-bearing rules. 16 entries today, each in the ADR shape: rule, why (the empirical incident or evidence that drove it), how to verify or apply. The file is symlinked into a betelgeuse-docs git repo, off-boxed nightly via the Tier-A encrypted archive (the same one the rest of my stack rides), and recoverable from a separate physical location. Nothing in it depends on a vendor’s persistence layer.

One substrate, multiple tools. I actively use Claude Code, Cascade, Continue.dev, and my own VSCode extension on the same codebase. Each has its own per-conversation working memory — Claude Code’s auto-memory in ~/.claude/projects/, my extension’s session JSON in ~/.halfax-sessions/, Cascade’s project memory. There’s no universal convention for project-root rule files in 2026, so I cover the major ones with symlinks: the same file is reachable as DECISIONS.md, .cursorrules (Cursor), AGENTS.md (Windsurf and a growing community standard), and CLAUDE.md (Anthropic). Each tool finds its own convention; I write to one source of truth in git. Plaintext at the root only matters if each tool actually picks it up, and the symlinks are what make that real.

The promote-to-durable rule. When an assistant learns something load-bearing mid-session — a corrected approach, a new invariant — the convention is to promote that fact from per-tool working memory into DECISIONS.md. Working memory is for ephemeral state; DECISIONS.md is for things that should outlive the tool that learned them. The promotion workflow itself is written down as a feedback memory the harness loads into every session, so the rule applies recursively to itself.

Two-tier semantic memory in my AI server. A 60-bullet always-loaded CORE file for rules, identity, preferences, active context, plus an unbounded archive auto-retrieved per turn by cosine similarity above a configurable threshold. The archive uses ChromaDB; embeddings come from the same Vulkan-backed llama-server on port 5001 that the rest of the stack uses. /api/memory/search exposes it to any client.

What’s not mine, and what is. I didn’t invent ADRs, semantic memory, or project-root rule files — mem0, MemGPT, .cursorrules, CLAUDE.md, Cascade’s project memory all exist and all work. What’s mine is the synthesis: ADRs applied to AI assistant rules with the same DR rigor I apply to production data, plus an explicit promote-to-durable workflow that bridges per-tool ephemeral memory to a shared portable substrate.

Self-Hosted DevOps Stack GitLab + Prometheus + Grafana — Docker

I didn’t build GitLab, Prometheus, or Grafana — but I run them on my own hardware as the backbone of how I ship code and watch the lab. They’re here for honesty: every other project on this page is hosted, monitored, and visualized through this stack.

GitLab CE in Docker on Betelgeuse is the authoritative remote for every project on this page — halfax-ai, halfax-ai-vscode, both MCP servers, the Story Universe components, the recovery plan tree. LAN-only on port 99, integrated into my private DNS zone. A self-hosted GitLab Runner (Docker executor, default python:3.12) is registered against the same instance so CI runs on the same host. Nightly at 02:30 it produces a paired data dump plus a config/secrets bundle — the file GitLab’s own backup tool deliberately doesn’t include — both off-boxed to hera-pi at 03:15.

Prometheus on port 9095 scrapes node-level metrics, MongoDB (via exporter), GitLab’s embedded endpoints, and containerized apps. Persistent volume for the time-series store, restart unless-stopped.

Grafana on port 3020 points at Prometheus as its primary data source. Dashboards cover host vitals (CPU / memory / disk / network across five machines), service status, and application metrics (AI model response times, Story Universe activity). Dashboard JSON lives in the recovery-plan tree so a clean rebuild reproduces them automatically.

Why it’s wired in this way: the point of self-hosting isn’t the existence of these services — SaaS versions exist and are fine. The point is recoverability. The recoveryplan tree at the root of this homelab can rebuild this whole stack from compose files and secrets bundles in a single afternoon, with no vendor in the loop. That’s the property no hosted alternative gives you.

Recovery Plan + Drift Detector DR + invariant guard — Bash + Python

A bare-metal rebuild tree under recoveryplan/ that can put Betelgeuse back into its current state in an afternoon: every docker-compose file, every systemd unit, every backup script, the BIND zone, the netplan, the sudoers stanza, a current package inventory. If the box dies tomorrow and I’m starting from a Ubuntu live USB, this tree is the only thing I need.

The backup chain. Mongo dump nightly at 02:15. GitLab backup + paired config/secrets bundle (the file GitLab’s own backup tool deliberately doesn’t include) at 02:30. The Tier-A encrypted archive (AES-256-CBC + PBKDF2, 30-day retention) twice daily at 02:45 and 14:45 — three local mirrors on three different paths, off-box rsync to hera-pi (separate hardware, separate disk, separate room) at 03:15. The recoveryplan tree itself off-boxes too, so post-disaster the runbook isn’t dying with the host. A manual third-copy snapshot lives on my Windows workstation at C:\Users\…\RECOVERY\ for the “both Linux machines gone” scenario — refreshed by hand when the laptop’s up, since I can’t put it on the cron.

The bootstrap chain. The Tier-A archive doesn’t just carry SSH keys and TLS; it also captures the KeySecrets server’s own bootstrap (/etc/keysecrets.env) and the kss_ auth tokens that halfax-secrets uses to call the API. So recovery from cold metal goes: decrypt archive → restore those files → mongorestore using the URI from keysecrets.env → start the KeySecrets server → ks-get works → clone projects from GitLab using the PAT pulled from the freshly-up vault. Each step uses what the previous step just made available, no Betelgeuse-only knowledge needed. The Tier-A passphrase (the keystone) lives on three machines and in the off-boxed runbook itself.

Drift-detect runs paired with the chain. 03:30 (45 min after the morning A1) and 15:30 (45 min after the afternoon A1). Eight checks: failed units, flapping units (NRestarts above threshold), halfax crontab vs captured baseline, enabled-on-host systemd units vs captured (alias-aware via Names=, skips files with leading # DISABLED), untracked source files in the recoveryplan tree, disk + memory pressure, and backup-log freshness — each of mongo / gitlab / secrets / offbox logs must show its OK marker with mtime within 26 hours, scanned by content rather than timestamp parsing. State is deduped to a JSON file so the second daily run stays silent if nothing changed. Email via mail.guywithgames.com with proper SPF + DKIM + DMARC alignment to all my recipient inboxes.

Sudoers, tightly scoped. NOPASSWD only for systemctl read-only + journalctl -u/-n/--since/--no-pager + docker ps/logs/inspect/stats + apt list --installed/--upgradable + iptables/nft list. Anything destructive (start, stop, disable, copy) needs the password. The drift detector and the homelab MCP can self-elevate for read-only checks without prompting; nothing destructive runs without me.

What I’m proud of here: most homelabs treat backups as the recovery story. Mine is a recovery tree: the backups plus the runbook to use them, both off-boxed, both reproducible, with a drift detector that pairs to the chain so a failed archive surfaces in the matching email. Earlier this week the detector caught a service that had been silently dead for six months. That’s the system working.

Network Intelligence discovery + enrichment + security

A discovery-and-correlation pipeline that turns raw network scans into a coherent picture of what’s actually on my network and how it’s behaving. Most homelab setups end up with five separate tools (nmap output here, DHCP leases there, an audit log somewhere else); this one joins them so a single device shows up once, with everything any tool knows about it.

What it joins. Active host scanning (open ports, services, banners), DNS records pulled from BIND9, DHCP leases scraped from the router, security audit hits, and Netbird mesh peer state — all keyed against the same device identity. Re-running a scan flags drift since the last run: new hosts, disappeared services, unexpected ports, mesh peers that went dark.

Live dashboard. A web UI that shows the current correlated view, lets me drill from one device into every record the pipeline has on it, and surfaces the difference between “a new device” and “an old device with new behavior” — the distinction security tooling exists to make.

Halfax System Reporter kernel driver + hardware telemetry

A hardware monitoring platform with a custom Windows kernel driver (WDM, Ring 0) that exposes privileged hardware primitives — MSR read/write, PCI config space, SMBus/I2C — via IOCTL interface with admin-only security descriptors. Not a wrapper around existing tools; the driver (halfax_telemetry_driver.c) loads into the Windows kernel directly. Same access level commercial tools like HWiNFO use, written from scratch.

Three-layer architecture. The kernel driver provides raw hardware access. A C++ user-mode broker (halfax_kernel_broker.cpp) wraps IOCTLs into structured JSON output. A set of native C helper executables handle specialized hardware reads: cpuid_helper (CPUID intrinsics, P/E core topology, APIC IDs, tile/die/module mapping), spd_helper (RAM SPD EEPROM via I2C — DDR4/DDR5 timings, CAS latency, rank/bank config), nvme_helper (NVMe SMART via kernel IOCTLs — temperature, wear, power-on hours), edid_helper (display EDID parsing via SetupAPI).

What it pulls from MSRs: per-core temperatures with thermal margins, RAPL energy counters for real-time power, turbo ratio analysis, C-state residency tracking, IPC efficiency metrics, microcode version detection. Multi-method fallback chains on every subsystem (e.g., memory: SPD helper → WMI → dmidecode → kernel) — if the deepest method fails, the app tries each fallback in turn before giving up, so even on locked-down systems most readings are still available.

Tkinter GUI with tabbed display across CPU, memory, storage, network, GPU, battery, and system. Cross-platform with full feature parity: Windows (kernel driver + WMI), Linux (sysfs + dmidecode), Raspberry Pi (device tree). Closer to HWiNFO than to a typical Python sysinfo script.

What I’m proud of here: most system monitors stop at WMI or /proc and call it done. I wrote a custom WDM driver that reads MSR registers and PCI config space directly — the same approach commercial tools use, but built from nothing. A kernel driver that loads into Ring 0 is a fundamentally different level of work than calling an API, and it’s the part of this project that took me the longest to get right.

Systems & Security

HalfaxOS bare-metal x86_64 OS — from scratch

A 64-bit capability-based operating system I’m writing entirely from scratch in C and x86_64 Assembly. No Linux, no BSD, no borrowed kernel code — every line is original. Boots via GRUB2 Multiboot2 on both VMware Workstation and physical hardware.

Capability security model. Replaces UNIX file descriptors with typed, permission- checked handles. 512 global kernel objects with refcounting; 128 handles per process with 12-bit permission bitmasks. cap_dup() can only attenuate (reduce) permissions — never escalate. This is the foundational design decision and makes whole classes of security bugs structurally impossible. Same direction as Google Fuchsia and ARM CHERI.

Kernel internals. 38 system calls (vs Linux’s 450+). SMP-aware preemptive scheduler with per-process CR3 switching. 4-level paging VMM, bitmap PMM, kernel heap. Message-passing IPC with named ports and structured messages. Ring 3 userspace with ELF64 loader and 5 user programs. ACPI/APIC multi-core CPU detection and IOAPIC routing.

Drivers and networking. Intel E1000 NIC driver. Full from-scratch TCP/IP stack: ARP, IPv4, UDP, TCP, ICMP, DNS, HTTP, DHCP. PS/2 keyboard and mouse. PCI enumeration. PIT timer.

GUI and filesystem. Framebuffer window manager with free-form drag/resize, desktop, taskbar, multi-window terminal with scrollback. VFS layer with RAM filesystem, device filesystem, and exFAT driver.

What I’m proud of here: this isn’t a bootloader I called an OS. I started from a real security thesis — capability handles that can only attenuate, never escalate — in the same direction Fuchsia and CHERI are going. On top of that I built a working TCP/IP stack, a window manager, ELF64 user programs running in Ring 3 isolation, and IOAPIC-routed multicore scheduling. It’s the largest single thing I’ve written, and every line of it is mine.

ComboServer BGP + Steam forensic analytics

An analytics platform that asks an unusual question: when the internet itself misbehaves — a routing leak, an undersea cable cut, a country going offline — what happens to the people playing video games on top of it? It correlates real-time BGP routing data with Steam behavioral data (player counts, review volume, patch impact) to surface patterns nobody else has built tooling for: BGP outages causing player-count drops, route leaks triggering latency spikes, country-level shutdowns shifting review volumes.

BGP side (10 features): real-time anomaly detection, hijack-vs-misconfiguration classification, AS “personality profiles” (stability/chattiness/weirdness scores), internet weather radar showing instability clusters, time-travel replay of historical outages, prefix ghosting tracker (announced ranges that don’t actually carry traffic), AS relationship drama detection.

Steam side (11 features): game similarity engine based on player behavior rather than publisher tags, sentiment-drift visualizer, coordinated review-bomb detector, patch-impact shockwave maps, player migration constellations, game life-cycle autopsy with narrative classification, review DNA fingerprinting that matches similar reviewers across games.

Storage and ingest. TimescaleDB for the time-series side, with a state manager that auto-saves every 5 minutes. RIPE RIS Live WebSocket for real-time BGP, RouteViews MRT dumps for historical replay. WebSocket dashboard, REST API, TextBlob sentiment analysis, geographic mapping. Auto-bootstrap pulls 4 weeks of BGP (~2,800 files at 15-min intervals) and 60 days of Steam history on first run, then catches up intelligently after any downtime — BGP if the gap is over an hour, Steam if over six.

What I’m proud of here: as far as I can tell, nobody else builds this. BGP routing analysis and Steam behavioral analytics exist as separate disciplines — correlating them to detect how internet infrastructure events ripple through gaming populations is the original framing I went looking for and didn’t find. The forensic side (AS personality profiles, prefix ghosting, review DNA fingerprinting) is where it stops being a dashboard and starts being actual pattern analysis.

Halfax KeySecrets multi-user secrets vault — HTTPS-only

A self-hosted, end-to-end encrypted secrets vault for me, my family, and a couple of trusted friends. Everyone has their own account, their own master password, and their own keypair. The server only ever sees ciphertext — even with full database access, an admin cannot read anyone’s passwords.

Per-recipient cryptographic sharing. Sharing is the part most password managers get wrong. I do it cryptographically: every secret has its own data key, and that key is sealed individually to each recipient. Revoking access is one wrapped-key delete — no re-encryption, no key churn, no leakage. As of this year the wrap is hybrid post-quantum by default (X25519 + ML-KEM-768 AND-construction, KSH1 wire format) — the server won’t even start without ML-KEM available. The older SealedBox path is read-only now, kept just so legacy entries are still openable.

Conservative modern crypto stack. Argon2id for password hashing (memory-hard, the current OWASP pick — GPU-resistant). ChaCha20-Poly1305 for authenticated encryption everywhere else. Master passwords are never stored; they only ever exist as the input that derives a session key, in memory, for the duration of a session.

Login + 2FA. Master password plus mandatory TOTP from any standard authenticator app — new accounts start in a pending_2fa state and can’t do anything except finish enrollment until they confirm a valid code. The 2FA secret itself is encrypted with a key derived from the user’s identity key, so even a stolen database doesn’t leak working codes. Failed logins are rate-limited; every register / login / 2FA / share event lands in an audit log.

Production hygiene most homelab projects skip. Python/Flask app fronted by cheroot (production WSGI server with native TLS). HTTPS only — no plaintext port. Locked-down systemd service with database credentials in a root-owned env file (never in the unit, never in ps). MongoDB sits behind it, holding only ciphertext. Browser side: HSTS, strict Content-Security-Policy, clickjacking protection, Secure/HttpOnly/SameSite cookies. Three interfaces — web UI, Click-based CLI, documented Bearer-token REST API. Seven secret types (logins, API keys, SSH keys, notes, certificates, cards, free-form), each with its own encrypted fields.

The consumption story — halfax-secrets. A vault is just a website if my programs still carry .env files. So I wrote a tiny Python lib + ks-get CLI that fetches credentials over the API at runtime: get_password("MongoDB — myHalfax…") and the password lives only in process memory, never on disk. Per-host service tokens (long-lived kss_ bearers, role-separated read vs change) authenticate unattended scripts — the human master password is never in a script’s environment. Migrating the fleet (HF tokens on the AI host, the Mongo backup credential, the Namecheap dynamic-DNS hash on hera-pi, SMTP passwords on two hosts) means there are now exactly five named files on disk that still hold plaintext credentials — the bootstrap files the vault needs before it can serve anything else, and the trust-anchor tokens themselves. Everything else fetches at runtime. That’s the line I wanted to draw, and now it’s actually drawn.

Games

Halcity city builder — Python + Pygame

A SimCity-style city builder I wrote from scratch in Python and Pygame. 150×100 procedural map (grass, forest, water), 16 building types with subtypes — 5 power plants (Coal, Gas, Nuclear, Solar, Wind), 3 residential, 3 commercial, 3 industrial, 3 farms — and roads that auto-tile at intersections with smart joins (straight, corner, T-junction, crossroad). Buildings have to connect to a road to function.

The underground layer (Tab toggle) lets you lay water pipes and sewer pipes with a 4-tile Manhattan coverage radius. Water pumps near water tiles produce +15 supply; sewage plants process waste. Both systems affect happiness and pollution.

Turn-based economy simulates population, jobs, food, energy, money, pollution, happiness, water, and sewage every End Turn. Threshold events fire automatically (smog alerts, trade booms, festivals). A prestige system lets you spend points on city decrees (Golden Harvest, Civic Vision, Clean Future). 48×48 pixel-art sprites for every building variant. Cursor-anchored zoom, right-click pan, hover tooltips, and full save/load.

Pygame rewrite of an earlier Tkinter prototype, organized into six modules: game.py (main loop + world gen), renderer.py (map + UI), game_logic.py (economy + events), sprites.py (pixel art), constants.py (data), ui.py (widgets).

Roguelike Dungeon Crawlers 4 iterations — mybad → newmybad

A roguelike dungeon crawler that evolved through four major iterations, each building on the last:

Version UI Key Addition
mybad PyQt6 GUI First playable: dungeon gen, combat, inventory, town, save/load
cmdmybad Terminal / curses Stripped to pure CLI; same engine, keyboard-only input
myband Terminal CSV-driven item/monster data experiment (Angband-inspired)
newmybad Tkinter GUI Full rewrite: magic system, Borg AI agent, telemetry, Godot port

newmybad is the current version. It adds a magic system, expanded combat in fight.py, a resource loader with JSON data files for weapons, armor, potions, scrolls, wands, jewelry, and monsters, and a full autonomous AI agent:

The Borg AI (borglib.py + borg_policy.py) plays the game autonomously using a goal-stack architecture (LIFO priorities + FIFO deferred goals), BFS pathfinding, stuck/oscillation detection and recovery, and priority-weighted targeting (monsters 40, stairs 30, healing 25, items 20, equip 15, explore 5). Every step is recorded by borg_instrument.py as JSONL telemetry for post-mortem analysis. A partial Godot 4 port exists in godot_version/.

Skyhaven RPG browser RPG + Godot 4 port

A Skyrim-inspired browser RPG built in pure HTML, CSS, and JavaScript — no frameworks, no build tools. Full character creation with 10 playable races (Nord, Imperial, Dark Elf, High Elf, Wood Elf, Khajiit, Argonian, Orc, Breton, Redguard) and 5 classes (Warrior, Mage, Thief, Ranger, Paladin), each with unique stat bonuses and starting gear.

Turn-based combat (attack, magic, defend, potion, flee), 12 skills across combat/magic/stealth trees, quest log with multi-stage quests, equipment slots, HP/MP/SP resource bars, level-up with attribute point allocation, NPC dialogue trees, world map navigation, and a full save system.

A parallel Godot 4 port is in progress with GDScript reimplementations of the game data, state management, and core scenes (main menu, character creation, game world).

Betelgeuse Projects active + dev

  • Halfax AI System
  • Narrative Engine
  • Homelab MCP Server
  • Halfax AI MCP Server
  • Halfax Learning System
  • ComboServer
  • Halfax System Reporter

hera-pi Projects service stack

  • AI Web Proxy (Interfaces)
  • Chronicle Keeper
  • Namescan / Network Scanner
  • Photo Upload Server
  • Halfax Image Generator (PictureAI)
  • Minecraft Server

UYScuti Projects 55+ builds

  • HalfaxOS
  • Interactive Story Creator
  • Halfax KeySecrets
  • Personal Finance App
  • Network Intelligence
  • Halcity (City Builder)
  • Skyhaven RPG
  • Roguelike Dungeon Crawler + Borg AI
  • Classic Roguelike Variants
  • ML / AI Experiments

Cloud Projects public-facing

  • World Browser
  • Daily Email Sender
  • Simple Blog App Prototype
  • Site publishing + operations workflow
  • Cross-host automation scripts
Security note: this homepage intentionally omits credentials, private IPs, host access details, DNS internals, and service ports not required for public navigation.