Real experiments. Published findings. No filler.
We test AI systems in production and write about what holds up. Security, architecture, model evaluation, and tooling decisions from active builds.
Latest
7 notesOpenclaw: security and deployment best practices with Docker
Airgapped Docker containers, hardened networking, pinned images, and zero outbound access for agent deployments that handle real data.
Tutor architecture: how we structure AI tutoring systems
Session handling, knowledge routing, adaptive difficulty, and the feedback loop.
Nanoclaw vs Openclaw: which one to deploy
Deployment framework for choosing the right agent runtime by team shape and operational constraints.
Opencode: an open alternative to Claude Code
Where open coding assistants work today, and where managed tools still win.
ChatGPT Codex 5.3 vs Claude Opus 4.6 for core build
Edit accuracy, recovery speed, and architecture-level task comparison.
RAG without hallucination
Guardrails for retrieval-only responses that stay faithful to approved documentation.
Aesthetic scoring for early childhood
Scoring rubric for generated visuals intended for young children.