General Purpose



Some ideas make sense to write about. Some make sense to make into a little experiment. These are the latter. All my fumblings are on GitHub.

design-gan

Autoresearch-style dual-agent loop that evolves single-page website designs. A generator agent produces a site from a short brief; a critic agent scores it on the System Usability Scale (SUS) alongside objective accessibility signals (axe-core); the orchestrator feeds feedback back into the generator and repeats until the composite score plateaus.

Screenshot 2026-04-27 at 5
Scrubber Compare
demo / repo

society-of-researchers

A multi-agent research orchestration system that runs research through 6 stages, each staffed by a panel of AI agents with deliberately conflicting perspectives. A conflict-detection pass surfaces where the agents agree, disagree, and contradict themselves — and a human researcher reviews, edits, and approves at every checkpoint before advancing.
Screenshot 2026-04-28 at 1
Screenshot 2026-04-28 at 1
demo / repo

back-and-forth-annotations

A new way to talk with an AI about an image. Instead of an image being disposable context that scrolls out of view, the image is pinned beside the conversation as the shared object you're both discussing — and both sides can point at it. You drop pins, lasso regions, and draw arrows to show the model exactly what you mean; the model sees the image and points back with its own marks. Every turn that points at something becomes a layer you can scrub through, replay, and export as a collage.
03-claude-points-back
repo

adaptive-explainer

Playing with mid-conversation adaptations. A learning app that creates personalized, multi-step explanations for any topic. It builds a structured learning path, maintains a dynamic model of the learner's knowledge, adapts explanations in real time, and lets users ask follow-up questions at any point during a lesson. Question Screenshot
demo / repo

belief-tracker

Using LLM analysis of conversations to classify a predicted user belief and then use that to improve future model responses. Beliefs Screenshot from Belief Tracker
demo / repo

diamonds-annotation

A tool for annotating LLM conversations across the DIAMONDS psychological framework. Chat with Claude, then Claude rates the resulting conversation across 8 situational dimensions, and you can adjust any of those ratings if you disagree. Useful for model output steering mid-conversation. Conversation Screenshot
repo

tom-benchmark

A benchmark for evaluating Large Language Models on Theory of Mind (ToM) tasks. The suite is organized around 6 cognitive categories scored through a 3-layer evaluation pipeline that combines fast deterministic matching with LLM-based semantic judging and structured output analysis. 01-browse
repo

tom-negotiation

A negotiation simulator where you bargain against AI agents that build and update Theory of Mind (ToM) models of you in real time — inferring your priorities from your moves and adapting their strategy accordingly. Playing Screenshot from Tom Negotiation
demo / repo

report-builder

A web app for building shareable UX research reports — with interactive before/after comparisons, pinned annotations, PDF export, and a separate AI-native report mode that surfaces methodology, prompts, model versions, and reasoning behind every finding. Report Comparison from UX App
AI Report Hero from UX Report App
repo

new-interaction-primitives-for-gen-AI

Seven proposed interaction primitives for working with language models beyond the chatbot. Drop in any text and click through the tabs to feel how each primitive reshapes the same input. Based on a talk of mine by the same name. Screenshot 2026-04-28 at 5
repo

talk-to-me

Turn a Raspberry Pi (or any computer with a mic and speaker) into a conversational object. Speak to it, and it speaks back — powered by Azure Speech and Azure OpenAI. Drop it inside a 3D-printed lamp, a stuffed animal, a stapler, anything. Change the SYSTEM_PROMPT and the object takes on a personality.
Screenshot 2026-04-28 at 12
repo

plus-max-go-one

Discover products with "Plus", "Max", "Go", or "One" in their names. Because tech companies really love to name products and services with one of these four words. Like, a lot.

Screenshot 2026-04-27 at 5
demo / repo

shader-fun

Shaders are fun. Upload an image and play around (or just look at them).
image-distortion
demo / repo