NOTES FROM THE LAB
Writing
Short, honest write-ups on the reliability side of AI engineering — evals and anti-hallucination, observability that doesn't lie, agents and MCP, and the GEO / AI-search visibility niche.
Writing
4 posts · 4 seriesOne MCP server, four clients
Pointing Cursor, a raw SDK client, LangChain, and LangGraph at one MCP server to test "write once, call from any agent" — and why transport choice, structured I/O, and not being an SSRF proxy are the real work.
Read →Teaching RAG to say "I don't know"
Two gates — a model-free score floor and an evidence-required reranker that must name what is missing — plus a negative control in the eval set, so a retrieval system can honestly return "nothing here fits."
Read →The bug that made our alerts lie for months
A severity-1 observability blind spot: a log-shipping prefix broke level extraction, so "no errors" really meant "we stopped being able to see errors." Why a zero is a question, not an answer.
Read →Mentioned but not cited: the five states of AI-answer visibility
A five-state model for AI-answer visibility — full, mention-only, citation-only, third-party, invisible — and why the mention–citation gap is the cheapest thing to measure and the most expensive to ignore.
Read →