Now

Last updated: Apr 24, 2026

Building

Rebuilding narapa.dev from a resume-shaped portfolio into a terminal-style working space. Notes, experiments, and a chat-with-the-site backed by my own AI Gateway.
RAG eval harness — I had hand-wavy confidence in the Prudential GPT-4 assistant's answers; I'm formalizing retrieval MRR and hallucination rate so regression is a number, not a feeling.
AI Gateway Platform — multi-provider LLM routing (Azure OpenAI + Bedrock), per-tenant cost tracking, policy-based guardrails. Adding prompt/response caching this week.

Honeycomb's blog — their writing on observability philosophy keeps changing how I instrument systems.
Anthropic's research posts on eval methodology — specifically the stuff on how to measure LLM reliability under distribution shift.
Designing Data-Intensive Applications — fifth re-read. Different things land each time.

How much of the "agent" pattern is actually useful vs. a clean RAG + a good prompt.
Where the seam between legacy and modern systems should sit when you're in the middle of a 3-year migration with no option to pause either side.
Why every AI infra team is converging on the same shape (gateway + vector DB + eval + guardrails) — and what that tells us about the next layer up.