01 — Start with data that doesn't lie.
Postgres with read replicas and indexing strategy. Redis out front as a distributed cache. Baseline: 60% fewer DB queries, 50% faster APIs, reads 60% snappier.
Software engineer with a decade of curiosity and a taste for systems that refuse to fall over. I spend my days making agentic software useful, measurable, and reliable — not just impressive in a demo.
The gap between a flashy demo and a production system is measured in observability, evaluations, and a thousand small decisions nobody tweets about.
They deserve SLOs, regression suites, and a paging rotation — same as any other service with users.
Retrieval strategy, conversation state, and prompt shape determine whether a user trusts you twice.
Offline evals gate the deploy. Online evals gate the rollout. Vibes are for the launch post.
Postgres, Kafka, Redis, a circuit breaker. The exciting part sits on top of a foundation that never surprises you.
Postgres with read replicas and indexing strategy. Redis out front as a distributed cache. Baseline: 60% fewer DB queries, 50% faster APIs, reads 60% snappier.
Kafka for event-driven flow. +30% throughput, −10% inter-service latency. OAuth 2.0 for 5M+ daily logins with 20% fewer failures.
High-throughput microservices in Spring Boot + Gin on Kubernetes. 2M+ concurrent users. 99.99% uptime. Legacy translation for 50M requests/day at 25% lower cost.
LlamaIndex over internal tools, code, and knowledge bases. Ingestion, semantic chunking, embeddings, vector search. Answers arrive with citations — or they don't arrive at all.
Prototyped in Dify, hardened with ADK. Specialized agents plan steps, call tools, and aggregate. ~70% less time finding internal info. 17% fewer tokens. Engineers & PMs actually use it.
An agentic RAG assistant over our private tools, code, and knowledge. Engineers ask, answers come back with citations — and the bill stays reasonable.
* caught by the eval harness before rollout. That's the whole point.
High-throughput microservices powering in-game commerce for 2M+ concurrent players with four nines of uptime. Boring on purpose.
Outside of 2K, I've been working with a Gen-AI startup on the infrastructure that turns an agent from a party trick into a product:
| faithfulness | 0.92 | |
| citation_recall | 0.88 | |
| task_completion | 0.81 | |
| tool_accuracy | 0.95 | |
| p95_latency | 1.8s | |
| cost/query | $0.014 |
If an agent can't be measured, it can't be shipped.
If it can't be shipped, it isn't real.
Engineering and arranging share the same bones — voice-leading is just dependency resolution in a nicer font. A few pieces I've been carrying around.
video2:09(Hit ▶. R2 ships the bytes; the rest was shipped after hours.)
Hi — I'm Haoyang's digital twin. Ask me about his agent work at 2K, the commerce platform, evals, or the music on the way down.