Replicating The Circuit Kings

replicating ‘circuit tracing: revealing computational graphs in language models’ by the absolute beasts over at anthropic
Read more →

Multi Query Low Rank Attention

a bit of stream of conciousness poasting
Read more →

Constitutional Mech

Legal Mech Interp
Read more →

Verify Verify Verify

verify the unverifiable
Read more →

Why So Hard (Negative) On Your Self (Reinforcement)?

Exploring hard negative mining with bm25, self-selection, bandits, and faiss
Read more →