🤖Reddit r/MachineLearning•Stalecollected in 26h
Predict GPT-2 Edges from Weights Alone
💡125x faster edge importance prediction for GPT-2 circuits from weights alone – interpretability breakthrough.
⚡ 30-Second TL;DR
What Changed
ρ=0.623 Spearman correlation with path patching
Why It Matters
Enables fast prioritization of edges for investigation or pruning in transformer circuits, saving compute on causal scrutiny. Promising for scaling mechanistic interpretability.
What To Do Next
Compute Cheap Anchor scores on your transformer model's induction heads using the described spectral and path metrics.
Who should care:Researchers & Academics
📰
Weekly AI Recap
Read this week's curated digest of top AI events →
👉Related Updates
AI-curated news aggregator. All content rights belong to original publishers.
Original source: Reddit r/MachineLearning ↗