🐯虎嗅•Freshcollected in 24m
API Probes Estimate Secret LLM Params

💡Black-box param estimator claims GPT-5.5=9T—test it on your APIs now (debate raging)
⚡ 30-Second TL;DR
What Changed
IKP dataset: 1400 rare questions across 7 scarcity levels, tested on 188 models.
Why It Matters
Enables reverse-engineering of closed models, but wide CIs and method critiques limit reliability. May shift focus to factual capacity over benchmarks in model comparisons.
What To Do Next
Build IKP dataset and probe your black-box LLM APIs to estimate param counts.
Who should care:Researchers & Academics
🧠 Deep Insight
AI-generated analysis for this event.
🔑 Enhanced Key Takeaways
- •The IKP (Information Knowledge Probing) framework utilizes a 'knowledge-density' metric, which posits that as model parameter counts increase, the ability to retrieve long-tail, low-frequency factual data follows a predictable power-law distribution.
- •Critics of the IKP method highlight that Mixture-of-Experts (MoE) architectures, which are now standard in top-tier models like GPT-5.5, create 'sparse' parameter counts that complicate the definition of 'total parameters' versus 'active parameters' during inference.
- •The research team behind IKP acknowledges that their 90% confidence interval (0.3x to 3x) is significantly wider than traditional hardware-based profiling, primarily due to the confounding variables of data contamination and specialized post-training optimization.
🛠️ Technical Deep Dive
- •The IKP framework relies on a log-linear regression model: log(Accuracy) = α * log(Parameters) + β, where α represents the scaling efficiency of knowledge acquisition.
- •The dataset consists of 1,400 'rare' factual queries curated from obscure academic papers, niche historical archives, and specialized technical documentation to minimize the likelihood of these facts appearing in standard training corpora.
- •The methodology accounts for 'Model Age' and 'Training Compute' as latent variables, using a normalization factor to adjust for models that have undergone extensive synthetic data distillation.
🔮 Future ImplicationsAI analysis grounded in cited sources
Model providers will increasingly obfuscate API response latency to prevent parameter estimation.
As parameter estimation techniques become more accurate, companies will likely introduce artificial jitter or latency to mask the computational footprint of their models.
Standardized 'Knowledge Density' benchmarks will emerge as a new industry metric.
The success of IKP suggests that factual recall efficiency is becoming a more reliable proxy for model capability than traditional MMLU or GSM8K benchmarks, which suffer from saturation.
⏳ Timeline
2025-03
Initial release of the IKP methodology whitepaper focusing on small-scale open-source models.
2025-11
Expansion of the IKP dataset to include 1,400 queries, enabling testing on closed-source API models.
2026-04
Publication of the comprehensive report estimating parameters for GPT-5.5 and Claude 4.7.
📰
Weekly AI Recap
Read this week's curated digest of top AI events →
👉Related Updates
AI-curated news aggregator. All content rights belong to original publishers.
Original source: 虎嗅 ↗



