MIT introduces Self-Distillation Fine-Tuning to reduce catastrophic forgetting; it uses student-teacher demonstrations and needs 2.5x compute.
Knowledge Distillation (KD) has been established as an effective technique for reducing the resource requirements of models when tackling computer vision tasks. Prior work has studied how to distill ...
The US is dominating headlines with frontier AI models, multi-billion-dollar investments and powerful chips, while China is making AI cheaper, widely deployable at home and abroad ...
Microsoft researchers have developed On-Policy Context Distillation (OPCD), a training method that permanently embeds ...
Artificial intelligence developers are accusing Chinese firms of stealing their intellectual property following a spate of ‘distillation attacks’, despite their own alleged theft of training data.
Recently, two of the most important artificial intelligence (AI) companies in the world (Google and OpenAI) have launched a ...
This month Anthropic and OpenAI each disclosed evidence that leading Chinese AI labs have illicitly used American models to ...
Anthropic accused three Chinese artificial intelligence enterprises of engaging in coordinated distillation campaigns, the ...
Anthropic alleges Chinese AI labs including DeepSeek, Moonshot and MiniMax used fake accounts to distill Claude, raising new concerns about AI model theft, proxies and U.S. export controls.
Anthropic said it is investing heavily in defences designed to make distillation attacks harder to execute and easier to identify.
Anthropic is accusing three Chinese artificial intelligence companies of "industrial-scale campaigns" to "illicitly extract" ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results