The main cost is training the first version of model. It is very easy to just tr...

bastawhiz · 2026-06-17T22:25:29 1781735129

That doesn't make any sense. If the cost is training, the cost is compute, not data. Distilling another model means paying for that model, which isn't obviously less expensive than crawling the web and curating a dataset. Regardless of where your data comes from, the training process costs the same.

what · 2026-06-18T00:39:33 1781743173

> paying for that model

They probably just used 100000000 free plans.

mswphd · 2026-06-17T22:53:46 1781736826

this is a justification for why one frontier lab would have extremely high spend rates. There is >1 frontier lab though.

More explicitly: if the Chinese can get near-leading performance models at a fraction of the cost by stealing from Anthropic, why doesn't OpenAI? Lord knows they could use the extra cash.