MI Történik?

Mesterséges intelligencia hírek magyarul — naponta frissülve

← Vissza a főoldalra

A Mistral kiadta a Ministral 3 modellcsaládot, amely kaszkád desztillációval készült

Mistral compressed Mistral Small 3.1 into much smaller versions, yielding a family of relatively small, open-weights, vision-language models that perform better by some measures than competing models of similar size. The method combines pruning and distillation. Mistral AI released weights for the Ministral 3 family in parameter counts of 14 billion, 8 billion, and 3 billion. Each size comes in base, instruction-tuned, and reasoning variants. The team detailed its recipe for distilling the models in a paper. Starting with a larger parent, they alternately pruned (removed less-important parameters) and distilled (trained a smaller model to mimic the larger model's outputs) it into progressively smaller children.
Miért fontos?

Cascade distillation offers a way to produce a high-performance model family from a single parent at a fraction of the usual cost. Training runs were shorter and the algorithm is relatively simple, potentially enabling developers to build multiple model sizes without proportionately higher training costs.

Eredeti forrás megtekintése (angol) →