The version numbers are a little confusing and deserve some explanation.
Internally, we are working on version 9 of our new foundation model, which is 1.5T params. This is substantially better in every way than v8: data curation, training recipe, size, etc. It is also optimized to run on a Blackwells.
The public facing v4.2 is based on foundation model v8, trained on Hoppers, with significant shortfalls in training data quality, comprehensiveness and proportionality. It is also only 0.5T in size.
The difference between Grok foundation model 8 and 9 is gigantic.