@Datta Nimmaturi @Gavrish Prabhu. 2 March 2024
Contents
A new model from mistral.ai. It ranks just behind GPT4 on MMLU. Though I’m not a real fan of MMLU, this achievement is still commendable. There isn’t a paper yet. Maybe soon?
Also, this model correctly answers the “Pound of feathers vs Kilo of bricks” question but fails to answer Le Cunn’s 7 gears problem
And as usual, this is instruction tuned and also tuned to function calling.
With this, mistral slowly is moving away from Open Source(weights) AI. They rebranded their models to Tiny(7B), Small (8*7B), Medium (70B) and Large.
A 2.3B param foundation model trained on videos from the internet. This model generates Playable (imagine Mario or Sonic) worlds from text and image/photograph/sketch as input. The videos and the actions aren’t annotated but the models still learn to understand the mapping.
The name Genie appears to be an amalgamation of Generative Interactive Environments.
This acts as a starting step in generating action video pairs for further training agents. And things will only get better from here on. Remember, CLIP was used in training DALLE2.
The best part is, this isn’t just restricted to game agents. You can take videos of robots performing tasks and this model can learn action space mapping to emulate that. Infact, they mention the same in the paper. Cool huh.
Another emergent capability is emulating Parallax. Nearer objects move more than farther. Infact, if you properly see SORA generated videos, you can observe the same. This has existed as a side effect or emergent capability ever since the days of GAN based video generators.