Monday, May 27, 2024

Foundation models in generative AI

 Scaling theory podcast with Yann Le Cunn

Foundation models will be customised per use case instead of a giant catch all model pan languages. Building AI models is faster and cheaper than you probably think. Y combinator companies used two levers of better architectus or lesser data to reduce computation. 

They are presumed to have high impact when the cumulative amount of compute used for its training exceeds 10^25 floating point operations (FLOPs),[23] - EU Law but you have learned just above that the models will try to not use much compute.

Regulation could be a threat to open models of Meta. Open models imply oversight and hence safer AI. 

open models

open software stack

open OS - Linux servers, Apache server side frameworks

Pytorch is open

While finetuning the foundation models per language is a task. John Schulman in a chat with Dwarkesh Patel mentioned an interesting finding that if you do all your fine-tuning with English data, the model will automatically behave well in other languages. This can be extended to leverage this in robots too. The collaborators’ theory is that learning about the physical world in one robot body should help an AI to operate another — in the same way that learning in English can help a language model to generate Chinese, because the underlying concepts about the world that the words describe are the same. 

 Schulman says that a version of this with multimodal data where if you do text-only fine-tuning, you also get reasonable behavior with images

Its language time. All languages need to prvide their data open source. If linguists can point out common rules of language, this can get furhter and faster. 

Small scale AI startups fine tuning a foundation model should show a figure of merit. 

Is Vision foundation model the future instead of billions of parameters? Considering that humans know more with so little data than what these moels are trained on in few years like a four year old.


No comments: