LLaMA: Open and Efficient Foundation Language Models
Watch: Yannic’s Video
Note:
However, recent work from Hoffmann et al. (2022) shows that, for a given compute budget, the best performances are not achieved by the largest models, but by smaller models trained on more data.