3

Why are LLMs so small?

 6 months ago
source link: https://willschenk.com/fragments/2024/why_are_ll_ms_so_small/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

March 1, 2024 9:42 am

Why are LLMs so small?

LLMs are compressing information in a wildly different way than I understand. If we compare a couple open source LLMs to Wikipedia, they are all 20%-25% smaller than the compressed version of English wikipedia. And yet you can ask questions about the LLM, they can – in a sense – reason about things, and they know how to code.

NAMESIZE
gemma:7b5.2 GB
llava:latest4.7 GB
mistral:7b4.1 GB
zephyr:latest4.1 GB

Contrast that to the the size of English wikipedia – 22gb. That's without media or images.

Shannon Entropy is a measure of information desitity, and whatever happens in training LLMs gets a lot closer to the limit than our current way of sharing information.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK