Why are LLMs so small?

6 months ago

source link: https://willschenk.com/fragments/2024/why_are_ll_ms_so_small/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

March 1, 2024 9:42 am

LLMs are compressing information in a wildly different way than I understand. If we compare a couple open source LLMs to Wikipedia, they are all 20%-25% smaller than the compressed version of English wikipedia. And yet you can ask questions about the LLM, they can – in a sense – reason about things, and they know how to code.

NAME	SIZE
gemma:7b	5.2 GB
llava:latest	4.7 GB
mistral:7b	4.1 GB
zephyr:latest	4.1 GB

Contrast that to the the size of English wikipedia – 22gb. That's without media or images.

Shannon Entropy is a measure of information desitity, and whatever happens in training LLMs gets a lot closer to the limit than our current way of sharing information.

Recommend

Why are LLMs so small?

Why are LLMs so small?

Recommend

Nikon buys Red Digital Cinema, will jump into the pro video space

tf-idf 與 BM25

Keyword Research for SEO: What It Is & How to Do It

探索SRM软件：了解SRM在企业中的作用

美国潮牌UNDEFEATED在武汉天地开设华中地区首店。

德国汉高集团日前公布了2023年全年财报：销售额同比下降3.9%至215.14亿欧元，调整后的...

How we applied advanced fuzzing techniques to cURL

AI Revolutionizing Sales: How Artificial Intelligence is Transforming the Sellin...

BOSS宣布网球运动员TAYLOR FRITZ为品牌全球大使。

2024年无人驾驶汽车代表性企业智驾方案分析——城市NOA落地进程有望加速【组图】

About Joyk