號稱目前最強的 Mistral 7B

Hacker News 上看到「Mistral 7B (mistral.ai)」，Mistral 7B 是目前號稱最強的 7B model。

宣稱在所有項目超越 Llama 2 13B，以及在許多項目超越 Llama 1 34B：

Outperforms Llama 2 13B on all benchmarks
Outperforms Llama 1 34B on many benchmarks

很重要的是以 open source license 放出來的，選的是 Apache License, Version 2.0：

We’re releasing Mistral 7B under the Apache 2.0 license, it can be used without restrictions.

這個 model 大小是可以用 CPU 跑的，馬上就有人推 patch 進 llama.cpp 了：「Added the fact that llama.cpp supports Mistral AI release 0.1 #3362」。

我記得 Llama 2 13B 的輸出結果還有點微妙，但如果說是全部都超過的話，也許可以期待看看品質...

Apache License 2.0 的 RedPajama 7B 釋出

在 LLaMA 出來以後，打造 open source license 的 LLM 變成大家期待的事情，而 RedPajama 算是蠻多人看好的項目。結果還在算的過程中間，路上殺出來 Falcon LLM，在釋出當下以一個比較寬鬆的 license (但還不是 open source license)，到了六月初直接宣布改用 Apache License, Version 2.0，而且同時放出 7B 與 40B 兩個 model，讓 RedPajama 的消息瞬間被壓下去... 現在 RedPajama 放出 7B 了，而且也宣稱在 HELM 上比 Falcon 7B 好：「RedPajama 7B now available, instruct model outperforms all open 7B models…

June 8, 2023

In "Computer"

玩最近 Facebook Research (Meta) 放出來的 LLaMA

很多地方應該都有提到 Facebook Research (Meta) 放出來的 LLaMA 了，對應的論文是「LLaMA: Open and Efficient Foundation Language Models」這篇，但這邊論文提到的 open 並不是一般常見的 open 定義，而只是常見的行銷詞彙而已，實際上只是 free for charging with constraints。另外要注意 LLaMA 是個 LLM 而已，跟 ChatGPT 不算是同樣性質的東西，能對比應該是 GPT-3 (或是 GPT-3.5)。主要是 ChatGPT 多了 SL 與 RL 的步驟，而產出來的東西更接近商業化產品要的結果。 LLaMA 的特點在於效能不錯，可以用 LLaMA-13B 打贏 GPT-3 (175B)，另外這次訓練出來最大的 LLaMA-65B 則可以站上第一梯隊 (與 DeepMind 的…

March 16, 2023

In "Computer"

llama.cpp 官方支援 Falcon

先前有提過採用 Apache License 2.0 的 Falcon 40B，少數能跟 LLaMA (第一代) 打對台的版本，而且是真正的 open source license：「Falcon 40B 超越 LLaMA 65B 成為目前 Open LLM 的領頭」，當時有提到 llama.cpp 還沒有支援。過了一陣子，社群自己先 fork 了一版，想辦法支援 Falcon 40B：「cmp-nct/ggllm.cpp」，但這也導致沒有跟到很多 llama.cpp 的新功能 (尤其是各種透過硬體加速的支援)。剛剛刷了一下，發現前幾天 llama.cpp 官方支援 Falcon 的 model 了：「llm : add Falcon support」。看起來是個開始，可以看到還有列出一些項目要實作的，但看起來可以跑了。

August 26, 2023

In "Computer"

Author Gea-Suan LinPosted on September 28, 2023Categories Computer, MurmuringTags 7b, ai, language, large, learning, llm, machine, mistral, model

Your email address will not be published. Required fields are marked *

Comment *

Name *

Email *

Website

Notify me of follow-up comments by email.

Notify me of new posts by email.

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Learn More)

號稱目前最強的 Mistral 7B

號稱目前最強的 Mistral 7B

Related

Apache License 2.0 的 RedPajama 7B 釋出

玩最近 Facebook Research (Meta) 放出來的 LLaMA

llama.cpp 官方支援 Falcon

Leave a Reply

Post navigation

Recommend

Just work on open source

Who Should Teach Tech Job Skills

Learning Elisp 10 - elisp data structures

Integrated Information Theory

Leaving a golden age for CS self-learners

字符编码技术专题(四)：史上最通俗大小端字节序详解，一文即懂！

macOS Sonoma 桌面上的小组件，让信息触手可及

戒色打卡交流群

早报｜iPhone 15 发售日门店大排长龙 / GPT-4 正式接入 Win11 / 国庆假期增开 200 列...

C#使用iKvm黑科技无缝接入JVM生态 - 程序设计实验室

About Joyk