Mistral 推出新的 Mixtral 8x7B，另外也開始提供付費 API 服務

首先是「Mixtral of experts」這個公告，Mistral 推出了新的 model，叫做 Mixtral 8x7B。這是一個 46.7B 的 model，但計算每個 token 時只需要計算 12.9B 的值：

Concretely, Mixtral has 46.7B total parameters but only uses 12.9B parameters per token. It, therefore, processes input and generates output at the same speed and for the same cost as a 12.9B model.

這就可以大幅降低計算需要的量，卻可以達到與 Llama 2 70B 或是 GPT-3.5 同一個等級的品質：

另外在「Mistral: Our first AI endpoints are available in early access (mistral.ai)」這邊看到官方推出了 API，正式的公告在「La plateforme」這邊，同時價錢也已經出來了：「Pricing」。

依照官方的說明，Mistral Small 用的是上面提到的 Mixtral 8x7B，所以表現的數字是一樣的。另外一個 Mistral Medium 沒有提到太多細節，只丟出簡單的說明：

然後 API 看起來會相容 OpenAI 家的 API：

Our API follows the specifications of the popular chat interface initially proposed by our dearest competitor.

Hacker News 上面的 id=38599156 有整理出重點，把價格數字都換算成 1m output 的費用，可以看到表現相近的 Mistral Small (Mixtral 8x7B) 與 GPT-3.5 在價錢上也差不多：

Per 1 million output tokens:
Mistral-medium $8
Mistral-small $1.94
gpt-3.5-turbo-1106 $2
gpt-4-1106-preview $30
gpt-4 $60
gpt-4-32k $120

沒有拿 GPT-4 的數字來比，代表 Mistral Medium 與 GPT-4 有一定的差距，這點也可以從價錢上面看出來。

不過我更在意的是 Mistral Medium 的 model 會不會放出來？

用 llama.cpp 玩 Mistral 7B Instruct，補一下 llama.cpp 的發展

看到「Workers AI Update: Hello Mistral 7B」這篇想到的，先前有提到「號稱目前最強的 Mistral 7B」，加上有一陣子沒看 llama.cpp 最近的發展，跳下去重新測試時發現有不少進展。一個比較大的進展是 llama.cpp 推出 gguf 格式，取代之前的 ggml 格式。新的格式可以想像是在檔案裡面放了通用性的 feature flag，就不會遇到新的 model 用到新的方法，沒辦法在 ggml 裡面指定 + 新增 feature，就得把 llama.cpp 整包 fork 拉出出去大改。這差不多是三個月前的事情，蠻多 model 都已經支援了，像是 maddes8cht 這邊就整理了很多 OSL model (open source license) 可以直接下載下來用，不需要自己轉檔。像是 Falcon 40B 與標題提到的 Mistral 7B，以及對應的 Instruct 版本…

November 23, 2023

In "Computer"

Georgi Gerganov 給了在 AWS 上面用 GPU instance 跑 llama.cpp 的說明

Georgi Gerganov 寫了一篇怎麼在 AWS 上面用 GPU instance 跑 llama.cpp 的說明：「Using llama.cpp with AWS instances #4225」。先跳到最後面的懶人套件，直接提供了 shell script 幫你弄完： bash -c "$(curl -s https://ggml.ai/server-llm.sh)" 回到開頭的部分，機器的選擇上面，他選了一台最便宜的 4 vCPU + 16GB RAM + 16GB VRAM 的機器來跑。然後他提到了 OpenHermes-2.5-Mistral-7B 這個模型最近很紅，也許有機會看一下： We have just 16GB VRAM to work with, so we likely want to…

November 28, 2023

In "API"

uBO Lite：另外一個方向的嘗試

兩個禮拜前在 Hacker News 上看到的東西，算是 uBlock Origin 對 Manifest V3 (MV3) 的另外一種嘗試：「uBlock Origin Lite: Description (github.com/gorhill)」，專案的說明在「uBO Lite (uBOL), an experimental permission-less MV3 API-based content blocker.」這邊。先前在「因應 Manifest V3 而推出的 uBlock Minus (MV3)」這邊提到的 uBlock Minus 是在 MV3 環境下的一個嘗試，但這個版本只是把 MV3 做不到的事情先拔掉，所以缺了很多重要的功能，像是 cosmetic filtering (主要是針對瀏覽器不支援的 css selector，像是最近才剛支援的 :has()，而這些 css selector 對於選擇要幹掉的 html 元素很好用)。 uBO…

October 4, 2022

In "Browser"

Author Gea-Suan LinPosted on December 12, 2023Categories API, Computer, Murmuring, Network, ServiceTags 8x7b, ai, api, dnn, gpt, learning, machine, medium, mistral, mixtral, model, network, neural, nn, service, small

Your email address will not be published. Required fields are marked *

Comment *

Name *

Email *

Website

Notify me of follow-up comments by email.

Notify me of new posts by email.

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Learn More)

Mistral 推出新的 Mixtral 8x7B，另外也開始提供付費 API 服務

Mistral 推出新的 Mixtral 8x7B，另外也開始提供付費 API 服務

Related

用 llama.cpp 玩 Mistral 7B Instruct，補一下 llama.cpp 的發展

Georgi Gerganov 給了在 AWS 上面用 GPU instance 跑 llama.cpp 的說明

uBO Lite：另外一個方向的嘗試

Leave a Reply

Post navigation

Recommend

pydantic/pydantic/_internal/_typing_extra.py at d7ab7d6e1f9bc694d47c82d786cf3051...

抖音上的虚拟手机号怎么弄？抖音中国虚拟手机号注册流程

纳斯达克联席总裁：公司注意到机构对加密货币兴趣减弱

Tim Sweeney: Why Epic did better against Google than Apple in court

UX Design and the power and challenges of AI and ML Integration

快速入门：使用 .NET Aspire 组件实现缓存

Everyone will have a company of 10000 experts: Sam Altman #atomicIdeas

中瓴智行获得战略投资

想上市的杨国福，让网友破防：合成肉、“蝙蝠”触目惊心

Facing the reality of AI

About Joyk