2

UC Berkeley Releases A Free Alternative Of Meta’s LLaMA

 1 year ago
source link: https://www.theinsaneapp.com/2023/05/open-llama.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

UC Berkeley Releases A Free Alternative Of Meta’s LLaMA

Berkeley’s researchers have unveiled 7B Open LLaMA model. This model is an open source alternative to Meta’s LLaMA language model. The researchers trained the model on the RedPajama dataset, consisting of 200 billion tokens.

The model’s weights are available in both PyTorch and Jax. With this release, all non-commercial models developed from LLaMA can be re-trained under a permissive license.

UC Berkeley Releases Open Source Alternative Of Meta's LLaMA

The RedPajama dataset, used for Open LLAMA‘s training, comprises a staggering 1.2 trillion tokens, equal to LLaMA’s dataset. The researchers used a cloud TPUv4 pod with data parallelism and FSDP or Zero3 to balance memory usage and throughput while training OpenLLAMA. Their training run produced over 1900 tokens/second/TPUv4 chip throughput.

The EleutherAI’s lm-evaluation-harness was used to evaluate the performance of Open LLAMA. The results showed that OpenLLAMA performs similarly to LLaMA and GPT-J in most tasks and even surpasses them in some.

The team predicts that OpenLLAMA’s performance will improve significantly after its training on 1 trillion tokens. The authors are currently working on evaluations to confirm that OpenLLAMA is on par with, or even better than, the original model in most cases.

Furthermore, the team is actively working on evaluations and training a 3B model. This upcoming model will be released shortly after completion.

Due to industrial licenses binding Meta’s LLaMA, directly distributing LLaMa-based models was impossible, but this is no longer the case. There have been numerous attempts to open-source these models. Open LlaMA is not the first of its kind in this domain.

Hugging Face, an open-source AI platform released HuggingChat, an open-source alternative to ChatGPT, less than two weeks ago. The chatbot uses OpenAssistant’s latest LLaMA-based model, which provides XOR weights for the OA models.

Moreover, Databricks found a way around this with Dolly 2.0. Unlike other “open source” models, Dolly 2.0 is available for commercial purposes without paying for API access or sharing data with third parties. This sets it apart from the rest.

Check out: GitHub Repo and Model Weights

Related Stories:

🙏 Help Us By Sharing This Article 👇:

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK