The LLama Effect: How an Accidental Leak Sparked a Series of Impressive Open Sou...

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03b36c-e5da-49d6-af96-e1ad9a078c0d_1024x1024.png

Created Using Midjourney

Next Week in The Sequence

Edge 281: Our series about federated learning(FL) continues with an overview of cross-device FL, Google’s research about FL and differential privacy and the FedLab framework for FL simulation.
Edge 282: We deep dive into LangChain, the uber popular framework for LLM-based development.

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

📝 Editorial: The LLama Effect: How an Accidental Leak Sparked a Series of Impressive Open Source Alternatives to ChatGPT

The friction between open source and API-based distribution is one of the most interesting battles looming in the generative AI ecosystem. In the text-to-image domain, the release of Stable Diffusion clearly signaled that open source was a viable distribution mechanism for foundational models. However, the same cannot be said in the large language model (LLM) space, in which the biggest breakthroughs are coming from models like GPT-4, Claude, and Cohere, which are only available via APIs. The open source alternatives to these models haven’t shown the same level of performance, specifically in their ability to follow human instructions. However, an unexpected research breakthrough and a leaked release are starting to change that.

A few weeks ago, Meta AI announced Llama, an LLM designed to advance research in the space. Llama was released in different versions, including 7B, 13B, 33B, and 65B parameters, and despite being notoriously smaller than alternative models, was able to match the performance of GPT-3 across many tasks. Llama was not initially open-sourced, but a week after its release, the model was leaked on 4chan, sparking thousands of downloads.

What could have been seen as an unfortunate incident has become one of the most interesting sources of innovation in the LLM space in the last few weeks. Since the leak of Llama, we have seen an explosion of innovation in LLM agents built on it. Just to cite a few examples:Stanford University released Alpaca, an instruction following model based on LLama 7B model.

Researchers from UC Berkeley, CMU, Stanford, and UC San Diego open sourced Vicuna, a fine-tuned version of LLama that matches GPT-4 performance.
Berkeley AI Research Institute(BAIR) released Koala, a version of LLama fine-tuned using internet dialogs.
Nebuly open sourced ChatLLama, a framework for creating conversational assistants using your own data.
FreedomGPT is an open source conversational agent based on Alpaca which is based on LLama.
The Colossal-AI project from UC Berkeley released ColossalChat, a ChatGPT type model with a complete RLHF pipeline based on LLama.

Several other projects are worth mentioning in this list, and I am sure more will be released soon. One thing is certain: the accidental leak of Llama might have turned out to be one of the biggest sparks of innovation in the open source LLM space.

🔎 ML Research

OpenAI Safety

OpenAI published a detailed blog post outlining some of the principles used to ensure safety in their models. The post emphasize in areas such as privacy, factual accuracy and harmful content prevention which are essential for the wide adoption of foundation models —> Read more.

BloombergGPT

Bloomberg published a paper introducing BloombergGPT, a 50 billion LLM fine tuned in financial data. The model is based on BLOOM and fine tuned on a 363 billion token dataset —> Read more.

Segment Anything

Meta AI published a paper outlining the Segment Anything Model(SAM), a large scale model for image segmentation. The model was open sourced together with Segment Anything 1-Billion mask dataset (SA-1B), the largest computer vision segmentation ever released —> Read more.

Koala

Berkeley AI Research(BAIR) released a paper detailing Koala, a dialogue model fine tuned for academic research. The model is based on Meta AI’s Llama and matches the performance of ChatGPT —> Read more.

BayesOpt for Hyperparameter Optimization

Google Research published a paper that models hyperparameter optimization as a Bayesian optimization problem. The paper proposes Hyper BayesOpt, a hyperparameter optimization algorithm that removes the need quantifying model parameters for Gaussian processes in BayesOpt —> Read more.

🤖 Cool AI Tech Releases

Vicuna

Vicuna is an open source Chatbot based on Meta AI Llama which matches ChatGPT quality —> Read more.

ColossalChat

The team from the Colossal-AI project open sourced ColossalChat, an open source clone of ChatGPT with RLHF capabilities —> Read more.

🛠 Real World ML

Generative AI at LinkedIn

Linkedin discusses some of the lessons learned and best practices for building generative AI application —> Read more.

Lyft Recommendations

Lyft discusses the ML models and architecture used in their recommendation systems —> Read more.

📡AI Radar

AI legends Andrew Ng and Yann LeCun recorded a session expressing their oppposition to the AI moratorium proposal.
Quantexa raised $129 million for its AI-based financial fraud prevention platform.
Adthos launched its platform for creating audio ads using generative AI.
Meta discussed their initiatives to use generative AI to create ads.
Robotics company Covariant raised another $75 million.
AI-search company Glean incorporated generative AI capabilities into its search platform.
Some leaked documents revealed OpenAI’s competitor Anthropic’s intentions to raise about $5 billion in the next two years.

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Next Week in The Sequence

📝 Editorial: The LLama Effect: How an Accidental Leak Sparked a Series of Impressive Open Source Alternatives to ChatGPT

🔎 ML Research

BayesOpt for Hyperparameter Optimization

🤖 Cool AI Tech Releases

Vicuna

ColossalChat

🛠 Real World ML

Generative AI at LinkedIn

Lyft Recommendations

📡AI Radar

Recommend

AI Infinity - AI Tools Directory

Apple holds nationwide meetings with retail workers to ‘discuss the risks of uni...

能否打败英特尔？龙芯的逆袭之路|高通|酷睿|cpu|ghz|四核心|财务报表|财务会计_网易订...

2万块你买吗？iPhone 15 Pro Max外形抢先看：CAD图曝光苹果会玩让主摄更突

Spherical Tokamak Achieves Crucial Plasma Temperatures

Twitter allows interactions of tweets with Substack links once again

Codeforces Round 865 (Div. 2)

“开新局智未来” 大咖助阵2023首场恒昌高端财富论坛

Save $230 off an i7-11390H Mini IT11 PC with 512GB+16GB storage

Do You Know The Meaning Of These 5 Famous Car Badges?

About Joyk