6

Meet 'Smaug-72B': The New King of Open-Source AI - Slashdot

 7 months ago
source link: https://news.slashdot.org/story/24/02/07/1753257/meet-smaug-72b-the-new-king-of-open-source-ai
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Meet 'Smaug-72B': The New King of Open-Source AI

Slashdot is powered by your submissions, so send in your scoop

binspamdupenotthebestofftopicslownewsdaystalestupid freshfunnyinsightfulinterestingmaybe offtopicflamebaittrollredundantoverrated insightfulinterestinginformativefunnyunderrated descriptive typodupeerror

Sign up for the Slashdot newsletter! OR check out the new Slashdot job board to browse remote jobs or jobs in your area

Do you develop on GitHub? You can keep using GitHub but automatically sync your GitHub releases to SourceForge quickly and easily with this tool so your projects have a backup location, and get your project in front of SourceForge's nearly 30 million monthly users. It takes less than a minute. Get new users downloading your project releases today!
×

Meet 'Smaug-72B': The New King of Open-Source AI (venturebeat.com) 20

Posted by msmash

on Wednesday February 07, 2024 @03:01PM from the new-king-in-town dept.
An anonymous reader shares a report: A new open-source language model has claimed the throne of the best in the world, according to the latest rankings from Hugging Face, one of the leading platforms for natural language processing (NLP) research and applications.

The model, called "Smaug-72B," was released publicly today by the startup Abacus AI, which helps enterprises solve difficult problems in the artificial intelligence and machine learning space. Smaug-72B is technically a fine-tuned version of "Qwen-72B," another powerful language model that was released just a few months ago by Qwen, a team of researchers at Alibaba Group.

What's most noteworthy about today's release is that Smaug-72B outperforms GPT-3.5 and Mistral Medium, two of the most advanced open-source large language models developed by OpenAI and Mistral, respectively, in several of the most popular benchmarks. Smaug-72B also surpasses Qwen-72B, the model from which it was derived, by a significant margin in many of these evaluations.
  • by 93 Escort Wagon ( 326346 ) on Wednesday February 07, 2024 @03:03PM (#64222810)

    With a name like "Smaug", it can't help but eventually turn evil...

    Good job, guys.

    • Re:

      One ring of fire to rule them all...

      • Re:

        I fell in to a burning ring of fire.

  • by ceoyoyo ( 59147 ) on Wednesday February 07, 2024 @03:13PM (#64222836)

    More impressive are the models that score almost as high, probably within error, and are 1/5th the size.

    • Re:

      My question would be, how does one explain the difference in size? Is it possible that the smaller models ace the tests because their models happen to cover the things that are part of the test? Would these smaller models struggle with a broader range of test inputs or subject matte?

    • Re:

      A word of caution about these leaderboards: you can train to the test. You can literally just download the eval datasets, reformat them, and then use them as datasets in your finetune. I don't trust these leaderboards much at all.

      Also, "open source" is kind of a stretch. It's based on Qwen, whose license is similar to the LLaMA 2 license: viral (generations can only be used to train derivatives of itself), and if any project becomes big enough, you have to negotiate with the owner (in this case, Alibaba),

  • I know it's a trained neural net, but is that simply data or does it include the actual runtime execution software too? Or do they use all the same base software in the same sense that all linux binaries require the linux kernel to run?

    • They distribute the model in the standardized Safetensors format. You can actually run the model on any software that can load a text model like that. Most of them are Python-based because of the easier GPU compute access.

      • Re:

        Its also about 130 Gig to download, so unless you have a powerful computer and memory, it's not really usable. There are other 7 Billion parameter models that are just as useful to use.

        • Re:

          Sure 7 billion models might be smaller. This one is 72. I don't think the whole model has to be loaded into RAM with the way this is set up.

          • Re:

            Yes, as a general rule, the whole model has to be loaded (there are some setups out there for having layers out on disk, but if you think running on system memory is slow...).

            Note that you can get quantizations of most models, which are much, much smaller.

        • Re:

          "powerful computer and memory" needs to be defined in some system requirements area.
          In general, I think these models need a lot of work to be made easier to understand and set-up by people. I've had success with Stable Diffusion and its numerous models, running them locally to generate some nice images, but that's mostly thanks to excellent step-by-step documentation I was able to find.

          I looked at Smaug's ancestor:

          That is all. I admit I understand almost none of it.

          • Re:

            Powerful computer is an understatement at 72 billion parameters. There's a reason there's no simple how-to instructions - it's that the only people running these are experts already. Even 24GB of VRAM is probably half the minimum size.

            • Re:

              I see. That makes sense.

              • Re:

                But the instructions seem to relate to using it with PyTorch which is a pretty widely used tool for running inference models

        • Re:

          This is like some weird slap in the face from the universe. My new PC has 128GB of memory.:)

        • Re:

          You don't have to download 16-bit versions of the models if just doing inference. Grab the 5-bit quantized version that drops size down to only about 40 GB or so. There is very little quality difference between 16 and 5 bits.

          With something like llama.cpp you can split roughly 40 GB between RAM/VRAM so part runs on the GPU and the rest spills over to the CPU.

          If you don't have a huge system 70B models will be slow but people should at least still be able to use it without spending a fortune on hardware. Ove

    • Re:

      A model is a big equation. The parameters are the unknown variables. Training is estimating values for them. When somebody says "the model has 72 billion parameters and you can download it here" what they mean is that there are 72 billion of those variables, estimated values for them are stored in a file, and you can download that file.

      You also need some code to take those parameters, substitute them into the equation, and evaluate it. The evaluation is pretty standard, but you need a bit of code that acuta

    • Re:

      Any inference software that supports Qwen. If you're looking for a web interface, try text-generation-webui.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK