9

Google Unveils MusicLM, an AI That Can Generate Music from Text Prompts

 1 year ago
source link: https://www.infoq.com/news/2023/02/google-musiclm-ai-music/?itm_source=infoq&itm_medium=popular_widget&itm_campaign=popular_content_list&itm_content=
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Google Unveils MusicLM, an AI That Can Generate Music from Text Prompts

Feb 01, 2023 1 min read

Google researchers have introduced MusicLM, an AI model that can generate high-fidelity music from text. MusicLM creates music at a constant 24 kHz throughout a number of minutes by modeling the conditional music generating process as a hierarchical sequence-to-sequence modeling problem.

According to the research paper, MusicLM was trained on a dataset of 280,000 hours of music to produce songs that make sense for complex descriptions. The researchers also claim their model outperforms previous systems both in audio quality and adherence to the text description.

MusicLM samples, includes five-minute pieces produced from only one or two words like melodic techno, as well as 30-second samples that sound like entire songs and are formed from paragraph-long descriptions that prescribe a genre, vibe, and even specific instruments.

MusicLM is also capable of transforming a collection of sequentially written descriptions into a musical story or narrative built on existing melodies, whether they are whistled, hummed, sung, or played on an instrument.

AI-generated music has a long history and has been credited with writing hit songs, and enhancing live performances. In a more recent version, written prompts are converted into spectrograms and music using the AI picture generating engine Stable Diffusion.

Contrary to text-to-image machine learning, where it is claimed that large datasets have contributed significantly to recent advancements, there are hurdles for AI music related to the absence of coupled audio and text data. For instance, Stable Diffusion and OpenAI's DALL-E tool have both sparked a surge in interest from the general public. Also the fact that music is structured along a temporal dimension presents another difficulty in AI music generation. Consequently, compared to using a description for a still image, it is far more difficult to convey the intention of a music track using simple text.

Google is being more cautious with MusicLM than some of its competitors may be with comparable technology, as it has been with prior excursions into this form of AI. The article ends with the statement, "We have no plans to disclose models at this point".

About the Author

Daniel Dominguez

Daniel has worked in the software industry for over 14 years, gaining experience in product development for companies ranging from Silicon Valley startups to Fortune 500. As a seasoned engineer, he is passionate about cloud computing to deliver innovative software solutions. In addition to software product management, he is also involved in artificial intelligence and machine learning.

Show more

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK