Can AI work backwards from a text description to generate a cohesive number? That’s the premise of MusicLMthe AI-powered music creation tool Google released yesterday to kick off its I/O conference.
MusicLM, which is trained on hundreds of thousands of hours of audio to learn how to make new music in different styles, is available in preview through Google’s AI Test Kitchen app. I’ve been playing with it for the past day, as have some of my colleagues.
The verdict? Let’s just say MusicLM isn’t coming anytime soon for musician jobs.
Using MusicLM in Test Kitchen is quite simple. Once you’ve been approved for entry, you’ll be greeted with a text box where you can enter a description of the song – as detailed as you like – and let the system generate two versions of the song. Both can be downloaded for offline listening, but Google encourages you to give one of the tracks a thumbs up to help improve the AI’s performance.
When I first covered MusicLM back in January, before it was released, I wrote that the system’s songs sounded like a human artist might compose – albeit not necessarily as musically inventive or cohesive. Now I can’t say I fully support those words as it seems clear that some serious cherries were being picked with samples from earlier this year.
Most of the songs I’ve made with MusicLM sound decent at best — and like a four-year-old letting loose on a DAW. I’ve mostly stuck to EDM, trying to deliver something with structure and a discernible (plus pleasant, ideally) melody. But no matter how decent – even good! — sounding the beginning of MusicLM’s songs, there comes a point where they cut off in a very obvious, musically unpleasant way.
Take this sample, for example, generated using the prompt “EDM song in a light, happy and light-hearted style, good for dancing.” It starts off promising, with a striking baseline and elements of a classic Daft Punk single. But towards the middle of the song it deviates very much from course – practically a different genre.
Here’s a piano solo from a simpler prompt – “romantic and emotional piano music.” You’ll find parts sound good and fine — even exceptional, at least as far as fingering is concerned. But then it is as if the pianist becomes possessed by mania. A jumble of notes later, and the song takes a radically different direction, as if it’s coming from new sheet music – albeit along the lines of the original.
I tried MusicLM’s hand at chiptunes for fun, figuring the AI might have an easier time with songs of more basic construction. No dice. The result (below), while catchy in parts, ended up just as random as the other monsters.
On the plus side, MusicLM generally fares much better than Jukebox, OpenAI’s attempt several years ago to create an AI music generator. Unlike MusicLM, the songs Jukebox produced lacked typical musical elements such as choruses that repeat and often contain nonsensical lyrics. Songs produced by MusicLM also contain fewer artifacts and generally feel like a step up when it comes to fidelity.
The rise of Dance Diffusion comes several years after OpenAI, the San Francisco-based lab behind DALL-E 2, described its grand experiment in music generation called Jukebox. Given a genre, artist, and snippet of lyrics, Jukebox could generate relatively coherent music, complete with vocals. But the songs Jukebox produced lacked larger musical structures such as choruses that repeat and often contain nonsensical lyrics.
MusicLM’s usefulness is also a bit limited, thanks to artificial restrictions on the prompt side. It does not generate music with artists or vocals, even in the style of certain musicians. Type a prompt like “along the lines of Barry Manilow” and you’ll just get an error.
The reason is probably legal. After all, deepfake music is on shady legal grounds, with some in the music industry claiming that AI music generators like MusicLM violate music copyright. It won’t be long before there is some clarity on the matter – several lawsuits making their way to the courts are likely to affect music-generating AI, including one involving the rights of artists whose work is used to power AI systems. training without their knowledge or consent. Time will tell.