Connect with us

Digital Economy

Markus Schwarzer (Cyanite): ‘What you should know before using ChatGPT for your music search’



OpenAI‘s chatbot ChatGPT continues to reshape the creative sector. Amid copyright infringement lawsuits, the AI tool has been a central point of focus in the past year. In this guest op-en, Markus Schwarzer, CEO of Berlin-based AI-powered music analysis and recommendation platform Cyanite, outlines some of the pitfalls and risks of using ChatGPT.

Cyanite’s Markus Schwarzer

ChatGPT has been the driving force in 2023’s AI boom. It has become somewhat synonymous with generative AI models, as one of the most powerful large language models (LLMs). Its accessibility to the public through its free model has given hundreds of millions of people the opportunity to use sophisticated AI assistance in their daily lives.

The ChatGPT API has sparked a new era in human-machine interaction, making it easier to build a powerful language-based search for almost anything — including music. Many companies have since set up their music search using text prompts.

Despite its convenience and user-friendly platform, there are risks to using ChatGPT for your music search. First, ChatGPT’s ability to get access to valuable music data for free is concerning. Second, there are concerns around OpenAI’s dependency on Microsoft. Third, ChatGPT’s progression to a monopoly will make it difficult for other genAI companies to set the tone in the market — and give OpenAI a uniquily strong negotiation position. Finally, there are ethical concerns about ChatGPT’s use of training data.

Using ChatGPT in music search

Before exploring these risks, it is useful to see how and why people are using it in their music search, as there a couple of ways in which it is being used.

1 > In the first scenario, a user tells ChatGPT to translate text prompts into any given tagging taxonomy. For instance: “I need a song that sounds like Pirates of The Caribbean and it needs to feature a trumpet”. ChatGPT does a reasonably good job of translating it into keywords such as orchestral, adventurous, swashbuckling, dramatic, and trumpet. Based on these keywords, the user can search for suitable content. It’s a fairly straightforward music search but only works for very limited use cases.

2 > The second scenario requires a pre-trained AI system that extracts comprehensive information from audio files, in the same way that ChatGPT extracts information from text prompts. Based on this, users can build a more seamless prompt search, comparing the text embedding to the audio embeddings in their music catalog. It’s a bit harder to build but translates into better results.

Feeding data to ChatGPT

Whenever someone uses ChatGPT for music search, they are essentially providing ChatGPT with free access to valuable data; teaching it how humans describe music, what we find particularly important in music, how we perceive it, and how it makes us feel. Every music search prompt is essentially a glimpse into our musical minds.

OpenAI, ChatGPT’s parent company, has already released generative AI models that can generate music. These systems are operated by text prompts, but unlike ChatGPT, they generate music instead of text.

It is generally believed that generative AI in music such as Stable Audio, Google’s MusicLM, or Meta’s MusicGen lacks sophistication compared to human music creations — not only in terms of their sound quality but also how well the music fits the prompt. This is due to the training data that is available.

Need for full-text music descriptions

These systems need full-text music descriptions and the corresponding audio files. The more complex and detailed the description, the better. But it is usually very expensive and time consuming to create or acquire this data, which is why genAI in music is still lacking the quality of LLMs.

However, conventionally tagged music is widely available, and enough to train good genAI models. The data gathered from a prompt-based music search can help OpenAI make much better connections between full-text prompts and tags, and thus generate suitable corresponding music to the text prompts.

For instance, if a user describes a film scene in a search prompt: “Give me a song that fits well to a scene where someone walks along Route 66 with their thumb out trying to hitchhike” and get underwhelming results, the user would add specificity, such as “sparse, blues, slide guitar.” In this moment chatGPT has made the connection between the film scene and the music tags.

OpenAI’s dependency on Microsoft

Generating the amounts of text, as ChatGPT does, is very computationally heavy and expensive. According to reports, its daily operating cost is close to three-quarters of a million US Dollars.

Through their cloud service Azure, OpenAI’s biggest investor Microsoft is financing 99% of the cost. It’s not unreasonable to assume that eventually Microsoft will want to see a return on its investment, and that could mean a price increase for users of ChatGPT.

To make matters worse, the recent turmoil at OpenAI which saw CEO Sam Altman being dismissed only to be reinstated in his position days later raises questions about their internal cohesion and strategy. While things appear friendly on the outside, it is not far-fetched to imagine that this has led to an even bigger divide between OpenAI and Microsoft, which will undoubtedly have repercussions for the ordinary users of ChatGPT.

Progressing monopolisation

ChatGPT is undoubtedly the leader of the pack in the textual genAI game. Sure, competitors and tech journalists are keen to convince us that models such as LLaMA, Gronk, Gemini will lead to ChatGPT’s demise, but I’m not convinced. They may be formidable models, but it is unlikely that they can generate similar public attention and user numbers in the same amount of time.

This bears a substantial risk. There is a finite amount of training data on the internet. Most of the models above are mostly trained on the same information, hence they generate comparable answers to prompts. This is particularly problematic for music-related searches, which makes it harder to differentiate between the different LLMs.

To achieve differentiation, companies need to acquire data sources that will enable unique answers from their AI model. The only way this is possible is if they are generating training data proprietary to their company.

Ethical concerns

One of the most scalable ways is to harvest information from user interactions, which is why the elevated use of ChatGPT is a cause for concern. It stands far ahead of the competition purely based on the amount of accessible data it holds.

There will be a big difference in negotiating power for the music industry if one instead of several genAI companies are setting the tone for the entire market.

Finally, there are significant concerns around the ethical use of training data. Many music companies were up in arms about genAI models because they claimed they were trained on unethically sourced datasets. Universal Music Group even urged Apple and Spotify to block genAI companies from accessing training data through their APIs.

Consent of copyright owners

The recent lawsuit from The New York Times against OpenAI does not give the impression that the training data was sourced with the consent of the copyright owners. In dubio pro reo, but what kind of message does it send if you are a music company that condemns using copyrighted music for training while using AI that was likely trained on unlicensed copyrighted text?

Text prompts will likely become the incumbent way of interacting with machines. And the music industry should not close its mind to this. However, when adopting this new trend, there are risks to consider.

Using systems like ChatGPT can be a quick and easy way to let users search for music. However, there are clear downsides of giving up valuable data for free, whiling paving the way for a monopoly in this important area at the same time. Apart from ChatGPT, there are search systems that allow for natural language search specifically for music. For instance, Azure OpenAI promises to not use any user data for retraining yet it’s economic feasibility remains a question mark.

By Markus Schwarzer

(Picture under license from AdobeStock)

Creative Industries

Music deals — Week 14, 2024



Here is a list of some of the main music deals announced in the past week:

Rock band KISS have signed an agreement with Pophouse which will give the Swedish entertainment and music investment firm ownership of KISS’ artist share of the master recordings and publishing rights. Financial terms of the transaction were not being disclosed but AP estimated that the deal was “over $300 million.” The transaction is subject to certain conditions and regulatory approvals. “Working closely with...

A paid subscription is required to read more.
Log in below, or UPGRADE / SUBSCRIBE.

Continue Reading

Creative Industries

Quotes of the week — Week 14, 2024



"Music makers already know what music lovers are just now learning: TikTok is the worst, most exploitative streaming platform for music, anywhere (and that’s saying something). The vast majority of music on TikTok generates virtually no revenue for the musicians who made it, and even more music on the platform is completely unlicensed (stolen), copied (stolen via AI), or pirated (stolen). Simply put, TikTok is trying to build a music-based business without paying music makers fair value for the...

A paid subscription is required to read more.
Log in below, or UPGRADE / SUBSCRIBE.

Continue Reading

Creative Industries

Key figures — Week 14, 2024



43.2 billion

That's the total number of times the songs in Spotify's list of “the 100 greatest R&B songs of the streaming era“ have been streamed on the platform since 2015, according to MusicAlly.

$40 billion

That's the EBITDA posted by TikTok parent ByteDance in 2023, out of a turnover of over $120bn, according to a Bloomberg report.

$2.9 billion

That's how much France's pay-TV company Canal+, part of Vivendi, is offering to acquire African broadcaster MultiChoice...

A paid subscription is required to read more.
Log in below, or UPGRADE / SUBSCRIBE.

Continue Reading