AMI: a music generation platform

Our aim is to develop new tools for musicians that support and inspire them. We are working closely with musicians to develop deep learning methods for composition, lyric generation and the synthesis of musical sound.

On

Melody generation

We are working with a number of musicians to develop artificial intelligence tools that support and enhance the ability of humans to make music.

AMI (Artificial Musical Intelligence) is a deep neural network that can generate compositions for various musical instruments and different musical styles with a coherent long-term structure. It uses a state-of-the-art attention-based deep neural network architecture to discover patterns of musical structures such as melodies, chords, and rhythm, from tens of thousands of MIDI files.

We encode music data in a way that is similar to reading a music score, which enables the model to better capture music structures. Furthermore, we enhance the learning of musical structures by adding embeddings at different time scales.

As a result, the model is able to maintain a coherent long-term structure and even occasionally transition to a different movement. AMI has been trained on classical music as well as rock, jazz, electronic and film music.

Listen to some examples of music generated by AMI.


Lyric generation

We have also developed a deep neural network for generating lyrics.

The model is based on the state-of-the-art GPT-2 model, which has been trained on a huge amount of text available on the internet. We adapted the GPT-2 model for lyric generating by fine-tuning the model on a large corpus of poetry before training it further on selected lyrics. The model has also incorporated a correlation between song titles and lyrics, which is able to generate coherent lyrics that broadly follow the meaning of a given title.

As a check against possible copyright infringement, the system can also check the generated lyrics against the training lyrics, highlighting those that have high similarity.

An example of a generated lyric:

Full moon

The world was a perfect place no room for pain
No room for life or tears
A small café with open doors
No food on the table
Who'll steal the fairest clothes
On a Saturday night

No windows for the sleeping child
No moon above the mountain stream
No living in this jungle dream
Then they came walking like two giving hands
Taking off in the great heat
Taking off to spread their wings


Sound generation

We have also been working on generating novel sounds using deep neural networks with different approaches.

Variational autoencoders are used to encode representations of different drum sounds in latent space, which can in turn generate novel sounds by sampling the latent space and mixing different sound classes in a probabilistic manner.

Another approach to generating novel sounds is via cross-synthesis using autoencoders. After training the autoencoder on a large sound database, it can be used to output novel sounds by exciting the encoder and the decoder with two different sounds as input.

Centres of excellence

The University's cross-faculty research centres harness our interdisciplinary expertise to solve the world's most pressing challenges.