Tag Archives: #AI

Music with RNN: myth or reality?


Last week I shared my excitement about my involvement in NTR’s new work on neural networks/RNN and promised to share what I learn and backstories  about the project itself.

Remember when I told you about my “other job as front woman for Vkhore? Well, like most bands, compose a lot of our own music.


I know from my own experience how hard that is — composing isn’t some off-the-shelf hobby.

But what if people with no musical training (formal or not) and no technical skills could use a computer application, choose the style of music to generate and listen to the results right then and there?

Sounds more like science fiction, but so do a lot of AI projects.

On a more personal level, training our own RNN to compose music means I’ve been getting a crash course about Machine Learning.

Over the past week I’ve been reading up on recent developments in Machine Learning in general and, more specifically, neural networks composing music.


There are already a number of them in existence. There is Magenta from Google; a tutorial that allows people  to generate music with a recurrent neural network. But it’s a simple model without stellar musical results.

What I wanted to know is if an RNN is can actually learn to compose music that has well-defined parts, i.e., the structure of music: verses, choruses, bridges, codas, etc.

Based on my research, there’s already been a good deal of development to make that happen. Originally, music generation was mainly focused on creating a single melody. You might  be interested in the discussions on Hacker News and Reddit about a year ago.  More recently, work on polyphonic music modeling, centered around time series probability density estimation, has met with partial success.

NTR’s goal is to build a generative model using a deep neural network architecture that will create music with  both harmony and melody.

We want our RNN to be able to create music that is as close to music composed by humans as possible.

I asked my colleague and friend Natasha Kazachenko, who is responsible for training our neural network to generate music, several questions to better understand exactly what we are doing. (It’s much easier to learn about a highly technical subject when you work in tech with good friends who are patient enough to explain stuff to a non-techie.)  I will share her answers next week.

I learned long ago that it is normal human psychology to attribute human traits, emotions, even gender, to non-human entities and techies (yes, they are human) are no different.

NTR Lab’s neural network is female.

Her name is Isadei.

inceptionizm gallary

What do you know about neural networks?

As some of you know, I started my career as a professional journalist, so when my boss asked me if I thought I could distinguish the original text of a contemporary Russian poet from the text generated by artificial intelligence I said yes. I tried it on a test and scored 8 out of 10 correctly.

Score one for us humans, but I have to say that it got me curious about the AI and how it works. One thing I learned is that there are many examples of machine generated creative writing on the internet, such as CuratedA.

Curiosity + journalist = research, so I started reading. But remember. I am not a techie, although I work for a very cool tech company, so even though I wrote doesn’t mean I really understand. But I’m learning!

First, I learned that artificial neural networks are mathematical or computational models inspired by the structure and functioning of the biological neural networks that are found in the nervous system of living organisms. Dr. Robert Hecht-Nielsen, inventor of one of the first neurocomputers, defines a neural network as “…a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.”

Next, I learned about recurrent neural networks (RNNs). They are a class of artificial neural network architecture inspired by the cyclical connectivity of neurons in the brain that uses iterative function loops to store information.

RNNs have several properties that make them an attractive choice for sequence labeling. Because they can learn what to store and what to ignore, they are perfect to use for context information they also have several drawbacks that have limited their application to real-world sequence labeling problems.

In plain language, this means that while they accept many different types and representations of data and can recognize sequential patterns in the presence of sequential distortions they are a long way from being perfect.

That said, there are already many real-world situations where RNNs are being used, such as time series analysis, natural language processing, speech recognition and Medical Event Detection.

Okay, so we have an idea about what RNNs are, why they are exciting and how they work. But the most exciting to me was to find out there were already RNNs doing art or there are people figured out how to train RNN character-level language models.

There is the algorithm developed at the University of Tubingen in Germany, which can extract the style from one image (say a painting by Van Gogh), and apply it to the content of a different image or Google’s inceptionism technique that transforms images.

dreams of google neural network

(Google, esperimento "Inceptionism")
(Google, esperimento “Inceptionism”)

Neural networks have operated in visual arts as a creative mechanism and in the the study of human aesthetic preferences, but, most exciting to me as a musician, artificial neural networks have been used to actually generate music.

I learned that neural networks are fundamental to artificial life and that the entities created can become interactive art pieces or even create the art itself.

These lifelike forms rely on generative models, which are a rapidly advancing area of research. As these models improve and the training and datasets scale, we can expect samples that depict entirely plausible images or videos.

Deus ex machina!, I thought when I read that the net had begun to sound human — like Her, instead of Siri.  Of course, if your background is technical you are probably more interested in the way this stuff actually works.

As I said at the start, I’m not techie and reading through this stuff sometimes makes my head hurt, so why am I excited?  Because I work for a company that has started training our own recurrent neural network and I’m actually involved in the project!

I will tell you more next week.