HOW AI WORKS
Okay, so maybe it's a bit more complex than this... Let's break it down.
Inspiration was inititally drawn from the DeepDream project, created by Google engineer Alexander Mordvintsev which uses a neural network to find and enhance patterns in images via algorithmic pareidolia. The images below have been processed by a neural network having been trained to perceive dogs, and enhance dog-life features. If we can do this to images, why not sound?
(source image, unedited)
(after applying ten iterations DeepDream)
(after applying fifty iterations DeepDream)
Images Credit: Martin Thoma
As an open source and ongoing research project, magenta explores the role of machine learning as a tool in the process of creating art and music.
It was originally started by researchers and engineers from the Google Brain team but now many others have contributed significantly to the project.
NSynth is an example of a product that this research has contributed to. Its investigating making music using new sounds generated with machine learning.
HOW COMPUTERS MAKE MUSIC
If you to look up a guide to how neural networks work, you'll find several descriptions with varying levels of complexity. To avoid over-explaining how this would work conceptually I'll walk through an example from tensorflow.org of how these neural networks are being used to create life-like images already. And then describe how it can be used for audio.
Generative Adversarial Networks (GANs) are one of the most interesting ideas in computer science today. Two models are trained simultaneously by an adversarial process. A generator ("the artist") learns to create images that look real, while a discriminator ("the art critic") learns to tell real images apart from fakes.
During training, the generator progressively becomes better at creating images that look real, while the discriminator becomes better at telling them apart. The process reaches equilibrium when the discriminator can no longer distinguish real images from fakes.
Paradolia's conceptual model follows this concept except replaces graphical data with sound, and images with music. Where the generator takes environmental noise and utilizes it to compose music, and the discriminator plays the role of the audiophile and critic
Real Jazz Music
Sample: Miles Davis
Latent random variable:
Sample: Generative Jazz