How AI Works

DATA

AI

MAGIC

Okay, so maybe it's a bit more complex than this... Let's break it down.

CONCEPT

Inspiration was inititally drawn from the DeepDream project, created by Google engineer Alexander Mordvintsev which uses a neural network to find and enhance patterns in images via algorithmic pareidolia. The images below have been processed by a neural network having been trained to perceive dogs, and enhance dog-life features. If we can do this to images, why not sound?

ORIGINAL

(source image, unedited)

10X

(after applying ten iterations DeepDream)

50X

(after applying fifty iterations DeepDream)

Images Credit: Martin Thoma

MAGENTA

As an open source and ongoing research project, magenta explores the role of machine learning as a tool in the process of creating art and music.

It was originally started by researchers and engineers from the Google Brain team but now many others have contributed significantly to the project.

NSynth is an example of a product that this research has contributed to. Its investigating making music using new sounds generated with machine learning.

How AI Works

HOW COMPUTERS MAKE MUSIC

If you to look up a guide to how neural networks work, you'll find several descriptions with varying levels of complexity. To avoid over-explaining how this would work conceptually I'll walk through an example from tensorflow.org of how these neural networks are being used to create life-like images already. And then describe how it can be used for audio.

Generative Adversarial Networks (GANs) are one of the most interesting ideas in computer science today. Two models are trained simultaneously by an adversarial process. A generator ("the artist") learns to create images that look real, while a discriminator ("the art critic") learns to tell real images apart from fakes.

During training, the generator progressively becomes better at creating images that look real, while the discriminator becomes better at telling them apart. The process reaches equilibrium when the discriminator can no longer distinguish real images from fakes.

Paradolia's conceptual model follows this concept except replaces graphical data with sound, and images with music. Where the generator takes environmental noise and utilizes it to compose music, and the discriminator plays the role of the audiophile and critic

Real Jazz Music

Sample: Miles Davis

qE0isuQNyo3k-WjkLaMLnpaxidb0t5uZFw-xMdDD

Latent random variable:

Environmental noise

Sample: Generative Jazz

mjs-standalone-hoan-13-p2-_mjs_-mjd_-fea

Generator

sax_1532878227670_12472804_ver1.0_640_36

Discriminator

Real

Fake

Return to Paradolia