Generating Chirps with Neural Networks

Grafting models together and iteratively calling a generator

The sound of birdsong is varied, beautiful, and relaxing. In the pre-Covid times a made a pomodoro timer which would play some recorded bird sounds during breaks, but I always wondered whether such sounds could be generated. Below is a proof-of-concept which can both successfully reproduce a single chirp and has parameters which can be adjusted to alter the generated sound.

The approach in theory

The generator will be composed two parts. The first part will take the entire sounds and encode key pieces of information about its overall shape in a small number of parameters.

The second part will take a small bit of sound, along with the information about the overall shape, and predict the next little bit of sound.

The second part can be called iteratively on itself to produce an entirely new chirp!

SEED = 4567

try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
except Exception:
    pass

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
import numpy as np
from tensorflow.keras import layers, utils, callbacks
import IPython.display as ipd
import pandas as pd
import librosa
from librosa import display as rosadisplay
import soundfile as sf
import s3fs
import io
from six.moves.urllib.request import urlopen
import random
import matplotlib.pyplot as plt

Num GPUs Available:  0

# so many different random processes
def set_seeds(s):
    tf.keras.backend.clear_session()
    random.seed(s)
    np.random.seed(s*2)
    tf.random.set_seed(s*3)
    
set_seeds(SEED)