Key words and phrases
Before examining models, review the following terms and definitions. The full list of available terms is available in the topic on AWS DeepComposer concepts and terminology.
- discriminator
-
A classifier model, which is one part of a generative adversarial network (GAN). The discriminator classifies real data from generated data. In the AWS DeepComposer use case, this model tries to determine if a generated piano roll looks like a real piano roll or an artificially created piano roll image. The discriminator is trained on real data.
- epoch
-
One complete pass through the training dataset by the neural network. For example, if you have 10,000 music tracks in the training dataset, one epoch represents one pass through all 10,000 tracks. The number of epochs required for a model to converge varies based on the training data. An iterative algorithm converges when the loss function stabilizes.
- generator
-
One part of a GAN. The generator creates new data by collecting and incorporating feedback from the discriminator. A GAN is made up of two models: a generator and a discriminator. Both have separate loss functions. The goal of the generator is to create the most accurate data possible. The goal of the discriminator is to determine whether that data is accurate.
- learning rate
-
A training hyperparameter that controls how large the step size is at each iteration and update. Step size is how much model weights and parameters change. A small learning rate requires more epochs because smaller changes are made to each update. A larger learning rate results in rapid changes. A learning rate that is too large can cause the model to converge too quickly and pass the optimal point. A learning rate that is too small can cause the training process to get stuck before reaching the optimal point.
- loss function
-
Controls how accurate a prediction is. For example, if an autonomous car misidentifies a pedestrian as street marks, it results in a harmful outcome for both parties. Loss functions help prevent these kinds of mistakes by mitigating the errors and depicting by how much the algorithm has missed the desired target. In this example, the correct loss function would prevent the car's algorithm from mistaking the pedestrian as street marks by ensuring that the prediction was as accurate as possible.
- training dataset
-
A dataset that helps a model learn about the music that you’re training. You choose the training dataset based on the two model types provided: MuseGAN or U-Net. If you want to train a model for the symphony, jazz, pop, or rock genres, use a MuseGAN model. Datasets for each of those genres are provided. The dataset for U-Net models is limited to the Bach genre.
- update ratio
-
The number of generator updates compared to discriminator updates in GANs. After the generator is trained, the discriminator must be trained. For example, an update ratio of
5
means that for every five times the discriminator is updated, the generator is updated one time.