Python Forum
"No gradients provided for any variable" error in Wasserstein GAN
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
"No gradients provided for any variable" error in Wasserstein GAN
#1
I have been working on a GAN model that is designed to output smiles sequences. During my training step I have been getting the error "Value Error: No gradients provided for any variable", i am getting this problem with the gan_model for the trainable variables of the generator.

I have implemented my loss function using tf.op function (tf.reduce_mean) and also checked if it is differentiable. I still run into the no gradients provided for any variable error. My generator uses an attention mechanism and I have checked at my level if any functions in my generator are non-differentiable or are having any incorrect input.

The wasserstein loss function returns plausible values and output shapes seem correct

Here is the relevant code i used for defining the loss function and the generator.

LOSS FUNCTION

def wasserstein_loss(y_true, y_pred):
    loss = tf.reduce_mean(y_true * y_pred)
    return loss


y_true = tf.constant([1.0, -1.0, 1.0], dtype=tf.float32)
y_pred = tf.constant([0.5, -0.5, 0.2], dtype=tf.float32)

with tf.GradientTape() as tape:
    tape.watch(y_pred)
    loss = wasserstein_loss(y_true, y_pred)

gradient = tape.gradient(loss, y_pred)

print("Loss:", loss.numpy())
print("Gradient:", gradient.numpy())
GENERATOR

def generator(latent_dim, num_protein_tokens, num_smiles_tokens, max_protein_seq_length, max_smiles_length):

    init_hidden_state = Input(shape=(max_smiles_length,), name='s0')
    init_cell_state = Input(shape=(max_smiles_length,), name='c0')
    hidden_state = init_hidden_state
    cell_state = init_cell_state

    input_latent = Input(shape=(max_smiles_length, latent_dim,), name='input_latent')
    input_protein = Input(shape=(max_protein_seq_length,), name='input_protein')
    embedding_protein = Embedding(num_protein_tokens, 25, mask_zero=True, input_length=max_protein_seq_length)(input_protein)

    lstm_protein = Bidirectional(LSTM(75, return_sequences=True))(embedding_protein)
    lstm_combined_outputs =[]
    for t in range(max_smiles_length):
        context = one_step_attention(lstm_protein, hidden_state)
        lstm_combined, hidden_state, cell_state = post_activation_LSTM_cell(inputs=context, initial_state=[hidden_state, cell_state])
        lstm_combined_outputs.append(lstm_combined)


    lstm_combined_outputs = Concatenate(axis=1)(lstm_combined_outputs)
    concat_layer = Concatenate(axis=2)([input_latent, lstm_combined_outputs])
    generated_smiles_array = TimeDistributed(Dense(num_smiles_tokens, activation='softmax', name='output_smiles'))(concat_layer)
    output_smiles = softargmax(generated_smiles_array, beta=1e10)

    generator_model = Model(inputs=[input_protein, init_hidden_state, init_cell_state, input_latent], outputs=output_smiles, name='generator')

    return generator_model

generator_model = generator(latent_dim, num_protein_tokens, num_smiles_tokens, max_protein_seq_length, max_smiles_length)
gen_optimizer = RMSprop(learning_rate=0.00005)
generator_model.compile(loss=wasserstein_loss, optimizer=gen_optimizer)
I have defined a seperate attention mechanism which uses repeat_vector followed by 2 dense layers, softmax and dot product. I am willing to share more code on my training loop and the attention.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020