Embedding Size: The size of the embedding layer.
Hidden Size: The size of the hidden layer.
Learning Rate: The learning rate for the optimizer.
Epochs: The number of training epochs.
Batch Size: The size of the training batches.
Dropout Rate: The dropout rate for regularization.
Gradient Clipping: The gradient clipping threshold to prevent exploding gradients.
Optimizer: The type of optimizer to use (e.g., SGD, Adam).
Activation Function: The activation function to use in the neural network layers.