Debugging Neural Networks with PyTorch and W&B Using ...
wandb.ai › site › articlesI used Gradient Clipping to overcome this problem in the linked notebook. Gradient clipping will ‘clip’ the gradients or cap them to a threshold value to prevent the gradients from getting too large. In Pytorch you can do this with one line of code. torch.nn.utils.clip_grad_norm_(model.parameters(), 4.0) Here 4.0 is the threshold.