Sampled softmax loss - PyTorch Forums
discuss.pytorch.org › t › sampled-softmax-lossFeb 02, 2017 · EDIT: sorry, I see that original link is to a page with a number of different softmax approximations, and NCE is one of them. I personally would be more interested in sampled softmax, as it tends to work better for me. EDIT2: here is a TF implementation of sampled softmax and NCE, hopefully they can be implemented using existing pytorch functions.
AdaptiveLogSoftmaxWithLoss - PyTorch
https://pytorch.org/docs/stable/generated/torch.nn.AdaptiveLogSoftmaxWithLoss.htmlAdaptiveLogSoftmaxWithLoss class torch.nn.AdaptiveLogSoftmaxWithLoss(in_features, n_classes, cutoffs, div_value=4.0, head_bias=False, device=None, dtype=None) [source] Efficient softmax approximation as described in Efficient softmax approximation for GPUs by Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, and Hervé Jégou.
CrossEntropyLoss — PyTorch 1.10.1 documentation
pytorch.org › torchclass torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=- 100, reduce=None, reduction='mean', label_smoothing=0.0) [source] This criterion computes the cross entropy loss between input and target. It is useful when training a classification problem with C classes. If provided, the optional argument weight should be a 1D ...