Losses - PyTorch Metric Learning
kevinmusgrave.github.io › pytorch-metric-learningIn this implementation, we use -g(A) as the loss. Parameters: softmax_scale: The exponent multiplier in the loss's softmax expression. The paper uses softmax_scale = 1, which is why it does not appear in the above equations. Default distance: LpDistance(normalize_embeddings=True, p=2, power=2) Default reducer: MeanReducer; Reducer input:
AdaptiveLogSoftmaxWithLoss — PyTorch 1.10.1 documentation
https://pytorch.org/.../generated/torch.nn.AdaptiveLogSoftmaxWithLoss.htmlAdaptiveLogSoftmaxWithLoss¶ class torch.nn. AdaptiveLogSoftmaxWithLoss (in_features, n_classes, cutoffs, div_value = 4.0, head_bias = False, device = None, dtype = None) [source] ¶. Efficient softmax approximation as described in Efficient softmax approximation for GPUs by Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, and Hervé Jégou. …
Sampled softmax loss - PyTorch Forums
discuss.pytorch.org › t › sampled-softmax-lossFeb 02, 2017 · EDIT: sorry, I see that original link is to a page with a number of different softmax approximations, and NCE is one of them. I personally would be more interested in sampled softmax, as it tends to work better for me. EDIT2: here is a TF implementation of sampled softmax and NCE, hopefully they can be implemented using existing pytorch functions.