Definition

Wasserstein distance is the minimum cost required to transform one distribution to another.
Assume that two Probability Distributions and are given. Then the Wasserstein distance between and is defined as where is the set of all joint distributions with marginals and respectively. Intuitively indicates how much mass must be transported from to in order to transform into .
p-Wasserstein Metric
Facts
Link to originalLet be a sequence of distributions. Then, The convergence of the KL-Divergence to zero implies that the JS-Divergence also converges to zero. The convergence of the JS-Divergence to zero is equivalent to the convergence of the Total Variation Distance to zero. The convergence of the Total Variation Distance to zero implies that the Wasserstein Distance also converges to zero. The convergence of the Wasserstein Distance to zero is equivalent to the Convergence in Distribution of the sequence.