Some boring equations


Covariance

$$cov_{x,y}=\frac{1}{N-1}\sum(x_i-\bar{x})(y_i-\bar{y})$$

Pearson's Correlation

It is calculated as the covariance of the two variables divided by the product of the standard deviation of each data sample. It is assumed that the samples are generated by a Gaussian like distribution, ranges from -1 to 1.

$$c_p = \frac{cov_{x,y}}{\sigma(x)*\sigma(y)}$$

Pearson's Correlation

It is with the prior of non-Gaussian distribution.

$$c_s = \frac{cov_{r(x),r(y)}}{\sigma(r(x))*\sigma(r(y))}$$

Cosine Similarity

Cosine similarity is a metric used to measure how similar the documents are irrespective of their size. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space.

Just a simple vector analysis,

$$\cos\theta = \frac{\vec{a}\cdot\vec{b}}{\|\vec{a}\|\|\vec{b}\|}$$

Triplet Loss

The loss function can be described using a Euclidean distance function, this is usually used to train embedding network, like face embedding,

$$\mathcal{L}(A,P,N) = \max\bigg(\|f(A)-f(P)\|^2-\|f(A)-f(N)\|^2+\alpha,0\bigg)$$

where A is an anchor input (reference face), P is a positive input of the same class as A, N is a negative input of a different class from A, \(\alpha\) is a margin between positive and negative pairs, and f is an embedding.

References


  1. Wikipedia Confusion Matrix
  2. TestDome Questions
  3. HackerRank
  4. Python Data Types: Collections - Exercises, Practice, Solution