L_p-norm (LP)

The L_p-norm (LP) measures the p-norm distance between the facet distributions of the observed labels in a training dataset. This metric is non-negative and so cannot detect reverse bias.

The formula for the L_p-norm is as follows:

L_p(P_a, P_d) = ( ∑_y||P_a - P_d||^p)^1/p

Where the p-norm distance between the points x and y is defined as follows:

L_p(x, y) = (|x₁-y₁|^p + |x₂-y₂|^p + … +|x_n-y_n|^p)^1/p

The 2-norm is the Euclidean norm. Assume you have an outcome distribution with three categories, for example, y_i = {y₀, y₁, y₂} = {accepted, waitlisted, rejected} in a college admissions multicategory scenario. You take the sum of the squares of the differences between the outcome counts for facets a and d. The resulting Euclidean distance is calculated as follows:

L₂(P_a, P_d) = [(n_a⁽⁰⁾ - n_d⁽⁰⁾)² + (n_a⁽¹⁾ - n_d⁽¹⁾)² + (n_a⁽²⁾ - n_d⁽²⁾)²]^1/2

Where:

n_a⁽ⁱ⁾ is number of the ith category outcomes in facet a: for example n_a⁽⁰⁾ is number of facet a acceptances.
n_d⁽ⁱ⁾ is number of the ith category outcomes in facet d: for example n_d⁽²⁾ is number of facet d rejections.

The range of LP values for binary, multicategory, and continuous outcomes is [0, √2), where:
- Values near zero mean the labels are similarly distributed.
- Positive values mean the label distributions diverge, the more positive the larger the divergence.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Jensen-Shannon Divergence (JS)

Total Variation Distance (TVD)

Lp-norm (LP)

L_p-norm (LP)