Skip to content

Entropy Transform

The entropy transform computes the entropy of a distribution. The entropy is a measure of the uncertainty of a random variable. The entropy of a distribution is defined as:

\[ H(X) = -\sum_{i=1}^{n} p(x_i) \log p(x_i) \]

where \(p(x_i)\) is the probability of the \(i\)-th outcome. The entropy is maximized when all outcomes are equally likely. The entropy is zero when the distribution is deterministic.

We can use the entropy transform to compute the entropy of a single discrete probability distribution using Shannon Entropy. We can also use the entropy transform to compute the relative entropy between two discrete probability distributions. This is also called the Kullback-Leibler (KL) divergence. This is defined as:

\[ D_{KL}(p||q) = H(p, q) = \sum_{x} p(x) \log \frac{p(x)}{q(x)} \]

where \(p\) and \(q\) are the two probability distributions. The relative entropy is zero if and only if the two distributions are identical. The relative entropy is always non-negative.

Bases: Transform

Compute the entropy of a distribution, or the KL divergence between two distributions.

__call__(pk, qk=None, base=None, where=lambda : not np.isnan(x))

Compute the entropy of the values in pk where where is True.

Parameters:

Name Type Description Default
pk ndarray

The discrete probability distribution to find the entropy of.

required
qk Optional[numpy.ndarray]

The second discrete probability distribution to find the relative entropy with. Default is None.

None
base Optional[Union[int, int_]]

The base of the logarithm used to compute the entropy. Default is None which means that the natural logarithm is used.

None
where Callable[[Union[int, float, int_, float_]], Union[bool, bool_]]

A function that takes a value and returns True or False. Default is lambda x: not np.isnan(x) i.e. a measurement is valid if it is not a NaN value.

lambda : not numpy.isnan(x)

Returns:

Type Description
Union[float, float_]

The entropy of the values in pk optionally with respect to qk (relative entropy) where where is True.

Examples

Shannon Entropy

import numpy as np
import autonfeat as aft

# Random data
n_samples = 100
x = np.random.randint(0, 10, n_samples)

# Sliding window
ws = 10
ss = 10
window = aft.SlidingWindow(window_size=ws, step_size=ss)

# Create transform
tf = aft.EntropyTransform()

# Get featurizer
featurizer = window.use(tf)

# Get features
features = featurizer(x)

# Print features
print(features)

KL Divergence

import numpy as np
import autonfeat as aft

# Random data
n_samples = 100
x1 = np.random.rand(n_samples)
x2 = np.random.rand(n_samples)

# Sliding window
ws = 10
ss = 10
window = aft.SlidingWindow(window_size=ws, step_size=ss)

# Create transform
tf = aft.EntropyTransform()

# Get featurizer
featurizer = window.use(tf)

# Get features
features = featurizer(x1, x2)

# Print features
print(features)

If you enjoy using AutonFeat, please consider starring the repository ⭐️.