Skip to content

Entropy Function

The entropy function computes the entropy of a distribution. The entropy is a measure of the uncertainty of a random variable. The entropy of a distribution is defined as:

\[ H(X) = -\sum_{i=1}^{n} p(x_i) \log p(x_i) \]

where \(p(x_i)\) is the probability of the \(i\)-th outcome. The entropy is maximized when all outcomes are equally likely. The entropy is zero when the distribution is deterministic.

We can use the entropy function to compute the entropy of a single discrete probability distribution using Shannon Entropy. We can also use the entropy function to compute the relative entropy between two discrete probability distributions. This is also called the Kullback-Leibler (KL) divergence. This is defined as:

\[ D_{KL}(p||q) = H(p, q) = \sum_{x} p(x) \log \frac{p(x)}{q(x)} \]

where \(p\) and \(q\) are the two probability distributions. The relative entropy is zero if and only if the two distributions are identical. The relative entropy is always non-negative.

Compute the entropy of the values in pk where where is True.

Parameters:

Name Type Description Default
pk ndarray

The discrete probability distribution to find the entropy of.

required
qk Optional[numpy.ndarray]

The second discrete probability distribution to find the relative entropy with.

None
base Optional[Union[int, int_]]

The base of the logarithm used to compute the entropy. Default is None which means that the natural logarithm is used.

None
where Callable[[Union[int, float, int_, float_]], Union[bool, bool_]]

A function that takes a value and returns True or False. Default is lambda x: not np.isnan(x) i.e. a measurement is valid if it is not a NaN value.

lambda : not numpy.isnan(x)

Returns:

Type Description
Union[float, float_]

The entropy of the values in pk optionally with respect to qk (relative entropy) where where is True.

Examples

Shannon Entropy

import numpy as np
import autonfeat as aft
import autonfeat.functional as F

# Random data
n_samples = 100
x = np.random.randint(0, 10, n_samples)

# Sliding window
ws = 10
ss = 10
window = aft.SlidingWindow(window_size=ws, step_size=ss)

# Get featurizer
featurizer = window.use(F.entropy_tf)

# Get features
features = featurizer(x)

# Print features
print(features)

KL Divergence

import numpy as np
import autonfeat as aft
import autonfeat.functional as F

# Random data
n_samples = 100
x1 = np.random.rand(n_samples)
x2 = np.random.rand(n_samples)

# Sliding window
ws = 10
ss = 10
window = aft.SlidingWindow(window_size=ws, step_size=ss)

# Get featurizer
featurizer = window.use(F.entropy_tf)

# Get features
features = featurizer(x1, x2)

# Print features
print(features)

If you enjoy using AutonFeat, please consider starring the repository ⭐️.