komanawa.BASE.generators#

created matt_dumont on: 1/12/23

The generators module contains the classes that are used to generate the parameters for the model and the BAseSampledParam class that is used to pass prior parameters to the model.

Submodules#

Classes#

BaseSampledParam

Schema for a sampled parameter which captures the distribution and arguments for the distribution to check that the distribution is valid and has not changed.

NormalPathGenerator

A base class for the path change generators which defines the KS test for the path change generator.

PseudoUniformChangeGenerator

A base class for the path change generators which defines the KS test for the path change generator.

UniformPathGenerator

A base class for the path change generators which defines the KS test for the path change generator.

Package Contents#

class BaseSampledParam(distribution, *args, **kwargs)#

Schema for a sampled parameter which captures the distribution and arguments for the distribution to check that the distribution is valid and has not changed.

Parameters:
  • distribution – The distribution to use (e.g. scipy.stats.uniform, scipy.stats.norm)

  • args – The arguments to pass to the distribution

  • kwargs – The keyword arguments to pass to the distribution

serialize()#

Serialize the object to a unique string to capture metadata

Returns:

class NormalPathGenerator(start_bounds, low, up, scale, n, delta_center=0)#

Bases: komanawa.BASE.generators.parent_path_change.ParentPathChange

Inheritance diagram of komanawa.BASE.generators.NormalPathGenerator

A base class for the path change generators which defines the KS test for the path change generator.

Generate a random path of length n, with a starting point in the interval start_bounds process is first pick a value out of the uniform distribution (”*start_bounds”), then for each additional point in the series pick a value from a normal distribution centered on the previous value with a scale of scale. If the picked value is outside the interval set to the upper and lower bounds then the value is truncated to the bounds value.

The delta_center parameter can be used to set the mean difference between points in the path to a value other than 0. This can be useful for generating paths that represent a know increase in the source data (e.g., intensification).

The up, low, scale, and delta_center parameters can be scalars or lists of length n-1. If they are scalars the value is used for all points in the path. If they are iterable the value is used for the corresponding point in the path. The first point in the path is always generated from the uniform distribution(*start_bounds).

Parameters:
  • start_bounds – (low, high) bounds for the starting point

  • up – upper bound for the path, can be a scalar or a list of length n-1 (one for change)

  • low – lower bound for the path, can be a scalar or a list of length n-1 (one for change)

  • scale – scale for the normal distribution used to generate the changes along the path, can be a scalar or a list of length n-1 (one for change)

  • n – length of the path

  • delta_center – center of the normal distribution used to generate the changes along the path, can be a scalar or a list of length n-1 (one for change)

comptest(sample, test=kstest, n_gen=None, random_state=None, **kwargs)#

Calculate the scipy test for the path change generator. The test is calculated for each time step in the path. The first time step is the starting point of the path, the second time step is the difference between the first and second points in the path, the third time step is the difference between the second and third points in the path, etc.

The test can be any of the scipy tests like kstest, Mann-Whitney U test, wasserstein_distance_nd, etc.

Parameters:
  • sample – sample to test shape=(nsims, self.n)

  • n_gen – Number of samples to generate for the KS test, if None use the length of the sample

  • kwargs – other keyword arguments to pass to scipy.stats test

Returns:

[estResult, …] for each time step in the path (see scipy function)

logpdf(q0)#

Calculate the log probability of the path (q0) given the parameters of the generator.

Parameters:

q0

Returns:

plot_path(x=None, nsigmas=2, ax=None, round_lab=2, lab_rotation=0)#

Plot the path generated by the generator

Parameters:
  • x – None or an array of length n to use as the x axis if None the x axis will be the integers from 0 to n

  • nsigmas – number of standard deviations to plot around the center path (default 2, 95th percentile)

  • ax – None or an axis to plot on

  • round_lab – number of decimal places to round the labels to

  • lab_rotation – rotation of the x axis labels

Returns:

fig, ax

plot_test(test_results, ax, x=None, plot_stats=('statistic', 'pvalue'), plot_on_twinx=True, colors=None, **kwargs)#

Plot the results of the KS test for the path change generator.

Parameters:
  • ktest_results – [testResult, …] for each time step in the path

  • ax – axis to plot on or None (if None a new figure is created)

  • plot_stats – tuple subset of result values to plot or None (assumes results are float)

  • plot_on_twinx – if True plot the first of plot_stats on the ax and then each subsequent statistic on individual twinx axes

  • colors – dictionary of colors for each statistic or None (if None use default colors)

  • kwargs – other keyword arguments to pass to the ax.plot

Returns:

fig, ax, (handles, labels)

rvs(random_state=None)#

Generate a random path of length n, with a starting point in the interval start_bounds process is first pick a value out of the uniform distribution (”*start_bounds”), then for each additional point in the series pick a value from a normal distribution centered on the previous value with a scale of scale. If the picked value is outside the interval set to the upper and lower bounds then the value is truncated to the bounds value.

Parameters:

random_state – None, use the global random state, or an integer to seed the random state. As this is a markov chain process the random state is used to generate a suite of integer seeds to make each delta deterministic.

Returns:

class PseudoUniformChangeGenerator(start_bounds, low, up, scale_low, scale_up, uniform_up, uniform_low, n)#

Bases: komanawa.BASE.generators.parent_path_change.ParentPathChange

Inheritance diagram of komanawa.BASE.generators.PseudoUniformChangeGenerator

A base class for the path change generators which defines the KS test for the path change generator.

Generate a random path of length n, with a starting point in the interval start_bounds and then a series of n-1 deltas picked from a uniform distribution bounded by lower and upper half normal distributions.

The process is first pick a value out of the uniform distribution (”*start_bounds”), then for each additional point in the series pick a value from a distribution (equally probable) defined by:

  • -inf -> uniform_low : half normal distribution with scale scale_low

  • uniform_low -> uniform_up : uniform distribution

  • uniform_up -> inf : half normal distribution with scale scale_up

Parameters:
  • start_bounds – (low, high) bounds for the starting point

  • up – upper bound for the path, can be a scalar or a list of length n-1 (one for change)

  • low – lower bound for the path, can be a scalar or a list of length n-1 (one for change)

  • scale_low – scale (always postive) for the half normal distribution from -inf -> uniform_low can be a scalar or a list of length n-1 (one for change)

  • scale_up – scale (always postive) for the half normal distribution from uniform_up -> inf can be a scalar or a list of length n-1 (one for change)

  • uniform_up – upper bound for the uniform distribution from uniform_low -> uniform_up can be a scalar or a list of length n-1 (one for change)

  • uniform_low – lower bound for the uniform distribution from uniform_low -> uniform_up can be a scalar or a list of length n-1 (one for change)

  • n – length of the path

comptest(sample, test=kstest, n_gen=None, random_state=None, **kwargs)#

Calculate the scipy test for the path change generator. The test is calculated for each time step in the path. The first time step is the starting point of the path, the second time step is the difference between the first and second points in the path, the third time step is the difference between the second and third points in the path, etc.

The test can be any of the scipy tests like kstest, Mann-Whitney U test, wasserstein_distance_nd, etc.

Parameters:
  • sample – sample to test shape=(nsims, self.n)

  • n_gen – Number of samples to generate for the KS test, if None use the length of the sample

  • kwargs – other keyword arguments to pass to scipy.stats test

Returns:

[estResult, …] for each time step in the path (see scipy function)

logpdf(q0)#

Calculate the log probability of the path (q0) given the parameters of the generator.

Parameters:

q0

Returns:

plot_path(x=None, nsigmas=2, ax=None, ax_delta=None, round_lab=2, lab_rotation=0)#

Plot the path generated by the generator

Parameters:
  • x – None or an array of length n to use as the x axis if None the x axis will be the integers from 0 to n

  • nsigmas – number of standard deviations to plot around the center path (default 2, 95th percentile)

  • ax – None or an axis to plot the paths on

  • ax_delta – None or an axis to plot the deltas on

  • round_lab – number of decimal places to round the labels to

  • lab_rotation – rotation of the x axis labels

Returns:

fig, ax

plot_test(test_results, ax, x=None, plot_stats=('statistic', 'pvalue'), plot_on_twinx=True, colors=None, **kwargs)#

Plot the results of the KS test for the path change generator.

Parameters:
  • ktest_results – [testResult, …] for each time step in the path

  • ax – axis to plot on or None (if None a new figure is created)

  • plot_stats – tuple subset of result values to plot or None (assumes results are float)

  • plot_on_twinx – if True plot the first of plot_stats on the ax and then each subsequent statistic on individual twinx axes

  • colors – dictionary of colors for each statistic or None (if None use default colors)

  • kwargs – other keyword arguments to pass to the ax.plot

Returns:

fig, ax, (handles, labels)

rvs(random_state=None)#

Generate a random path of length n, with a starting point in the interval start_bounds process is first pick a value out of the uniform distribution (”*start_bounds”), then for each additional point in the series pick a value from a normal distribution centered on the previous value with a scale of scale. If the picked value is outside the interval set to the upper and lower bounds then the value is truncated to the bounds value.

Parameters:

random_state – None, use the global random state, or an integer to seed the random state. As this is a markov chain process the random state is used to generate a suite of integer seeds to make each delta deterministic.

Returns:

class UniformPathGenerator(low, up, n)#

Bases: komanawa.BASE.generators.parent_path_change.ParentPathChange

Inheritance diagram of komanawa.BASE.generators.UniformPathGenerator

A base class for the path change generators which defines the KS test for the path change generator.

Generate a random path of length n, within the bounds of low and up. The path is generated by picking a value out of the uniform distribution (low[i]-up[i]) for each point in the path.

parameters low and up can be scalars or lists of length n. If they are scalars the value is used for all points in the path. If they are iterable the value is used for the corresponding point in the path.

Parameters:
  • low

  • up

  • n

comptest(sample, test=kstest, n_gen=None, random_state=None, **kwargs)#

Calculate the scipy test for the path change generator. The test is calculated for each time step in the path. The first time step is the starting point of the path, the second time step is the difference between the first and second points in the path, the third time step is the difference between the second and third points in the path, etc.

The test can be any of the scipy tests like kstest, Mann-Whitney U test, wasserstein_distance_nd, etc.

Parameters:
  • sample – sample to test shape=(nsims, self.n)

  • n_gen – Number of samples to generate for the KS test, if None use the length of the sample

  • kwargs – other keyword arguments to pass to scipy.stats test

Returns:

[estResult, …] for each time step in the path (see scipy function)

logpdf(q0)#

Calculate the log probability of the path (q0) given the parameters of the generator.

Parameters:

q0

Returns:

plot_path(x=None, ax=None, round_lab=2, lab_rotation=0, xlab=None)#

Plot the path with the bounds.

Parameters:
  • x – None or an array of length n to use as the x axis, if None the x axis will be the integers from 0 to n

  • ax – None or an axis to plot on

  • round_lab – the number of decimal places to round the labels to

  • lab_rotation – rotation of the x axis labels

  • xlab – None or a strings to use as the xtick label

Returns:

plot_test(test_results, ax, x=None, plot_stats=('statistic', 'pvalue'), plot_on_twinx=True, colors=None, **kwargs)#

Plot the results of the KS test for the path change generator.

Parameters:
  • ktest_results – [testResult, …] for each time step in the path

  • ax – axis to plot on or None (if None a new figure is created)

  • plot_stats – tuple subset of result values to plot or None (assumes results are float)

  • plot_on_twinx – if True plot the first of plot_stats on the ax and then each subsequent statistic on individual twinx axes

  • colors – dictionary of colors for each statistic or None (if None use default colors)

  • kwargs – other keyword arguments to pass to the ax.plot

Returns:

fig, ax, (handles, labels)

rvs(random_state=None)#

Generate a random path of length n, with a starting point in the interval start_bounds process is first pick a value out of the uniform distribution (”*start_bounds”), then for each additional point in the series pick a value from a normal distribution centered on the previous value with a scale of scale. If the picked value is outside the interval set to the upper and lower bounds then the value is truncated to the bounds value.

Parameters:

random_state – None, use the global random state, or an integer to seed the random state. As this is a markov chain process the random state is used to generate a suite of integer seeds to make each delta deterministic.

Returns: