Window utilities¶
-
allel.stats.window.
moving_statistic
(values, statistic, size, start=0, stop=None, step=None)[source]¶ Calculate a statistic in a moving window over values.
Parameters: values : array_like
The data to summarise.
statistic : function
The statistic to compute within each window.
size : int
The window size (number of values).
start : int, optional
The index at which to start.
stop : int, optional
The index at which to stop.
step : int, optional
The distance between start positions of windows. If not given, defaults to the window size, i.e., non-overlapping windows.
Returns: out : ndarray, shape (n_windows,)
Examples
>>> import allel >>> values = [2, 5, 8, 16] >>> allel.stats.moving_statistic(values, np.sum, size=2) array([ 7, 24]) >>> allel.stats.moving_statistic(values, np.sum, size=2, step=1) array([ 7, 13, 24])
-
allel.stats.window.
windowed_count
(pos, size=None, start=None, stop=None, step=None, windows=None)[source]¶ Count the number of items in windows over a single chromosome/contig.
Parameters: pos : array_like, int, shape (n_items,)
The item positions in ascending order, using 1-based coordinates..
size : int, optional
The window size (number of bases).
start : int, optional
The position at which to start (1-based).
stop : int, optional
The position at which to stop (1-based).
step : int, optional
The distance between start positions of windows. If not given, defaults to the window size, i.e., non-overlapping windows.
windows : array_like, int, shape (n_windows, 2), optional
Manually specify the windows to use as a sequence of (window_start, window_stop) positions, using 1-based coordinates. Overrides the size/start/stop/step parameters.
Returns: counts : ndarray, int, shape (n_windows,)
The number of items in each window.
windows : ndarray, int, shape (n_windows, 2)
The windows used, as an array of (window_start, window_stop) positions, using 1-based coordinates.
Notes
The window stop positions are included within a window.
The final window will be truncated to the specified stop position, and so may be smaller than the other windows.
Examples
Non-overlapping windows:
>>> import allel >>> pos = [1, 7, 12, 15, 28] >>> counts, windows = allel.stats.windowed_count(pos, size=10) >>> counts array([2, 2, 1]) >>> windows array([[ 1, 10], [11, 20], [21, 28]])
Half-overlapping windows:
>>> counts, windows = allel.stats.windowed_count(pos, size=10, step=5) >>> counts array([2, 3, 2, 0, 1]) >>> windows array([[ 1, 10], [ 6, 15], [11, 20], [16, 25], [21, 28]])
-
allel.stats.window.
windowed_statistic
(pos, values, statistic, size=None, start=None, stop=None, step=None, windows=None, fill=nan)[source]¶ Calculate a statistic from items in windows over a single chromosome/contig.
Parameters: pos : array_like, int, shape (n_items,)
The item positions in ascending order, using 1-based coordinates..
values : array_like, int, shape (n_items,)
The values to summarise. May also be a tuple of values arrays, in which case each array will be sliced and passed through to the statistic function as separate arguments.
statistic : function
The statistic to compute.
size : int, optional
The window size (number of bases).
start : int, optional
The position at which to start (1-based).
stop : int, optional
The position at which to stop (1-based).
step : int, optional
The distance between start positions of windows. If not given, defaults to the window size, i.e., non-overlapping windows.
windows : array_like, int, shape (n_windows, 2), optional
Manually specify the windows to use as a sequence of (window_start, window_stop) positions, using 1-based coordinates. Overrides the size/start/stop/step parameters.
fill : object, optional
The value to use where a window is empty, i.e., contains no items.
Returns: out : ndarray, shape (n_windows,)
The value of the statistic for each window.
windows : ndarray, int, shape (n_windows, 2)
The windows used, as an array of (window_start, window_stop) positions, using 1-based coordinates.
counts : ndarray, int, shape (n_windows,)
The number of items in each window.
Notes
The window stop positions are included within a window.
The final window will be truncated to the specified stop position, and so may be smaller than the other windows.
Examples
Count non-zero (i.e., True) items in non-overlapping windows:
>>> import allel >>> pos = [1, 7, 12, 15, 28] >>> values = [True, False, True, False, False] >>> nnz, windows, counts = allel.stats.windowed_statistic( ... pos, values, statistic=np.count_nonzero, size=10 ... ) >>> nnz array([1, 1, 0]) >>> windows array([[ 1, 10], [11, 20], [21, 28]]) >>> counts array([2, 2, 1])
Compute a sum over items in half-overlapping windows:
>>> values = [3, 4, 2, 6, 9] >>> x, windows, counts = allel.stats.windowed_statistic( ... pos, values, statistic=np.sum, size=10, step=5, fill=0 ... ) >>> x array([ 7, 12, 8, 0, 9]) >>> windows array([[ 1, 10], [ 6, 15], [11, 20], [16, 25], [21, 28]]) >>> counts array([2, 3, 2, 0, 1])
-
allel.stats.window.
per_base
(x, windows, is_accessible=None, fill=nan)[source]¶ Calculate the per-base value of a windowed statistic.
Parameters: x : array_like, shape (n_windows,)
The statistic to average per-base.
windows : array_like, int, shape (n_windows, 2)
The windows used, as an array of (window_start, window_stop) positions using 1-based coordinates.
is_accessible : array_like, bool, shape (len(contig),), optional
Boolean array indicating accessibility status for all positions in the chromosome/contig.
fill : object, optional
Use this value where there are no accessible bases in a window.
Returns: y : ndarray, float, shape (n_windows,)
The input array divided by the number of (accessible) bases in each window.
n_bases : ndarray, int, shape (n_windows,)
The number of (accessible) bases in each window