Selection¶

allel.stats.selection.
ehh_decay
(h, truncate=False)[source]¶ Compute the decay of extended haplotype homozygosity (EHH) moving away from the first variant.
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
truncate : bool, optional
If True, the return array will exclude trailing zeros.
Returns: ehh : ndarray, float, shape (n_variants, )
EHH at successive variants from the first variant.

allel.stats.selection.
voight_painting
(h)[source]¶ Paint haplotypes, assigning a unique integer to each shared haplotype prefix.
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
Returns: painting : ndarray, int, shape (n_variants, n_haplotypes)
Painting array.
indices : ndarray, int, shape (n_hapotypes,)
Haplotype indices after sorting by prefix.

allel.stats.selection.
plot_voight_painting
(painting, palette='colorblind', flank='right', ax=None, height_factor=0.01)[source]¶ Plot a painting of shared haplotype prefixes.
Parameters: painting : array_like, int, shape (n_variants, n_haplotypes)
Painting array.
ax : axes, optional
The axes on which to draw. If not provided, a new figure will be created.
palette : string, optional
A Seaborn palette name.
flank : {‘right’, ‘left’}, optional
If left, painting will be reversed along first axis.
height_factor : float, optional
If no axes provided, determine height of figure by multiplying height of painting array by this number.
Returns: ax : axes

allel.stats.selection.
fig_voight_painting
(h, index=None, palette='colorblind', height_factor=0.01, fig=None)[source]¶ Make a figure of shared haplotype prefixes for both left and right flanks, centred on some variant of choice.
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
index : int, optional
Index of the variant within the haplotype array to centre on. If not provided, the middle variant will be used.
palette : string, optional
A Seaborn palette name.
height_factor : float, optional
If no axes provided, determine height of figure by multiplying height of painting array by this number.
fig : figure
The figure on which to draw. If not provided, a new figure will be created.
Returns: fig : figure
Notes
N.B., the ordering of haplotypes on the left and right flanks will be different. This means that haplotypes on the right flank will not correspond to haplotypes on the left flank at the same vertical position.

allel.stats.selection.
xpehh
(h1, h2, pos, min_ehh=0.05)[source]¶ Compute the unstandardized crosspopulation extended haplotype homozygosity score (XPEHH) for each variant.
Parameters: h1 : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array for the first population.
h2 : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array for the second population.
pos : array_like, int, shape (n_variants,)
Variant positions on physical or genetic map.
min_ehh: float, optional
Minimum EHH beyond which to truncate integrated haplotype homozygosity calculation.
Returns: score : ndarray, float, shape (n_variants,)
Unstandardized XPEHH scores.
Notes
This function will calculate XPEHH for all variants. To exclude variants below a given minor allele frequency, filter the input haplotype arrays before passing to this function.
This function returns NaN for any EHH calculations where haplotype homozygosity does not decay below min_ehh before reaching the first or last variant. To disable this behaviour, set min_ehh to None.
This function currently does nothing to account for large gaps between variants. There will be edge effects near any large gaps.
Note that the unstandardized score is returned. Usually these scores are then normalised in different allele frequency bins.
Haplotype arrays from the two populations may have different numbers of haplotypes.

allel.stats.selection.
ihs
(h, pos, min_ehh=0.05)[source]¶ Compute the unstandardized integrated haplotype score (IHS) for each variant, comparing integrated haplotype homozygosity between the reference and alternate alleles.
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
pos : array_like, int, shape (n_variants,)
Variant positions on physical or genetic map.
min_ehh: float, optional
Minimum EHH beyond which to truncate integrated haplotype homozygosity calculation.
Returns: score : ndarray, float, shape (n_variants,)
Unstandardized IHS scores.
Notes
This function will calculate IHS for all variants. To exclude variants below a given minor allele frequency, filter the input haplotype array before passing to this function.
This function computes IHS comparing the reference and alternate alleles. These can be polarised by switching the sign for any variant where the reference allele is derived.
This function returns NaN for any IHS calculations where haplotype homozygosity does not decay below min_ehh before reaching the first or last variant. To disable this behaviour, set min_ehh to None.
This function currently does nothing to account for large gaps between variants. There will be edge effects near any large gaps.
Note that the unstandardized score is returned. Usually these scores are then normalised in different allele frequency bins.

allel.stats.selection.
nsl
(h)[source]¶ Compute the unstandardized number of segregating sites by length (nSl) for each variant, comparing the reference and alternate alleles, after FerrerAdmetlla et al. (2014).
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
Returns: score : ndarray, float, shape (n_variants,)
Notes
This function will calculate nSl for all variants. To exclude variants below a given minor allele frequency, filter the input haplotype array before passing to this function.
The function only expects segregating sites, so ensure any nonsegregating sites are removed before passing in the haplotype array.
This function computes nSl by comparing the reference and alternate alleles. These can be polarised by switching the sign for any variant where the reference allele is derived.
This function does nothing about nSl calculations where haplotype homozygosity extends up to the first or last variant. There will be edge effects.
This function currently does nothing to account for large gaps between variants. There will be edge effects near any large gaps.
This function returns unstandardised scores. Typically nSl scores are are normalised by subtracting the mean and dividing by the standard deviation.

allel.stats.selection.
haplotype_diversity
(h)[source]¶ Estimate haplotype diversity.
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
Returns: hd : float
Haplotype diversity.

allel.stats.selection.
moving_haplotype_diversity
(h, size, start=0, stop=None, step=None)[source]¶ Estimate haplotype diversity in moving windows.
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
size : int
The window size (number of variants).
start : int, optional
The index at which to start.
stop : int, optional
The index at which to stop.
step : int, optional
The number of variants between start positions of windows. If not given, defaults to the window size, i.e., nonoverlapping windows.
Returns: hd : ndarray, float, shape (n_windows,)
Haplotype diversity.

allel.stats.selection.
garud_h
(h)[source]¶ Compute the H1, H12, H123 and H2/H1 statistics for detecting signatures of soft sweeps, as defined in Garud et al. (2015).
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
Returns: h1 : float
H1 statistic (sum of squares of haplotype frequencies).
h12 : float
H12 statistic (sum of squares of haplotype frequencies, combining the two most common haplotypes into a single frequency).
h123 : float
H123 statistic (sum of squares of haplotype frequencies, combining the three most common haplotypes into a single frequency).
h2_h1 : float
H2/H1 statistic, indicating the “softness” of a sweep.

allel.stats.selection.
moving_garud_h
(h, size, start=0, stop=None, step=None)[source]¶ Compute the H1, H12, H123 and H2/H1 statistics for detecting signatures of soft sweeps, as defined in Garud et al. (2015), in moving windows,
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
size : int
The window size (number of variants).
start : int, optional
The index at which to start.
stop : int, optional
The index at which to stop.
step : int, optional
The number of variants between start positions of windows. If not given, defaults to the window size, i.e., nonoverlapping windows.
Returns: h1 : ndarray, float, shape (n_windows,)
H1 statistics (sum of squares of haplotype frequencies).
h12 : ndarray, float, shape (n_windows,)
H12 statistics (sum of squares of haplotype frequencies, combining the two most common haplotypes into a single frequency).
h123 : ndarray, float, shape (n_windows,)
H123 statistics (sum of squares of haplotype frequencies, combining the three most common haplotypes into a single frequency).
h2_h1 : ndarray, float, shape (n_windows,)
H2/H1 statistics, indicating the “softness” of a sweep.

allel.stats.selection.
plot_haplotype_frequencies
(h, palette='Paired', singleton_color='w', ax=None)[source]¶ Plot haplotype frequencies.
Parameters: h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
palette : string, optional
A Seaborn palette name.
singleton_color : string, optional
Color to paint singleton haplotypes.
ax : axes, optional
The axes on which to draw. If not provided, a new figure will be created.
Returns: ax : axes

allel.stats.selection.
plot_moving_haplotype_frequencies
(pos, h, size, start=0, stop=None, n=None, palette='Paired', singleton_color='w', ax=None)[source]¶ Plot haplotype frequencies in moving windows over the genome.
Parameters: pos : array_like, int, shape (n_items,)
Variant positions, using 1based coordinates, in ascending order.
h : array_like, int, shape (n_variants, n_haplotypes)
Haplotype array.
size : int
The window size (number of variants).
start : int, optional
The index at which to start.
stop : int, optional
The index at which to stop.
n : int, optional
Color only the n most frequent haplotypes (by default, all nonsingleton haplotypes are colored).
palette : string, optional
A Seaborn palette name.
singleton_color : string, optional
Color to paint singleton haplotypes.
ax : axes, optional
The axes on which to draw. If not provided, a new figure will be created.
Returns: ax : axes