Selection

allel.stats.selection.ehh_decay(h, truncate=False)[source]

Compute the decay of extended haplotype homozygosity (EHH) moving away from the first variant.

Parameters:

h : array_like, int, shape (n_variants, n_haplotypes)

Haplotype array.

truncate : bool, optional

If True, the return array will exclude trailing zeros.

Returns:

ehh : ndarray, float, shape (n_variants, )

EHH at successive variants from the first variant.

allel.stats.selection.voight_painting(h)[source]

Paint haplotypes, assigning a unique integer to each shared haplotype prefix.

Parameters:

h : array_like, int, shape (n_variants, n_haplotypes)

Haplotype array.

Returns:

painting : ndarray, int, shape (n_variants, n_haplotypes)

Painting array.

allel.stats.selection.xpehh(h1, h2, pos, min_ehh=0)[source]

Compute the unstandardized cross-population extended haplotype homozygosity score (XPEHH) for each variant.

Parameters:

h1 : array_like, int, shape (n_variants, n_haplotypes)

Haplotype array for the first population.

h2 : array_like, int, shape (n_variants, n_haplotypes)

Haplotype array for the second population.

pos : array_like, int, shape (n_variants,)

Variant positions on physical or genetic map.

min_ehh: float, optional

Minimum EHH beyond which to truncate integrated haplotype homozygosity calculation.

Returns:

score : ndarray, float, shape (n_variants,)

Unstandardized XPEHH scores.

Notes

This function will calculate XPEHH for all variants. To exclude variants below a given minor allele frequency, filter the input haplotype arrays before passing to this function.

This function does nothing about XPEHH calculations where haplotype homozygosity extends up to the first or last variant. There will be edge effects.

This function currently does nothing to account for large gaps between variants. There will be edge effects near any large gaps.

Note that the unstandardized score is returned. Usually these scores are then normalised in different allele frequency bins.

Haplotype arrays from the two populations may have different numbers of haplotypes.

allel.stats.selection.ihs(h, pos, min_ehh=0)[source]

Compute the unstandardized integrated haplotype score (IHS) for each variant, comparing integrated haplotype homozygosity between the reference and alternate alleles.

Parameters:

h : array_like, int, shape (n_variants, n_haplotypes)

Haplotype array.

pos : array_like, int, shape (n_variants,)

Variant positions on physical or genetic map.

min_ehh: float, optional

Minimum EHH beyond which to truncate integrated haplotype homozygosity calculation.

Returns:

score : ndarray, float, shape (n_variants,)

Unstandardized IHS scores.

Notes

This function will calculate IHS for all variants. To exclude variants below a given minor allele frequency, filter the input haplotype array before passing to this function.

This function computes IHS comparing the reference and alternate alleles. These can be polarised by switching the sign for any variant where the reference allele is derived.

This function does nothing about IHS calculations where haplotype homozygosity extends up to the first or last variant. There will be edge effects.

This function currently does nothing to account for large gaps between variants. There will be edge effects near any large gaps.

Note that the unstandardized score is returned. Usually these scores are then normalised in different allele frequency bins.