Utility functions¶

allel.create_allele_mapping(ref, alt, alleles, dtype=’i1’)¶

Create an array mapping variant alleles into a different allele index system.

Parameters:

ref : array_like, S1, shape (n_variants,)

Reference alleles.

alt : array_like, S1, shape (n_variants, n_alt_alleles)

Alternate alleles.

alleles : array_like, S1, shape (n_variants, n_alleles)

Alleles defining the new allele indexing.

dtype : dtype, optional

Output dtype.

Returns:

mapping : ndarray, int8, shape (n_variants, n_alt_alleles + 1)

See also

GenotypeArray.map_alleles, HaplotypeArray.map_alleles, AlleleCountsArray.map_alleles

Examples

Example with biallelic variants:

>>> import allel
>>> ref = [b'A', b'C', b'T', b'G']
>>> alt = [b'T', b'G', b'C', b'A']
>>> alleles = [[b'A', b'T'],  # no transformation
...            [b'G', b'C'],  # swap
...            [b'T', b'A'],  # 1 missing
...            [b'A', b'C']]  # 1 missing
>>> mapping = allel.create_allele_mapping(ref, alt, alleles)
>>> mapping
array([[ 0,  1],
       [ 1,  0],
       [ 0, -1],
       [-1,  0]], dtype=int8)

Example with multiallelic variants:

>>> ref = [b'A', b'C', b'T']
>>> alt = [[b'T', b'G'],
...        [b'A', b'T'],
...        [b'G', b'.']]
>>> alleles = [[b'A', b'T'],
...            [b'C', b'T'],
...            [b'G', b'A']]
>>> mapping = create_allele_mapping(ref, alt, alleles)
>>> mapping
array([[ 0,  1, -1],
       [ 0, -1,  1],
       [-1,  0, -1]], dtype=int8)

allel.locate_fixed_differences(ac1, ac2)¶

Locate variants with no shared alleles between two populations.

Parameters:

ac1 : array_like, int, shape (n_variants, n_alleles)

Allele counts array from the first population.

ac2 : array_like, int, shape (n_variants, n_alleles)

Allele counts array from the second population.

Returns:

loc : ndarray, bool, shape (n_variants,)

See also

allel.stats.diversity.windowed_df

Examples

>>> import allel
>>> g = allel.GenotypeArray([[[0, 0], [0, 0], [1, 1], [1, 1]],
...                          [[0, 1], [0, 1], [0, 1], [0, 1]],
...                          [[0, 1], [0, 1], [1, 1], [1, 1]],
...                          [[0, 0], [0, 0], [1, 1], [2, 2]],
...                          [[0, 0], [-1, -1], [1, 1], [-1, -1]]])
>>> ac1 = g.count_alleles(subpop=[0, 1])
>>> ac2 = g.count_alleles(subpop=[2, 3])
>>> loc_df = allel.locate_fixed_differences(ac1, ac2)
>>> loc_df
array([ True, False, False,  True,  True], dtype=bool)

allel.locate_private_alleles(*acs)¶

Locate alleles that are found only in a single population.

Parameters:

*acs : array_like, int, shape (n_variants, n_alleles)

Allele counts arrays from each population.

Returns:

loc : ndarray, bool, shape (n_variants, n_alleles)

Boolean array where elements are True if allele is private to a single population.

Examples

>>> import allel
>>> g = allel.GenotypeArray([[[0, 0], [0, 0], [1, 1], [1, 1]],
...                          [[0, 1], [0, 1], [0, 1], [0, 1]],
...                          [[0, 1], [0, 1], [1, 1], [1, 1]],
...                          [[0, 0], [0, 0], [1, 1], [2, 2]],
...                          [[0, 0], [-1, -1], [1, 1], [-1, -1]]])
>>> ac1 = g.count_alleles(subpop=[0, 1])
>>> ac2 = g.count_alleles(subpop=[2])
>>> ac3 = g.count_alleles(subpop=[3])
>>> loc_private_alleles = allel.locate_private_alleles(ac1, ac2, ac3)
>>> loc_private_alleles
array([[ True, False, False],
       [False, False, False],
       [ True, False, False],
       [ True,  True,  True],
       [ True,  True, False]], dtype=bool)
>>> loc_private_variants = np.any(loc_private_alleles, axis=1)
>>> loc_private_variants
array([ True, False,  True,  True,  True], dtype=bool)

allel.sample_to_haplotype_selection(indices, ploidy)¶