# Release notes¶

## v0.15¶

- Added functions to estimate Fst with standard error via a
block-jackknife:
`allel.stats.fst.blockwise_weir_cockerham_fst()`

,`allel.stats.fst.blockwise_hudson_fst()`

,`allel.stats.fst.blockwise_patterson_fst()`

. - Fixed a serious bug in
`allel.stats.fst.weir_cockerham_fst()`

related to incorrect estimation of heterozygosity, which manifested if the subpopulations being compared were not a partition of the total population (i.e., there were one or more samples in the genotype array that were not included in the subpopulations to compare). - Added method
`allel.model.AlleleCountsArray.max_allele()`

to determine highest allele index for each variant. - Changed first return value from admixture functions
`allel.stats.admixture.blockwise_patterson_f3()`

and`allel.stats.admixture.blockwise_patterson_d()`

to return the estimator from the whole dataset. - Added utility functions to the
`allel.stats.distance`

module for transforming coordinates between condensed and uncondensed forms of a distance matrix. - Classes previously available from the allel.model and
allel.bcolz modules are now aliased from the root
`allel`

module for convenience. These modules have been reorganised into an`allel.model`

package with sub-modules`allel.model.ndarray`

and`allel.model.bcolz`

. - All functions in the
`allel.model.bcolz`

module use cparams from input carray as default for output carray (convenient if you, e.g., want to use zlib level 1 throughout). - All classes in the
`allel.model.ndarray`

and`allel.model.bcolz`

modules have changed the default value for the copy keyword argument to False. This means that**not**copying the input data, just wrapping it, is now the default behaviour. - Fixed bug in
`GenotypeArray.to_gt()`

where maximum allele index is zero.

## v0.14¶

- Added a new module
`allel.stats.admixture`

with statistical tests for admixture between populations, implementing the f2, f3 and D statistics from Patterson (2012). Functions include`allel.stats.admixture.blockwise_patterson_f3()`

and`allel.stats.admixture.blockwise_patterson_d()`

which compute the f3 and D statistics respectively in blocks of a given number of variants and perform a block-jackknife to estimate the standard error.

## v0.12¶

- Added functions for principal components analysis of genotype
data. Functions in the new module
`allel.stats.decomposition`

include`allel.stats.decomposition.pca()`

to perform a PCA via full singular value decomposition, and`allel.stats.decomposition.randomized_pca()`

which uses an approximate truncated singular value decomposition to speed up computation. In tests with real data the randomized PCA is around 5 times faster and uses half as much memory as the conventional PCA, producing highly similar results. - Added function
`allel.stats.distance.pcoa()`

for principal coordinate analysis (a.k.a. classical multi-dimensional scaling) of a distance matrix. - Added new utility module
`allel.stats.preprocessing`

with classes for scaling genotype data prior to use as input for PCA or PCoA. By default the scaling (i.e., normalization) of Patterson (2006) is used with principal components analysis functions in the`allel.stats.decomposition`

module. Scaling functions can improve the ability to resolve population structure via PCA or PCoA. - Added method
`allel.model.GenotypeArray.to_n_ref()`

. Also added`dtype`

argument to`allel.model.GenotypeArray.to_n_ref()`

and`allel.model.GenotypeArray.to_n_alt()`

methods to enable direct output as float arrays, which can be convenient if these arrays are then going to be scaled for use in PCA or PCoA. - Added
`allel.model.GenotypeArray.mask`

property which can be set with a Boolean mask to filter genotype calls from genotype and allele counting operations. A similar property is available on the`allel.bcolz.GenotypeCArray`

class. Also added method`allel.model.GenotypeArray.fill_masked()`

and similar method on the`allel.bcolz.GenotypeCArray`

class to fill masked genotype calls with a value (e.g., -1).

## v0.11¶

- Added functions for calculating Watterson’s theta (proportional to
the number of segregating variants):
`allel.stats.diversity.watterson_theta()`

for calculating over a given region, and`allel.stats.diversity.windowed_watterson_theta()`

for calculating in windows over a chromosome/contig. - Added functions for calculating Tajima’s D statistic (balance
between nucleotide diversity and number of segregating sites):
`allel.stats.diversity.tajima_d()`

for calculating over a given region and`allel.stats.diversity.windowed_tajima_d()`

for calculating in windows over a chromosome/contig. - Added
`allel.stats.diversity.windowed_df()`

for calculating the rate of fixed differences between two populations. - Added function
`allel.model.locate_fixed_differences()`

for locating variants that are fixed for different alleles in two different populations. - Added function
`allel.model.locate_private_alleles()`

for locating alleles and variants that are private to a single population.

## v0.10¶

- Added functions implementing the Weir and Cockerham (1984)
estimators for F-statistics:
`allel.stats.fst.weir_cockerham_fst()`

and`allel.stats.fst.windowed_weir_cockerham_fst()`

. - Added functions implementing the Hudson (1992) estimator for Fst:
`allel.stats.fst.hudson_fst()`

and`allel.stats.fst.windowed_hudson_fst()`

. - Added new module
`allel.stats.ld`

with functions for calculating linkage disequilibrium estimators, including`allel.stats.ld.rogers_huff_r()`

for pairwise variant LD calculation,`allel.stats.ld.windowed_r_squared()`

for windowed LD calculations, and`allel.stats.ld.locate_unlinked()`

for locating variants in approximate linkage equilibrium. - Added function
`allel.plot.pairwise_ld()`

for visualising a matrix of linkage disequilbrium values between pairs of variants. - Added function
`allel.model.create_allele_mapping()`

for creating a mapping of alleles into a different index system, i.e., if you want 0 and 1 to represent something other than REF and ALT, e.g., ancestral and derived. Also added methods`allel.model.GenotypeArray.map_alleles()`

,`allel.model.HaplotypeArray.map_alleles()`

and`allel.model.AlleleCountsArray.map_alleles()`

which will perform an allele transformation given an allele mapping. - Added function
`allel.plot.variant_locator()`

ported across from anhima. - Refactored the
`allel.stats`

module into a package with sub-modules for easier maintenance.

## v0.9¶

- Added documentation for the functions
`allel.bcolz.carray_from_hdf5()`

,`allel.bcolz.carray_to_hdf5()`

,`allel.bcolz.ctable_from_hdf5_group()`

,`allel.bcolz.ctable_to_hdf5_group()`

. - Refactoring of internals within the
`allel.bcolz`

module.

## v0.8¶

- Added subpop argument to
`allel.model.GenotypeArray.count_alleles()`

and`allel.model.HaplotypeArray.count_alleles()`

to enable count alleles within a sub-population without subsetting the array. - Added functions
`allel.model.GenotypeArray.count_alleles_subpops()`

and`allel.model.HaplotypeArray.count_alleles_subpops()`

to enable counting alleles in multiple sub-populations in a single pass over the array, without sub-setting. - Added classes
`allel.model.FeatureTable`

and`allel.bcolz.FeatureCTable`

for storing and querying data on genomic features (genes, etc.), with functions for parsing from a GFF3 file. - Added convenience function
`allel.stats.distance.pairwise_dxy()`

for computing a distance matrix using Dxy as the metric.

## v0.7¶

- Added function
`allel.io.write_fasta()`

for writing a nucleotide sequence stored as a NumPy array out to a FASTA format file.

## v0.6¶

- Added method
`allel.model.VariantTable.to_vcf()`

for writing a variant table to a VCF format file.