# Pairwise distance and ordination¶

`allel.stats.distance.``pairwise_distance`(x, metric, chunked=False, blen=None)[source]

Compute pairwise distance between individuals (e.g., samples or haplotypes).

Parameters: x : array_like, shape (n, m, ...) Array of m observations (e.g., samples or haplotypes) in a space with n dimensions (e.g., variants). Note that the order of the first two dimensions is swapped compared to what is expected by scipy.spatial.distance.pdist. metric : string or function Distance metric. See documentation for the function `scipy.spatial.distance.pdist()` for a list of built-in distance metrics. chunked : bool, optional If True, use a block-wise implementation to avoid loading the entire input array into memory. This means that a distance matrix will be calculated for each block of the input array, and the results will be summed to produce the final output. For some distance metrics this will return a different result from the standard implementation. blen : int, optional Block length to use for chunked implementation. dist : ndarray, shape (m * (m - 1) / 2,) Distance matrix in condensed form.

Examples

```>>> import allel
>>> g = allel.GenotypeArray([[[0, 0], [0, 1], [1, 1]],
...                          [[0, 1], [1, 1], [1, 2]],
...                          [[0, 2], [2, 2], [-1, -1]]])
>>> d = allel.stats.pairwise_distance(g.to_n_alt(), metric='cityblock')
>>> d
array([ 3.,  4.,  3.])
>>> import scipy.spatial
>>> scipy.spatial.distance.squareform(d)
array([[ 0.,  3.,  4.],
[ 3.,  0.,  3.],
[ 4.,  3.,  0.]])
```
`allel.stats.distance.``plot_pairwise_distance`(dist, labels=None, colorbar=True, ax=None, imshow_kwargs=None)[source]

Plot a pairwise distance matrix.

Parameters: dist : array_like The distance matrix in condensed form. labels : sequence of strings, optional Sample labels for the axes. colorbar : bool, optional If True, add a colorbar to the current figure. ax : axes, optional The axes on which to draw. If not provided, a new figure will be created. imshow_kwargs : dict-like, optional Additional keyword arguments passed through to `matplotlib.pyplot.imshow()`. ax : axes The axes on which the plot was drawn
`allel.stats.distance.``pairwise_dxy`(pos, gac, start=None, stop=None, is_accessible=None)[source]

Convenience function to calculate a pairwise distance matrix using nucleotide divergence (a.k.a. Dxy) as the distance metric.

Parameters: pos : array_like, int, shape (n_variants,) Variant positions. gac : array_like, int, shape (n_variants, n_samples, n_alleles) Per-genotype allele counts. start : int, optional Start position of region to use. stop : int, optional Stop position of region to use. is_accessible : array_like, bool, shape (len(contig),), optional Boolean array indicating accessibility status for all positions in the chromosome/contig. dist : ndarray Distance matrix in condensed form.

`allel.model.ndarray.GenotypeArray.to_allele_counts`

`allel.stats.distance.``pcoa`(dist)[source]

Perform principal coordinate analysis of a distance matrix, a.k.a. classical multi-dimensional scaling.

Parameters: dist : array_like Distance matrix in condensed form. coords : ndarray, shape (n_samples, n_dimensions) Transformed coordinates for the samples. explained_ratio : ndarray, shape (n_dimensions) Variance explained by each dimension.
`allel.stats.distance.``condensed_coords`(i, j, n)[source]

Transform square distance matrix coordinates to the corresponding index into a condensed, 1D form of the matrix.

Parameters: i : int Row index. j : int Column index. n : int Size of the square matrix (length of first or second dimension). ix : int
`allel.stats.distance.``condensed_coords_within`(pop, n)[source]

Return indices into a condensed distance matrix for all pairwise comparisons within the given population.

Parameters: pop : array_like, int Indices of samples or haplotypes within the population. n : int Size of the square matrix (length of first or second dimension). indices : ndarray, int
`allel.stats.distance.``condensed_coords_between`(pop1, pop2, n)[source]

Return indices into a condensed distance matrix for all pairwise comparisons between two populations.

Parameters: pop1 : array_like, int Indices of samples or haplotypes within the first population. pop2 : array_like, int Indices of samples or haplotypes within the second population. n : int Size of the square matrix (length of first or second dimension). indices : ndarray, int