Skip to contents

Calculates a distance matrix from a matrix of probability distributions using Jensen-Shannon divergence. Adapted from https://enterotype.embl.de/.

Usage

jsd(M, pseudocount = 1e-06, normalizeCounts = FALSE)

Arguments

M

a probability distribution matrix, e.g., normalized transcript compatibility counts.

pseudocount

a small number to avoid division by zero errors.

normalizeCounts

logical, whether to attempt to normalize by dividing by the column sums. Set to TRUE if this is, e.g., a count matrix.

Value

A Jensen-Shannon divergence-based distance matrix.

Examples

set.seed(42)
M <- matrix(rpois(100, lambda=100), ncol=5)
colnames(M) <- paste0("sample", 1:5)
rownames(M) <- paste0("gene", 1:20)
Mnorm <- apply(M, 2, function(x) x/sum(x))
Mjsd <- jsd(Mnorm)
# equivalently
Mjsd <- jsd(M, normalizeCounts=TRUE)
Mjsd
#>            sample1    sample2    sample3    sample4
#> sample2 0.04351841                                 
#> sample3 0.06114582 0.04682573                      
#> sample4 0.05485535 0.04587213 0.04641441           
#> sample5 0.04869326 0.04697849 0.03939286 0.04894781
plot(hclust(Mjsd))