The BetaML.Utils Module

BetaML.Utils — Module

Utils module

Provide shared utility functions for various machine learning algorithms. You don't usually need to import from this module, as each other module (Nn, Perceptron, Clusters,...) reexport it.

source

Detailed API

Base.error — Method

error(ŷ,y) - Categorical error (Int vs Int)

source

Base.error — Method

error(ŷ,y) - Categorical error with probabilistic prediction of a single datapoint (PMF vs Int).

source

Base.error — Method

error(ŷ,y) - Categorical error with probabilistic predictions of a dataset (PMF vs Int).

source

Base.reshape — Method

reshape(myNumber, dims..) - Reshape a number as a n dimensional Array

source

BetaML.Utils.accuracy — Method

accuracy(ŷ,y) - Categorical accuracy (Int vs Int)

source

BetaML.Utils.accuracy — Method

accuracy(ŷ,y;tol)

Categorical accuracy with probabilistic prediction of a single datapoint (PMF vs Int).

Use the parameter tol [def: 1] to determine the tollerance of the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol = 1), or consider instead the set of tol maximum values.

source

BetaML.Utils.accuracy — Method

accuracy(ŷ,y;tol,ignoreLabels)

Categorical accuracy with probabilistic predictions of a dataset (PMF vs Int).

Parameters:

ŷ: An (N,K) matrix of probabilities that each $\hat y_n$ record with $n \in 1,....,N$ being of category $k$ with $k \in 1,...,K$.
y: The N array with the correct category for each point $n$.
tol: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol = 1), or consider instead the set of tol maximum values [def: 1].
ignoreLabels: Wheter to ignore the specific label order in y. Useful for unsupervised learning algorithms where the specific label order don't make sense [def: false]

source

BetaML.Utils.aic — Method

aic(lL,k) - Akaike information criterion (lower is better)

source

BetaML.Utils.autoJacobian — Method

autoJacobian(f,x;nY)

Evaluate the Jacobian using AD in the form of a (nY,nX) madrix of first derivatives

Parameters:

f: The function to compute the Jacobian
x: The input to the function where the jacobian has to be computed
nY: The number of outputs of the function f [def: length(f(x))]

Return values:

An Array{Float64,2} of the locally evaluated Jacobian

Notes:

The nY parameter is optional. If provided it avoids having to compute f(x)

source

BetaML.Utils.batch — Method

batch(n,bSize;sequential=false)

Return a vector of bSize indeces from 1 to n. Randomly unless the optional parameter sequential is used.

source

BetaML.Utils.bic — Method

bic(lL,k,n) - Bayesian information criterion (lower is better)

source

BetaML.Utils.celu — Method

celu(x; α=1)

https://arxiv.org/pdf/1704.07483.pdf

source

BetaML.Utils.cosine_distance — Method

Cosine distance

source

BetaML.Utils.dcelu — Method

dcelu(x; α=1)

https://arxiv.org/pdf/1704.07483.pdf

source

BetaML.Utils.delu — Method

delu(x; α=1) with α > 0

https://arxiv.org/pdf/1511.07289.pdf

source

BetaML.Utils.dmish — Method

dmish(x)

https://arxiv.org/pdf/1908.08681v1.pdf

source

BetaML.Utils.dplu — Method

dplu(x;α=0.1,c=1)

Piecewise Linear Unit derivative

https://arxiv.org/pdf/1809.09534.pdf

source

BetaML.Utils.drelu — Method

drelu(x)

Rectified Linear Unit

https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf

source

BetaML.Utils.dsigmoid — Method

dsigmoid(x)

source

BetaML.Utils.dsoftmax — Method

dsoftmax(x; β=1)

Derivative of the softmax function

https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/

source

BetaML.Utils.dsoftplus — Method

dsoftplus(x)

https://en.wikipedia.org/wiki/Rectifier(neuralnetworks)#Softplus

source

BetaML.Utils.dtanh — Method

dtanh(x)

source

BetaML.Utils.elu — Method

elu(x; α=1) with α > 0

https://arxiv.org/pdf/1511.07289.pdf

source

BetaML.Utils.getScaleFactors — Method

getScaleFactors(x;skip)

Return the scale factors (for each dimensions) in order to scale a matrix X (n,d) such that each dimension has mean 0 and variance 1.

Parameters

x: the (n × d) dimension matrix to scale on each dimension d
skip: an array of dimension index to skip the scaling [def: []]

Return

A touple whose first elmement is the shift and the second the multiplicative

term to make the scale.

source

BetaML.Utils.l1_distance — Method

L1 norm distance (aka Manhattan Distance)

source

BetaML.Utils.l2_distance — Method

Euclidean (L2) distance

source

BetaML.Utils.l2²_distance — Method

Squared Euclidean (L2) distance

source

BetaML.Utils.logNormalFixedSd — Method

log-PDF of a multidimensional normal with no covariance and shared variance across dimensions

source

BetaML.Utils.lse — Method

LogSumExp for efficiently computing log(sum(exp.(x)))

source

BetaML.Utils.makeMatrix — Method

Transform an Array{T,1} in an Array{T,2} and leave unchanged Array{T,2}.

source

BetaML.Utils.meanRelError — Method

meanRelError(ŷ,y;normDim=true,normRec=true,p=1)

Compute the mean relative error (l-1 based by default) between ŷ and y.

There are many ways to compute a mean relative error. In particular, if normRec (normDim) is set to true, the records (dimensions) are normalised, in the sense that it doesn't matter if a record (dimension) is bigger or smaller than the others, the relative error is first computed for each record (dimension) and then it is averaged. With both normDim and normRec set to false the function returns the relative mean error; with both set to true (default) it returns the mean relative error (i.e. with p=1 the "mean absolute percentage error (MAPE)") The parameter p [def: 1] controls the p-norm used to define the error.

source

BetaML.Utils.mish — Method

mish(x)

https://arxiv.org/pdf/1908.08681v1.pdf

source

BetaML.Utils.normalFixedSd — Method

PDF of a multidimensional normal with no covariance and shared variance across dimensions

source

BetaML.Utils.oneHotEncoder — Function

oneHotEncoder(y,d;count)

Encode arrays (or arrays of arrays) of integer data as 0/1 matrices

Parameters:

y: The data to convert (integer, array or array of arrays of integers)
d: The number of dimensions in the output matrik. [def: maximum(maximum.(Y))]
count: Wether to count multiple instances on the same dimension/record or indicate just presence. [def: false]

source

BetaML.Utils.pca — Method

pca(X;K,error)

Perform Principal Component Analysis returning the matrix reprojected among the dimensions of maximum variance.

Parameters:

X : The (N,D) data to reproject
K : The number of dimensions to maintain (with K<=D) [def: nothing]
error: The maximum approximation error that we are willing to accept [def: 0.05]

Return:

A named tuple with:
- X: The reprojected (NxK) matrix
- K: The dimensions retieved
- error: The actual proportion of variance not explained in the reprojected dimensions
- P: The (D,K) matrix of the eigenvectors associated to the K-largest eigenvalues used to reproject the data matrix

Note that if K is indicated, the parameter error has no effect.

source

BetaML.Utils.plu — Method

plu(x;α=0.1,c=1)

Piecewise Linear Unit

https://arxiv.org/pdf/1809.09534.pdf

source

BetaML.Utils.polynomialKernel — Method

Polynomial kernel parametrised with c=0 and d=2 (i.e. a quadratic kernel). For other cᵢ and dᵢ use K = (x,y) -> polynomialKernel(x,y,c=cᵢ,d=dᵢ) as kernel function in the supporting algorithms

source

BetaML.Utils.radialKernel — Method

Radial Kernel (aka RBF kernel) parametrised with γ=1/2. For other gammas γᵢ use K = (x,y) -> radialKernel(x,y,γ=γᵢ) as kernel function in the supporting algorithms

source

BetaML.Utils.relu — Method

relu(x)

Rectified Linear Unit

https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf

source

BetaML.Utils.scale — Function