# The BetaML.Perceptron Module

`BetaML.Perceptron`

— Module`Perceptron module`

Provide linear and kernel classifiers.

See a runnable example on myBinder

`perceptron`

: Train data using the classical perceptron`kernelPerceptron`

: Train data using the kernel perceptron`pegasos`

: Train data using the pegasos algorithm`predict`

: Predict data using parameters from one of the above algorithms

All algorithms are multiclass, with `perceptron`

and `pegasos`

employing a one-vs-all strategy, while `kernelPerceptron`

employs a *one-vs-one* approach, and return a "probability" for each class in term of a dictionary for each record. Use `mode(ŷ)`

to return a single class prediction per record.

The binary equivalent algorithms, accepting only `{-1,+1}`

labels, are available as `peceptronBinary`

, `kernelPerceptronBinary`

and `pegasosBinary`

. They are slighly faster as they don't need to be wrapped in the multi-class equivalent and return a more informative output.

The multi-class versions are available in the MLJ framework as `PerceptronClassifier`

,`KernelPerceptronClassifier`

and `PegasosClassifier`

respectivly.

## Module Index

`BetaML.Perceptron.kernelPerceptron`

`BetaML.Perceptron.kernelPerceptronBinary`

`BetaML.Perceptron.pegasos`

`BetaML.Perceptron.pegasosBinary`

`BetaML.Perceptron.perceptron`

`BetaML.Perceptron.perceptronBinary`

## Detailed API

`BetaML.Api.predict`

— Functionpredict(x,θ,θ₀)

Predict a binary label {-1,1} given the feature vector and the linear coefficients

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`θ`

: The trained parameters`θ₀`

: The trained bias barameter [def:`0`

]

**Return :**

`y`

: Vector of the predicted labels

**Example:**

`julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])`

`BetaML.Api.predict`

— Methodpredict(x,xtrain,ytrain,α;K)

Predict a binary label {-1,1} given the feature vector and the training data together with their errors (as trained by a kernel perceptron algorithm)

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`xtrain`

: The feature vectors used for the training`ytrain`

: The labels of the training set`α`

: The errors associated to each record`K`

: The kernel function used for the training and to be used for the prediction [def:`radialKernel`

]

**Return :**

`y`

: Vector of the predicted labels

**Example:**

`julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])`

`BetaML.Api.predict`

— Methodpredict(x,xtrain,ytrain,α,classes;K)

Predict a multiclass label given the new feature vector and a trained kernel perceptron model.

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`xtrain`

: A vector of the feature matrix used for training each of the one-vs-one class matches (i.e.`model.x`

)`ytrain`

: A vector of the label vector used for training each of the one-vs-one class matches (i.e.`model.y`

)`α`

: A vector of the errors associated to each record (i.e.`model.α`

)`classes`

: The overal classes encountered in training (i.e.`model.classes`

)`K`

: The kernel function used for the training and to be used for the prediction [def:`radialKernel`

]

**Return :**

`ŷ`

: Vector of dictionaries`label=>probability`

(warning: it isn't really a probability, it is just the standardized number of matches "won" by this class compared with the other classes)

**Notes:**

- Use
`mode(ŷ)`

if you want a single predicted label per record

**Example:**

```
julia> model = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict([10 10; 2.2 2.5],model.x,model.y,model.α, model.classes,K=model.K)
```

`BetaML.Api.predict`

— Methodpredict(x,θ,θ₀,classes)

Predict a multiclass label given the feature vector, the linear coefficients and the classes vector

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`θ`

: Vector of the trained parameters for each one-vs-all model (i.e.`model.θ`

)`θ₀`

: Vector of the trained bias barameter for each one-vs-all model (i.e.`model.θ₀`

)`classes`

: The overal classes encountered in training (i.e.`model.classes`

)

**Return :**

`ŷ`

: Vector of dictionaries`label=>probability`

**Notes:**

- Use
`mode(ŷ)`

if you want a single predicted label per record

**Example:**

```julia julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1]) julia> ŷtrain = predict([10 10; 2.5 2.5],model.θ,model.θ₀, model.classes)

`BetaML.Perceptron.kernelPerceptron`

— MethodkernelPerceptron(x,y;K,T,α,nMsgs,shuffle)

Train a multiclass kernel classifier "perceptron" algorithm based on x and y.

`kernelPerceptron`

is a (potentially) non-linear perceptron-style classifier employing user-defined kernel funcions. Multiclass is supported using a one-vs-one approach.

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`y`

: Associated labels of the training data, in the format of ⨦ 1`K`

: Kernel function to employ. See`?radialKernel`

or`?polynomialKernel`

for details or check`?BetaML.Utils`

to verify if other kernels are defined (you can alsways define your own kernel) [def:`radialKernel`

]`T`

: Maximum number of iterations (aka "epochs") across the whole set (if the set is not fully classified earlier) [def: 100]`α`

: Initial distribution of the errors [def:`zeros(length(y))`

]`nMsg`

: Maximum number of messages to show if all iterations are done [def:`0`

]`shuffle`

: Whether to randomly shuffle the data at each iteration [def:`false`

]`rng`

: Random Number Generator (see`FIXEDSEED`

) [deafult:`Random.GLOBAL_RNG`

]

**Return a named tuple with:**

`x`

: The x data (eventually shuffled if`shuffle=true`

)`y`

: The label`α`

: The errors associated to each record`classes`

: The labels classes encountered in the training

**Notes:**

- The trained model can then be used to make predictions using the function
`predict()`

. - This model is available in the MLJ framework as the
`KernelPerceptronClassifier`

**Example:**

```
julia> model = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict(xtrain,model.x,model.y,model.α, model.classes,K=model.K)
julia> ϵtrain = error(ytrain, mode(ŷtrain))
```

`BetaML.Perceptron.kernelPerceptronBinary`

— MethodkernelPerceptronBinary(x,y;K,T,α,nMsgs,shuffle)

Train a multiclass kernel classifier "perceptron" algorithm based on x and y

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`y`

: Associated labels of the training data, in the format of ⨦ 1`K`

: Kernel function to employ. See`?radialKernel`

or`?polynomialKernel`

for details or check`?BetaML.Utils`

to verify if other kernels are defined (you can alsways define your own kernel) [def:`radialKernel`

]`T`

: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]`α`

: Initial distribution of the errors [def:`zeros(length(y))`

]`nMsg`

: Maximum number of messages to show if all iterations are done`shuffle`

: Whether to randomly shuffle the data at each iteration [def:`false`

]`rng`

: Random Number Generator (see`FIXEDSEED`

) [deafult:`Random.GLOBAL_RNG`

]

**Return a named tuple with:**

`x`

: the x data (eventually shuffled if`shuffle=true`

)`y`

: the label`α`

: the errors associated to each record`errors`

: the number of errors in the last iteration`besterrors`

: the minimum number of errors in classifying the data ever reached`iterations`

: the actual number of iterations performed`separated`

: a flag if the data has been successfully separated

**Notes:**

- The trained data can then be used to make predictions using the function
`predict()`

.**If the option**.`shuffle`

has been used, it is important to use there the returned (x,y,α) as these would have been shuffle compared with the original (x,y) - Please see @kernelPerceptron for a multi-class version

**Example:**

`julia> model = kernelPerceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])`

`BetaML.Perceptron.pegasos`

— Methodpegasos(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)

Train the multiclass classifier "pegasos" algorithm according to x (features) and y (labels)

Pegasos is a *linear*, gradient-based classifier. Multiclass is supported using a one-vs-all approach.

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`y`

: Associated labels of the training data, can be in any format (string, integers..)`θ`

: Initial value of the weights (parameter) [def:`zeros(d)`

]`θ₀`

: Initial value of the weight (parameter) associated to the constant term [def:`0`

]`λ`

: Multiplicative term of the learning rate`η`

: Learning rate [def: (t -> 1/sqrt(t))]`T`

: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]`nMsg`

: Maximum number of messages to show if all iterations are done`shuffle`

: Whether to randomly shuffle the data at each iteration [def:`false`

]`forceOrigin`

: Whehter to force`θ₀`

to remain zero [def:`false`

]`returnMeanHyperplane`

: Whether to return the average hyperplane coefficients instead of the average ones [def:`false`

]`rng`

: Random Number Generator (see`FIXEDSEED`

) [deafult:`Random.GLOBAL_RNG`

]

**Return a named tuple with:**

`θ`

: The weights of the classifier`θ₀`

: The weight of the classifier associated to the constant term`classes`

: The classes (unique values) of y

**Notes:**

- The trained parameters can then be used to make predictions using the function
`predict()`

. - This model is available in the MLJ framework as the
`PegasosClassifier`

**Example:**

```
julia> model = pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)
```

`BetaML.Perceptron.pegasosBinary`

— Method`pegasosBinary(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin)`

Train the peagasos algorithm based on x and y (labels)

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`y`

: Associated labels of the training data, in the format of ⨦ 1`θ`

: Initial value of the weights (parameter) [def:`zeros(d)`

]`θ₀`

: Initial value of the weight (parameter) associated to the constant term [def:`0`

]`λ`

: Multiplicative term of the learning rate`η`

: Learning rate [def: (t -> 1/sqrt(t))]`T`

: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]`nMsg`

: Maximum number of messages to show if all iterations are done`shuffle`

: Whether to randomly shuffle the data at each iteration [def:`false`

]`forceOrigin`

: Whether to force`θ₀`

to remain zero [def:`false`

]

**Return a named tuple with:**

`θ`

: The final weights of the classifier`θ₀`

: The final weight of the classifier associated to the constant term`avgθ`

: The average weights of the classifier`avgθ₀`

: The average weight of the classifier associated to the constant term`errors`

: The number of errors in the last iteration`besterrors`

: The minimum number of errors in classifying the data ever reached`iterations`

: The actual number of iterations performed`separated`

: Weather the data has been successfully separated

**Notes:**

- The trained parameters can then be used to make predictions using the function
`predict()`

.

**Example:**

`julia> pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])`

`BetaML.Perceptron.perceptron`

— Method`perceptron(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)`

Train the multiclass classifier "perceptron" algorithm based on x and y (labels).

The perceptron is a *linear* classifier. Multiclass is supported using a one-vs-all approach.

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`y`

: Associated labels of the training data, can be in any format (string, integers..)`θ`

: Initial value of the weights (parameter) [def:`zeros(d)`

]`θ₀`

: Initial value of the weight (parameter) associated to the constant term [def:`0`

]`T`

: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]`nMsg`

: Maximum number of messages to show if all iterations are done [def:`0`

]`shuffle`

: Whether to randomly shuffle the data at each iteration [def:`false`

]`forceOrigin`

: Whether to force`θ₀`

to remain zero [def:`false`

]`returnMeanHyperplane`

: Whether to return the average hyperplane coefficients instead of the final ones [def:`false`

]`rng`

: Random Number Generator (see`FIXEDSEED`

) [deafult:`Random.GLOBAL_RNG`

]

**Return a named tuple with:**

`θ`

: The weights of the classifier`θ₀`

: The weight of the classifier associated to the constant term`classes`

: The classes (unique values) of y

**Notes:**

- The trained parameters can then be used to make predictions using the function
`predict()`

. - This model is available in the MLJ framework as the
`PerceptronClassifier`

**Example:**

```
julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)
```

`BetaML.Perceptron.perceptronBinary`

— Method`perceptronBinary(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin)`

Train the binary classifier "perceptron" algorithm based on x and y (labels)

**Parameters:**

`x`

: Feature matrix of the training data (n × d)`y`

: Associated labels of the training data, in the format of ⨦ 1`θ`

: Initial value of the weights (parameter) [def:`zeros(d)`

]`θ₀`

: Initial value of the weight (parameter) associated to the constant term [def:`0`

]`T`

: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]`nMsg`

: Maximum number of messages to show if all iterations are done`shuffle`

: Whether to randomly shuffle the data at each iteration [def:`false`

]`forceOrigin`

: Whether to force`θ₀`

to remain zero [def:`false`

]`rng`

: Random Number Generator (see`FIXEDSEED`

) [deafult:`Random.GLOBAL_RNG`

]

**Return a named tuple with:**

`θ`

: The final weights of the classifier`θ₀`

: The final weight of the classifier associated to the constant term`avgθ`

: The average weights of the classifier`avgθ₀`

: The average weight of the classifier associated to the constant term`errors`

: The number of errors in the last iteration`besterrors`

: The minimum number of errors in classifying the data ever reached`iterations`

: The actual number of iterations performed`separated`

: Weather the data has been successfully separated

**Notes:**

- The trained parameters can then be used to make predictions using the function
`predict()`

.

**Example:**

`julia> model = perceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])`