The MLJ interface to BetaML Models
BetaML.Bmlj
— ModuleMLJ interface for BetaML models
In this module we define the interface of several BetaML models. They can be used using the MLJ framework.
Note that MLJ models (whose name could be the same as the underlying BetaML model) are not exported. You can access them with BetaML.Bmlj.ModelXYZ
.
Models available through MLJ
BetaML.Bmlj.mljverbosity_to_betaml_verbosity
BetaML.Bmlj.AutoEncoder
BetaML.Bmlj.DecisionTreeClassifier
BetaML.Bmlj.DecisionTreeRegressor
BetaML.Bmlj.GaussianMixtureClusterer
BetaML.Bmlj.GaussianMixtureImputer
BetaML.Bmlj.GaussianMixtureRegressor
BetaML.Bmlj.GeneralImputer
BetaML.Bmlj.KMeansClusterer
BetaML.Bmlj.KMedoidsClusterer
BetaML.Bmlj.KernelPerceptronClassifier
BetaML.Bmlj.MultitargetGaussianMixtureRegressor
BetaML.Bmlj.MultitargetNeuralNetworkRegressor
BetaML.Bmlj.NeuralNetworkClassifier
BetaML.Bmlj.NeuralNetworkRegressor
BetaML.Bmlj.PegasosClassifier
BetaML.Bmlj.PerceptronClassifier
BetaML.Bmlj.RandomForestClassifier
BetaML.Bmlj.RandomForestImputer
BetaML.Bmlj.RandomForestRegressor
BetaML.Bmlj.SimpleImputer
Detailed models documentation
BetaML.Bmlj.AutoEncoder
— Typemutable struct AutoEncoder <: MLJModelInterface.Unsupervised
A ready-to use AutoEncoder, from the Beta Machine Learning Toolkit (BetaML) for ecoding and decoding of data using neural networks
Parameters:
encoded_size
: The number of neurons (i.e. dimensions) of the encoded data. If the value is a float it is consiered a percentual (to be rounded) of the dimensionality of the data [def:0.33
]layers_size
: Inner layer dimension (i.e. number of neurons). If the value is a float it is considered a percentual (to be rounded) of the dimensionality of the data [def:nothing
that applies a specific heuristic]. Consider that the underlying neural network is trying to predict multiple values at the same times. Normally this requires many more neurons than a scalar prediction. Ife_layers
ord_layers
are specified, this parameter is ignored for the respective part.e_layers
: The layers (vector ofAbstractLayer
s) responsable of the encoding of the data [def:nothing
, i.e. two dense layers with the inner one oflayers_size
]. Seesubtypes(BetaML.AbstractLayer)
for supported layersd_layers
: The layers (vector ofAbstractLayer
s) responsable of the decoding of the data [def:nothing
, i.e. two dense layers with the inner one oflayers_size
]. Seesubtypes(BetaML.AbstractLayer)
for supported layersloss
: Loss (cost) function [def:BetaML.squared_cost
]. Should always assume y and ŷ as (n x d) matrices.Warning If you change the parameter
loss
, you need to either provide its derivative on the parameterdloss
or use autodiff withdloss=nothing
.
dloss
: Derivative of the loss function [def:BetaML.dsquared_cost
ifloss==squared_cost
,nothing
otherwise, i.e. use the derivative of the squared cost or autodiff]epochs
: Number of epochs, i.e. passages trough the whole training sample [def:200
]batch_size
: Size of each individual batch [def:8
]opt_alg
: The optimisation algorithm to update the gradient at each batch [def:BetaML.ADAM()
] Seesubtypes(BetaML.OptimisationAlgorithm)
for supported optimizersshuffle
: Whether to randomly shuffle the data at each iteration (epoch) [def:true
]tunemethod
: The method - and its parameters - to employ for hyperparameters autotuning. SeeSuccessiveHalvingSearch
for the default method. To implement automatic hyperparameter tuning during the (first)fit!
call simply setautotune=true
and eventually change the defaulttunemethod
options (including the parameter ranges, the resources to employ and the loss function to adopt).
descr
: An optional title and/or description for this modelrng
: Random Number Generator (seeFIXEDSEED
) [deafult:Random.GLOBAL_RNG
]
Notes:
- data must be numerical
- use
transform
to obtain the encoded data, andinverse_trasnform
to decode to the original data
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load AutoEncoder pkg = "BetaML" verbosity=0;
julia> model = modelType(encoded_size=2,layers_size=10);
julia> mach = machine(model, X)
untrained Machine; caches model-specific representations of data
model: AutoEncoder(e_layers = nothing, …)
args:
1: Source @334 ⏎ Table{AbstractVector{Continuous}}
julia> fit!(mach,verbosity=2)
[ Info: Training machine(AutoEncoder(e_layers = nothing, …), …).
***
*** Training for 200 epochs with algorithm BetaML.Nn.ADAM.
Training.. avg loss on epoch 1 (1): 35.48243542158747
Training.. avg loss on epoch 20 (20): 0.07528042222678126
Training.. avg loss on epoch 40 (40): 0.06293071729378613
Training.. avg loss on epoch 60 (60): 0.057035588828991145
Training.. avg loss on epoch 80 (80): 0.056313167754822875
Training.. avg loss on epoch 100 (100): 0.055521461091809436
Training the Neural Network... 52%|██████████████████████████████████████ | ETA: 0:00:01Training.. avg loss on epoch 120 (120): 0.06015206472927942
Training.. avg loss on epoch 140 (140): 0.05536835903285201
Training.. avg loss on epoch 160 (160): 0.05877560142428245
Training.. avg loss on epoch 180 (180): 0.05476302769966953
Training.. avg loss on epoch 200 (200): 0.049240864053557445
Training the Neural Network... 100%|█████████████████████████████████████████████████████████████████████████| Time: 0:00:01
Training of 200 epoch completed. Final epoch error: 0.049240864053557445.
trained Machine; caches model-specific representations of data
model: AutoEncoder(e_layers = nothing, …)
args:
1: Source @334 ⏎ Table{AbstractVector{Continuous}}
julia> X_latent = transform(mach, X)
150×2 Matrix{Float64}:
7.01701 -2.77285
6.50615 -2.9279
6.5233 -2.60754
⋮
6.70196 -10.6059
6.46369 -11.1117
6.20212 -10.1323
julia> X_recovered = inverse_transform(mach,X_latent)
150×4 Matrix{Float64}:
5.04973 3.55838 1.43251 0.242215
4.73689 3.19985 1.44085 0.295257
4.65128 3.25308 1.30187 0.244354
⋮
6.50077 2.93602 5.3303 1.87647
6.38639 2.83864 5.54395 2.04117
6.01595 2.67659 5.03669 1.83234
julia> BetaML.relative_mean_error(MLJ.matrix(X),X_recovered)
0.03387721261716176
BetaML.Bmlj.DecisionTreeClassifier
— Typemutable struct DecisionTreeClassifier <: MLJModelInterface.Probabilistic
A simple Decision Tree model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
max_depth::Int64
: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def:0
, i.e. no limits]min_gain::Float64
: The minimum information gain to allow for a node's partition [def:0
]min_records::Int64
: The minimum number of records a node must holds to consider for a partition of it [def:2
]max_features::Int64
: The maximum number of (random) features to consider at each partitioning [def:0
, i.e. look at all features]splitting_criterion::Function
: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def:gini
]. Eithergini
,entropy
or a custom function. It can also be an anonymous function.rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load DecisionTreeClassifier pkg = "BetaML" verbosity=0
BetaML.Trees.DecisionTreeClassifier
julia> model = modelType()
DecisionTreeClassifier(
max_depth = 0,
min_gain = 0.0,
min_records = 2,
max_features = 0,
splitting_criterion = BetaML.Utils.gini,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
[ Info: Training machine(DecisionTreeClassifier(max_depth = 0, …), …).
julia> cat_est = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:
UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
⋮
UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
BetaML.Bmlj.DecisionTreeRegressor
— Typemutable struct DecisionTreeRegressor <: MLJModelInterface.Deterministic
A simple Decision Tree model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
max_depth::Int64
: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def:0
, i.e. no limits]min_gain::Float64
: The minimum information gain to allow for a node's partition [def:0
]min_records::Int64
: The minimum number of records a node must holds to consider for a partition of it [def:2
]max_features::Int64
: The maximum number of (random) features to consider at each partitioning [def:0
, i.e. look at all features]splitting_criterion::Function
: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def:variance
]. Eithervariance
or a custom function. It can also be an anonymous function.rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_boston;
julia> modelType = @load DecisionTreeRegressor pkg = "BetaML" verbosity=0
BetaML.Trees.DecisionTreeRegressor
julia> model = modelType()
DecisionTreeRegressor(
max_depth = 0,
min_gain = 0.0,
min_records = 2,
max_features = 0,
splitting_criterion = BetaML.Utils.variance,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
[ Info: Training machine(DecisionTreeRegressor(max_depth = 0, …), …).
julia> ŷ = predict(mach, X);
julia> hcat(y,ŷ)
506×2 Matrix{Float64}:
24.0 26.35
21.6 21.6
34.7 34.8
⋮
23.9 23.75
22.0 22.2
11.9 13.2
BetaML.Bmlj.GaussianMixtureClusterer
— Typemutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised
A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
n_classes::Int64
: Number of mixtures (latent classes) to consider [def: 3]initial_probmixtures::AbstractVector{Float64}
: Initial probabilities of the categorical distribution (n_classes x 1) [default:[]
]mixtures::Union{Type, Vector{var"#s1393"} where var"#s1393"<:AbstractMixture}
: An array (of lengthn_classes
) of the mixtures to employ (see the?GMM
module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisation_strategy
parameter is set to "gived". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector ofn_classes
mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]
]tol::Float64
: Tolerance to stop the algorithm [default: 10^(-6)]minimum_variance::Float64
: Minimum variance for the mixtures [default: 0.05]minimum_covariance::Float64
: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).initialisation_strategy::String
: The computation method of the vector of the initial mixtures. One of the following:- "grid": using a grid approach
- "given": using the mixture provided in the fully qualified
mixtures
parameter - "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]
Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.
maximum_iterations::Int64
: Maximum number of iterations [def:typemax(Int64)
, i.e. ∞]rng::Random.AbstractRNG
: Random Number Generator [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load GaussianMixtureClusterer pkg = "BetaML" verbosity=0
BetaML.GMM.GaussianMixtureClusterer
julia> model = modelType()
GaussianMixtureClusterer(
n_classes = 3,
initial_probmixtures = Float64[],
mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)],
tol = 1.0e-6,
minimum_variance = 0.05,
minimum_covariance = 0.0,
initialisation_strategy = "kmeans",
maximum_iterations = 9223372036854775807,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(GaussianMixtureClusterer(n_classes = 3, …), …).
Iter. 1: Var. of the post 10.800150114964184 Log-likelihood -650.0186451891216
julia> classes_est = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.17e-15, 3=>2.1900000000000003e-31)
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>1.25e-13, 3=>5.87e-31)
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.5e-15, 3=>1.55e-32)
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>6.93e-14, 3=>3.37e-31)
⋮
UnivariateFinite{Multiclass{3}}(1=>5.39e-25, 2=>0.0167, 3=>0.983)
UnivariateFinite{Multiclass{3}}(1=>7.5e-29, 2=>0.000106, 3=>1.0)
UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)
BetaML.Bmlj.GaussianMixtureImputer
— Typemutable struct GaussianMixtureImputer <: MLJModelInterface.Unsupervised
Impute missing values using a probabilistic approach (Gaussian Mixture Models) fitted using the Expectation-Maximisation algorithm, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
n_classes::Int64
: Number of mixtures (latent classes) to consider [def: 3]initial_probmixtures::Vector{Float64}
: Initial probabilities of the categorical distribution (n_classes x 1) [default:[]
]mixtures::Union{Type, Vector{var"#s1393"} where var"#s1393"<:AbstractMixture}
: An array (of lengthn_classes
) of the mixtures to employ (see the [
?GMM](@ref GMM) module in BetaML). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the
initialisationstrategyparameter is set to "gived"
This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector ofn_classes
mixtures of the specified type. Note that mixing of different mixture types is not currently supported and that currently implemented mixtures are
SphericalGaussian,
DiagonalGaussianand
FullGaussian. [def:
DiagonalGaussian`]tol::Float64
: Tolerance to stop the algorithm [default: 10^(-6)]minimum_variance::Float64
: Minimum variance for the mixtures [default: 0.05]minimum_covariance::Float64
: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance.initialisation_strategy::String
: The computation method of the vector of the initial mixtures. One of the following:- "grid": using a grid approach
- "given": using the mixture provided in the fully qualified
mixtures
parameter - "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]
Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.
rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example :
julia> using MLJ
julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
julia> modelType = @load GaussianMixtureImputer pkg = "BetaML" verbosity=0
BetaML.Imputation.GaussianMixtureImputer
julia> model = modelType(initialisation_strategy="grid")
GaussianMixtureImputer(
n_classes = 3,
initial_probmixtures = Float64[],
mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)],
tol = 1.0e-6,
minimum_variance = 0.05,
minimum_covariance = 0.0,
initialisation_strategy = "grid",
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(GaussianMixtureImputer(n_classes = 3, …), …).
Iter. 1: Var. of the post 2.0225921341714286 Log-likelihood -42.96100103213314
julia> X_full = transform(mach) |> MLJ.matrix
9×2 Matrix{Float64}:
1.0 10.5
1.5 14.7366
1.8 8.0
1.7 15.0
3.2 40.0
2.51842 15.1747
3.3 38.0
2.47412 -2.3
5.2 -2.4
BetaML.Bmlj.GaussianMixtureRegressor
— Typemutable struct GaussianMixtureRegressor <: MLJModelInterface.Deterministic
A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.
This is the single-target version of the model. If you want to predict several labels (y) at once, use the MLJ model MultitargetGaussianMixtureRegressor
.
Hyperparameters:
n_classes::Int64
: Number of mixtures (latent classes) to consider [def: 3]initial_probmixtures::Vector{Float64}
: Initial probabilities of the categorical distribution (n_classes x 1) [default:[]
]mixtures::Union{Type, Vector{var"#s1393"} where var"#s1393"<:AbstractMixture}
: An array (of lengthn_classes
) of the mixtures to employ (see the [
?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the
initialisationstrategyparameter is set to "gived"
This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector ofn_classes
mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:
[DiagonalGaussian() for i in 1:n_classes]`]tol::Float64
: Tolerance to stop the algorithm [default: 10^(-6)]minimum_variance::Float64
: Minimum variance for the mixtures [default: 0.05]minimum_covariance::Float64
: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).initialisation_strategy::String
: The computation method of the vector of the initial mixtures. One of the following:- "grid": using a grid approach
- "given": using the mixture provided in the fully qualified
mixtures
parameter - "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]
Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.
maximum_iterations::Int64
: Maximum number of iterations [def:typemax(Int64)
, i.e. ∞]rng::Random.AbstractRNG
: Random Number Generator [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_boston;
julia> modelType = @load GaussianMixtureRegressor pkg = "BetaML" verbosity=0
BetaML.GMM.GaussianMixtureRegressor
julia> model = modelType()
GaussianMixtureRegressor(
n_classes = 3,
initial_probmixtures = Float64[],
mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)],
tol = 1.0e-6,
minimum_variance = 0.05,
minimum_covariance = 0.0,
initialisation_strategy = "kmeans",
maximum_iterations = 9223372036854775807,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
[ Info: Training machine(GaussianMixtureRegressor(n_classes = 3, …), …).
Iter. 1: Var. of the post 21.74887448784976 Log-likelihood -21687.09917379566
julia> ŷ = predict(mach, X)
506-element Vector{Float64}:
24.703442835305577
24.70344283512716
⋮
17.172486989759676
17.172486989759644
BetaML.Bmlj.GeneralImputer
— Typemutable struct GeneralImputer <: MLJModelInterface.Unsupervised
Impute missing values using arbitrary learning models, from the Beta Machine Learning Toolkit (BetaML).
Impute missing values using a vector (one per column) of arbitrary learning models (classifiers/regressors, not necessarily from BetaML) that implement the interface m = Model([options])
, train!(m,X,Y)
and predict(m,X)
.
Hyperparameters:
cols_to_impute::Union{String, Vector{Int64}}
: Columns in the matrix for which to create an imputation model, i.e. to impute. It can be a vector of columns IDs (positions), or the keywords "auto" (default) or "all". With "auto" the model automatically detects the columns with missing data and impute only them. You may manually specify the columns or use "all" if you want to create a imputation model for that columns during training even if all training data are non-missing to apply then the training model to further data with possibly missing values.estimator::Any
: An entimator model (regressor or classifier), with eventually its options (hyper-parameters), to be used to impute the various columns of the matrix. It can also be acols_to_impute
-length vector of different estimators to consider a different estimator for each column (dimension) to impute, for example when some columns are categorical (and will hence require a classifier) and some others are numerical (hence requiring a regressor). [default:nothing
, i.e. use BetaML random forests, handling classification and regression jobs automatically].missing_supported::Union{Bool, Vector{Bool}}
: Wheter the estimator(s) used to predict the missing data support itself missing data in the training features (X). If not, when the model for a certain dimension is fitted, dimensions with missing data in the same rows of those where imputation is needed are dropped and then only non-missing rows in the other remaining dimensions are considered. It can be a vector of boolean values to specify this property for each individual estimator or a single booleann value to apply to all the estimators [default:false
]fit_function::Union{Function, Vector{Function}}
: The function used by the estimator(s) to fit the model. It should take as fist argument the model itself, as second argument a matrix representing the features, and as third argument a vector representing the labels. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default:BetaML.fit!
]predict_function::Union{Function, Vector{Function}}
: The function used by the estimator(s) to predict the labels. It should take as fist argument the model itself and as second argument a matrix representing the features. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default:BetaML.predict
]recursive_passages::Int64
: Define the number of times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default:1
].rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]. Note that this influence only the specific GeneralImputer code, the individual estimators may have their own rng (or similar) parameter.
Examples :
- Using BetaML models:
julia> using MLJ;
julia> import BetaML # The library from which to get the individual estimators to be used for each column imputation
julia> X = ["a" 8.2;
"a" missing;
"a" 7.8;
"b" 21;
"b" 18;
"c" -0.9;
missing 20;
"c" -1.8;
missing -2.3;
"c" -2.4] |> table ;
julia> modelType = @load GeneralImputer pkg = "BetaML" verbosity=0
BetaML.Imputation.GeneralImputer
julia> model = modelType(estimator=BetaML.DecisionTreeEstimator(),recursive_passages=2);
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(GeneralImputer(cols_to_impute = auto, …), …).
julia> X_full = transform(mach) |> MLJ.matrix
10×2 Matrix{Any}:
"a" 8.2
"a" 8.0
"a" 7.8
"b" 21
"b" 18
"c" -0.9
"b" 20
"c" -1.8
"c" -2.3
"c" -2.4
- Using third party packages (in this example
DecisionTree
):
julia> using MLJ;
julia> import DecisionTree # An example of external estimators to be used for each column imputation
julia> X = ["a" 8.2;
"a" missing;
"a" 7.8;
"b" 21;
"b" 18;
"c" -0.9;
missing 20;
"c" -1.8;
missing -2.3;
"c" -2.4] |> table ;
julia> modelType = @load GeneralImputer pkg = "BetaML" verbosity=0
BetaML.Imputation.GeneralImputer
julia> model = modelType(estimator=[DecisionTree.DecisionTreeClassifier(),DecisionTree.DecisionTreeRegressor()], fit_function=DecisionTree.fit!,predict_function=DecisionTree.predict,recursive_passages=2);
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(GeneralImputer(cols_to_impute = auto, …), …).
julia> X_full = transform(mach) |> MLJ.matrix
10×2 Matrix{Any}:
"a" 8.2
"a" 7.51111
"a" 7.8
"b" 21
"b" 18
"c" -0.9
"b" 20
"c" -1.8
"c" -2.3
"c" -2.4
BetaML.Bmlj.KMeansClusterer
— Typemutable struct KMeansClusterer <: MLJModelInterface.Unsupervised
The classical KMeansClusterer clustering algorithm, from the Beta Machine Learning Toolkit (BetaML).
Parameters:
n_classes::Int64
: Number of classes to discriminate the data [def: 3]dist::Function
: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance
,l2_distance
,l2squared_distance
),cosine_distance
), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that, contrary toKMedoidsClusterer
, theKMeansClusterer
algorithm is not guaranteed to converge with other distances than the Euclidean one.initialisation_strategy::String
: The computation method of the vector of the initial representatives. One of the following:- "random": randomly in the X space
- "grid": using a grid approach
- "shuffle": selecting randomly within the available points [default]
- "given": using a provided set of initial representatives provided in the
initial_representatives
parameter
initial_representatives::Union{Nothing, Matrix{Float64}}
: Provided (K x D) matrix of initial representatives (useful only withinitialisation_strategy="given"
) [default:nothing
]rng::Random.AbstractRNG
: Random Number Generator [deafult:Random.GLOBAL_RNG
]
Notes:
- data must be numerical
- online fitting (re-fitting with new data) is supported
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load KMeansClusterer pkg = "BetaML" verbosity=0
BetaML.Clustering.KMeansClusterer
julia> model = modelType()
KMeansClusterer(
n_classes = 3,
dist = BetaML.Clustering.var"#34#36"(),
initialisation_strategy = "shuffle",
initial_representatives = nothing,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(KMeansClusterer(n_classes = 3, …), …).
julia> classes_est = predict(mach, X);
julia> hcat(y,classes_est)
150×2 CategoricalArrays.CategoricalArray{Union{Int64, String},2,UInt32}:
"setosa" 2
"setosa" 2
"setosa" 2
⋮
"virginica" 3
"virginica" 3
"virginica" 1
BetaML.Bmlj.KMedoidsClusterer
— Typemutable struct KMedoidsClusterer <: MLJModelInterface.Unsupervised
Parameters:
n_classes::Int64
: Number of classes to discriminate the data [def: 3]dist::Function
: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance
,l2_distance
,l2squared_distance
),cosine_distance
), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics.initialisation_strategy::String
: The computation method of the vector of the initial representatives. One of the following:- "random": randomly in the X space
- "grid": using a grid approach
- "shuffle": selecting randomly within the available points [default]
- "given": using a provided set of initial representatives provided in the
initial_representatives
parameter
initial_representatives::Union{Nothing, Matrix{Float64}}
: Provided (K x D) matrix of initial representatives (useful only withinitialisation_strategy="given"
) [default:nothing
]rng::Random.AbstractRNG
: Random Number Generator [deafult:Random.GLOBAL_RNG
]
The K-medoids clustering algorithm with customisable distance function, from the Beta Machine Learning Toolkit (BetaML).
Similar to K-Means, but the "representatives" (the cetroids) are guaranteed to be one of the training points. The algorithm work with any arbitrary distance measure.
Notes:
- data must be numerical
- online fitting (re-fitting with new data) is supported
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load KMedoidsClusterer pkg = "BetaML" verbosity=0
BetaML.Clustering.KMedoidsClusterer
julia> model = modelType()
KMedoidsClusterer(
n_classes = 3,
dist = BetaML.Clustering.var"#39#41"(),
initialisation_strategy = "shuffle",
initial_representatives = nothing,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(KMedoidsClusterer(n_classes = 3, …), …).
julia> classes_est = predict(mach, X);
julia> hcat(y,classes_est)
150×2 CategoricalArrays.CategoricalArray{Union{Int64, String},2,UInt32}:
"setosa" 3
"setosa" 3
"setosa" 3
⋮
"virginica" 1
"virginica" 1
"virginica" 2
BetaML.Bmlj.KernelPerceptronClassifier
— Typemutable struct KernelPerceptronClassifier <: MLJModelInterface.Probabilistic
The kernel perceptron algorithm using one-vs-one for multiclass, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
kernel::Function
: Kernel function to employ. See?radial_kernel
or?polynomial_kernel
(once loaded the BetaML package) for details or check?BetaML.Utils
to verify if other kernels are defined (you can alsways define your own kernel) [def:radial_kernel
]epochs::Int64
: Maximum number of epochs, i.e. passages trough the whole training sample [def:100
]initial_errors::Union{Nothing, Vector{Vector{Int64}}}
: Initial distribution of the number of errors errors [def:nothing
, i.e. zeros]. If provided, this should be a nModels-lenght vector of nRecords integer values vectors , where nModels is computed as(n_classes * (n_classes - 1)) / 2
shuffle::Bool
: Whether to randomly shuffle the data at each iteration (epoch) [def:true
]rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load KernelPerceptronClassifier pkg = "BetaML"
[ Info: For silent loading, specify `verbosity=0`.
import BetaML ✔
BetaML.Perceptron.KernelPerceptronClassifier
julia> model = modelType()
KernelPerceptronClassifier(
kernel = BetaML.Utils.radial_kernel,
epochs = 100,
initial_errors = nothing,
shuffle = true,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
julia> est_classes = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:
UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)
UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)
⋮
UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.245, virginica=>0.665)
UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.665, virginica=>0.245)
BetaML.Bmlj.MultitargetGaussianMixtureRegressor
— Typemutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic
A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.
This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor
.
Hyperparameters:
n_classes::Int64
: Number of mixtures (latent classes) to consider [def: 3]initial_probmixtures::Vector{Float64}
: Initial probabilities of the categorical distribution (n_classes x 1) [default:[]
]mixtures::Union{Type, Vector{var"#s1393"} where var"#s1393"<:AbstractMixture}
: An array (of lengthn_classes
) of the mixtures to employ (see the [
?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the
initialisationstrategyparameter is set to "gived"
This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector ofn_classes
mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:
[DiagonalGaussian() for i in 1:n_classes]`]tol::Float64
: Tolerance to stop the algorithm [default: 10^(-6)]minimum_variance::Float64
: Minimum variance for the mixtures [default: 0.05]minimum_covariance::Float64
: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).initialisation_strategy::String
: The computation method of the vector of the initial mixtures. One of the following:- "grid": using a grid approach
- "given": using the mixture provided in the fully qualified
mixtures
parameter - "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]
Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.
maximum_iterations::Int64
: Maximum number of iterations [def:typemax(Int64)
, i.e. ∞]rng::Random.AbstractRNG
: Random Number Generator [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_boston;
julia> ydouble = hcat(y, y .*2 .+5);
julia> modelType = @load MultitargetGaussianMixtureRegressor pkg = "BetaML" verbosity=0
BetaML.GMM.MultitargetGaussianMixtureRegressor
julia> model = modelType()
MultitargetGaussianMixtureRegressor(
n_classes = 3,
initial_probmixtures = Float64[],
mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)],
tol = 1.0e-6,
minimum_variance = 0.05,
minimum_covariance = 0.0,
initialisation_strategy = "kmeans",
maximum_iterations = 9223372036854775807,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, ydouble);
julia> fit!(mach);
[ Info: Training machine(MultitargetGaussianMixtureRegressor(n_classes = 3, …), …).
Iter. 1: Var. of the post 20.46947926187522 Log-likelihood -23662.72770575145
julia> ŷdouble = predict(mach, X)
506×2 Matrix{Float64}:
23.3358 51.6717
23.3358 51.6717
⋮
16.6843 38.3686
16.6843 38.3686
BetaML.Bmlj.MultitargetNeuralNetworkRegressor
— Typemutable struct MultitargetNeuralNetworkRegressor <: MLJModelInterface.Deterministic
A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of multiple dimensional targets.
Parameters:
layers
: Array of layer objects [def:nothing
, i.e. basic network]. Seesubtypes(BetaML.AbstractLayer)
for supported layersloss
: Loss (cost) function [def:BetaML.squared_cost
]. Should always assume y and ŷ as matrices.Warning If you change the parameter
loss
, you need to either provide its derivative on the parameterdloss
or use autodiff withdloss=nothing
.
dloss
: Derivative of the loss function [def:BetaML.dsquared_cost
, i.e. use the derivative of the squared cost]. Usenothing
for autodiff.epochs
: Number of epochs, i.e. passages trough the whole training sample [def:300
]batch_size
: Size of each individual batch [def:16
]opt_alg
: The optimisation algorithm to update the gradient at each batch [def:BetaML.ADAM()
]. Seesubtypes(BetaML.OptimisationAlgorithm)
for supported optimizersshuffle
: Whether to randomly shuffle the data at each iteration (epoch) [def:true
]descr
: An optional title and/or description for this modelcb
: A call back function to provide information during training [def:BetaML.fitting_info
]rng
: Random Number Generator (seeFIXEDSEED
) [deafult:Random.GLOBAL_RNG
]
Notes:
- data must be numerical
- the label should be a n-records by n-dimensions matrix
Example:
julia> using MLJ
julia> X, y = @load_boston;
julia> ydouble = hcat(y, y .*2 .+5);
julia> modelType = @load MultitargetNeuralNetworkRegressor pkg = "BetaML" verbosity=0
BetaML.Nn.MultitargetNeuralNetworkRegressor
julia> layers = [BetaML.DenseLayer(12,50,f=BetaML.relu),BetaML.DenseLayer(50,50,f=BetaML.relu),BetaML.DenseLayer(50,50,f=BetaML.relu),BetaML.DenseLayer(50,2,f=BetaML.relu)];
julia> model = modelType(layers=layers,opt_alg=BetaML.ADAM(),epochs=500)
MultitargetNeuralNetworkRegressor(
layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.2591582523441157 -0.027962845131416225 … 0.16044535560124418 -0.12838827994676857; -0.30381834909561184 0.2405495243851402 … -0.2588144861880588 0.09538577909777807; … ; -0.017320292924711156 -0.14042266424603767 … 0.06366999105841187 -0.13419651752478906; 0.07393079961409338 0.24521350531110264 … 0.04256867886217541 -0.0895506802948175], [0.14249427336553644, 0.24719379413682485, -0.25595911822556566, 0.10034088778965933, -0.017086404878505712, 0.21932184025609347, -0.031413516834861266, -0.12569076082247596, -0.18080140982481183, 0.14551901873323253 … -0.13321995621967364, 0.2436582233332092, 0.0552222336976439, 0.07000814133633904, 0.2280064379660025, -0.28885681475734193, -0.07414214246290696, -0.06783184733650621, -0.055318068046308455, -0.2573488383282579], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.0395424111703751 -0.22531232360829911 … -0.04341228943744482 0.024336206858365517; -0.16481887432946268 0.17798073384748508 … -0.18594039305095766 0.051159225856547474; … ; -0.011639475293705043 -0.02347011206244673 … 0.20508869536159186 -0.1158382446274592; -0.19078069527757857 -0.007487540070740484 … -0.21341165344291158 -0.24158671316310726], [-0.04283623889330032, 0.14924461547060602, -0.17039563392959683, 0.00907774027816255, 0.21738885963113852, -0.06308040225941691, -0.14683286822101105, 0.21726892197970937, 0.19784321784707126, -0.0344988665714947 … -0.23643089430602846, -0.013560425201427584, 0.05323948910726356, -0.04644175812567475, -0.2350400292671211, 0.09628312383424742, 0.07016420995205697, -0.23266392927140334, -0.18823664451487, 0.2304486691429084], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.11504184627266828 0.08601794194664503 … 0.03843129724045469 -0.18417305624127284; 0.10181551438831654 0.13459759904443674 … 0.11094951365942118 -0.1549466590355218; … ; 0.15279817525427697 0.0846661196058916 … -0.07993619892911122 0.07145402617285884; -0.1614160186346092 -0.13032002335149 … -0.12310552194729624 -0.15915773071049827], [-0.03435885900946367, -0.1198543931290306, 0.008454985905194445, -0.17980887188986966, -0.03557204910359624, 0.19125847393334877, -0.10949700778538696, -0.09343206702591, -0.12229583511781811, -0.09123969069220564 … 0.22119233518322862, 0.2053873143308657, 0.12756489387198222, 0.11567243705173319, -0.20982445664020496, 0.1595157838386987, -0.02087331046544119, -0.20556423263489765, -0.1622837764237961, -0.019220998739847395], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.25796717031347993 0.17579536633402948 … -0.09992960168785256 -0.09426177454620635; -0.026436330246675632 0.18070899284865127 … -0.19310119102392206 -0.06904005900252091], [0.16133004882307822, -0.3061228721091248], BetaML.Utils.relu, BetaML.Utils.drelu)],
loss = BetaML.Utils.squared_cost,
dloss = BetaML.Utils.dsquared_cost,
epochs = 500,
batch_size = 32,
opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var"#90#93"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]),
shuffle = true,
descr = "",
cb = BetaML.Nn.fitting_info,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, ydouble);
julia> fit!(mach);
julia> ŷdouble = predict(mach, X);
julia> hcat(ydouble,ŷdouble)
506×4 Matrix{Float64}:
24.0 53.0 28.4624 62.8607
21.6 48.2 22.665 49.7401
34.7 74.4 31.5602 67.9433
33.4 71.8 33.0869 72.4337
⋮
23.9 52.8 23.3573 50.654
22.0 49.0 22.1141 48.5926
11.9 28.8 19.9639 45.5823
BetaML.Bmlj.NeuralNetworkClassifier
— Typemutable struct NeuralNetworkClassifier <: MLJModelInterface.Probabilistic
A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for classification problems.
Parameters:
layers
: Array of layer objects [def:nothing
, i.e. basic network]. Seesubtypes(BetaML.AbstractLayer)
for supported layers. The last "softmax" layer is automatically added.loss
: Loss (cost) function [def:BetaML.crossentropy
]. Should always assume y and ŷ as matrices.Warning If you change the parameter
loss
, you need to either provide its derivative on the parameterdloss
or use autodiff withdloss=nothing
.
dloss
: Derivative of the loss function [def:BetaML.dcrossentropy
, i.e. the derivative of the cross-entropy]. Usenothing
for autodiff.epochs
: Number of epochs, i.e. passages trough the whole training sample [def:200
]batch_size
: Size of each individual batch [def:16
]opt_alg
: The optimisation algorithm to update the gradient at each batch [def:BetaML.ADAM()
]. Seesubtypes(BetaML.OptimisationAlgorithm)
for supported optimizersshuffle
: Whether to randomly shuffle the data at each iteration (epoch) [def:true
]descr
: An optional title and/or description for this modelcb
: A call back function to provide information during training [def:BetaML.fitting_info
]categories
: The categories to represent as columns. [def:nothing
, i.e. unique training values].handle_unknown
: How to handle categories not seens in training or not present in the providedcategories
array? "error" (default) rises an error, "infrequent" adds a specific column for these categories.other_categories_name
: Which value during prediction to assign to this "other" category (i.e. categories not seen on training or not present in the providedcategories
array? [def:nothing
, i.e. typemax(Int64) for integer vectors and "other" for other types]. This setting is active only ifhandle_unknown="infrequent"
and in that case it MUST be specified if Y is neither integer or stringsrng
: Random Number Generator [deafult:Random.GLOBAL_RNG
]
Notes:
- data must be numerical
- the label should be a n-records by n-dimensions matrix (e.g. a one-hot-encoded data for classification), where the output columns should be interpreted as the probabilities for each categories.
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load NeuralNetworkClassifier pkg = "BetaML" verbosity=0
BetaML.Nn.NeuralNetworkClassifier
julia> layers = [BetaML.DenseLayer(4,8,f=BetaML.relu),BetaML.DenseLayer(8,8,f=BetaML.relu),BetaML.DenseLayer(8,3,f=BetaML.relu),BetaML.VectorFunctionLayer(3,f=BetaML.softmax)];
julia> model = modelType(layers=layers,opt_alg=BetaML.ADAM())
NeuralNetworkClassifier(
layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.376173352338049 0.7029289511758696 -0.5589563304592478 -0.21043274001651874; 0.044758889527899415 0.6687689636685921 0.4584331114653877 0.6820506583840453; … ; -0.26546358457167507 -0.28469736227283804 -0.164225549922154 -0.516785639164486; -0.5146043550684141 -0.0699113265130964 0.14959906603941908 -0.053706860039406834], [0.7003943613125758, -0.23990840466587576, -0.23823126271387746, 0.4018101580410387, 0.2274483050356888, -0.564975060667734, 0.1732063297031089, 0.11880299829896945], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.029467850439546583 0.4074661266592745 … 0.36775675246760053 -0.595524555448422; 0.42455597698371306 -0.2458082732997091 … -0.3324220683462514 0.44439454998610595; … ; -0.2890883863364267 -0.10109249362508033 … -0.0602680568207582 0.18177278845097555; -0.03432587226449335 -0.4301192922760063 … 0.5646018168286626 0.47269177680892693], [0.13777442835428688, 0.5473306726675433, 0.3781939472904011, 0.24021813428130567, -0.0714779477402877, -0.020386373530818958, 0.5465466618404464, -0.40339790713616525], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([0.6565120540082393 0.7139211611842745 … 0.07809812467915389 -0.49346311403373844; -0.4544472987041656 0.6502667641568863 … 0.43634608676548214 0.7213049952968921; 0.41212264783075303 -0.21993289366360613 … 0.25365007887755064 -0.5664469566269569], [-0.6911986792747682, -0.2149343209329364, -0.6347727539063817], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.VectorFunctionLayer{0}(fill(NaN), 3, 3, BetaML.Utils.softmax, BetaML.Utils.dsoftmax, nothing)],
loss = BetaML.Utils.crossentropy,
dloss = BetaML.Utils.dcrossentropy,
epochs = 100,
batch_size = 32,
opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var"#90#93"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]),
shuffle = true,
descr = "",
cb = BetaML.Nn.fitting_info,
categories = nothing,
handle_unknown = "error",
other_categories_name = nothing,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
julia> classes_est = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:
UnivariateFinite{Multiclass{3}}(setosa=>0.575, versicolor=>0.213, virginica=>0.213)
UnivariateFinite{Multiclass{3}}(setosa=>0.573, versicolor=>0.213, virginica=>0.213)
⋮
UnivariateFinite{Multiclass{3}}(setosa=>0.236, versicolor=>0.236, virginica=>0.529)
UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.254, virginica=>0.492)
BetaML.Bmlj.NeuralNetworkRegressor
— Typemutable struct NeuralNetworkRegressor <: MLJModelInterface.Deterministic
A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of a single dimensional target.
Parameters:
layers
: Array of layer objects [def:nothing
, i.e. basic network]. Seesubtypes(BetaML.AbstractLayer)
for supported layersloss
: Loss (cost) function [def:BetaML.squared_cost
]. Should always assume y and ŷ as matrices, even if the regression task is 1-DWarning If you change the parameter
loss
, you need to either provide its derivative on the parameterdloss
or use autodiff withdloss=nothing
.
dloss
: Derivative of the loss function [def:BetaML.dsquared_cost
, i.e. use the derivative of the squared cost]. Usenothing
for autodiff.epochs
: Number of epochs, i.e. passages trough the whole training sample [def:200
]batch_size
: Size of each individual batch [def:16
]opt_alg
: The optimisation algorithm to update the gradient at each batch [def:BetaML.ADAM()
]. Seesubtypes(BetaML.OptimisationAlgorithm)
for supported optimizersshuffle
: Whether to randomly shuffle the data at each iteration (epoch) [def:true
]descr
: An optional title and/or description for this modelcb
: A call back function to provide information during training [def:fitting_info
]rng
: Random Number Generator (seeFIXEDSEED
) [deafult:Random.GLOBAL_RNG
]
Notes:
- data must be numerical
- the label should be be a n-records vector.
Example:
julia> using MLJ
julia> X, y = @load_boston;
julia> modelType = @load NeuralNetworkRegressor pkg = "BetaML" verbosity=0
BetaML.Nn.NeuralNetworkRegressor
julia> layers = [BetaML.DenseLayer(12,20,f=BetaML.relu),BetaML.DenseLayer(20,20,f=BetaML.relu),BetaML.DenseLayer(20,1,f=BetaML.relu)];
julia> model = modelType(layers=layers,opt_alg=BetaML.ADAM());
NeuralNetworkRegressor(
layers = BetaML.Nn.AbstractLayer[BetaML.Nn.DenseLayer([-0.23249759178069676 -0.4125090172711131 … 0.41401934928739 -0.33017881111237535; -0.27912169279319965 0.270551221249931 … 0.19258414323473344 0.1703002982374256; … ; 0.31186742456482447 0.14776438287394805 … 0.3624993442655036 0.1438885872964824; 0.24363744610286758 -0.3221033024934767 … 0.14886090419299408 0.038411663101909355], [-0.42360286004241765, -0.34355377040029594, 0.11510963232946697, 0.29078650404397893, -0.04940236502546075, 0.05142849152316714, -0.177685375947775, 0.3857630523957018, -0.25454667127064756, -0.1726731848206195, 0.29832456225553444, -0.21138505291162835, -0.15763643112604903, -0.08477044513587562, -0.38436681165349196, 0.20538016429104916, -0.25008157754468335, 0.268681800562054, 0.10600581996650865, 0.4262194464325672], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.08534180387478185 0.19659398307677617 … -0.3413633217504578 -0.0484925247381256; 0.0024419192794883915 -0.14614102508129 … -0.21912059923003044 0.2680725396694708; … ; 0.25151545823147886 -0.27532269951606037 … 0.20739970895058063 0.2891938885916349; -0.1699020711688904 -0.1350423717084296 … 0.16947589410758873 0.3629006047373296], [0.2158116357688406, -0.3255582642532289, -0.057314442103850394, 0.29029696770539953, 0.24994080694366455, 0.3624239027782297, -0.30674318230919984, -0.3854738338935017, 0.10809721838554087, 0.16073511121016176, -0.005923262068960489, 0.3157147976348795, -0.10938918304264739, -0.24521229198853187, -0.307167732178712, 0.0808907777008302, -0.014577497150872254, -0.0011287181458157214, 0.07522282588658086, 0.043366500526073104], BetaML.Utils.relu, BetaML.Utils.drelu), BetaML.Nn.DenseLayer([-0.021367697115938555 -0.28326652172347155 … 0.05346175368370165 -0.26037328415871647], [-0.2313659199724562], BetaML.Utils.relu, BetaML.Utils.drelu)],
loss = BetaML.Utils.squared_cost,
dloss = BetaML.Utils.dsquared_cost,
epochs = 100,
batch_size = 32,
opt_alg = BetaML.Nn.ADAM(BetaML.Nn.var"#90#93"(), 1.0, 0.9, 0.999, 1.0e-8, BetaML.Nn.Learnable[], BetaML.Nn.Learnable[]),
shuffle = true,
descr = "",
cb = BetaML.Nn.fitting_info,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
julia> ŷ = predict(mach, X);
julia> hcat(y,ŷ)
506×2 Matrix{Float64}:
24.0 30.7726
21.6 28.0811
34.7 31.3194
⋮
23.9 30.9032
22.0 29.49
11.9 27.2438
BetaML.Bmlj.PegasosClassifier
— Typemutable struct PegasosClassifier <: MLJModelInterface.Probabilistic
The gradient-based linear "pegasos" classifier using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
initial_coefficients::Union{Nothing, Matrix{Float64}}
: N-classes by D-dimensions matrix of initial linear coefficients [def:nothing
, i.e. zeros]initial_constant::Union{Nothing, Vector{Float64}}
: N-classes vector of initial contant terms [def:nothing
, i.e. zeros]learning_rate::Function
: Learning rate [def: (epoch -> 1/sqrt(epoch))]learning_rate_multiplicative::Float64
: Multiplicative term of the learning rate [def:0.5
]epochs::Int64
: Maximum number of epochs, i.e. passages trough the whole training sample [def:1000
]shuffle::Bool
: Whether to randomly shuffle the data at each iteration (epoch) [def:true
]force_origin::Bool
: Whether to force the parameter associated with the constant term to remain zero [def:false
]return_mean_hyperplane::Bool
: Whether to return the average hyperplane coefficients instead of the final ones [def:false
]rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load PegasosClassifier pkg = "BetaML" verbosity=0
BetaML.Perceptron.PegasosClassifier
julia> model = modelType()
PegasosClassifier(
initial_coefficients = nothing,
initial_constant = nothing,
learning_rate = BetaML.Perceptron.var"#71#73"(),
learning_rate_multiplicative = 0.5,
epochs = 1000,
shuffle = true,
force_origin = false,
return_mean_hyperplane = false,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
julia> est_classes = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:
UnivariateFinite{Multiclass{3}}(setosa=>0.817, versicolor=>0.153, virginica=>0.0301)
UnivariateFinite{Multiclass{3}}(setosa=>0.791, versicolor=>0.177, virginica=>0.0318)
⋮
UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.5, virginica=>0.246)
UnivariateFinite{Multiclass{3}}(setosa=>0.283, versicolor=>0.51, virginica=>0.207)
BetaML.Bmlj.PerceptronClassifier
— Typemutable struct PerceptronClassifier <: MLJModelInterface.Probabilistic
The classical perceptron algorithm using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
initial_coefficients::Union{Nothing, Matrix{Float64}}
: N-classes by D-dimensions matrix of initial linear coefficients [def:nothing
, i.e. zeros]initial_constant::Union{Nothing, Vector{Float64}}
: N-classes vector of initial contant terms [def:nothing
, i.e. zeros]epochs::Int64
: Maximum number of epochs, i.e. passages trough the whole training sample [def:1000
]shuffle::Bool
: Whether to randomly shuffle the data at each iteration (epoch) [def:true
]force_origin::Bool
: Whether to force the parameter associated with the constant term to remain zero [def:false
]return_mean_hyperplane::Bool
: Whether to return the average hyperplane coefficients instead of the final ones [def:false
]rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load PerceptronClassifier pkg = "BetaML"
[ Info: For silent loading, specify `verbosity=0`.
import BetaML ✔
BetaML.Perceptron.PerceptronClassifier
julia> model = modelType()
PerceptronClassifier(
initial_coefficients = nothing,
initial_constant = nothing,
epochs = 1000,
shuffle = true,
force_origin = false,
return_mean_hyperplane = false,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
[ Info: Training machine(PerceptronClassifier(initial_coefficients = nothing, …), …).
*** Avg. error after epoch 2 : 0.0 (all elements of the set has been correctly classified)
julia> est_classes = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt8, Float64}:
UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>2.53e-34, virginica=>0.0)
UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>1.27e-18, virginica=>1.86e-310)
⋮
UnivariateFinite{Multiclass{3}}(setosa=>2.77e-57, versicolor=>1.1099999999999999e-82, virginica=>1.0)
UnivariateFinite{Multiclass{3}}(setosa=>3.09e-22, versicolor=>4.03e-25, virginica=>1.0)
BetaML.Bmlj.RandomForestClassifier
— Typemutable struct RandomForestClassifier <: MLJModelInterface.Probabilistic
A simple Random Forest model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
n_trees::Int64
max_depth::Int64
: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def:0
, i.e. no limits]min_gain::Float64
: The minimum information gain to allow for a node's partition [def:0
]min_records::Int64
: The minimum number of records a node must holds to consider for a partition of it [def:2
]max_features::Int64
: The maximum number of (random) features to consider at each partitioning [def:0
, i.e. square root of the data dimensions]splitting_criterion::Function
: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def:gini
]. Eithergini
,entropy
or a custom function. It can also be an anonymous function.β::Float64
: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def:0
, i.e. uniform weigths]rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example :
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load RandomForestClassifier pkg = "BetaML" verbosity=0
BetaML.Trees.RandomForestClassifier
julia> model = modelType()
RandomForestClassifier(
n_trees = 30,
max_depth = 0,
min_gain = 0.0,
min_records = 2,
max_features = 0,
splitting_criterion = BetaML.Utils.gini,
β = 0.0,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
[ Info: Training machine(RandomForestClassifier(n_trees = 30, …), …).
julia> cat_est = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:
UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
⋮
UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0667, virginica=>0.933)
BetaML.Bmlj.RandomForestImputer
— Typemutable struct RandomForestImputer <: MLJModelInterface.Unsupervised
Impute missing values using Random Forests, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
n_trees::Int64
: Number of (decision) trees in the forest [def:30
]max_depth::Union{Nothing, Int64}
: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def:nothing
, i.e. no limits]min_gain::Float64
: The minimum information gain to allow for a node's partition [def:0
]min_records::Int64
: The minimum number of records a node must holds to consider for a partition of it [def:2
]max_features::Union{Nothing, Int64}
: The maximum number of (random) features to consider at each partitioning [def:nothing
, i.e. square root of the data dimension]forced_categorical_cols::Vector{Int64}
: Specify the positions of the integer columns to treat as categorical instead of cardinal. [Default: empty vector (all numerical cols are treated as cardinal by default and the others as categorical)]splitting_criterion::Union{Nothing, Function}
: Eithergini
,entropy
orvariance
. This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def:nothing
, i.e.gini
for categorical labels (classification task) andvariance
for numerical labels(regression task)]. It can be an anonymous function.recursive_passages::Int64
: Define the times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default:1
].rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
julia> modelType = @load RandomForestImputer pkg = "BetaML" verbosity=0
BetaML.Imputation.RandomForestImputer
julia> model = modelType(n_trees=40)
RandomForestImputer(
n_trees = 40,
max_depth = nothing,
min_gain = 0.0,
min_records = 2,
max_features = nothing,
forced_categorical_cols = Int64[],
splitting_criterion = nothing,
recursive_passages = 1,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(RandomForestImputer(n_trees = 40, …), …).
julia> X_full = transform(mach) |> MLJ.matrix
9×2 Matrix{Float64}:
1.0 10.5
1.5 10.3909
1.8 8.0
1.7 15.0
3.2 40.0
2.88375 8.66125
3.3 38.0
3.98125 -2.3
5.2 -2.4
BetaML.Bmlj.RandomForestRegressor
— Typemutable struct RandomForestRegressor <: MLJModelInterface.Deterministic
A simple Random Forest model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
n_trees::Int64
: Number of (decision) trees in the forest [def:30
]max_depth::Int64
: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def:0
, i.e. no limits]min_gain::Float64
: The minimum information gain to allow for a node's partition [def:0
]min_records::Int64
: The minimum number of records a node must holds to consider for a partition of it [def:2
]max_features::Int64
: The maximum number of (random) features to consider at each partitioning [def:0
, i.e. square root of the data dimension]splitting_criterion::Function
: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def:variance
]. Eithervariance
or a custom function. It can also be an anonymous function.β::Float64
: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def:0
, i.e. uniform weigths]rng::Random.AbstractRNG
: A Random Number Generator to be used in stochastic parts of the code [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_boston;
julia> modelType = @load RandomForestRegressor pkg = "BetaML" verbosity=0
BetaML.Trees.RandomForestRegressor
julia> model = modelType()
RandomForestRegressor(
n_trees = 30,
max_depth = 0,
min_gain = 0.0,
min_records = 2,
max_features = 0,
splitting_criterion = BetaML.Utils.variance,
β = 0.0,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X, y);
julia> fit!(mach);
[ Info: Training machine(RandomForestRegressor(n_trees = 30, …), …).
julia> ŷ = predict(mach, X);
julia> hcat(y,ŷ)
506×2 Matrix{Float64}:
24.0 25.8433
21.6 22.4317
34.7 35.5742
33.4 33.9233
⋮
23.9 24.42
22.0 22.4433
11.9 15.5833
BetaML.Bmlj.SimpleImputer
— Typemutable struct SimpleImputer <: MLJModelInterface.Unsupervised
Impute missing values using feature (column) mean, with optional record normalisation (using l-norm
norms), from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
statistic::Function
: The descriptive statistic of the column (feature) to use as imputed value [def:mean
]norm::Union{Nothing, Int64}
: Normalise the feature mean by l-norm
norm of the records [default:nothing
]. Use it (e.g.norm=1
to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).
Example:
julia> using MLJ
julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
julia> modelType = @load SimpleImputer pkg = "BetaML" verbosity=0
BetaML.Imputation.SimpleImputer
julia> model = modelType(norm=1)
SimpleImputer(
statistic = Statistics.mean,
norm = 1)
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(SimpleImputer(statistic = mean, …), …).
julia> X_full = transform(mach) |> MLJ.matrix
9×2 Matrix{Float64}:
1.0 10.5
1.5 0.295466
1.8 8.0
1.7 15.0
3.2 40.0
0.280952 1.69524
3.3 38.0
0.0750839 -2.3
5.2 -2.4
BetaML.Bmlj.mljverbosity_to_betaml_verbosity
— Methodmljverbosity_to_betaml_verbosity(i::Integer) -> Verbosity
Convert any integer (short scale) to one of the defined betaml verbosity levels Currently "steps" are 0, 1, 2 and 3
MLJModelInterface.fit
— Methodfit(
m::BetaML.Bmlj.AutoEncoder,
verbosity,
X
) -> Tuple{AutoEncoder, Nothing, Nothing}
For the verbosity
parameter see Verbosity
)
MLJModelInterface.fit
— Methodfit(
m::BetaML.Bmlj.MultitargetNeuralNetworkRegressor,
verbosity,
X,
y
) -> Tuple{NeuralNetworkEstimator, Nothing, Nothing}
For the verbosity
parameter see Verbosity
)
MLJModelInterface.fit
— MethodMMI.fit(model::NeuralNetworkClassifier, verbosity, X, y)
For the verbosity
parameter see Verbosity
)
MLJModelInterface.fit
— Methodfit(
m::BetaML.Bmlj.NeuralNetworkRegressor,
verbosity,
X,
y
) -> Tuple{NeuralNetworkEstimator, Nothing, Nothing}
For the verbosity
parameter see Verbosity
)
MLJModelInterface.predict
— Methodpredict(m::KMeansClusterer, fitResults, X) - Given a fitted clustering model and some observations, predict the class of the observation
MLJModelInterface.transform
— Methodtransform(m, fitResults, X)
Given a trained imputator model fill the missing data of some new observations. Note that with multiple recursive imputations and inner estimators that don't support missing data, this function works only for X for which th model has been trained with, i.e. this function can not be applied to new matrices with empty values using model trained on other matrices.
MLJModelInterface.transform
— Methodtransform(m, fitResults, X) - Given a trained imputator model fill the missing data of some new observations
MLJModelInterface.transform
— Methodfit(m::KMeansClusterer, fitResults, X) - Given a fitted clustering model and some observations, return the distances to each centroids