Feature Normalization

Warning

This section will likely be subject to larger changes and/or redesigns. It may be the case that the preprocessing functionalty will get out-sources into a back-end package (see #29).

This package contains a simple model called FeatureNormalizer, that can be used to normalize training and test data with the parameters computed from the training data.

x = collect(-5:.1:5)
X = [x x.^2 x.^3]'

# Derives the model from the given data
cs = fit(FeatureNormalizer, X)

# Normalizes the given data using the derived parameters
X_norm = predict(cs, X)
3x101 Array{Float64,2}:
 -1.70647  -1.67235  -1.63822  -1.60409   …  1.56996  1.60409  1.63822  1.67235  1.70647
  2.15985   2.03026   1.90328   1.77893      1.65719  1.77893  1.90328  2.03026  2.15985
 -2.55607  -2.40576  -2.26145  -2.12303      1.99038  2.12303  2.26145  2.40576  2.55607

The underlying functions can also be used directly

Centering

center!(X[, μ][, obsdim])

Center X along obsdim around the corresponding entry in the vector μ. In other words performs feature-wise centering.

Parameters:
  • X (Array) – Feature matrix that should be centered in-place.
  • μ (Vector) – Vector of means. If not specified then it defaults to the feature specific means.
  • obsdim

    Optional. If it makes sense for the type of X, then obsdim can be used to specify which dimension of X denotes the observations. It can be specified in a type-stable manner as a positional argument, or as a more convenient keyword parameter. See Observation Dimension for more information.

Returns:

Returns the parameters μ itself.

μ = center!(X, μ)

Rescaling

rescale!(X[, μ][, σ][, obsdim])

Center X along obsdim around the corresponding entry in the vector μ and then rescale each feature using the corresponding entry in the vector σ.

Parameters:
  • X (Array) – Feature matrix that should be centered and rescaled in-place.
  • μ (Vector) – Vector of means. If not specified then it defaults to the feature specific means.
  • σ (Vector) – Vector of standard deviations. If not defaults to the feature specific standard deviations.
  • obsdim

    Optional. If it makes sense for the type of X, then obsdim can be used to specify which dimension of X denotes the observations. It can be specified in a type-stable manner as a positional argument, or as a more convenient keyword parameter. See Observation Dimension for more information.

Returns:

Returns the parameters μ and σ itself.

μ, σ = rescale!(X, μ, σ)

Basis Expansion

expand_poly(x[, degree])

Performs a polynomial basis expansion of the given degree for the vector x.

Parameters:
  • x (Vector) – Feature vector that should be expanded.
  • degree (Int) – The number of polynomes that should be augmented into the resulting matrix X
Returns:

Result of the expansion. A matrix of size (degree, length(x)). Note that all the features of X are centered and rescaled.

X = expand_poly(x; degree = 5)