# Feature Normalization¶

Warning

This section will likely be subject to larger changes and/or redesigns. It may be the case that the preprocessing functionalty will get out-sources into a back-end package (see #29).

This package contains a simple model called FeatureNormalizer, that can be used to normalize training and test data with the parameters computed from the training data.

x = collect(-5:.1:5)
X = [x x.^2 x.^3]'

# Derives the model from the given data
cs = fit(FeatureNormalizer, X)

# Normalizes the given data using the derived parameters
X_norm = predict(cs, X)
3x101 Array{Float64,2}:
-1.70647  -1.67235  -1.63822  -1.60409   …  1.56996  1.60409  1.63822  1.67235  1.70647
2.15985   2.03026   1.90328   1.77893      1.65719  1.77893  1.90328  2.03026  2.15985
-2.55607  -2.40576  -2.26145  -2.12303      1.99038  2.12303  2.26145  2.40576  2.55607

The underlying functions can also be used directly

## Centering¶

center!(X[, μ][, obsdim])

Center X along obsdim around the corresponding entry in the vector μ. In other words performs feature-wise centering.

Parameters: X (Array) – Feature matrix that should be centered in-place. μ (Vector) – Vector of means. If not specified then it defaults to the feature specific means. obsdim – Optional. If it makes sense for the type of X, then obsdim can be used to specify which dimension of X denotes the observations. It can be specified in a type-stable manner as a positional argument, or as a more convenient keyword parameter. See Observation Dimension for more information. Returns the parameters μ itself. μ = center!(X, μ)

## Rescaling¶

rescale!(X[, μ][, σ][, obsdim])

Center X along obsdim around the corresponding entry in the vector μ and then rescale each feature using the corresponding entry in the vector σ.

Parameters: X (Array) – Feature matrix that should be centered and rescaled in-place. μ (Vector) – Vector of means. If not specified then it defaults to the feature specific means. σ (Vector) – Vector of standard deviations. If not defaults to the feature specific standard deviations. obsdim – Optional. If it makes sense for the type of X, then obsdim can be used to specify which dimension of X denotes the observations. It can be specified in a type-stable manner as a positional argument, or as a more convenient keyword parameter. See Observation Dimension for more information. Returns the parameters μ and σ itself. μ, σ = rescale!(X, μ, σ)

## Basis Expansion¶

expand_poly(x[, degree])

Performs a polynomial basis expansion of the given degree for the vector x.

Parameters: x (Vector) – Feature vector that should be expanded. degree (Int) – The number of polynomes that should be augmented into the resulting matrix X Result of the expansion. A matrix of size (degree, length(x)). Note that all the features of X are centered and rescaled. X = expand_poly(x; degree = 5)