As described in this previous post, features scaling can be quite important in machine learning.
With nkululeko since version 0.97 you have a multitude if scaling methods at hand.
You simply state in the config:
[FEATS]
scale = xxx
For xxx you specify the scaling methods are
- standard: z-transformation (mean of 0 and std of 1) based on the training set
- robust: robust scaler
- speaker: like standard but based on individual speaker sets (also for the test)
- bins: convert feature values into 0, .5 and 1 (for low, mid and high)
- minmax: rescales the data set such that all feature values are in the range [0, 1]
- maxabs: similar to MinMaxScaler except that the values are mapped across several ranges depending on whether negative OR positive values are present
- normalizer: scales each sample (row) individually to have unit norm (e.g., L2 norm)
- powertransformer: applies a power transformation to each feature to make the data more Gaussian-like in order to stabilize variance and minimize skewness
- quantiletransformer: applies a non-linear transformation such that the probability density function of each feature will be mapped to a uniform or Gaussian distribution (range [0, 1])