This is a first of a series of posts to support my lecture "speech processing with machine learning".
Focus is an introduction to topics related, mainly machine learning as i teach phoneticians which already know a lot about speech.
This page is the landing page which serves as a table of contents for the posts, i will try to introduce a meaningful order for the posts, but sequential read is not required. As said, it's introductory anyway and it's very easy to find much deeper posts on the net. E.g. here's a great list with pictures
Links that are marked with (nkulu) are for posts that use Nkululeko as a hands-on exercise.
- How does it work in general? -> learning from data
- Supervised or not? (nkulu): Main distintions for machine learning
- learning by example (Supervised)
- Unsupervised
- clustering
- representation learning/ Self-Supervised
- learning by interaction -> Reinforcement Learning
- Splits: test, train and dev (nkulu): How to learn what from data
- Evaluation (nkulu): Kinds of evaluation metrics
- Meta parameter tuning (nkulu): How to tune your predictor
- Augmentation (nkulu): Enhance generalization by adding altered training samples
- Feature normalization/scaling (nkulu): Shift the feature values to a common value range.
- Kinds of machine learning: A taxonomy of buzzwords around articial neural nets.
- Different machine learners: Introducing the most common approaches to machine learning
- Transformation architectures: Introducing the architectural differences od input/output processing