This is a first of a series of posts to support my lecture "speech processing with machine learning".
Focus is an introduction to topics related, mainly machine learning as i teach phoneticians which already know a lot about speech.
This page is the landing page which serves as a table of contents for the posts, i will try to introduce a meaningful order for the posts, but sequential read is not required. As said, it's introductory anyway and it's very easy to find much deeper posts on the net. E.g. here's a great list with pictures
You can try some of these concepts with nkululeko with these Google Colab scripts
Links that are marked with (nkulu) are for posts that use Nkululeko as a hands-on exercise.
- How does it work in general? -> learning from data
- Supervised or not? (nkulu): Main distintions for machine learning
- learning by example (Supervised)
- Unsupervised
- clustering
- representation learning/ Self-Supervised
- learning by interaction -> Reinforcement Learning
- Splits: test, train and dev (nkulu): How to learn what from data
- Evaluation (nkulu): Kinds of evaluation metrics
- Meta parameter tuning (nkulu): How to tune your predictor
- Augmentation (nkulu): Enhance generalization by adding altered training samples
- Feature normalization/scaling (nkulu): Shift the feature values to a common value range.
- Kinds of machine learning: A taxonomy of buzzwords around articial neural nets.
- Different machine learners: Introducing the most common approaches to machine learning
- Transformation architectures: Introducing the architectural differences od input/output processing