These are exam preparation notes, subpar in quality and certainly not of divine quality.
In a classification problem the desired goal is to reduce the generalization error \(E^G\). Unfortunately during training it is only possible to evaluate the classifier against a limited amount of data - the test data set. Therefore we can only measure \(E^T\).
The problem we want to solve is to know how good our classifier actually is without additional data.
The statistical learning theory allows you to give an upper bound on the overfitting.
A good introduction into this upper bound and the upcoming topic can be found on Youtube.
There is the concept of the capacity of a classifier.