The variance is the property of the predicted model f`(x) only. It has nothing to do with the function being estimated i.e. f(x). It tells us how much variability does our candidate function (in the case above polynomial of degreen n) has under different training dataset. Hence the correct variance expression is E( (f`(x) — E(f`(x)))² ).

The complete derivation can be found at the following link

https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff#Derivation

PhD candidate @ GeorgiaTech | ML Engineer | Writer | Researcher | Traveler | www.aqeel-anwar.com | www.twitter.com/_aqeelanwar

PhD candidate @ GeorgiaTech | ML Engineer | Writer | Researcher | Traveler | www.aqeel-anwar.com | www.twitter.com/_aqeelanwar