PAC Learnability

A hypothesis class H is Probably Approximately Correct (PAC) learnable with respect to a set Z:=X×Yand a loss function l:H×ZR+, if there exists a function mH:(0,1)2N and a learning algorithm with the following property: For every ϵ,δ(0,1) and for every distribution D over Z, running the learning algorithm on m>mH(ϵ,δ) i.i.d. examples generated by D returns a hypothesis hH such that, with probability of at least 1δ (over the choice of the m training examples),

LD(h)minhHLD(h)+ϵ,

where LD(h)=EzD[l(h,z)]. The function mH determines the sample complexity of learning H.

Under a realizability assumption (hH s.t. LD,f(h)=0), we can guarantee an absolute level of error instead of just a relative one w.r.t. some minimizer within the hypothesis class. For example, under the realizability assumption, every finite hypothesis class for a binary classification task is PAC learnable (LD(h)ϵ with probability 1δ) with sample complexity:

mH(ϵ,δ)log|H|δϵ
Links

Sources