Quizzes for Maching Learning Course CS405 and CS329 at SUSTech. (Before 2024 Fall Semester)
Quiz 1
Question
y=Ax+v, where v is a Gaussian noise.
- What is the optimal solution for x?
- What is the optimal solution for x if v∼N(0,R)?
- What is the optimal solution for x if v∼N(0,R) and X∼N(0,aI)?
- A and X are unknown, what is the optimal solution for x?
Answer
- J(x)=21(y−Ax)T(y−Ax), ∂x∂J=0, x=(ATA)−1ATy
- J(x)=21(y−Ax)TR−1(y−Ax), ∂x∂J=0, x=(ATR−1A)−1ATR−1y
- J(x)=21(y−Ax)TR−1(y−Ax)+21xT(aI)−1x, ∂x∂J=0, x=(ATR−1A+aI)−1ATR−1y
- We can distinguish two cases:
- For x: J(x)=21(y−Ax)TR−1(y−Ax)+21xT(aI)−1x, ∂x∂J=0, x=(ATR−1A+aI)−1ATR−1y
- For A: YT=XTAT J(A)=21(Y−XA)TR−1(Y−XA), ∂A∂J=0, AT=(XR−1XT)−1XR−1YT
Quiz 2
Question
Y=AX+ω, where ω∼N(0,Q) and X∼N(μ0,Σ0)
- What is p(Y∣X)?
- What is p(Y)?
- What is p(X∣Y)?
- What is p(Y′∣Y)?
Answer
- p(Y∣X)∼N(AX,Q) We regard X as a constant under conditional probability.
- p(Y)∼∫p(Y∣X)p(X)dx∼N(Aμ0,AΣ0AT+Q).
var[Y]=var[AX]+var[ω]=AΣ0AT+Q
- Assume that p(X∣Y)∼N(m,L), then we can use the equality of quadratic from to solve the problems.
- p(X∣Y)∼p(Y∣X)p(X)=N(Y∣AX,Q)N(X∣μ0,Σ9)
- −21(x−m)TL−1(x−m)∝−21(y−Ax)TQ−1(y−Ax)−21(x−μ0)TΣ0−1(x−μ0)
- We can get the result:
L−1L−1m=ATQ−1A+Σ0−1=ATQ−1y+Σ0−1μ0
- By applying [A+BCD]−1=A−1−A−1B[C−1+DA−1B]−1DA−1
Lm=(I−KA)Σ0=μ0+K(y−Aμ0)
where K=Σ0AT(ATΣ0A+Q)−1 - p(Y′∣Y)∼∫p(Y′∣X)p(X∣Y)dx∼N(Am,ALAT+Q). The same format as question 2.
Quiz 3
- Learning: p(θ∣D)∝p(D∣θ)p(θ)
- Prediction: p(Dnew∣D)=∫p(Dnew∣θ)p(θ∣D)dθ
- Evaluation: p(D)=∫p(D∣θ)p(θ)dθ
Question
Given t=Φ(x)ω+v where Φ(x)=[1,x,x...,xM] and v∼N(0,β−1), D={[x1,...,xN],[t1,...,tN]}
- What is the solution of ωML?
- What is the solution of ωMAP if ω∼N(0,α−1I)?
- What is the predictive distribution if Dnew={xnew,tnew}?
- What is the model evaluation?
Answer
- J(ω)=2β(T−Φω)T(T−Φω)→ωML=(ΦTΦ)−1ΦTT
- J(ω)=2β(T−Φω)T(T−Φω)+2αωTω→ωMAP=(βΦTΦ+αI)−1βΦTT
- N(Φ(xnew)ωMAP,Φ(xnew)ΣMAPΦ(xnew)T+βI)
- N(0,α−1ΦΦT+β−1I)
Quiz 4
Question
For y=σ(Φ(x)w), and D={[x1,...,xN],[t1,...,tN]}, where σ(x)=1+e−x1.
- What is the solution of wML?
- What is the solution of wMAP if w∼N(m0,S0)?
- What is the predictive distribution if Dnew={xnew,tnew=1}?
- What is the model evaluation?
Answer
J(w)=−∑n=1N{tnlogyn+(1−tn)log(1−yn)} b=▽J(w)=∑n=1NϕT(yn−tn) H=▽▽J(w)=∑n=1Nyn(1−yn)ϕnTϕ
Because σ is not a linear function, there are no explicit solution to find wML. We can use the gradient descent method to find the solution.
w+→w−H−1b
J(w)=−∑n=1N{tnlogyn+(1−tn)log(1−yn)}+21(w−m0)TS0−1(w−m0)
Therefore,and H=▽▽J(w)=∑n=1Nyn(1−yn)ϕnTϕ+S0−1
p(tnew=1∣xnew,D)=∫p(tnew=1∣w)p(w∣D)dw=∫σ(ϕneww)N(wMAP,H−1)dw
σ(κ(σa2)μa)
∑n=1N[tnlnyn+(1−tn)ln(1−yn)]MAP−21(wMAP−m0)TS0−1(wMAP−m0)+2Mln2π−21ln∣H∣MAP