• 如果您觉得本站非常有看点，那么赶紧使用Ctrl+D 收藏吧

# 深入理解SVM，软间隔与对偶问题

4周前 (09-02) 24次浏览

[left{begin{align*}
&min_{omega , b} frac{1}{2}||omega||^2\
s.t.& quad y_i(omega^Tx + b) ge 1, &i=1,2,3ldots,m\
end{align*}right.]

## 求解过程

QP问题其实有专门的QP计算包可以求它的极值，我也曾经用过，但这并不是唯一的解法，并且这种解法有一个很大的缺点在于没办法套用核函数。所以我们一般不使用QP规划的方法来求解。

[left{begin{align*}
&min_{x} f(x)\ s.t.& g_i(x) le 0, i=1, 2, cdots, k\
&h_j(x) = 0, j=1, 2, cdots, m\ & end{align*}right.]

[L(x, alpha, beta)=f(x) + sum_{i=1}^k alpha_ig_i(x)+sum_{j=1}^m beta_jh_j(x), quad alpha_i ge 0
]

## 对偶问题

[max_{alpha_i ge 0} theta_D(alpha, beta) = max_{alpha_i ge 0} min_{x} L(x, alpha, beta)
]

[theta_D(alpha, beta) = max_{alpha_i ge 0} min_x L(x, alpha, beta) le min_x max_{alpha_i ge 0} L(x, alpha, beta) = theta_P(x)
]

[min_x L(x, alpha, beta) le L(x, alpha, beta) le max_{alpha_i ge 0} L(x, alpha, beta)
]

KTT条件有这么几条，看起来多其实并不复杂。

[nabla_xL(x^*,alpha^*, beta^*)=0
]

[nabla_alpha L(x^*,alpha^*, beta^*)=0
]

[nabla_beta L(x^*,alpha^*, beta^*)=0
]

[alpha_i^*g_i(x*)=0, i = 1, 2, 3ldots m
]

[g_i(x*)=0, i = 1, 2, 3ldots m
]

[alpha_igeq0, i = 1, 2, 3ldots m
]

[h_i(x*)=0, j = 1, 2, 3ldots l
]

[begin{aligned}
L(x, omega, b) &= frac{1}{2}||omega||^2 + sum_{i=1}^malpha_i(1 – y_i(omega^T x_i + b))\
&= frac{1}{2}||omega||^2 + sum_{i=1}^m (alpha_i – alpha_i y_i omega^Tx_i – alpha_i y_ib) \
&= frac{1}{2}||omega||^2 + sum_{i=1}^m alpha_i – sum_{i=1}^m alpha_i y_iomega^Tx_i – sum{i=1}^m alpha_i y_ib
end{aligned}
]

[frac{partial L}{partial omega} = frac{1}{2} * 2 * omega + 0 – sum_{i=1}^m alpha_i y_i x_i – 0 = 0 rightarrow omega = sum_{i=1}^m alpha_i y_ix_i
]

[frac{partial L}{partial b} = 0 + 0 – 0 – sum{i=1}^malpha_i y_i = 0 rightarrow sum_{i=1}^malpha_i y_i = 0
]

[begin{aligned}
min_{omega, b} L(x, omega, b) &= frac{1}{2}omega^Tomega + sum_{i=1}^m alpha_i – sum_{i=1}^m alpha_iy_iomega^Tx_i – sum_{i=1}^m alpha_iy_i b \
&= -frac{1}{2}omega^Tsum_{i=1}^m alpha_iy_i x_i + sum_{i=1}^m alpha_i – bsum_{i=1}^m alpha_iy_i \
& = -frac{1}{2}omega^Tsum_{i=1}^m alpha_iy_i x_i + sum_{i=1}^m alpha_i \
& = sum_{i=1}^m alpha_i – frac{1}{2}sum_{i=1}^msum_{j=1}^m alpha_i alpha_j y_i y_j x_i^T x_j
end{aligned}
]