热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文



For the moment, however, it is instructive to continue with the current approach and to consider how in practice we can apply it to data sets of limited size where we may wish to use relatively complex and flexible models. One technique that is often used to control the over-fitting phenomenon in such cases is that of regularization, which involves adding a penalty term to the error function(1.2) in order  to discourage the coefficients from reaching large values. The simplest such penalty term takes the form of a sum of squares of all of the coefficients,leading to a modified error function of the form


《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》

where ||w||2 ≡wTw = w2 0 +w2 1 +…+w2 M, and the coefficient λ governs the relative importance of the regularization term compared with the sum-of-squares error term. Note that often the coefficient w0 is omitted from the regularizer because its inclusion causes the results to depend on the choice of origin for the target variable (Hastie et al., 2001), or it may be included but with its own regularization coefficient (we shall discuss this topic in more detail in Section 5.5.1). Again, the error function in (1.4) can be minimized exactly in closed form. Techniques such as this are known Exercise 1.2 in the statistics literature as shrinkage methods because they reduce the value of the coefficients. The particular case of a quadratic regularizer is called ridge regression (Hoerl and Kennard, 1970). In the context of neural networks, this approach is known as weight decay.


《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》 = 

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》,系数

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》用来管理正则化因子相对于平方和错误因子的重要性。注意

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》常常会被从正则式中去掉,因为包含了它会导致依赖目标值的原点选择,或者包含一个属于它自己的正则化系数。还有,这个错误函数可以被精确的以闭合形式最小化。例如在练习1.2中统计学的收缩方法,因为他们减少了系数的值。一个二次正则式的特例是叫做边缘回归。在神经网络场景下,这种方法叫权重衰减。

Figure 1.7 shows the results of fitting the polynomial of order M =9 to the same data set as before but now using the regularized error function given by (1.4). We see that, for a value of lnλ = −18, the over-fitting has been suppressed and we now obtain a much closer representation of the underlying function sin(2πx). If, however, we use too large a value for λ then we again obtain a poor fit, as shown in Figure 1.7 for lnλ =0. The corresponding coefficients from the fitted polynomials are given in Table 1.2, showing that regularization has the desired effect of reducing the magnitude of the coefficients. The impact of the regularization term on the generalization error can be seen by plotting the value of the RMS error (1.3) for both training and test sets against lnλ, as shown in Figure 1.8. We see that in effect λ now controls the effective complexity of the model and hence determines the degree of over-fitting.

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》


《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》=-18,过拟合现象已经被抑制了,我们现在得到了一个更接近函数sin(2

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》x)的代表。但是如果我们使用太大的值,效果再次变差,如图1.7中ln

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》=0时的情况。如表1.2,给定了这个拟合多项式的相应的系数,显示正则化已经起到了减少系数权重的效果。正则化因子对于泛化的影响可以在RMS错误函数的对于训练数据和测试数据两种数据集的图像中看到,如图1.8.我们看到

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》现在控制了模型的有效复杂度,因此决定了过拟合的度。

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》

The issue of model complexity is an important one and will be discussed at length in Section1.3. Here we simply note that, if we were trying to solve a practical application using this approach of minimizing an error function, we would have to find a way to determine a suitable value for the model complexity. The results above suggest a simple way of achieving this, namely by taking the available data and partitioning it into a training set, used to determine the coefficients w, and a separate validation set, also called a hold-out set, used to optimize the model complexity (either M or λ). In many cases, however, this will prove to be too wasteful of valuable training data, and we have to seek more sophisticated approaches.


《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》)。但是在许多场景下,这中方法浪费了有价值的训练数据。我们必须找到一种更巧妙的方法。

《三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)》

So far our discussion of polynomial curve fitting has appealed largely to intuition. We now seek a more principled approach to solving problems in pattern recognition by turning to a discussion of probability theory. As well as providing the foundation for nearly all of the subsequent developments in this book, it will also give us some important insights into the concepts we have introduced in the context of polynomial curve fitting and will allow us to extend these to more complex situations.


PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有