作者:xljlg | 来源:互联网 | 2023-10-09 19:37
1.导入模块importnumpyasnpimportpandasaspdfrompandasimportSeries,DataFrameimportmatplo
1.导入模块
import numpy as np
import pandas as pd
from pandas import Series,DataFrame
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split
2.生成训练数据和测试数据
data = pd.read_csv('../data/digits.csv')
train = data.iloc[:,1:]
target = data['label']
X_train,x_test,y_train,y_true = train_test_split(train,target,test_size=0.2)
3.对数据进行降维处理
- PCA 用于数据降维,减少运算时间,避免过拟合
- n_components参数设置需要保留特征的数量,如果是小数,则表示保留特征的比例
pca = PCA(n_compOnents=150,whiten=True)
pca.fit(X_train,y_train)
X_train_pca = pca.transform(X_train)
x_test_pca = pca.transform(x_test)
结果将由原来的784个特征变为了150个特征
4.创建学习模型
svc = SVC(kernel = 'rbf')
5.使用降维后的数据进行模型训练
svc.fit(X_train_pca,y_train)
6.预测结果
y_pre_svc = svc.predict(x_test_pca)
7.展示结果
samples = x_test.iloc[:100]
y_pre = y_pre_svc[:100]
plt.figure(figsize=(12,18))
for i in range(100):
plt.subplot(10,10,i+1)
plt.imshow(samples.iloc[i].reshape(28,28),cmap='gray')
title = 'True:'+str(y_true.iloc[i])+'\nSVC:'+str(y_pre[i])
plt.title(title)
plt.axis('off')
8.模型执行降维后数据的评分
svc.score(x_test_pca[:100],y_true[:100])