当前位置：首页 > news >正文

Sklearn-使用SVC对iris数据集进行分类

news 2025/8/6 11:15:37

Sklearn-使用SVC对iris数据集进行分类

iris数据集的加载
训练svc模型
输出混淆矩阵和分类报告
使用Pipeline管道完成固定操作
- 不使用Pipeline
- 使用Pipeline

使用SVC对iris数据集进行分类预测
涉及内容包含：

数据集的加载,训练集和测试集的划分
训练svc模型,对测试集的预测
输出混淆矩阵和分类报告
使用Pipeline执行操作

iris数据集的加载

加载数据集
用DataFrame展示数据
划分训练集和测试集合

from sklearn.datasets import load_iris

iris = load_iris()

iris.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename'])

data = iris['data']
target = iris['target']# 以DataFrame显示所有的数据
import pandas as pd
df = pd.DataFrame(data,columns=iris['feature_names']) 
df['target'] = target # 添加target列

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2

150 rows × 5 columns

# 划分数据集：训练集和测试集
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(data,target,test_size=0.3) # 测试集占30%。训练集70%

训练svc模型

导入库文件
初始化svc
训练svc

from sklearn.svm import SVC
# 初始化SVC
svc = SVC()
# 训练
svc.fit(x_train,y_train)
# 查看训练效果
print("训练集的精度",svc.score(x_train,y_train))
# 对测试集预测的精度
print("对测试集的预测效果：",svc.score(x_test,y_test))# 对测试集进行预测
y_pre = svc.predict(x_test)
# 表格对比预测与实际结果
df2 = pd.DataFrame(data = {'predict':y_pre,'true':y_test
})

训练集的精度 0.9714285714285714
对测试集的预测效果： 0.9555555555555556

输出混淆矩阵和分类报告

输出混淆矩阵：查看每个类预测的成功与失败的情况
输出分类报告：查看分类的性能

from sklearn.metrics import confusion_matrix# 输出混淆矩阵
con_matrix = confusion_matrix(y_test,y_pre)
print(con_matrix)

[[12  0  0][ 0 15  1][ 0  1 16]]

from sklearn.metrics import classification_report
# 输出分类报告
report = classification_report(y_test,y_pre,target_names=iris['target_names'])
print(report)

              precision    recall  f1-score   supportsetosa       1.00      1.00      1.00        12versicolor       0.94      0.94      0.94        16virginica       0.94      0.94      0.94        17accuracy                           0.96        45macro avg       0.96      0.96      0.96        45
weighted avg       0.96      0.96      0.96        45

使用Pipeline管道完成固定操作

增加对数据的归一化处理
将对数据的归一化处理和训练处理放在pipeline中完成

不使用Pipeline

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler iris = load_iris()
data = iris['data']
target = iris['target']# 划分数据集：训练集和测试集
x_train,x_test,y_train,y_test = train_test_split(data,target,test_size=0.3,random_state=42,stratify=target) # 测试集占30%。训练集70%# 特征变量标准化
# 由于支持向量机可能受特征变量取值范围影响，训练集与测试集的特征变量标准化
scaler = StandardScaler().fit(x_train)
x_train_s = scaler.transform(x_train)
x_test_s = scaler.transform(x_test)# 训练模型
svm = SVC()
svm.fit(x_train_s, y_train)
print("精确度：",svm.score(x_test_s, y_test))

精确度： 0.9333333333333333

使用Pipeline

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVCfrom sklearn.preprocessing import StandardScaler 
from sklearn.pipeline import Pipeline
# from sklearn.pipeline import make_pipelineiris = load_iris()
data = iris['data']
target = iris['target']# 划分数据集：训练集和测试集
x_train,x_test,y_train,y_test = train_test_split(data,target,test_size=0.3,random_state=42,stratify=target) # 测试集占30%。训练集70%# 构造管道
pipe = Pipeline([('std_scaler',StandardScaler()),('svc',SVC())]
)
# 使用管道
pipe.fit(x_train,y_train)
# 预测
print("精度为：",pipe.score(x_test,y_test))

精度为： 0.9333333333333333

查看全文

http://www.lryc.cn/news/108700.html

项目经理必读：领导风格对项目成功的关键影响

行业追踪，2023-08-04

双链表（带哨兵位头节点）

MySQL - LOAD DATA LOCAL INFILE将数据导入表中和 INTO OUTFILE (速度快)

String ，StringBulider ,StringBuffer

阶段总结（linux基础）

HTTP（超文本传输协议）学习

23年7月工作笔记整理（前端）

pytorch学习——正则化技术——权重衰减

iTOP-RK3588开发板Ubuntu 系统交叉编译 Qt 工程-命令行交叉编译

Java进阶——数据结构与算法之哈希表与树的入门小结（四）

DataFrame中按某字段分类并且取该分类随机数量的数据

【c++】rand()随机函数的应用(一)——rand()函数详解和实例

iOS——Block回调

html学习6（xhtml）

UML-活动图

跨境电商怎么做?Live Market教你创业及做大生意

Linux 4.19 和Linux 5.10 的区别

学习单片机的秘诀：实践与坚持

Hum Brain Mapp：用于功能连接体指纹识别和认知状态解码的高精度机器学习技术

Ajax图书管理业务

对于爬虫代码的优化，多个方向

ffmpeg推流卡顿修复

Java02-迭代器，数据结构,List,Set ,TreeSet集合,Collections工具类

离散 Hopfield 神经网络的分类与matlab实现

opencv 30 -图像平滑处理01-均值滤波 cv2.blur()

中小企业的数字化营销应该如何着手？数字化营销到底要怎么做？

实数信号的傅里叶级数研究（Matlab代码实现）

oracle数据库巡检脚本

服务注册中心consul的服务健康监控及告警

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2

Sklearn-使用SVC对iris数据集进行分类

iris数据集的加载

训练svc模型

输出混淆矩阵和分类报告

使用Pipeline管道完成固定操作

不使用Pipeline

使用Pipeline

相关文章：

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2