当前位置: 首页 > news >正文

【人工智能】Python在机器学习与人工智能中的应用

Python因其简洁易用、丰富的库支持以及强大的社区,被广泛应用于机器学习与人工智能(AI)领域。本教程通过实用的代码示例和讲解,带你从零开始掌握Python在机器学习与人工智能中的基本用法。


1. 机器学习与AI的Python生态系统

Python拥有多种支持机器学习和AI的库,以下是几个核心库:

  • NumPy:处理高效数组和矩阵运算。
  • Pandas:提供数据操作与分析工具。
  • Matplotlib/Seaborn:用于数据可视化。
  • Scikit-learn:机器学习的核心库,包含分类、回归、聚类等算法。
  • TensorFlow/PyTorch:深度学习框架,用于构建和训练神经网络。

安装:

pip install numpy pandas matplotlib scikit-learn tensorflow

2. 数据预处理

加载数据
import pandas as pd# 示例数据
data = pd.DataFrame({'Feature1': [1, 2, 3, 4, 5],'Feature2': [5, 4, 3, 2, 1],'Target': [1, 0, 1, 0, 1]
})print(data)

输出:

   Feature1  Feature2  Target
0         1         5       1
1         2         4       0
2         3         3       1
3         4         2       0
4         5         1       1
特征缩放

归一化或标准化数据有助于提升模型性能。

import pandas as pd
from sklearn.preprocessing import MinMaxScalerdata = pd.DataFrame({'Feature1': [1, 2, 3, 4, 5],'Feature2': [5, 4, 3, 2, 1],'Target': [1, 0, 1, 0, 1]
})scaler = MinMaxScaler()
scaled_features = scaler.fit_transform(data[['Feature1', 'Feature2']])
print(scaled_features)

输出:

[[0.   1.  ][0.25 0.75][0.5  0.5 ][0.75 0.25][1.   0.  ]]

3. 数据可视化

利用MatplotlibSeaborn绘制数据分布图。

import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
import seaborn as snsdata = pd.DataFrame({'Feature1': [1, 2, 3, 4, 5],'Feature2': [5, 4, 3, 2, 1],'Target': [1, 0, 1, 0, 1]
})scaler = MinMaxScaler()
scaled_features = scaler.fit_transform(data[['Feature1', 'Feature2']])
print(scaled_features)# 散点图
sns.scatterplot(x='Feature1', y='Feature2', hue='Target', data=data)
plt.title('Feature Scatter Plot')
plt.show()


4. 构建第一个机器学习模型

使用Scikit-learn实现分类模型。

拆分数据
import pandas as pd
from sklearn.model_selection import train_test_splitdata = pd.DataFrame({'Feature1': [1, 2, 3, 4, 5],'Feature2': [5, 4, 3, 2, 1],'Target': [1, 0, 1, 0, 1]
})X = data[['Feature1', 'Feature2']]
y = data['Target']X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)print('X_train:')
print(X_train)
print('X_test:')
print(X_test)
print('y_train:')
print(y_train)
print('y_test:')
print(y_test)
X_train:Feature1  Feature2
4         5         1
2         3         3
0         1         5
3         4         2X_test:Feature1  Feature2
1         2         4y_train:
4    1
2    1
0    1
3    0
Name: Target, dtype: int64y_test:
1    0
Name: Target, dtype: int64
训练模型
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_scoredata = pd.DataFrame({'Feature1': [1, 2, 3, 4, 5],'Feature2': [5, 4, 3, 2, 1],'Target': [1, 0, 1, 0, 1]
})X = data[['Feature1', 'Feature2']]
y = data['Target']X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# 随机森林分类器
model = RandomForestClassifier()
model.fit(X_train, y_train)# 预测
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
Accuracy: 0.0

5. 深度学习与神经网络

构建一个简单的神经网络进行分类任务。

安装TensorFlow
conda install tensorflow

如果安装遇到Could not solve for environment spec错误,请先执行以下命令

conda create -n tf_env python=3.8
conda activate tf_env   
构建模型
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense# 构建神经网络
model = Sequential([Dense(8, input_dim=2, activation='relu'),Dense(4, activation='relu'),Dense(1, activation='sigmoid')
])
编译与训练
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=50, batch_size=1, verbose=1)
评估模型
loss, accuracy = model.evaluate(X_test, y_test)
print("Loss:", loss)
print("Accuracy:", accuracy)
完整代码
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Densedata = pd.DataFrame({'Feature1': [1, 2, 3, 4, 5],'Feature2': [5, 4, 3, 2, 1],'Target': [1, 0, 1, 0, 1]
})X = data[['Feature1', 'Feature2']]
y = data['Target']X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# 构建神经网络
model = Sequential([Dense(8, input_dim=2, activation='relu'),Dense(4, activation='relu'),Dense(1, activation='sigmoid')
])model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=50, batch_size=1, verbose=1)loss, accuracy = model.evaluate(X_test, y_test)
print("Loss:", loss)
print("Accuracy:", accuracy)

输出:

Epoch 1/50
4/4 [==============================] - 1s 1ms/step - loss: 0.6867 - accuracy: 0.5000
Epoch 2/50
4/4 [==============================] - 0s 997us/step - loss: 0.6493 - accuracy: 0.5000
Epoch 3/50
4/4 [==============================] - 0s 997us/step - loss: 0.6183 - accuracy: 0.5000
Epoch 4/50
4/4 [==============================] - 0s 665us/step - loss: 0.5920 - accuracy: 0.5000
Epoch 5/50
4/4 [==============================] - 0s 1ms/step - loss: 0.5702 - accuracy: 0.5000
Epoch 6/50
4/4 [==============================] - 0s 997us/step - loss: 0.5612 - accuracy: 0.7500
Epoch 7/50
4/4 [==============================] - 0s 998us/step - loss: 0.5405 - accuracy: 0.7500
Epoch 8/50
4/4 [==============================] - 0s 665us/step - loss: 0.5223 - accuracy: 0.7500
Epoch 9/50
4/4 [==============================] - 0s 1ms/step - loss: 0.5047 - accuracy: 0.7500
Epoch 10/50
4/4 [==============================] - 0s 665us/step - loss: 0.4971 - accuracy: 0.7500
Epoch 11/50
4/4 [==============================] - 0s 997us/step - loss: 0.4846 - accuracy: 0.7500
Epoch 12/50
4/4 [==============================] - 0s 997us/step - loss: 0.4762 - accuracy: 0.7500
Epoch 13/50
4/4 [==============================] - 0s 665us/step - loss: 0.4753 - accuracy: 0.7500
Epoch 14/50
4/4 [==============================] - 0s 997us/step - loss: 0.4623 - accuracy: 1.0000
Epoch 15/50
4/4 [==============================] - 0s 998us/step - loss: 0.4563 - accuracy: 1.0000
Epoch 16/50
4/4 [==============================] - 0s 998us/step - loss: 0.4530 - accuracy: 1.0000
Epoch 17/50
4/4 [==============================] - 0s 997us/step - loss: 0.4469 - accuracy: 1.0000
Epoch 18/50
4/4 [==============================] - 0s 997us/step - loss: 0.4446 - accuracy: 0.7500
Epoch 19/50
4/4 [==============================] - 0s 665us/step - loss: 0.4385 - accuracy: 0.7500
Epoch 20/50
4/4 [==============================] - 0s 998us/step - loss: 0.4355 - accuracy: 0.7500
Epoch 21/50
4/4 [==============================] - 0s 997us/step - loss: 0.4349 - accuracy: 0.7500
Epoch 22/50
4/4 [==============================] - 0s 665us/step - loss: 0.4290 - accuracy: 0.7500
Epoch 23/50
4/4 [==============================] - 0s 997us/step - loss: 0.4270 - accuracy: 0.7500
Epoch 24/50
4/4 [==============================] - 0s 997us/step - loss: 0.4250 - accuracy: 0.7500
Epoch 25/50
4/4 [==============================] - 0s 665us/step - loss: 0.4218 - accuracy: 0.7500
Epoch 26/50
4/4 [==============================] - 0s 997us/step - loss: 0.4192 - accuracy: 0.7500
Epoch 27/50
4/4 [==============================] - 0s 997us/step - loss: 0.4184 - accuracy: 0.7500
Epoch 28/50
4/4 [==============================] - 0s 665us/step - loss: 0.4152 - accuracy: 0.7500
Epoch 29/50
4/4 [==============================] - 0s 997us/step - loss: 0.4129 - accuracy: 0.7500
Epoch 30/50
4/4 [==============================] - 0s 997us/step - loss: 0.4111 - accuracy: 0.7500
Epoch 31/50
4/4 [==============================] - 0s 997us/step - loss: 0.4095 - accuracy: 0.7500
Epoch 32/50
4/4 [==============================] - 0s 997us/step - loss: 0.4070 - accuracy: 0.7500
Epoch 33/50
4/4 [==============================] - 0s 997us/step - loss: 0.4053 - accuracy: 0.7500
Epoch 34/50
4/4 [==============================] - 0s 997us/step - loss: 0.4033 - accuracy: 0.7500
Epoch 35/50
4/4 [==============================] - 0s 998us/step - loss: 0.4028 - accuracy: 0.7500
Epoch 36/50
4/4 [==============================] - 0s 997us/step - loss: 0.3998 - accuracy: 0.7500
Epoch 37/50
4/4 [==============================] - 0s 1ms/step - loss: 0.3978 - accuracy: 0.7500
Epoch 38/50
4/4 [==============================] - 0s 997us/step - loss: 0.3966 - accuracy: 0.7500
Epoch 39/50
4/4 [==============================] - 0s 665us/step - loss: 0.3946 - accuracy: 0.7500
Epoch 40/50
4/4 [==============================] - 0s 997us/step - loss: 0.3926 - accuracy: 0.7500
Epoch 41/50
4/4 [==============================] - 0s 997us/step - loss: 0.3918 - accuracy: 0.7500
Epoch 42/50
4/4 [==============================] - 0s 997us/step - loss: 0.3898 - accuracy: 0.7500
Epoch 43/50
4/4 [==============================] - 0s 997us/step - loss: 0.3877 - accuracy: 0.7500
Epoch 44/50
4/4 [==============================] - 0s 997us/step - loss: 0.3861 - accuracy: 0.7500
Epoch 45/50
4/4 [==============================] - 0s 665us/step - loss: 0.3842 - accuracy: 0.7500
Epoch 46/50
4/4 [==============================] - 0s 665us/step - loss: 0.3830 - accuracy: 0.7500
Epoch 47/50
4/4 [==============================] - 0s 997us/step - loss: 0.3815 - accuracy: 0.7500
Epoch 48/50
4/4 [==============================] - 0s 665us/step - loss: 0.3790 - accuracy: 0.7500
Epoch 49/50
4/4 [==============================] - 0s 665us/step - loss: 0.3778 - accuracy: 0.7500
Epoch 50/50
4/4 [==============================] - 0s 997us/step - loss: 0.3768 - accuracy: 0.7500
1/1 [==============================] - 0s 277ms/step - loss: 2.8638 - accuracy: 0.0000e+00
Loss: 2.863826274871826
Accuracy: 0.0

6. 数据聚类

实现一个K-Means聚类模型:

from sklearn.cluster import KMeans# 数据
data_points = [[1, 2], [2, 3], [3, 4], [8, 7], [9, 8], [10, 9]]# K-Means
kmeans = KMeans(n_clusters=2)
kmeans.fit(data_points)# 输出聚类中心
print("Cluster Centers:", kmeans.cluster_centers_)

输出:

Cluster Centers: [[9. 8.][2. 3.]]

7. 自然语言处理 (NLP)

使用NLTK处理文本数据:

pip install nltk
文本分词
import nltknltk.download('punkt_tab')
nltk.download('punkt')from nltk.tokenize import word_tokenizetext = "Machine learning is amazing!"
tokens = word_tokenize(text)
print(tokens)

输出: 

['Machine', 'learning', 'is', 'amazing', '!']
词袋模型
from sklearn.feature_extraction.text import CountVectorizertexts = ["I love Python", "Python is great for AI"]
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)print(X.toarray())

输出:  

[[0 0 0 0 1 1][1 1 1 1 0 1]]

8. 实用案例:房价预测

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error# 加载数据集
data = fetch_california_housing(as_frame=True)
X = data.data
y = data.target# 数据拆分
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# 模型训练
model = LinearRegression()
model.fit(X_train, y_train)# 预测
y_pred = model.predict(X_test)
print("Model Coefficients:", model.coef_)# 评估
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

输出:  

Model Coefficients: [ 4.48674910e-01  9.72425752e-03 -1.23323343e-01  7.83144907e-01-2.02962058e-06 -3.52631849e-03 -4.19792487e-01 -4.33708065e-01]
Mean Squared Error: 0.5558915986952442

总结

本教程涵盖了Python在机器学习和人工智能领域的基础应用,从数据预处理、可视化到模型构建和评估,再到深度学习的基本实现。通过这些示例,你可以逐步掌握如何使用Python进行机器学习和AI项目开发。

http://www.lryc.cn/news/491635.html

相关文章:

  • 使用八爪鱼爬虫抓取汽车网站数据,分析舆情数据
  • 什么是事务?事务有哪些特性?
  • 玩转合宙Luat教程 基础篇④——程序基础(库、线程、定时器和订阅/发布)
  • 24.<Spring博客系统①(数据库+公共代码+持久层+显示博客列表+博客详情)>
  • webp 网页如何录屏?
  • 丹摩征文活动|实现Llama3.1大模型的本地部署
  • Spring Boot 2 和 Spring Boot 3 中使用 Spring Security 的区别
  • 【数据结构与算法】 LeetCode:回溯
  • SpringBoot线程池的使用
  • Neural Magic 发布 LLM Compressor:提升大模型推理效率的新工具
  • HttpServletRequest req和前端的关系,req.getParameter详细解释,req.getParameter和前端的关系
  • React-useEffect的使用
  • MySQL数据库与Informix:能否创建同名表?
  • 爬虫实战:采集知乎XXX话题数据
  • 大数据新视界 -- Hive 数据桶原理:均匀分布数据的智慧(上)(9/ 30)
  • 【小白学机器学习33】 大数定律python的 pandas.Dataframe 和 pandas.Series基础内容
  • 【shodan】(五)网段利用
  • LeetCode739. 每日温度(2024冬季每日一题 15)
  • Node.js的http模块:创建HTTP服务器、客户端示例
  • 加菲工具 - 好用免费的在线工具集合
  • .NET9 - 新功能体验(二)
  • map和redis关系
  • 《数据结构》学习系列——图(中)
  • 探索Python的HTTP之旅:揭秘Requests库的神秘面纱
  • Python 爬虫从入门到(不)入狱学习笔记
  • IDEA优雅debug
  • wp the_posts_pagination 与分类页面搭配使用
  • 大数据-231 离线数仓 - DWS 层、ADS 层的创建 Hive 执行脚本
  • 【Python】分割秘籍!掌握split()方法,让你的字符串处理轻松无敌!
  • 免费实用在线AI工具集合 - 加菲工具