【Python】学习笔记总结8（经典算法）

冷不防 2023-03-01 13:29 75阅读 0赞

文章目录

八、Python经典算法
- 0.Python画图
- 1.回归-线性回归
- 2.分类-k最近邻
- 3.聚类
- - 3.1.Kmeans(k均值聚类算法)
  - 3.2.DBSCAN(基于密度的聚类算法)
  - 3.3.层次聚类算法
- 4.降维
- - 4.1.PCA算法
  - 4.2.FA算法
- 5.学习（神经网络）
- - 5.1.BP神经网络
- 6.推荐算法
- 7.时间序列（视频学习）

八、Python经典算法

0.Python画图

Python画图

1.回归-线性回归

回归-课程回顾
目的：找一条线，尽可能地拟合数据点，生成线性回归模型，并进行预测
解决什么样的问题：要完成的任务是预测一个连续值的话，那这个任务就是回归。是离散值的话就是分类
拟合（Fitting）：就是说这个曲线能不能很好的描述某些样本，并且有比较好的泛化能力。
过拟合（Overfitting）：就是太过贴近于训练数据的特征了，在训练集上表现非常优秀，近乎完美的预测/区分了所有的数据，但是在新的测试集上却表现平平，不具泛化性，拿到新样本后没有办法去准确的判断。
欠拟合（UnderFitting）：测试样本的特性没有学到，或者是模型过于简单无法拟合或区分样本。
过拟合和欠拟合的形象解释

import matplotlib.pyplot as plt
import numpy as np
t = np.arange(1,10,1)
y = 0.9 * t + np.sin(t)
# plt.plot(t,y,"o")
# plt.show()
model = np.polyfit(t,y,deg = 3)#生成三阶模型
t2 = np.arange(-2,12,0.5)
y2predict = np.polyval(model,t2)
plt.plot(t,y,"o",t2,y2predict,"x")
plt.show()

在这里插入图片描述

2.分类-k最近邻

分类-课程回顾
目的：根据已知样本进行分类学习，生成模型，并对测试样本进行预测
解决什么样的问题：了解单个样本信息特征以及其标签值，根据其生成模型，并对测试样本进行预测

#分类算法
#k最近邻：近朱者赤近墨者黑原理
import os
import pandas as pd
from sklearn import  neighbors
thisFilePath = os.path.abspath('.')
os.chdir(thisFilePath)
# print(os.getcwd())
df = pd.read_csv('ScoreData.csv')
# print(df.head())
train_x = df.iloc[0:8,2:4]
# print(train_x.head())
train_y= df.iloc[0:8,4]
# print(train_y.head())
model = neighbors.KNeighborsClassifier()
model.fit(train_x,train_y)
test_x = df.iloc[8:11,2:4]
test_y= df.iloc[8:11,4].values
test_p = model.predict(test_x)
print(test_p)
print(test_y)
print(model.score(test_x, test_y))

3.聚类

聚类-课程回顾
目的：根据已知样本进行分类学习，生成模型，并对测试样本进行预测
解决什么样的问题：不了解单个样本信息特征以及其标签值，根据其生成模型，并对测试样本进行预测

3.1.Kmeans(k均值聚类算法)

(k-means clustering algorithm)
在这里插入图片描述

import numpy as np
train_x2 = np.array(train_x[['yuwen','shuxue']])
print(train_x2)
from sklearn.cluster import KMeans
model2 = KMeans(n_clusters=3)
model2 = model2.fit(train_x2)
clusterResult = pd.DataFrame(model2.labels_,index=train_x.index,columns=['clusterResult'])
print(clusterResult.head())