当前位置: 首页 > news >正文

预测脱碳企业的信用评级-论文代码复现

文献来源

【Forecasting credit ratings of decarbonized firms: Comparative assessmentof machine learning models】

文章有代码复现有两个基本工作,1.是提取每个算法的重要性;2.计算每个算法的评价指标

算法有 CRT 分类决策树 ANN 人工神经网络 RFE 随机森林 SVM支持向量机

评价指标有F1 Score ;Specificity ;Accuracy

1.准备数据

分类标签【信誉等级CR1-CR7】和特征向量【变量】

[特征-变量]Probability of Default违约概率
Coverages覆盖范围
Capital Structure资本结构
Liquidity流动性
Profitability盈利能力
Operating Efficiency运营效率
Scale, Scope, and Diversity规模、范围和多样性
Competitive Advantage竞争优势
Fiscal Strength and Credit Conditions财政实力和信用状况
Systemic Governance and Effectiveness系统治理和有效性
Economic Strength经济实力

from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, accuracy_score, confusion_matrix
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier# Regenerating the dataset with binary classification
num_companies = 500
features = {'Probability_of_Default': np.random.uniform(0, 1, num_companies),'Coverages': np.random.uniform(0, 1, num_companies),'Capital_Structure': np.random.uniform(0, 1, num_companies),'Liquidity': np.random.uniform(0, 1, num_companies),'Profitability': np.random.uniform(0, 1, num_companies),'Operating_Efficiency': np.random.uniform(0, 1, num_companies),'Scale_Scope_and_Diversity': np.random.uniform(0, 1, num_companies),'Competitive_Advantage': np.random.uniform(0, 1, num_companies),'Fiscal_Strength_and_Credit_Conditions': np.random.uniform(0, 1, num_companies),'Systemic_Governance_and_Effectiveness': np.random.uniform(0, 1, num_companies),'Economic_Strength': np.random.uniform(0, 1, num_companies),
}
features['Rating'] = np.random.choice([1, 2], num_companies)# Create a DataFrame
df = pd.DataFrame(features)# Preview the data
df.head()
df.to_excel(r'C:\Users\12810\Desktop\组会\组会记录\2024-0103寒假集训材料\看论文任务\要讲的\模拟数据.xlsx')
df.head()
Probability_of_DefaultCoveragesCapital_StructureLiquidityProfitabilityOperating_EfficiencyScale_Scope_and_DiversityCompetitive_AdvantageFiscal_Strength_and_Credit_ConditionsSystemic_Governance_and_EffectivenessEconomic_StrengthRating
00.2985310.3284230.6655250.9851690.1074560.9338630.7142930.6816080.5083980.4726750.5085592
10.8414300.7363880.9990200.6899460.3482650.9299350.0660770.6095160.7979290.0483730.4248582
20.4628970.1057830.7162920.9128550.5644820.8505070.7740660.8800070.7378170.7293970.2834052
30.0627240.0735370.6117610.2137030.4832200.6687490.0528950.9245320.1340430.1262610.9101671
40.2422000.0897230.8747930.6599270.1592410.3484620.8285900.2735720.1177960.1548200.3240182
from sklearn.tree import DecisionTreeClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import classification_report, accuracy_score
from sklearn.model_selection import train_test_split# Define features and labels
X = df.drop('Rating', axis=1)
y = df['Rating']# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# Initialize the classifiers
classifiers = {'CRT': DecisionTreeClassifier(random_state=42),'ANN': MLPClassifier(random_state=42, max_iter=1000),'RFE': RandomForestClassifier(random_state=42),'SVM': SVC(random_state=42)
}# Dictionary to store models' performance metrics
performance_metrics = {}# Training and evaluating classifiers
for name, clf in classifiers.items():clf.fit(X_train, y_train)  # Train the modely_pred = clf.predict(X_test)  # Predict on test set# Calculate performance metricsperformance_metrics[name] = classification_report(y_test, y_pred, output_dict=True)performance_metrics[name]['accuracy'] = accuracy_score(y_test, y_pred)# Since ANN and SVM don't provide a direct method for feature importance, we will skip that part as per user's instructions.performance_metrics# ... (previous code to generate data and split sets)# Initialize classifiers
decision_tree = DecisionTreeClassifier(random_state=42)
random_forest = RandomForestClassifier(random_state=42)
ann = MLPClassifier(random_state=42, max_iter=1000)
svm = SVC(random_state=42)# Train and predict using each classifier
for clf in [decision_tree, random_forest, ann, svm]:clf.fit(X_train, y_train)y_pred = clf.predict(X_test)print(f"{clf.__class__.__name__} metrics:")print(classification_report(y_test, y_pred))print(f"Accuracy: {accuracy_score(y_test, y_pred)}\n")# Feature importance for Decision Tree and Random Forest
print(f"Decision Tree Feature Importance: {decision_tree.feature_importances_}")
print(f"Random Forest Feature Importance: {random_forest.feature_importances_}")
D:\install_file\Anaconda3\lib\site-packages\sklearn\neural_network\_multilayer_perceptron.py:691: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (1000) reached and the optimization hasn't converged yet.warnings.warn(DecisionTreeClassifier metrics:precision    recall  f1-score   support1       0.36      0.38      0.37        452       0.47      0.45      0.46        55accuracy                           0.42       100macro avg       0.42      0.42      0.42       100
weighted avg       0.42      0.42      0.42       100Accuracy: 0.42RandomForestClassifier metrics:precision    recall  f1-score   support1       0.39      0.40      0.40        452       0.50      0.49      0.50        55accuracy                           0.45       100macro avg       0.45      0.45      0.45       100
weighted avg       0.45      0.45      0.45       100Accuracy: 0.45MLPClassifier metrics:precision    recall  f1-score   support1       0.40      0.44      0.42        452       0.50      0.45      0.48        55accuracy                           0.45       100macro avg       0.45      0.45      0.45       100
weighted avg       0.46      0.45      0.45       100Accuracy: 0.45SVC metrics:precision    recall  f1-score   support1       0.45      0.42      0.44        452       0.55      0.58      0.57        55accuracy                           0.51       100macro avg       0.50      0.50      0.50       100
weighted avg       0.51      0.51      0.51       100Accuracy: 0.51Decision Tree Feature Importance: [0.1373407  0.1034318  0.09762338 0.047583   0.13000749 0.047780610.04700612 0.09306039 0.13848067 0.05285945 0.1048264 ]
Random Forest Feature Importance: [0.09758143 0.09809294 0.09033973 0.08525976 0.09331244 0.082810670.094648   0.10032612 0.08863182 0.08533643 0.08366067]D:\install_file\Anaconda3\lib\site-packages\sklearn\neural_network\_multilayer_perceptron.py:691: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (1000) reached and the optimization hasn't converged yet.warnings.warn(
from sklearn.inspection import permutation_importance# For ANN
ann.fit(X_train, y_train)
perm_importance_ann = permutation_importance(ann, X_test, y_test, n_repeats=30, random_state=42)# For SVM
svm.fit(X_train, y_train)
perm_importance_svm = permutation_importance(svm, X_test, y_test, n_repeats=30, random_state=42)# Store the permutation importances in a dictionary or DataFrame
feature_importances_ann = perm_importance_ann.importances_mean
feature_importances_svm = perm_importance_svm.importances_mean# You can then display these importances or further analyze them
print("Feature importances from ANN:", feature_importances_ann)
print("Feature importances from SVM:", feature_importances_svm)
D:\install_file\Anaconda3\lib\site-packages\sklearn\neural_network\_multilayer_perceptron.py:691: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (1000) reached and the optimization hasn't converged yet.warnings.warn(Feature importances from ANN: [-0.04366667 -0.04866667 -0.06166667 -0.01633333 -0.06166667 -0.069-0.059      -0.07333333 -0.02866667 -0.041      -0.04266667]
Feature importances from SVM: [-0.002      -0.00633333  0.019       0.035       0.00666667  0.00233333-0.00233333 -0.01466667 -0.02        0.00833333  0.02      ]
http://www.lryc.cn/news/306364.html

相关文章:

  • 目标检测——KITTI目标跟踪数据集
  • 25-k8s集群中-RBAC用户角色资源权限
  • Android 面试问题 2024 版(其二)
  • SpringMVC的异常处理
  • 【计算机网络】1 因特网概述
  • 【Ubuntu】Anaconda的安装和使用
  • OpenAI推出首个AI视频模型Sora:重塑视频创作与体验
  • mybatis总结传参三
  • JSONVUE
  • OSCP靶机--Medjed
  • 【Unity】Unity与安卓交互
  • QYFB-02 无线风力报警仪 风速风向超限声光报警
  • css知识:盒模型盒子塌陷BFC
  • Nginx的反向代理:实现灵活的请求转发和内容缓存
  • 免费享受企业级安全:雷池社区版WAF,高效专业的Web安全的方案
  • 基于SpringBoot的航班进出港管理系统
  • Odoo系统安装部署并结合内网穿透实现固定域名访问本地ERP系统
  • 幻兽帕鲁(Palworld 1.4.1)私有服务器搭建(docker版)
  • 好书推荐丨细说Python编程:从入门到科学计算
  • 智慧城市与数字孪生:共创未来城市新篇章
  • Java数据结构---初识集合框架
  • Spring Cloud学习
  • 【计算机网络】1.4 接入网和物理媒体
  • 关于螺栓的基本拧紧技术了解多少——SunTorque智能扭矩系统
  • C# .Net 发布后,把dll全部放在一个文件夹中,让软件目录更整洁
  • [更新]ARCGIS之土地耕地占补平衡、进出平衡系统报备坐标txt格式批量导出工具(定制开发版)
  • todolist
  • 【Java程序设计】【C00307】基于Springboot的基Hadoop的物品租赁管理系统(有论文)
  • GIT中对子仓库的使用方法介绍
  • ClickHouse 指南(三)最佳实践 -- 跳数索引