当前位置：首页 > article >正文

格兰杰因果分析

article 2025/9/13 20:58:39

定义：如果变量X无助于预测Y，则说明X不是Y的原因；相反，若X是Y的原因，则需要满足2个条件：（1）X应该有助于预测Y，即在Y关于Y的过去值的回归中，添加X的过去值作为独立变量应当显著增加回归的解释能力；（2）Y不应当有助于预测X，因为X有助于预测Y，Y也有助于预测X，则很有可能存在一个或多个其他变量是引起X变化的原因也是引起Y变化的原因。

Granger因果分析适用于平稳性的时间序列数据模型的检验。

对于变量X和Y，granger因果检验要求估计以下回归：

$Y_t=\sum_{i=1}^{m}\alpha_i X_{t-i}+\sum_{i=1}^{m}\beta_i Y_{t-i}+\mu _{1t}\left ( 1 \right )$

$X_t=\sum_{i=1}^{m}\lambda _i Y_{t-i}+\sum_{i=1}^{m}\delta _i X_{t-i}+\mu _{2t}\left ( 2 \right )$

假设白噪声 $\mu _{1t}$ 和 $\mu _{2t}$ 是互不相关的，式（1）假定y取值与y和x过去值有关，式（2）假定x取值与y和x过去值有关。

式（1）的零假设为α1=α2=…=αm=0，式（2）的假设为β1=β2=…=βm=0。

在式（1）中若x的滞后项系数显著不为0，在式（2）中若y的滞后项系数显著为0，就说明：x是y的granger原因。同理，式（1）和式（2）中若x的滞后项系数和y的滞后项系数显著不为0，说明x和y互为granger因果，当它们显著为0时，说明x和y互为独立，彼此不存在grange因果关系。

步骤：

（1）平稳性检验，一般采用单位根检验。

（2）若（1）通过检验后，进行赤池信息准则AIC或贝叶斯信息准则BIC求阶数，granger因果检验。

（3）若（1）未通过，一阶差分或协整性检验，进行赤池信息准则AIC或贝叶斯信息准则BIC求阶数，granger因果检验

def grangercausality2_feature(readpath, savepath):"""该方法接收一个包含2列的2维的数组作为主要参数：第一列是当前要预测未来值的序列A，第二列是另一个序列B,该方法就是看B对A的预测是否有帮助。该方法的零假设是：B对A没有帮助。如果所有检验下的P-Values都小于显著水平0.05，则可以拒绝零假设，并推断出B确实对A的预测有用。第二个参数maxlag是设定测试用的lags的最大值。"""filename = os.listdir(readpath)f = open(savepath + 'log_GrangerCausalityRelation.txt','a+', encoding='utf-8')for name in filename[:1]:df = pd.read_csv(readpath + name)data1 = df[['NOX', 'erci_wind']]data2 = df[['NOX', 'yici_wind']]data3 = df[['NOX', 'anshui']]data4 = df[['NOX', 'fuhe']]data5 = df[['NOX', 'in_Omean']]result_1 = grangercausalitytests(data1, maxlag=10, addconst=True, verbose=True)result_list_1 = []for lag in result_1:for test in result_1[lag][0]:lag_info = {}lag_info["lag"] = laglag_info["p_value"] = result_1[lag][0][test][1]lag_info["test_type"] = testresult_list_1.append(lag_info)test_result_1 = pd.DataFrame(result_list_1)result_2 = grangercausalitytests(data2, maxlag=10, addconst=True, verbose=True)result_list_2 = []for lag in result_2:for test in result_2[lag][0]:lag_info = {}lag_info["lag"] = laglag_info["p_value"] = result_2[lag][0][test][1]lag_info["test_type"] = testresult_list_2.append(lag_info)test_result_2 = pd.DataFrame(result_list_2)result_3 = grangercausalitytests(data3, maxlag=10, addconst=True, verbose=True)result_list_3 = []for lag in result_3:for test in result_3[lag][0]:lag_info = {}lag_info["lag"] = laglag_info["p_value"] = result_3[lag][0][test][1]lag_info["test_type"] = testresult_list_3.append(lag_info)test_result_3 = pd.DataFrame(result_list_3)result_4 = grangercausalitytests(data4, maxlag=10, addconst=True, verbose=True)result_list_4 = []for lag in result_4:for test in result_4[lag][0]:lag_info = {}lag_info["lag"] = laglag_info["p_value"] = result_4[lag][0][test][1]lag_info["test_type"] = testresult_list_4.append(lag_info)test_result_4 = pd.DataFrame(result_list_4)result_5 = grangercausalitytests(data5, maxlag=10, addconst=True, verbose=True)result_list_5 = []for lag in result_5:for test in result_5[lag][0]:lag_info = {}lag_info["lag"] = laglag_info["p_value"] = result_5[lag][0][test][1]lag_info["test_type"] = testresult_list_5.append(lag_info)test_result_5 = pd.DataFrame(result_list_5)print(result_1, file=f)print(result_2, file=f)print(result_3, file=f)print(result_4, file=f)print(result_5, file=f)# 制图plt.figure(figsize=(25, 20), dpi=200)ax1 = plt.subplot(2, 3, 1)for t in test_result_1['test_type'].unique():t_data = test_result_1[test_result_1['test_type'] == t]ax1.plot(t_data['lag'],  # x轴数据t_data['p_value'],  # y轴数据linestyle='-',  # 折线类型linewidth=2,  # 折线宽度color='b',  # 折线颜色marker='o',  # 点的形状markersize=6,  # 点的大小markeredgecolor='r',  # 点的边框色markerfacecolor='r')  # 点的填充色ax1.axhline(y=0.05, color='c', linestyle='--')ax1.set_title('GrangerCausalityResults1', fontsize=15)ax1.set_xlabel('lag', fontsize=15)ax1.set_ylabel('p_value', fontsize=15)ax2 = plt.subplot(2, 3, 2)for t in test_result_2['test_type'].unique():t_data = test_result_2[test_result_2['test_type'] == t]ax2.plot(t_data['lag'],t_data['p_value'],linestyle='-',linewidth=2,color='b',marker='o',markersize=6,markeredgecolor='r',markerfacecolor='r')ax2.axhline(y=0.05, color='c', linestyle='--')ax2.set_title('GrangerCausalityResults2', fontsize=15)ax2.set_xlabel('lag', fontsize=15)ax2.set_ylabel('p_value', fontsize=15)ax3 = plt.subplot(2, 3, 3)for t in test_result_3['test_type'].unique():t_data = test_result_3[test_result_3['test_type'] == t]ax3.plot(t_data['lag'],t_data['p_value'],linestyle='-',linewidth=2,color='b',marker='o',markersize=6,markeredgecolor='r',markerfacecolor='r')ax3.axhline(y=0.05, color='c', linestyle='--')ax3.set_title('GrangerCausalityResults3', fontsize=15)ax3.set_xlabel('lag', fontsize=15)ax3.set_ylabel('p_value', fontsize=15)ax4 = plt.subplot(2, 3, 4)for t in test_result_4['test_type'].unique():t_data = test_result_4[test_result_4['test_type'] == t]ax4.plot(t_data['lag'],t_data['p_value'],linestyle='-',linewidth=2,color='b',marker='o',markersize=6,markeredgecolor='r',markerfacecolor='r')ax4.axhline(y=0.05, color='c', linestyle='--')ax4.set_title('GrangerCausalityResults4')ax4.set_xlabel('lag', fontsize=15)ax4.set_ylabel('p_value')ax5 = plt.subplot(2, 3, 5)for t in test_result_5['test_type'].unique():t_data = test_result_5[test_result_5['test_type'] == t]ax5.plot(t_data['lag'],  # x轴数据t_data['p_value'],  # y轴数据linestyle='-',  # 折线类型linewidth=2,  # 折线宽度color='b',  # 折线颜色marker='o',  # 点的形状markersize=6,  # 点的大小markeredgecolor='r',  # 点的边框色markerfacecolor='r')  # 点的填充色ax5.axhline(y=0.05, color='c', linestyle='--')ax5.set_title('GrangerCausalityResults5', fontsize=15)ax5.set_xlabel('lag', fontsize=15)ax5.set_ylabel('p_value', fontsize=15)plt.savefig(save_path + 'grangercausality.png')if __name__ == '__main__':read_path = "path1"save_path = "path2"grangercausality2_feature(read_path, save_path)

验证结果

grangercausalitytests(df[[‘b’, ‘a’]], maxlag=2)Granger Causaliy
Number of lags （no zero）1
Ssr based F test：   F=0.5019， p=0.6541， df_denom=3, df_num=1
Ssr based chi2 test：chi2=2.7152, p=0.2583, df=1
Likelihood ratio test: chi2=2.3377, p=0.3327, df=1
Parameter F test: F=0.5019, p=0.6541, df_denom=3, df_num=1Granger Causaliy
Number of lags （no zero）2
Ssr based F test：   F=0.5019， p=0.6541， df_denom=3, df_num=2
Ssr based chi2 test：chi2=2.7152, p=0.2583, df=2
Likelihood ratio test: chi2=2.3377, p=0.3327, df=2
Parameter F test: F=0.5019, p=0.6541, df_denom=3, df_num=2

说明：

主要看p值，所有p值小于0.05才能证明b对a有效。

Number of lags （no zero）2：当lags=2时的检测结果

Ssr based F test：残差平方和F检验

Ssr based chi2 test：残差平方和卡方检验

Likelihood ratio test：似然估计检验结果

Parameter F test：参数F检验结果

查看全文

http://www.lryc.cn/news/2414641.html