spark广播表大小超过Spark默认的8GB限制
报错:
org.apache.hive.service.cli.HiveSQLException: Error running query: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Cannot broadcast the table that is larger than 8GB: 10 GB
解决方案
方案1:增加广播阈值
设置参数 SET spark.sql.autoBroadcastJoinThreshold = 10485760;(10G)根据实际情况变更
方案2:禁用广播连接
设置参数 set spark.sql.autoBroadcastJoinThreshold=-1;