当前位置: 首页 > news >正文

实际并行workers数量不等于postgresql.conf中设置的max_parallel_workers_per_gather数量

1 前言

  • 本文件的源码来自PostgreSQL 14.5,其它版本略有不同

PostgreSQL的并行workers是由compute_parallel_worker函数决定的,compute_parallel_worker是估算扫描所需的并行工作线程数,并不是您在postgresql.conf中设置的max_parallel_workers_per_gather数量,compute_parallel_worker会根据heap_pages、index_pages、max_workers(max_parallel_workers_per_gather)来决定并行工作线程数量。

2 源码和调用位置

compute_parallel_worker共有4个地方调用

src\backend\optimizer\path\allpaths.c(801,21)
src\backend\optimizer\path\allpaths.c(3724,21)
src\backend\optimizer\path\costsize.c(707,33)
src\backend\optimizer\plan\planner.c(5953,21)

compute_parallel_worker的声明

src\include\optimizer\paths.h(59,12)

compute_parallel_worker的实现

src\backend\optimizer\path\allpaths.c(3750,1)

compute_parallel_worker的源码

/** Compute the number of parallel workers that should be used to scan a* relation.  We compute the parallel workers based on the size of the heap to* be scanned and the size of the index to be scanned, then choose a minimum* of those.** "heap_pages" is the number of pages from the table that we expect to scan, or* -1 if we don't expect to scan any.** "index_pages" is the number of pages from the index that we expect to scan, or* -1 if we don't expect to scan any.** "max_workers" is caller's limit on the number of workers.  This typically* comes from a GUC.* "max_workers"就是postgresql.conf中max_parallel_workers_per_gather的值*/
int
compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,int max_workers)
{int			parallel_workers = 0;/** If the user has set the parallel_workers reloption, use that; otherwise* select a default number of workers.* 不需要优化,直接来自表级存储参数parallel_workers* 详见第3节直接使用postgresql.conf中设置的max_parallel_workers_per_gather数量*/if (rel->rel_parallel_workers != -1)parallel_workers = rel->rel_parallel_workers;else{/** If the number of pages being scanned is insufficient to justify a* parallel scan, just return zero ... unless it's an inheritance* child. In that case, we want to generate a parallel path here* anyway.  It might not be worthwhile just for this relation, but* when combined with all of its inheritance siblings it may well pay* off.*/if (rel->reloptkind == RELOPT_BASEREL &&((heap_pages >= 0 && heap_pages < min_parallel_table_scan_size) ||(index_pages >= 0 && index_pages < min_parallel_index_scan_size)))return 0;if (heap_pages >= 0){int			heap_parallel_threshold;int			heap_parallel_workers = 1;/** Select the number of workers based on the log of the size of* the relation.  This probably needs to be a good deal more* sophisticated, but we need something here for now.  Note that* the upper limit of the min_parallel_table_scan_size GUC is* chosen to prevent overflow here.*/heap_parallel_threshold = Max(min_parallel_table_scan_size, 1);while (heap_pages >= (BlockNumber) (heap_parallel_threshold * 3)){heap_parallel_workers++;heap_parallel_threshold *= 3;if (heap_parallel_threshold > INT_MAX / 3)break;		/* avoid overflow */}parallel_workers = heap_parallel_workers;}if (index_pages >= 0){int			index_parallel_workers = 1;int			index_parallel_threshold;/* same calculation as for heap_pages above */index_parallel_threshold = Max(min_parallel_index_scan_size, 1);while (index_pages >= (BlockNumber) (index_parallel_threshold * 3)){index_parallel_workers++;index_parallel_threshold *= 3;if (index_parallel_threshold > INT_MAX / 3)break;		/* avoid overflow */}if (parallel_workers > 0)parallel_workers = Min(parallel_workers, index_parallel_workers);elseparallel_workers = index_parallel_workers;}}/* In no case use more than caller supplied maximum number of workers */parallel_workers = Min(parallel_workers, max_workers);return parallel_workers;
}

3 直接使用postgresql.conf中设置的max_parallel_workers_per_gather数量

如果要使用指定数量的并行工作线程数,必须使用表级存储参数parallel_workers。

alter table tab set (parallel_workers=8);

注意:如果不设置表级存储参数parallel_workers,实际的并行工作线程数由compute_parallel_worker根据会根据heap_pages、index_pages、max_workers来决定并行工作线程数量。
因此会出现实际并行工作数量不等于postgresql.conf中设置的max_parallel_workers_per_gather数量的情况。

http://www.lryc.cn/news/146419.html

相关文章:

  • java定位问题工具
  • 【Java】基础入门 (十六)--- 异常
  • [javaWeb]Socket网络编程
  • <MySon car=“宝马“ :money=“money“></MySon>有没有冒号
  • netty(三):NIO——多线程优化
  • Linux操作系统--linux概述
  • 数组中出现次数超过一半的数字
  • 网络优化工程师,你真的了解吗?
  • git 的常用命令
  • linux如何拷贝文件,删除多余的一级目录,用*号代替所有文件
  • springboot使用properties
  • Android中获取手机SIM卡的各种信息
  • matlab 根据索引提取点云
  • 蓝芯、四川邦辰面试(部分)
  • openCV实战-系列教程13:文档扫描OCR识别下(图像轮廓/模版匹配)项目实战、源码解读
  • SpringBootWeb案例 Part 4
  • 什么是ChatGPT水印,ChatGPT生成的内容如何不被检测出来,原理什么?
  • Android 6.0 Settings中添加虚拟键开关
  • Yolov8小目标检测(12):动态稀疏注意力BiFormer | CVPR 2023
  • C# VS调试技巧
  • VS的调试技巧
  • lucene国内镜像 极速下载
  • Qt 信号槽连接方式
  • (线程池) 100行以内的简单线程池
  • Mysql按姓氏从小到大排序的正确sql
  • 【C++】详细介绍模版初阶—函数模版、类模板
  • BananaPi BPI-6202工业控制板全志科技A40i、24V DC输入、RS485接口
  • Python - functools.partial设置回调函数处理异步任务基本使用
  • phpspreadsheet导出excel自动获得列,数字下标
  • 结算日-洛谷