当前位置: 首页 > news >正文

kubernetes|云原生|Deployment does not have minimum availability 的解决方案(资源隐藏的由来)

前言:

最近在部署prometheus的过程中遇到的这个问题,感觉比较的经典,有必要记录一下。

现象是部署prometheus主服务的时候,看不到pod,只能看到deployment,由于慌乱,一度以为是集群有毛病了,然后重新做了集群,具体情况如下图:

注:up-to-date表示没有部署,available表示无可用pod

[root@node4 yaml]# k get deployments.apps -n monitor-sa 
NAME                READY   UP-TO-DATE   AVAILABLE   AGE
prometheus-server   0/2     0            0           2m5s
[root@node4 yaml]# k get po -n monitor-sa 
NAME                                 READY   STATUS        RESTARTS   AGE
node-exporter-6ttbl                  1/1     Running       0          23h
node-exporter-7ls5t                  1/1     Running       0          23h
node-exporter-r287q                  1/1     Running       0          23h
node-exporter-z85dm                  1/1     Running       0          23h

部署文件如下;

注意注意,有一个sa的引用哦  serviceAccountName: monitor

[root@node4 yaml]# cat prometheus-deploy.yaml 
---
apiVersion: apps/v1
kind: Deployment
metadata:name: prometheus-servernamespace: monitor-salabels:app: prometheus
spec:replicas: 2selector:matchLabels:app: prometheuscomponent: server#matchExpressions:#- {key: app, operator: In, values: [prometheus]}#- {key: component, operator: In, values: [server]}template:metadata:labels:app: prometheuscomponent: serverannotations:prometheus.io/scrape: 'false'spec:nodeName: node4serviceAccountName: monitorcontainers:- name: prometheusimage: prom/prometheus:v2.2.1imagePullPolicy: IfNotPresentcommand:- prometheus- --config.file=/etc/prometheus/prometheus.yml- --storage.tsdb.path=/prometheus- --storage.tsdb.retention=720hports:- containerPort: 9090protocol: TCPvolumeMounts:- mountPath: /etc/prometheus/prometheus.ymlname: prometheus-configsubPath: prometheus.yml- mountPath: /prometheus/name: prometheus-storage-volumevolumes:- name: prometheus-configconfigMap:name: prometheus-configitems:- key: prometheus.ymlpath: prometheus.ymlmode: 0644- name: prometheus-storage-volumehostPath:path: /datatype: Directory

 

解决方案:

那么,遇到这种情况,我们应该怎么做呢?当然了,第一点就是不要慌,其次deployment控制器有一个比较不让人注意的地方,就是编辑deployment可以看到该deployment的当前状态详情,会有非常详细的信息给我们看,也就是status字段

具体的命令是 kubectl edit deployment -n 命名空间  deployment名称,在本例中是这样的:

。。。。。。略略略   path: prometheus.ymlname: prometheus-configname: prometheus-config- hostPath:path: /datatype: Directoryname: prometheus-storage-volume
status:conditions:- lastTransitionTime: "2023-11-22T15:21:06Z"lastUpdateTime: "2023-11-22T15:21:06Z"message: Deployment does not have minimum availability.reason: MinimumReplicasUnavailablestatus: "False"type: Available- lastTransitionTime: "2023-11-22T15:21:06Z"lastUpdateTime: "2023-11-22T15:21:06Z"message: 'pods "prometheus-server-78bbb77dd7-" is forbidden: error looking upservice account monitor-sa/monitor: serviceaccount "monitor" not found'reason: FailedCreatestatus: "True"type: ReplicaFailure- lastTransitionTime: "2023-11-22T15:31:07Z"lastUpdateTime: "2023-11-22T15:31:07Z"message: ReplicaSet "prometheus-server-78bbb77dd7" has timed out progressing.reason: ProgressDeadlineExceededstatus: "False"type: ProgressingobservedGeneration: 1unavailableReplicas: 2

可以看到有三个message,第一个是标题里提到的报错信息,在dashboard里这个信息会优先显示,如果是报错的时候,第二个message是进一步解释错误问题在哪,本例里是说有个名叫 monitor的sa没有找到,第三个信息说的是这个deployment控制的rs部署失败,此信息无关紧要了,那么,重要的是第二个信息,这个信息是解决问题的关键。

附:一个正常的deployment 的status:

这个status告诉我们,他是一个副本,部署成功的,因此,第一个message是Deployment has minimum availability

      serviceAccount: kube-state-metricsserviceAccountName: kube-state-metricsterminationGracePeriodSeconds: 30
status:availableReplicas: 1conditions:- lastTransitionTime: "2023-11-21T14:56:14Z"lastUpdateTime: "2023-11-21T14:56:14Z"message: Deployment has minimum availability.reason: MinimumReplicasAvailablestatus: "True"type: Available- lastTransitionTime: "2023-11-21T14:56:13Z"lastUpdateTime: "2023-11-21T14:56:14Z"message: ReplicaSet "kube-state-metrics-57794dcf65" has successfully progressed.reason: NewReplicaSetAvailablestatus: "True"type: ProgressingobservedGeneration: 1readyReplicas: 1replicas: 1updatedReplicas: 1

具体的解决方案:

根据以上报错信息,那么,我们就需要一个sa,当然了,如果不想给太高的权限,就需要自己编写权限文件了,这里我偷懒 使用cluster-admin,具体的命令如下:

[root@node4 yaml]# k create sa -n monitor-sa monitor
serviceaccount/monitor created
[root@node4 yaml]# k create clusterrolebinding monitor-clusterrolebinding -n monitor-sa --clusterrole=cluster-admin  --serviceaccount=monitor-sa:monitor

再次部署就成功了:

[root@node4 yaml]# k get po -n monitor-sa  -owide
NAME                                 READY   STATUS      RESTARTS        AGE   IP               NODE    NOMINATED NODE   READINESS GATES
node-exporter-6ttbl                  1/1     Running     0               24h   192.168.123.12   node2   <none>           <none>
node-exporter-7ls5t                  1/1     Running     0               24h   192.168.123.11   node1   <none>           <none>
node-exporter-r287q                  1/1     Running     1 (2m57s ago)   24h   192.168.123.14   node4   <none>           <none>
node-exporter-z85dm                  1/1     Running     0               24h   192.168.123.13   node3   <none>           <none>
prometheus-server-78bbb77dd7-6smlt   1/1     Running     0               20s   10.244.41.19     node4   <none>           <none>
prometheus-server-78bbb77dd7-fhf5k   1/1     Running     0               20s   10.244.41.18     node4   <none>           <none>

总结来了:

那么,其实缺少sa可能会导致pod被隐藏,可以得出,sa是这个deployment的必要非显性依赖,同样的,如果部署文件内有写configmap,但configmap并没有提前创建也会出现这种错误,就是创建了deployment,但pod创建不出来,不像namespace没有提前创建的情况,namespace是必要显性依赖,没有会直接不让创建。

配额设置也是和sa一样的必要非显性依赖。

例如,下面创建一个针对default这个命名空间的配额文件,此文件定义如下:

定义的内容为规定default命名空间下最多4个pods,最多20个services,只能使用10G的内存,5.5的CPU

[root@node4 yaml]# cat quota-nginx.yaml 
apiVersion: v1
kind: ResourceQuota
metadata:name: quotanamespace: default
spec:hard:requests.cpu: "5.5"limits.cpu: "5.5"requests.memory: 10Gilimits.memory: 10Gipods: "4"services: "20"

下面创建一个deployment,副本是6个的nginx:

[root@node4 yaml]# cat nginx.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:annotations:deployment.kubernetes.io/revision: "1"creationTimestamp: "2023-11-22T16:13:33Z"generation: 1labels:app: nginxname: nginxnamespace: defaultresourceVersion: "16411"uid: e9a5cdc5-c6f0-45fb-a001-fcdd695eb925
spec:progressDeadlineSeconds: 600replicas: 6revisionHistoryLimit: 10selector:matchLabels:app: nginxstrategy:rollingUpdate:maxSurge: 25%maxUnavailable: 25%type: RollingUpdatetemplate:metadata:creationTimestamp: nulllabels:app: nginxspec:containers:- image: nginx:1.18imagePullPolicy: IfNotPresentname: nginxresources: {}terminationMessagePath: /dev/termination-logterminationMessagePolicy: Fileresources:limits:cpu: 1memory: 1Girequests:cpu: 500mmemory: 512MidnsPolicy: ClusterFirstrestartPolicy: AlwaysschedulerName: default-schedulersecurityContext: {}terminationGracePeriodSeconds: 30

创建完毕后,发现只有四个pod,配额有效:

[root@node4 yaml]# k get po
NAME                     READY   STATUS    RESTARTS   AGE
nginx-54f9858f64-g65pk   1/1     Running   0          4m50s
nginx-54f9858f64-h42vf   1/1     Running   0          4m50s
nginx-54f9858f64-s776t   1/1     Running   0          4m50s
nginx-54f9858f64-wl7wz   1/1     Running   0          4m50s

那么,还有两个pod呢?

[root@node4 yaml]# k get deployments.apps nginx -oyaml |grep messagemessage: Deployment does not have minimum availability.message: 'pods "nginx-54f9858f64-p8rxf" is forbidden: exceeded quota: quota, requested:message: ReplicaSet "nginx-54f9858f64" is progressing.

那么解决的方法也很简单,也就是调整quota啦,怎么调整就不在这里废话了吧!!!!!!!!!~~~~~~

http://www.lryc.cn/news/242672.html

相关文章:

  • 2023.11.22 IDEA Spring Boot 项目热部署
  • CentOS rpm安装Nginx和配置
  • 【pandas】数据透视表【pivot_table】
  • ubuntu22.04中ros2 安装rosbridge
  • 不单一的错误!如何修复Windows 10上“未安装音频输出设备”的错误
  • winlogbeat采集windows日志
  • 关于ElectronVue3中集成讯飞星火AI
  • 初识JVM(简单易懂),解开JVM神秘的面纱
  • Open3D (C++) 计算两点云之间的最小距离
  • 51单片机演奏兰亭序
  • 计算机编程零基础编程学什么语言,中文编程工具构件简介软件下载
  • zookeeper单机版的搭建
  • roseha for windows 11+oracle 11g部署过程
  • 机器学习与因果推断的高级实践 | 数学建模
  • go语言实现高性能自定义ip管理模块(ip黑名单)
  • 检索增强生成架构详解【RAG】
  • 高清动态壁纸软件Live Wallpaper Themes 4K mac中文版功能
  • Kafka配置SASL认证密码登录
  • 两年功能五年自动化测试面试经验分享
  • 大数据基础设施搭建 - Kafka(with ZooKeeper)
  • [JVM] 京东一面~说一下Java 类加载过程
  • 2023 年 认证杯 小美赛 ABC题 国际大学生数学建模挑战赛 |数学建模完整代码+建模过程全解全析
  • N-134基于java实现捕鱼达人游戏
  • MTK联发科MT6762/MT6763/MT6765安卓核心板参数规格比较
  • 仿ChatGPT对话前端页面(内含源码)
  • js粒子效果(一)
  • 程序员必备工具篇 / 程序员必备基础:Git
  • MacBook使用指南
  • 数据库的事务的基本特性,事务的隔离级别,事务隔离级别如何在java代码中使用,使用MySQL数据库演示不同隔离级别下的并发问题
  • Robust taboo search for the quadratic assignment problem-二次分配问题的鲁棒禁忌搜索