Prometheus监控神技--自动发现配置

叁歲伎倆 2023-08-17 17:13 215阅读 0赞

一、自动发现类型

在上一篇文中留了一个坑:

462684-20190910175110069-780535859.png

监控某个statefulset服务的时候,我在service文件中定义了个EP,然后把pod的ip写死在配置文件中,这样,当pod重启后,IP地址变化,就监控不到数据了,这肯定是不合理的。

如果在我们的 Kubernetes 集群中有了很多的 Service/Pod,那么我们都需要一个一个的去建立一个对应的 ServiceMonitor 对象来进行监控吗?这样岂不是也很麻烦么?

为解决上面的问题,Prometheus Operator 为我们提供了一个额外的抓取配置的来解决这个问题,我们可以通过额外的配置来获取k8s的资源监控(pod、service、node等)。

promethues支持多种文件发现。

462684-20190911081909697-2129777206.png

其中通过kubernetes_sd_configs,可以达到我们想要的目的,监控其各种资源。kubernetes SD 配置允许从kubernetes REST API接受搜集指标,且总是和集群保持同步状态,以下任何一种role类型都能够配置来发现我们想要的对象,来自官网翻译的。

1、Node

Node role发现每个集群中的目标是通过默认的kubelet的HTTP端口。目标地址默认是kubernetes如下地址中node的第一个地址(NodeInternalIP, NodeExternalIP,NodeLegacyHostIP, and NodeHostName.)

可用的meta标签有:

  1. __meta_kubernetes_node_name: The name of the node object.
  2. __meta_kubernetes_node_label_<labelname>: Each label from the node object.
  3. __meta_kubernetes_node_labelpresent_<labelname>: true for each label from the node object.
  4. __meta_kubernetes_node_annotation_<annotationname>: Each annotation from the node object.
  5. __meta_kubernetes_node_annotationpresent_<annotationname>: true for each annotation from the node object.
  6. __meta_kubernetes_node_address_<address_type>: The first address for each node address type, if it exists.

此外,node的实例标签将会被设置成从API server传递过来的node的name。

2、Service

service角色会为每个服务发现一个服务端口。对于黑盒监控的服务,这个比较有用。address将会被设置成service的kubernetes DNS名称以及各自的服务端口。

Available meta labels:

  1. __meta_kubernetes_namespace: The namespace of the service object.
  2. __meta_kubernetes_service_annotation_<annotationname>: Each annotation from the service object.
  3. __meta_kubernetes_service_annotationpresent_<annotationname>: "true" for each annotation of the service object.
  4. __meta_kubernetes_service_cluster_ip: The cluster IP address of the service. (Does not apply to services of type ExternalName)
  5. __meta_kubernetes_service_external_name: The DNS name of the service. (Applies to services of type ExternalName)
  6. __meta_kubernetes_service_label_<labelname>: Each label from the service object.
  7. __meta_kubernetes_service_labelpresent_<labelname>: true for each label of the service object.
  8. __meta_kubernetes_service_name: The name of the service object.
  9. __meta_kubernetes_service_port_name: Name of the service port for the target.
  10. __meta_kubernetes_service_port_protocol: Protocol of the service port for the target.

  

3、Pod

Pod role会发现所有pods以及暴露的容器作为target。每个容器声明一个端口,一个单独的target就会生成。如果一个容器没有指定端口,通过relabel手动指定一个端口,一个port-free target容器将会生成。

Available meta labels:

  1. __meta_kubernetes_namespace: The namespace of the pod object.
  2. __meta_kubernetes_pod_name: The name of the pod object.
  3. __meta_kubernetes_pod_ip: The pod IP of the pod object.
  4. __meta_kubernetes_pod_label_<labelname>: Each label from the pod object.
  5. __meta_kubernetes_pod_labelpresent_<labelname>: truefor each label from the pod object.
  6. __meta_kubernetes_pod_annotation_<annotationname>: Each annotation from the pod object.
  7. __meta_kubernetes_pod_annotationpresent_<annotationname>: true for each annotation from the pod object.
  8. __meta_kubernetes_pod_container_init: true if the container is an InitContainer
  9. __meta_kubernetes_pod_container_name: Name of the container the target address points to.
  10. __meta_kubernetes_pod_container_port_name: Name of the container port.
  11. __meta_kubernetes_pod_container_port_number: Number of the container port.
  12. __meta_kubernetes_pod_container_port_protocol: Protocol of the container port.
  13. __meta_kubernetes_pod_ready: Set to true or false for the pod's ready state.
  14. __meta_kubernetes_pod_phase: Set to Pending, Running, Succeeded, Failed or Unknown in the lifecycle.
  15. __meta_kubernetes_pod_node_name: The name of the node the pod is scheduled onto.
  16. __meta_kubernetes_pod_host_ip: The current host IP of the pod object.
  17. __meta_kubernetes_pod_uid: The UID of the pod object.
  18. __meta_kubernetes_pod_controller_kind: Object kind of the pod controller.
  19. __meta_kubernetes_pod_controller_name: Name of the pod controller.

  

4、endpoints

endpoints role从每个服务监听的endpoints发现。每个endpoint都会发现一个port。如果endpoint是一个pod,所有包含的容器不被绑定到一个endpoint port,也会被targets被发现。

Available meta labels:

  1. __meta_kubernetes_namespace: The namespace of the endpoints object.
  2. __meta_kubernetes_endpoints_name: The names of the endpoints object.
  3. For all targets discovered directly from the endpoints list (those not additionally inferred from underlying pods), the following labels are attached:
  4. __meta_kubernetes_endpoint_hostname: Hostname of the endpoint.
  5. __meta_kubernetes_endpoint_node_name: Name of the node hosting the endpoint.
  6. __meta_kubernetes_endpoint_ready: Set to true or false for the endpoint's ready state.
  7. __meta_kubernetes_endpoint_port_name: Name of the endpoint port.
  8. __meta_kubernetes_endpoint_port_protocol: Protocol of the endpoint port.
  9. __meta_kubernetes_endpoint_address_target_kind: Kind of the endpoint address target.
  10. __meta_kubernetes_endpoint_address_target_name: Name of the endpoint address target.
  11. If the endpoints belong to a service, all labels of the role: service discovery are attached.
  12. For all targets backed by a pod, all labels of the role: pod discovery are attached.

  

5、ingress

ingress role将会发现每个ingress。ingress在黑盒监控上比较有用。address将会被设置成ingress指定的配置。

Available meta labels:

  1. __meta_kubernetes_namespace: The namespace of the ingress object.
  2. __meta_kubernetes_ingress_name: The name of the ingress object.
  3. __meta_kubernetes_ingress_label_<labelname>: Each label from the ingress object.
  4. __meta_kubernetes_ingress_labelpresent_<labelname>: true for each label from the ingress object.
  5. __meta_kubernetes_ingress_annotation_<annotationname>: Each annotation from the ingress object.
  6. __meta_kubernetes_ingress_annotationpresent_<annotationname>: true for each annotation from the ingress object.
  7. __meta_kubernetes_ingress_scheme: Protocol scheme of ingress, https if TLS config is set. Defaults to http.
  8. __meta_kubernetes_ingress_path: Path from ingress spec. Defaults to /.

  

二、自动发现Pod配置

比如业务上有一个微服务,类型为statefulset,启动后是2个pod的副本集,pod暴露的数据接口为http://pod\_ip:7000/metrics。由于pod每次重启后,ip都会变化,所以只能通过自动发现的方式获取数据。

  1. apiVersion: apps/v1
  2. kind: StatefulSet
  3. metadata:
  4. labels:
  5. run: jx3recipe
  6. name: jx3recipe
  7. annotations:
  8. prometheus.io/scrape: "true"
  9. spec:
  10. selector:
  11. matchLabels:
  12. app: jx3recipe
  13. serviceName: jx3recipe-service
  14. replicas: 2
  15. template:
  16. metadata:
  17. labels:
  18. app: jx3recipe
  19. appCluster: jx3recipe-cluster
  20. spec:
  21. terminationGracePeriodSeconds: 20
  22. containers:
  23. - image: hub.kce.ooo.com/jx3pvp/jx3recipe:qa-latest
  24. imagePullPolicy: Always
  25. securityContext:
  26. runAsUser: 1000
  27. name: jx3recipe
  28. lifecycle:
  29. preStop:
  30. exec:
  31. command: ["kill","-s","SIGINT","1"]
  32. volumeMounts:
  33. - name: config-volume
  34. mountPath: /data/conf.yml
  35. subPath: conf.yml
  36. resources:
  37. requests:
  38. cpu: "100m"
  39. memory: "500Mi"
  40. env:
  41. - name: JX3PVP_ENV
  42. value: "qa"
  43. - name: JX3PVP_RUN_MODE
  44. value: "k8s"
  45. - name: JX3PVP_SERVICE_ID
  46. valueFrom:
  47. fieldRef:
  48. fieldPath: metadata.name
  49. - name: JX3PVP_LOCAL_IP
  50. valueFrom:
  51. fieldRef:
  52. fieldPath: status.podIP
  53. - name: JX3PVP_CONSUL_IP
  54. value: $(CONSUL_AGENT_SERVICE_HOST)
  55. ports:
  56. - name: biz
  57. containerPort: 8000
  58. protocol: "TCP"
  59. - name: admin
  60. containerPort: 7000
  61. protocol: "TCP"
  62. volumes:
  63. - name: config-volume
  64. configMap:
  65. name: app-configure-file-jx3recipe
  66. items:
  67. - key: jx3recipe.yml
  68. path: conf.yml

  

1、创建发现规则

设定发现pod规则:文件名为promethues-additional.yaml

  • pod名称的label为jx3recipe
  • pod的label_appCluster匹配为 jx3recipe-cluster
  • pod的address为http://.\*:7000/metrics格式

    • job_name: ‘kubernetes-service-pod’
      kubernetes_sd_configs:
      • role: pod
        relabel_configs:
      • source_labels: [__meta_kubernetes_pod_container_name]
        action: replace
        target_label: jx3recipe
      • action: labelmap
        regex: _meta_kubernetes_pod_label(.+)
      • source_labels: [“__meta_kubernetes_pod_label_appCluster”]
        regex: “jx3recipe-cluster”
        action: keep
      • sourcelabels: [_address]
        action: keep
        regex: ‘(.*):7000’

  

2、创建对应的Secret对象

  1. kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml -n monitoring

 

创建完成后,会将上面配置信息进行 base64 编码后作为 prometheus-additional.yaml 这个 key 对应的值存在:

  1. apiVersion: v1
  2. data:
  3. prometheus-additional.yaml: LSBqb2JfbmFtZTogJ2t1YmVybmV0ZXMtc2VydmljZS1wb2QnCiAga3ViZXJuZXRlc19zZF9jb25maWdzOgogIC0gcm9sZTogcG9kCiAgcmVsYWJlbF9jb25maWdzOgogIC0gc291cmNlX2xhYmVsczogW19fbWV0YV9rdWJlcm5ldGVzX3BvZF9jb250YWluZXJfbmFtZV0KICAgIGFjdGlvbjogcmVwbGFjZQogICAgdGFyZ2V0X2xhYmVsOiBqeDNyZWNpcGUKICAtIGFjdGlvbjogbGFiZWxtYXAKICAgIHJlZ2V4OiBfX21ldGFfa3ViZXJuZXRlc19wb2RfbGFiZWxfKC4rKQogIC0gc291cmNlX2xhYmVsczogIFsiX19tZXRhX2t1YmVybmV0ZXNfcG9kX2xhYmVsX2FwcENsdXN0ZXIiXQogICAgcmVnZXg6ICJqeDNyZWNpcGUtY2x1c3RlciIKICAgIGFjdGlvbjoga2VlcAogIC0gc291cmNlX2xhYmVsczogW19fYWRkcmVzc19fXQogICAgYWN0aW9uOiBrZWVwCiAgICByZWdleDogJyguKik6NzAwMCcK
  4. kind: Secret
  5. metadata:
  6. creationTimestamp: "2019-09-10T09:32:22Z"
  7. name: additional-configs
  8. namespace: monitoring
  9. resourceVersion: "1004681"
  10. selfLink: /api/v1/namespaces/monitoring/secrets/additional-configs
  11. uid: e455d657-d3ad-11e9-95b4-fa163e3c10ff
  12. type: Opaque

  然后我们只需要在声明 prometheus 的资源对象文件中添加上这个额外的配置:(prometheus-prometheus.yaml)

3、promethues添加资源对象

修改prometheus-prometheus.yaml文件

  1. apiVersion: monitoring.coreos.com/v1
  2. kind: Prometheus
  3. metadata:
  4. labels:
  5. prometheus: k8s
  6. name: k8s
  7. namespace: monitoring
  8. spec:
  9. alerting:
  10. alertmanagers:
  11. - name: alertmanager-main
  12. namespace: monitoring
  13. port: web
  14. baseImage: quay.io/prometheus/prometheus
  15. nodeSelector:
  16. beta.kubernetes.io/os: linux
  17. replicas: 2
  18. secrets:
  19. - etcd-certs
  20. resources:
  21. requests:
  22. memory: 400Mi
  23. ruleSelector:
  24. matchLabels:
  25. prometheus: k8s
  26. role: alert-rules
  27. securityContext:
  28. fsGroup: 2000
  29. runAsNonRoot: true
  30. runAsUser: 1000
  31. additionalScrapeConfigs:
  32. name: additional-configs
  33. key: prometheus-additional.yaml
  34. serviceAccountName: prometheus-k8s
  35. serviceMonitorNamespaceSelector: {}
  36. serviceMonitorSelector: {}
  37. version: v2.5.0

  增加了下面这一段:

  1. additionalScrapeConfigs:
  2. name: additional-configs
  3. key: prometheus-additional.yaml

  

4、应用配置

  1. kubectl apply -f prometheus-prometheus.yaml

  过一段时间,刷新promethues上的config,将会看到下面红色框框的配置。

462684-20190911152340156-1228074519.png

5、添加权限

 在 Prometheus Dashboard 的配置页面下面我们可以看到已经有了对应的的配置信息了,但是我们切换到 targets 页面下面却并没有发现对应的监控任务,查看 Prometheus 的 Pod 日志:

462684-20190911152514659-2066498733.png

可以看到有很多错误日志出现,都是xxx is forbidden,这说明是 RBAC 权限的问题,通过 prometheus 资源对象的配置可以知道 Prometheus 绑定了一个名为 prometheus-k8s 的 ServiceAccount 对象,而这个对象绑定的是一个名为 prometheus-k8s 的 ClusterRole:(prometheus-clusterRole.yaml)

修改为:

  1. apiVersion: rbac.authorization.k8s.io/v1
  2. kind: ClusterRole
  3. metadata:
  4. name: prometheus-k8s
  5. rules:
  6. - apiGroups:
  7. - ""
  8. resources:
  9. - nodes
  10. - services
  11. - endpoints
  12. - pods
  13. - nodes/proxy
  14. verbs:
  15. - get
  16. - list
  17. - watch
  18. - apiGroups:
  19. - ""
  20. resources:
  21. - configmaps
  22. - nodes/metrics
  23. verbs:
  24. - get
  25. - nonResourceURLs:
  26. - /metrics
  27. verbs:
  28. - get

  更新上面的 ClusterRole 这个资源对象,然后重建下 Prometheus 的所有 Pod,正常就可以看到 targets 页面下面有 kubernetes-service-pod这个监控任务了:

462684-20190911152632937-1121200075.png

至此,一个自动发现pod的配置就完成了,其他资源(service、endpoint、ingress、node同样也可以通过自动发现的方式实现。)

转载于:https://www.cnblogs.com/skyflask/p/11498834.html

发表评论

表情:
评论列表 (有 0 条评论,215人围观)

还没有评论,来说两句吧...

相关阅读