用prometheus-operator的方式安装prometheus和grafana监控

olivee 5年前 ⋅ 1288 阅读

参考:

1. 说明

Prometheus Operator 是 CoreOS 开发的基于 Prometheus 的 Kubernetes 监控方案。(个人理解为在k8s中辅助运维管理 Prometheus的工具,能够帮助我们更好的使用Prometheus)。

Operator的关键是基于K8S的CRD技术实现的。

在Operator中定义了以下几种CRD资源:

  • Prometheus:定义了一个Prometheus的CRD资源,也就等同于定义了一个Prometheus的deployment,它会触发部署定义的Prometheus Pod。

  • ServiceMonitor:定义Prometheus需要监控哪些Service资源,Operator会根据ServiceMonitor的定义自动生成Prometheus的监控配置。

  • PodMonitor:定义Prometheus需要监控哪些Pod资源,Operator会根据PodMonitor的定义自动生成Prometheus的监控配置。

  • PrometheusRule: Prometheus alert 会读取PrometheusRule的配置,并根据配置触发告警。

  • Alertmanager: 定义了一个Alertmanager的CRD资源,也就等同于定义了一个Alertmanager的deployment,它会触发部署定义的Alertmanager Pod。

1.1 架构图如下:

p1.png

2. helm安装prometheus-operator

helm install --name prometheus-operator --set rbacEnable=true --namespace=monitoring stable/prometheus-operator

安装信息如下:

NAME:   prometheus-operator
LAST DEPLOYED: Wed Dec  4 16:44:59 2019
NAMESPACE: monitoring
STATUS: DEPLOYED

RESOURCES:
==> v1/Alertmanager
NAME                              AGE
prometheus-operator-alertmanager  32s

==> v1/ClusterRole
NAME                                              AGE
prometheus-operator-alertmanager                  33s
prometheus-operator-grafana-clusterrole           33s
prometheus-operator-operator                      33s
prometheus-operator-operator-psp                  33s
prometheus-operator-prometheus                    33s
prometheus-operator-prometheus-psp                33s
psp-prometheus-operator-kube-state-metrics        33s
psp-prometheus-operator-prometheus-node-exporter  33s

==> v1/ClusterRoleBinding
NAME                                              AGE
prometheus-operator-alertmanager                  33s
prometheus-operator-grafana-clusterrolebinding    33s
prometheus-operator-operator                      33s
prometheus-operator-operator-psp                  33s
prometheus-operator-prometheus                    33s
prometheus-operator-prometheus-psp                33s
psp-prometheus-operator-kube-state-metrics        33s
psp-prometheus-operator-prometheus-node-exporter  33s

==> v1/ConfigMap
NAME                                                   AGE
prometheus-operator-etcd                               33s
prometheus-operator-grafana                            33s
prometheus-operator-grafana-config-dashboards          33s
prometheus-operator-grafana-datasource                 33s
prometheus-operator-grafana-test                       33s
prometheus-operator-k8s-cluster-rsrc-use               33s
prometheus-operator-k8s-node-rsrc-use                  33s
prometheus-operator-k8s-resources-cluster              33s
prometheus-operator-k8s-resources-namespace            33s
prometheus-operator-k8s-resources-pod                  33s
prometheus-operator-k8s-resources-workload             33s
prometheus-operator-k8s-resources-workloads-namespace  33s
prometheus-operator-nodes                              33s
prometheus-operator-persistentvolumesusage             33s
prometheus-operator-pods                               33s
prometheus-operator-statefulset                        33s

==> v1/DaemonSet
NAME                                          AGE
prometheus-operator-prometheus-node-exporter  32s

==> v1/Deployment
NAME                                    AGE
prometheus-operator-grafana             32s
prometheus-operator-kube-state-metrics  32s
prometheus-operator-operator            32s

==> v1/Pod(related)
NAME                                                    AGE
prometheus-operator-grafana-685f4b98b5-jzttt            32s
prometheus-operator-kube-state-metrics-746dc6ccc-l78g7  32s
prometheus-operator-operator-76b5d88594-hrwsw           32s
prometheus-operator-prometheus-node-exporter-k5d8v      32s
prometheus-operator-prometheus-node-exporter-xmrwf      32s
prometheus-operator-prometheus-node-exporter-z4knj      32s

==> v1/Prometheus
NAME                            AGE
prometheus-operator-prometheus  32s

==> v1/PrometheusRule
NAME                                                      AGE
prometheus-operator-alertmanager.rules                    32s
prometheus-operator-etcd                                  32s
prometheus-operator-general.rules                         32s
prometheus-operator-k8s.rules                             32s
prometheus-operator-kube-apiserver.rules                  32s
prometheus-operator-kube-prometheus-node-alerting.rules   32s
prometheus-operator-kube-prometheus-node-recording.rules  32s
prometheus-operator-kube-scheduler.rules                  32s
prometheus-operator-kubernetes-absent                     32s
prometheus-operator-kubernetes-apps                       32s
prometheus-operator-kubernetes-resources                  32s
prometheus-operator-kubernetes-storage                    32s
prometheus-operator-kubernetes-system                     32s
prometheus-operator-node-network                          32s
prometheus-operator-node-time                             32s
prometheus-operator-node.rules                            32s
prometheus-operator-prometheus-operator                   32s
prometheus-operator-prometheus.rules                      32s

==> v1/Role
NAME                              AGE
prometheus-operator-grafana-test  33s

==> v1/RoleBinding
NAME                              AGE
prometheus-operator-grafana-test  33s

==> v1/Secret
NAME                                           AGE
alertmanager-prometheus-operator-alertmanager  33s
prometheus-operator-grafana                    33s

==> v1/Service
NAME                                          AGE
prometheus-operator-alertmanager              32s
prometheus-operator-coredns                   33s
prometheus-operator-grafana                   33s
prometheus-operator-kube-controller-manager   33s
prometheus-operator-kube-etcd                 33s
prometheus-operator-kube-proxy                33s
prometheus-operator-kube-scheduler            33s
prometheus-operator-kube-state-metrics        32s
prometheus-operator-operator                  32s
prometheus-operator-prometheus                33s
prometheus-operator-prometheus-node-exporter  32s

==> v1/ServiceAccount
NAME                                          AGE
prometheus-operator-alertmanager              33s
prometheus-operator-grafana                   33s
prometheus-operator-grafana-test              33s
prometheus-operator-kube-state-metrics        33s
prometheus-operator-operator                  33s
prometheus-operator-prometheus                33s
prometheus-operator-prometheus-node-exporter  33s

==> v1/ServiceMonitor
NAME                                         AGE
prometheus-operator-alertmanager             32s
prometheus-operator-apiserver                32s
prometheus-operator-coredns                  32s
prometheus-operator-grafana                  32s
prometheus-operator-kube-controller-manager  32s
prometheus-operator-kube-etcd                32s
prometheus-operator-kube-proxy               32s
prometheus-operator-kube-scheduler           32s
prometheus-operator-kube-state-metrics       32s
prometheus-operator-kubelet                  32s
prometheus-operator-node-exporter            32s
prometheus-operator-operator                 32s
prometheus-operator-prometheus               32s

==> v1beta1/ClusterRole
NAME                                    AGE
prometheus-operator-kube-state-metrics  33s

==> v1beta1/ClusterRoleBinding
NAME                                    AGE
prometheus-operator-kube-state-metrics  33s

==> v1beta1/MutatingWebhookConfiguration
NAME                           AGE
prometheus-operator-admission  32s

==> v1beta1/PodSecurityPolicy
NAME                                          AGE
prometheus-operator-alertmanager              33s
prometheus-operator-grafana                   33s
prometheus-operator-grafana-test              33s
prometheus-operator-kube-state-metrics        33s
prometheus-operator-operator                  33s
prometheus-operator-prometheus                33s
prometheus-operator-prometheus-node-exporter  33s

==> v1beta1/Role
NAME                         AGE
prometheus-operator-grafana  33s

==> v1beta1/RoleBinding
NAME                         AGE
prometheus-operator-grafana  33s

==> v1beta1/ValidatingWebhookConfiguration
NAME                           AGE
prometheus-operator-admission  32s


NOTES:
The Prometheus Operator has been installed. Check its status by running:
  kubectl --namespace monitoring get pods -l "release=prometheus-operator"

Visit https://github.com/coreos/prometheus-operator for instructions on how
to create & configure Alertmanager and Prometheus instances using the Operator.

如果需要删除执行:

helm delete prometheus-operator 

如果重新安装的时候发现crd已存在,可以把crd删除之后在安装,删除crd的命令如下:

kubectl delete --ignore-not-found customresourcedefinitions \
  prometheuses.monitoring.coreos.com \
  servicemonitors.monitoring.coreos.com \
  podmonitors.monitoring.coreos.com \
  alertmanagers.monitoring.coreos.com \
  prometheusrules.monitoring.coreos.com

2.1 查看grafana的密码

kubectl get secret \
    --namespace monitoring prometheus-operator-grafana \
    -o jsonpath="{.data.admin-password}" \
    | base64 --decode ; echo

3. 创建ingress

创建grafana.testdomain.com的ingress:kubectl create -f prometheus-operator-grafana-ingress.yaml,prometheus-operator-grafana-ingress.yaml的配置如下:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-operator-grafana
  namespace: monitoring
spec:
  rules:
  - host: grafana.testdomain.com
    http:
      paths:
      - backend:
          serviceName: prometheus-operator-grafana
          servicePort: 80

创建prometheus.testdomain.com的ingress:kubectl create -f prometheus-operator-prometheus-ingress.yaml,prometheus-operator-prometheus-ingress.yaml的配置如下: kubectl create -f

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-operator-prometheus
  namespace: monitoring
spec:
  rules:
  - host: prometheus.testdomain.com
    http:
      paths:
      - backend:
          serviceName: prometheus-operator-prometheus
          servicePort: 9090

创建prometheus-alert.testdomain.com的ingress:kubectl create -f prometheus-operator-alertmanager-ingress.yaml,prometheus-operator-alertmanager-ingress.yaml的配置如下:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-operator-alertmanager
  namespace: monitoring
spec:
  rules:
  - host: prometheus-alert.testdomain.com
    http:
      paths:
      - backend:
          serviceName: prometheus-operator-alertmanager
          servicePort: 9093

4. 使用示例

参考

4.1 先创建的deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example-app
  template:
    metadata:
      labels:
        app: example-app
    spec:
      containers:
      - name: example-app
        image: fabxc/instrumented_app
        ports:
        - name: web
          containerPort: 8080

4.2 并且暴露这个deployment对应的的service

kind: Service
apiVersion: v1
metadata:
  name: example-app
  labels:
    app: example-app
spec:
  selector:
    app: example-app
  ports:
  - name: web
    port: 8080

4.3 通过ServiceMonitor来配置监控example-app对应的的service

如下的selector.matchLabels定义了监控app: example-app这个服务:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    team: frontend
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web

4.4 创建RBAC,用于部署Prometheus资源

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default

4.5 创建Prometheus资源

创建Prometheus资源成功后,会自动部署一个pod,这个pod的ServiceAccount为prometheus(该ServiceAccount已在上一步授权),这个Prometheus会去监控matchLabels为team: frontend的ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: frontend
  resources:
    requests:
      memory: 400Mi
  enableAdminAPI: false

enableAdminAPI参数设置了是否暴露prometheus的admin API。