参考:
-
-
-
helm安装参数说明:https://github.com/helm/charts/blob/master/stable/prometheus-operator/README.md#configuration
-
-
中文安装部署参考:https://www.qikqiak.com/post/first-use-prometheus-operator/#%E4%BB%8B%E7%BB%8D
1. 说明
Prometheus Operator 是 CoreOS 开发的基于 Prometheus 的 Kubernetes 监控方案。(个人理解为在k8s中辅助运维管理 Prometheus的工具,能够帮助我们更好的使用Prometheus)。
Operator的关键是基于K8S的CRD技术实现的。
在Operator中定义了以下几种CRD资源:
-
Prometheus:定义了一个Prometheus的CRD资源,也就等同于定义了一个Prometheus的deployment,它会触发部署定义的Prometheus Pod。
-
ServiceMonitor:定义Prometheus需要监控哪些Service资源,Operator会根据ServiceMonitor的定义自动生成Prometheus的监控配置。
-
PodMonitor:定义Prometheus需要监控哪些Pod资源,Operator会根据PodMonitor的定义自动生成Prometheus的监控配置。
-
PrometheusRule: Prometheus alert 会读取PrometheusRule的配置,并根据配置触发告警。
-
Alertmanager: 定义了一个Alertmanager的CRD资源,也就等同于定义了一个Alertmanager的deployment,它会触发部署定义的Alertmanager Pod。
1.1 架构图如下:
2. helm安装prometheus-operator
helm install --name prometheus-operator --set rbacEnable=true --namespace=monitoring stable/prometheus-operator
安装信息如下:
NAME: prometheus-operator
LAST DEPLOYED: Wed Dec 4 16:44:59 2019
NAMESPACE: monitoring
STATUS: DEPLOYED
RESOURCES:
==> v1/Alertmanager
NAME AGE
prometheus-operator-alertmanager 32s
==> v1/ClusterRole
NAME AGE
prometheus-operator-alertmanager 33s
prometheus-operator-grafana-clusterrole 33s
prometheus-operator-operator 33s
prometheus-operator-operator-psp 33s
prometheus-operator-prometheus 33s
prometheus-operator-prometheus-psp 33s
psp-prometheus-operator-kube-state-metrics 33s
psp-prometheus-operator-prometheus-node-exporter 33s
==> v1/ClusterRoleBinding
NAME AGE
prometheus-operator-alertmanager 33s
prometheus-operator-grafana-clusterrolebinding 33s
prometheus-operator-operator 33s
prometheus-operator-operator-psp 33s
prometheus-operator-prometheus 33s
prometheus-operator-prometheus-psp 33s
psp-prometheus-operator-kube-state-metrics 33s
psp-prometheus-operator-prometheus-node-exporter 33s
==> v1/ConfigMap
NAME AGE
prometheus-operator-etcd 33s
prometheus-operator-grafana 33s
prometheus-operator-grafana-config-dashboards 33s
prometheus-operator-grafana-datasource 33s
prometheus-operator-grafana-test 33s
prometheus-operator-k8s-cluster-rsrc-use 33s
prometheus-operator-k8s-node-rsrc-use 33s
prometheus-operator-k8s-resources-cluster 33s
prometheus-operator-k8s-resources-namespace 33s
prometheus-operator-k8s-resources-pod 33s
prometheus-operator-k8s-resources-workload 33s
prometheus-operator-k8s-resources-workloads-namespace 33s
prometheus-operator-nodes 33s
prometheus-operator-persistentvolumesusage 33s
prometheus-operator-pods 33s
prometheus-operator-statefulset 33s
==> v1/DaemonSet
NAME AGE
prometheus-operator-prometheus-node-exporter 32s
==> v1/Deployment
NAME AGE
prometheus-operator-grafana 32s
prometheus-operator-kube-state-metrics 32s
prometheus-operator-operator 32s
==> v1/Pod(related)
NAME AGE
prometheus-operator-grafana-685f4b98b5-jzttt 32s
prometheus-operator-kube-state-metrics-746dc6ccc-l78g7 32s
prometheus-operator-operator-76b5d88594-hrwsw 32s
prometheus-operator-prometheus-node-exporter-k5d8v 32s
prometheus-operator-prometheus-node-exporter-xmrwf 32s
prometheus-operator-prometheus-node-exporter-z4knj 32s
==> v1/Prometheus
NAME AGE
prometheus-operator-prometheus 32s
==> v1/PrometheusRule
NAME AGE
prometheus-operator-alertmanager.rules 32s
prometheus-operator-etcd 32s
prometheus-operator-general.rules 32s
prometheus-operator-k8s.rules 32s
prometheus-operator-kube-apiserver.rules 32s
prometheus-operator-kube-prometheus-node-alerting.rules 32s
prometheus-operator-kube-prometheus-node-recording.rules 32s
prometheus-operator-kube-scheduler.rules 32s
prometheus-operator-kubernetes-absent 32s
prometheus-operator-kubernetes-apps 32s
prometheus-operator-kubernetes-resources 32s
prometheus-operator-kubernetes-storage 32s
prometheus-operator-kubernetes-system 32s
prometheus-operator-node-network 32s
prometheus-operator-node-time 32s
prometheus-operator-node.rules 32s
prometheus-operator-prometheus-operator 32s
prometheus-operator-prometheus.rules 32s
==> v1/Role
NAME AGE
prometheus-operator-grafana-test 33s
==> v1/RoleBinding
NAME AGE
prometheus-operator-grafana-test 33s
==> v1/Secret
NAME AGE
alertmanager-prometheus-operator-alertmanager 33s
prometheus-operator-grafana 33s
==> v1/Service
NAME AGE
prometheus-operator-alertmanager 32s
prometheus-operator-coredns 33s
prometheus-operator-grafana 33s
prometheus-operator-kube-controller-manager 33s
prometheus-operator-kube-etcd 33s
prometheus-operator-kube-proxy 33s
prometheus-operator-kube-scheduler 33s
prometheus-operator-kube-state-metrics 32s
prometheus-operator-operator 32s
prometheus-operator-prometheus 33s
prometheus-operator-prometheus-node-exporter 32s
==> v1/ServiceAccount
NAME AGE
prometheus-operator-alertmanager 33s
prometheus-operator-grafana 33s
prometheus-operator-grafana-test 33s
prometheus-operator-kube-state-metrics 33s
prometheus-operator-operator 33s
prometheus-operator-prometheus 33s
prometheus-operator-prometheus-node-exporter 33s
==> v1/ServiceMonitor
NAME AGE
prometheus-operator-alertmanager 32s
prometheus-operator-apiserver 32s
prometheus-operator-coredns 32s
prometheus-operator-grafana 32s
prometheus-operator-kube-controller-manager 32s
prometheus-operator-kube-etcd 32s
prometheus-operator-kube-proxy 32s
prometheus-operator-kube-scheduler 32s
prometheus-operator-kube-state-metrics 32s
prometheus-operator-kubelet 32s
prometheus-operator-node-exporter 32s
prometheus-operator-operator 32s
prometheus-operator-prometheus 32s
==> v1beta1/ClusterRole
NAME AGE
prometheus-operator-kube-state-metrics 33s
==> v1beta1/ClusterRoleBinding
NAME AGE
prometheus-operator-kube-state-metrics 33s
==> v1beta1/MutatingWebhookConfiguration
NAME AGE
prometheus-operator-admission 32s
==> v1beta1/PodSecurityPolicy
NAME AGE
prometheus-operator-alertmanager 33s
prometheus-operator-grafana 33s
prometheus-operator-grafana-test 33s
prometheus-operator-kube-state-metrics 33s
prometheus-operator-operator 33s
prometheus-operator-prometheus 33s
prometheus-operator-prometheus-node-exporter 33s
==> v1beta1/Role
NAME AGE
prometheus-operator-grafana 33s
==> v1beta1/RoleBinding
NAME AGE
prometheus-operator-grafana 33s
==> v1beta1/ValidatingWebhookConfiguration
NAME AGE
prometheus-operator-admission 32s
NOTES:
The Prometheus Operator has been installed. Check its status by running:
kubectl --namespace monitoring get pods -l "release=prometheus-operator"
Visit https://github.com/coreos/prometheus-operator for instructions on how
to create & configure Alertmanager and Prometheus instances using the Operator.
如果需要删除执行:
helm delete prometheus-operator
如果重新安装的时候发现crd已存在,可以把crd删除之后在安装,删除crd的命令如下:
kubectl delete --ignore-not-found customresourcedefinitions \
prometheuses.monitoring.coreos.com \
servicemonitors.monitoring.coreos.com \
podmonitors.monitoring.coreos.com \
alertmanagers.monitoring.coreos.com \
prometheusrules.monitoring.coreos.com
2.1 查看grafana的密码
kubectl get secret \
--namespace monitoring prometheus-operator-grafana \
-o jsonpath="{.data.admin-password}" \
| base64 --decode ; echo
3. 创建ingress
创建grafana.testdomain.com的ingress:kubectl create -f prometheus-operator-grafana-ingress.yaml
,prometheus-operator-grafana-ingress.yaml的配置如下:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus-operator-grafana
namespace: monitoring
spec:
rules:
- host: grafana.testdomain.com
http:
paths:
- backend:
serviceName: prometheus-operator-grafana
servicePort: 80
创建prometheus.testdomain.com的ingress:kubectl create -f prometheus-operator-prometheus-ingress.yaml
,prometheus-operator-prometheus-ingress.yaml的配置如下:
kubectl create -f
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus-operator-prometheus
namespace: monitoring
spec:
rules:
- host: prometheus.testdomain.com
http:
paths:
- backend:
serviceName: prometheus-operator-prometheus
servicePort: 9090
创建prometheus-alert.testdomain.com的ingress:kubectl create -f prometheus-operator-alertmanager-ingress.yaml
,prometheus-operator-alertmanager-ingress.yaml的配置如下:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus-operator-alertmanager
namespace: monitoring
spec:
rules:
- host: prometheus-alert.testdomain.com
http:
paths:
- backend:
serviceName: prometheus-operator-alertmanager
servicePort: 9093
4. 使用示例
参考
4.1 先创建的deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-app
spec:
replicas: 3
selector:
matchLabels:
app: example-app
template:
metadata:
labels:
app: example-app
spec:
containers:
- name: example-app
image: fabxc/instrumented_app
ports:
- name: web
containerPort: 8080
4.2 并且暴露这个deployment对应的的service
kind: Service
apiVersion: v1
metadata:
name: example-app
labels:
app: example-app
spec:
selector:
app: example-app
ports:
- name: web
port: 8080
4.3 通过ServiceMonitor来配置监控example-app对应的的service
如下的selector.matchLabels定义了监控app: example-app
这个服务:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: web
4.4 创建RBAC,用于部署Prometheus资源
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: default
4.5 创建Prometheus资源
创建Prometheus资源成功后,会自动部署一个pod,这个pod的ServiceAccount为prometheus(该ServiceAccount已在上一步授权),这个Prometheus会去监控matchLabels为team: frontend
的ServiceMonitor:
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
team: frontend
resources:
requests:
memory: 400Mi
enableAdminAPI: false
enableAdminAPI参数设置了是否暴露prometheus的admin API。