参考:
-
-
如何以优雅的姿势监控kubernetes 集群服务:https://www.kancloud.cn/huyipow/prometheus/527093
-
Prometheus2.13官方文档中文翻译:https://www.kancloud.cn/nicefo71/prometheus-doc-zh/1331204
这种方式与prometheus-operator和metrics-server的区别是没有用的aggregation layer和CRD技术。而prometheus-operator用了CRD
技术,metrics-server用了aggregation layer
技术。
1. 创建monitoring命名空间
kubectl create namespace monitoring
2. 安装prometheus
执行如下安装命令
helm install stable/prometheus --namespace monitoring --name prometheus
安装结果如下:
NAME: prometheus
LAST DEPLOYED: Tue Dec 3 14:11:08 2019
NAMESPACE: monitoring
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME AGE
prometheus-alertmanager 1s
prometheus-server 1s
==> v1/DaemonSet
NAME AGE
prometheus-node-exporter 1s
==> v1/Deployment
NAME AGE
prometheus-alertmanager 1s
prometheus-kube-state-metrics 1s
prometheus-pushgateway 1s
prometheus-server 1s
==> v1/PersistentVolumeClaim
NAME AGE
prometheus-alertmanager 1s
prometheus-server 1s
==> v1/Pod(related)
NAME AGE
prometheus-alertmanager-74ffdf8bd6-xnbt4 1s
prometheus-kube-state-metrics-77757854cf-s5hkz 1s
prometheus-node-exporter-fhbl4 1s
prometheus-node-exporter-txr7z 1s
prometheus-pushgateway-57688d8875-wcjjq 1s
prometheus-server-5c8b68f5cd-tkv5b 1s
==> v1/Service
NAME AGE
prometheus-alertmanager 1s
prometheus-kube-state-metrics 1s
prometheus-node-exporter 1s
prometheus-pushgateway 1s
prometheus-server 1s
==> v1/ServiceAccount
NAME AGE
prometheus-alertmanager 1s
prometheus-kube-state-metrics 1s
prometheus-node-exporter 1s
prometheus-pushgateway 1s
prometheus-server 1s
==> v1beta1/ClusterRole
NAME AGE
prometheus-alertmanager 1s
prometheus-kube-state-metrics 1s
prometheus-pushgateway 1s
prometheus-server 1s
==> v1beta1/ClusterRoleBinding
NAME AGE
prometheus-alertmanager 1s
prometheus-kube-state-metrics 1s
prometheus-pushgateway 1s
prometheus-server 1s
NOTES:
The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-server.monitoring.svc.cluster.local
Get the Prometheus server URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 9090
The Prometheus alertmanager can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-alertmanager.monitoring.svc.cluster.local
Get the Alertmanager URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=alertmanager" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 9093
#################################################################################
###### WARNING: Pod Security Policy has been moved to a global property. #####
###### use .Values.podSecurityPolicy.enabled with pod-based #####
###### annotations #####
###### (e.g. .Values.nodeExporter.podSecurityPolicy.annotations) #####
#################################################################################
The Prometheus PushGateway can be accessed via port 9091 on the following DNS name from within your cluster:
prometheus-pushgateway.monitoring.svc.cluster.local
Get the PushGateway URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=pushgateway" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 9091
For more information on running Prometheus, visit:
https://prometheus.io/
由于它会用到两个pvc,因此,先编辑pvc配置文件vi prometheus-server-pvc.yaml
:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
finalizers:
- kubernetes.io/pvc-protection
labels:
labels:
app: prometheus
name: prometheus-server
namespace: monitoring
annotations:
volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
volumeMode: Filesystem
再执行安装命令(先删除后安装):
kubectl delete -f prometheus-server-pvc.yaml
kubectl create -f prometheus-server-pvc.yaml
先编辑pvc配置文件vi prometheus-alertmanager-pvc.yaml
:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
finalizers:
- kubernetes.io/pvc-protection
labels:
app: prometheus
name: prometheus-alertmanager
namespace: monitoring
annotations:
volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
volumeMode: Filesystem
再执行安装命令(先删除后安装):
kubectl delete -f prometheus-alertmanager-pvc.yaml
kubectl create -f prometheus-alertmanager-pvc.yaml
安装好之后,可以看到如下pod:
$ kubectl get pods -n monitoring
NAME READY STATUS
prometheus-alertmanager-5c5958dcb7-bq2fw 2/2 Running
prometheus-kube-state-metrics-76d649cdf9-v5qg5 1/1 Running
prometheus-node-exporter-j74zq 1/1 Running
prometheus-node-exporter-x5xnq 1/1 Running
prometheus-pushgateway-6744d69d4-27dxb 1/1 Running
prometheus-server-669b987bcd-swcxh 2/2 Running
其中:
-
alertmanager:告警模块
-
kube-state-metrics:是一个简单的监听Kubernetes API server的服务,并且生成pod、deployment、node等所有对象的metrics。它通过/metrics接口暴露这些对象的metrics信息
-
node-exporter:主要主要是监控kubernetes 集群node 物理主机:cpu、memory、network、disk 等基础监控资源。使用daemonset 方式 自动为每个node部署监控agent。
-
pushgateway:客户端使用push的方式上报监控数据到pushgateway,prometheus会定期从pushgateway拉取数据。使用它的原因主要是:
-
-
Prometheus 采用 pull 模式,可能由于不在一个子网或者防火墙原因,导致Prometheus 无法直接拉取各个 target数据。
-
-
-
在监控业务数据的时候,需要将不同数据汇总, 由 Prometheus 统一收集。
-
-
server: 主服务
prometheus的架构图如下:
3. 安装grafana
创建一个ConfigMap,用于指定grafana的数据源连接到prometheus:
kubectl apply -f config.yaml
config.yaml文件内容如下:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-grafana-datasource
namespace: monitoring
labels:
grafana_datasource: '1'
data:
datasource.yaml: |-
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus-server.monitoring.svc.cluster.local
编辑helm安装grafana的values参数文件vi values.yml
:
sidecar:
datasources:
enabled: true
label: grafana_datasource
dashboards:
enabled: true
label: grafana_dashboard
其中:
-
sidecar.datasources.enabled
参数表示可以用configmap来配置datasources,label指定了读取哪个label的configmap。 -
sidecar.dashboards.enabled
参数表示可以用configmap来配置dashboard,label指定了读取哪个label的configmap。
执行helm安装命令:
helm install stable/grafana -f values.yml --namespace monitoring --name grafana
安装结果如下:
NAME: grafana
LAST DEPLOYED: Tue Dec 3 15:57:26 2019
NAMESPACE: monitoring
STATUS: DEPLOYED
RESOURCES:
==> v1/ClusterRole
NAME AGE
grafana-clusterrole 0s
==> v1/ClusterRoleBinding
NAME AGE
grafana-clusterrolebinding 0s
==> v1/ConfigMap
NAME AGE
grafana 0s
grafana-test 0s
==> v1/Deployment
NAME AGE
grafana 0s
==> v1/Pod(related)
NAME AGE
grafana-577d8d9c79-j5q2p 0s
==> v1/Role
NAME AGE
grafana-test 0s
==> v1/RoleBinding
NAME AGE
grafana-test 0s
==> v1/Secret
NAME AGE
grafana 0s
==> v1/Service
NAME AGE
grafana 0s
==> v1/ServiceAccount
NAME AGE
grafana 0s
grafana-test 0s
==> v1beta1/PodSecurityPolicy
NAME AGE
grafana 0s
grafana-test 0s
==> v1beta1/Role
NAME AGE
grafana 0s
==> v1beta1/RoleBinding
NAME AGE
grafana 0s
NOTES:
1. Get your 'admin' user password by running:
kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:
grafana.monitoring.svc.cluster.local
Get the Grafana URL to visit by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=grafana,release=grafana" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 3000
3. Login with the password from step 1 and the username: admin
#################################################################################
###### WARNING: Persistence is disabled!!! You will lose your data when #####
###### the Grafana pod is terminated. #####
#################################################################################
获取grafana的密码:
kubectl get secret \
--namespace monitoring grafana \
-o jsonpath="{.data.admin-password}" \
| base64 --decode ; echo
这个密码的用户名是: admin
4. 配置ingress访问
配置grafana访问的ingress(kubectl create -f grafana-ingress.yaml
),grafana-ingress.yaml文件的内容如下:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grafana
namespace: monitoring
spec:
rules:
- host: grafana.testdomain.com
http:
paths:
- backend:
serviceName: grafana
servicePort: 80
配置prometheus访问的ingress(kubectl create -f prometheus-ingress.yaml
),prometheus-ingress.yaml文件的内容如下:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus
namespace: monitoring
spec:
rules:
- host: prometheus.testdomain.com
http:
paths:
- backend:
serviceName: prometheus-server
servicePort: 80
配置alert访问的ingress(kubectl create -f prometheus-alert-ingress.yaml
),prometheus-alert-ingress.yaml文件的内容如下:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus-alert
namespace: monitoring
spec:
rules:
- host: prometheus-alert.testdomain.com
http:
paths:
- backend:
serviceName: prometheus-alertmanager
servicePort: 80
5. 使用
打开grafana的访问地址:http://grafana.testdomain.com:ingres的nodePort端口/。
5.1 通过import导入dashboard
选择Dashboards > Manage > + Import。在Grafana.com dashboard输入1860
dashboard ID,并点击load。
下一步,选择你的dashboard的名字,并选择Prometheus
datasource,并点击Import。
查看这个dashboard的页面:
5.2 通过config来配置dashboard
下载json: https://grafana.com/api/dashboards/3662/revisions/2/download
wget https://grafana.com/api/dashboards/3662/revisions/2/download
mv download prometheus-2-0-overview_rev2.json
替换DS_THEMIS参数为Prometheus:
sed -i 's/${DS_THEMIS}/Prometheus/g' prometheus-2-0-overview_rev2.json
根据创建的dashboard的configmap(kubectl apply -f prometheus-overview-dashboard-configmap.yml
),prometheus-overview-dashboard-configmap.yml文件的内容如下(注意修改替换后的prometheus-2-0-overview_rev2.json
):
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-prometheus-overview-dashboard
namespace: monitoring
labels:
grafana_dashboard: '1'
data:
prometheus-dashboard.json: |-
替换后的prometheus-2-0-overview_rev2.json
配置好后,可以访问这个ConfigMap对应的dashboard。