安装prometheus和grafana监控

olivee 5年前 ⋅ 1190 阅读

参考:

这种方式与prometheus-operator和metrics-server的区别是没有用的aggregation layer和CRD技术。而prometheus-operator用了CRD技术,metrics-server用了aggregation layer技术。

1. 创建monitoring命名空间

kubectl create namespace monitoring

2. 安装prometheus

执行如下安装命令

helm install stable/prometheus  --namespace monitoring  --name prometheus

安装结果如下:

NAME:   prometheus
LAST DEPLOYED: Tue Dec  3 14:11:08 2019
NAMESPACE: monitoring
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME                     AGE
prometheus-alertmanager  1s
prometheus-server        1s

==> v1/DaemonSet
NAME                      AGE
prometheus-node-exporter  1s

==> v1/Deployment
NAME                           AGE
prometheus-alertmanager        1s
prometheus-kube-state-metrics  1s
prometheus-pushgateway         1s
prometheus-server              1s

==> v1/PersistentVolumeClaim
NAME                     AGE
prometheus-alertmanager  1s
prometheus-server        1s

==> v1/Pod(related)
NAME                                            AGE
prometheus-alertmanager-74ffdf8bd6-xnbt4        1s
prometheus-kube-state-metrics-77757854cf-s5hkz  1s
prometheus-node-exporter-fhbl4                  1s
prometheus-node-exporter-txr7z                  1s
prometheus-pushgateway-57688d8875-wcjjq         1s
prometheus-server-5c8b68f5cd-tkv5b              1s

==> v1/Service
NAME                           AGE
prometheus-alertmanager        1s
prometheus-kube-state-metrics  1s
prometheus-node-exporter       1s
prometheus-pushgateway         1s
prometheus-server              1s

==> v1/ServiceAccount
NAME                           AGE
prometheus-alertmanager        1s
prometheus-kube-state-metrics  1s
prometheus-node-exporter       1s
prometheus-pushgateway         1s
prometheus-server              1s

==> v1beta1/ClusterRole
NAME                           AGE
prometheus-alertmanager        1s
prometheus-kube-state-metrics  1s
prometheus-pushgateway         1s
prometheus-server              1s

==> v1beta1/ClusterRoleBinding
NAME                           AGE
prometheus-alertmanager        1s
prometheus-kube-state-metrics  1s
prometheus-pushgateway         1s
prometheus-server              1s


NOTES:
The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-server.monitoring.svc.cluster.local


Get the Prometheus server URL by running these commands in the same shell:
  export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
  kubectl --namespace monitoring port-forward $POD_NAME 9090


The Prometheus alertmanager can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-alertmanager.monitoring.svc.cluster.local


Get the Alertmanager URL by running these commands in the same shell:
  export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=alertmanager" -o jsonpath="{.items[0].metadata.name}")
  kubectl --namespace monitoring port-forward $POD_NAME 9093
#################################################################################
######   WARNING: Pod Security Policy has been moved to a global property.  #####
######            use .Values.podSecurityPolicy.enabled with pod-based      #####
######            annotations                                               #####
######            (e.g. .Values.nodeExporter.podSecurityPolicy.annotations) #####
#################################################################################


The Prometheus PushGateway can be accessed via port 9091 on the following DNS name from within your cluster:
prometheus-pushgateway.monitoring.svc.cluster.local


Get the PushGateway URL by running these commands in the same shell:
  export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=pushgateway" -o jsonpath="{.items[0].metadata.name}")
  kubectl --namespace monitoring port-forward $POD_NAME 9091

For more information on running Prometheus, visit:
https://prometheus.io/

由于它会用到两个pvc,因此,先编辑pvc配置文件vi prometheus-server-pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
  labels:
    app: prometheus
  name: prometheus-server
  namespace: monitoring
  annotations:
    volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  volumeMode: Filesystem

再执行安装命令(先删除后安装):

kubectl delete -f prometheus-server-pvc.yaml
kubectl create -f prometheus-server-pvc.yaml

先编辑pvc配置文件vi prometheus-alertmanager-pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: prometheus
  name: prometheus-alertmanager
  namespace: monitoring
  annotations:
    volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  volumeMode: Filesystem

再执行安装命令(先删除后安装):

kubectl delete -f prometheus-alertmanager-pvc.yaml
kubectl create -f prometheus-alertmanager-pvc.yaml

安装好之后,可以看到如下pod:

$ kubectl get pods -n monitoring
NAME                                             READY   STATUS
prometheus-alertmanager-5c5958dcb7-bq2fw         2/2     Running
prometheus-kube-state-metrics-76d649cdf9-v5qg5   1/1     Running
prometheus-node-exporter-j74zq                   1/1     Running
prometheus-node-exporter-x5xnq                   1/1     Running
prometheus-pushgateway-6744d69d4-27dxb           1/1     Running
prometheus-server-669b987bcd-swcxh               2/2     Running

其中:

  • alertmanager:告警模块

  • kube-state-metrics:是一个简单的监听Kubernetes API server的服务,并且生成pod、deployment、node等所有对象的metrics。它通过/metrics接口暴露这些对象的metrics信息

  • node-exporter:主要主要是监控kubernetes 集群node 物理主机:cpu、memory、network、disk 等基础监控资源。使用daemonset 方式 自动为每个node部署监控agent。

  • pushgateway:客户端使用push的方式上报监控数据到pushgateway,prometheus会定期从pushgateway拉取数据。使用它的原因主要是:

    • Prometheus 采用 pull 模式,可能由于不在一个子网或者防火墙原因,导致Prometheus 无法直接拉取各个 target数据。

    • 在监控业务数据的时候,需要将不同数据汇总, 由 Prometheus 统一收集。

  • server: 主服务

prometheus的架构图如下: prometheus-arth.png

3. 安装grafana

创建一个ConfigMap,用于指定grafana的数据源连接到prometheus:

kubectl apply -f config.yaml

config.yaml文件内容如下:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-grafana-datasource
  namespace: monitoring
  labels:
    grafana_datasource: '1'
data:
  datasource.yaml: |-
    apiVersion: 1
    datasources:
    - name: Prometheus
      type: prometheus
      access: proxy
      orgId: 1
      url: http://prometheus-server.monitoring.svc.cluster.local

编辑helm安装grafana的values参数文件vi values.yml

sidecar:
  datasources:
    enabled: true
    label: grafana_datasource
  dashboards:
    enabled: true
    label: grafana_dashboard

其中:

  • sidecar.datasources.enabled参数表示可以用configmap来配置datasources,label指定了读取哪个label的configmap。

  • sidecar.dashboards.enabled参数表示可以用configmap来配置dashboard,label指定了读取哪个label的configmap。

执行helm安装命令:

helm install stable/grafana -f values.yml --namespace monitoring  --name grafana

安装结果如下:

NAME:   grafana
LAST DEPLOYED: Tue Dec  3 15:57:26 2019
NAMESPACE: monitoring
STATUS: DEPLOYED

RESOURCES:
==> v1/ClusterRole
NAME                 AGE
grafana-clusterrole  0s

==> v1/ClusterRoleBinding
NAME                        AGE
grafana-clusterrolebinding  0s

==> v1/ConfigMap
NAME          AGE
grafana       0s
grafana-test  0s

==> v1/Deployment
NAME     AGE
grafana  0s

==> v1/Pod(related)
NAME                      AGE
grafana-577d8d9c79-j5q2p  0s

==> v1/Role
NAME          AGE
grafana-test  0s

==> v1/RoleBinding
NAME          AGE
grafana-test  0s

==> v1/Secret
NAME     AGE
grafana  0s

==> v1/Service
NAME     AGE
grafana  0s

==> v1/ServiceAccount
NAME          AGE
grafana       0s
grafana-test  0s

==> v1beta1/PodSecurityPolicy
NAME          AGE
grafana       0s
grafana-test  0s

==> v1beta1/Role
NAME     AGE
grafana  0s

==> v1beta1/RoleBinding
NAME     AGE
grafana  0s


NOTES:
1. Get your 'admin' user password by running:

   kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:

   grafana.monitoring.svc.cluster.local

   Get the Grafana URL to visit by running these commands in the same shell:

     export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=grafana,release=grafana" -o jsonpath="{.items[0].metadata.name}")
     kubectl --namespace monitoring port-forward $POD_NAME 3000

3. Login with the password from step 1 and the username: admin
#################################################################################
######   WARNING: Persistence is disabled!!! You will lose your data when   #####
######            the Grafana pod is terminated.                            #####
#################################################################################

获取grafana的密码:

kubectl get secret \
    --namespace monitoring grafana \
    -o jsonpath="{.data.admin-password}" \
    | base64 --decode ; echo

这个密码的用户名是: admin

4. 配置ingress访问

配置grafana访问的ingress(kubectl create -f grafana-ingress.yaml),grafana-ingress.yaml文件的内容如下:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: grafana
  namespace: monitoring
spec:
  rules:
  - host: grafana.testdomain.com
    http:
      paths:
      - backend:
          serviceName: grafana
          servicePort: 80

配置prometheus访问的ingress(kubectl create -f prometheus-ingress.yaml),prometheus-ingress.yaml文件的内容如下:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus
  namespace: monitoring
spec:
  rules:
  - host: prometheus.testdomain.com
    http:
      paths:
      - backend:
          serviceName: prometheus-server
          servicePort: 80

配置alert访问的ingress(kubectl create -f prometheus-alert-ingress.yaml),prometheus-alert-ingress.yaml文件的内容如下:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-alert
  namespace: monitoring
spec:
  rules:
  - host: prometheus-alert.testdomain.com
    http:
      paths:
      - backend:
          serviceName: prometheus-alertmanager
          servicePort: 80

5. 使用

打开grafana的访问地址:http://grafana.testdomain.com:ingres的nodePort端口/。

5.1 通过import导入dashboard

选择Dashboards > Manage > + Import。在Grafana.com dashboard输入1860dashboard ID,并点击load。

下一步,选择你的dashboard的名字,并选择Prometheusdatasource,并点击Import。

1_ZNAdGPDA_j_lqDOsKaFIwQ.png

查看这个dashboard的页面: 1_PamVZs2tcN8b8P2cUbL2qw.png

5.2 通过config来配置dashboard

下载json: https://grafana.com/api/dashboards/3662/revisions/2/download

wget https://grafana.com/api/dashboards/3662/revisions/2/download
mv download prometheus-2-0-overview_rev2.json

替换DS_THEMIS参数为Prometheus:

sed -i 's/${DS_THEMIS}/Prometheus/g' prometheus-2-0-overview_rev2.json

根据创建的dashboard的configmap(kubectl apply -f prometheus-overview-dashboard-configmap.yml),prometheus-overview-dashboard-configmap.yml文件的内容如下(注意修改替换后的prometheus-2-0-overview_rev2.json):

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-prometheus-overview-dashboard
  namespace: monitoring
  labels:
    grafana_dashboard: '1'
data:
  prometheus-dashboard.json: |-
    替换后的prometheus-2-0-overview_rev2.json

配置好后,可以访问这个ConfigMap对应的dashboard。