• 对于注定会优秀的人来说,他所需要的,只是时间!
  • 手懒得,必受贫穷,手勤的,必得富足----《圣经》
  • 帮助别人,成就自己。愿君在本站能真正有所收获!
  • 如果你在本站中发现任何问题,欢迎留言指正!
  • 宝剑锋从磨砺出,梅花香自苦寒来!

<八>Prometheus学习笔记–Prometheus监控之elasticsearch集群

Prometheus eryajf 6个月前 (05-02) 824°C 已收录 1个评论
本文预计阅读时间 12 分钟

prometheus监控es,同样采用exporter的方案。

  • 项目地址:
    • elasticsearch_exporter:https://github.com/justwatchcom/elasticsearch_exporter

1、安装部署

现有es三节点的集群,环境大概如下:

主机 组件
10.3.6.30–es-node1 es,nginx
10.3.6.125–es-node2 es
10.3.6.124–es-node3 es,kibana

接着分别在如上三台主机上进行如下配置:

wget https://github.com/justwatchcom/elasticsearch_exporter/releases/download/v1.1.0/elasticsearch_exporter-1.1.0.linux-amd64.tar.gz

tar xf elasticsearch_exporter-1.1.0.linux-amd64.tar.gz

mv elasticsearch_exporter-1.1.0.linux-amd64 /usr/local/elasticsearch_exporter

启动监控客户端:

nohup ./elasticsearch_exporter --web.listen-address ":9109"  --es.uri http://10.3.6.30:9200 &

使用systemd管理:

cat /lib/systemd/system/es_exporter.service

[Unit]
Description=The es_exporter
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/elasticsearch_exporter/elasticsearch_exporter --web.listen-address ":9308" --es.uri http://127.0.0.1:9200
Restart=on-failure
[Install]
WantedBy=multi-user.target

启动:

systemctl daemon-reload
systemctl start es_exporter

查看metrics:

curl 127.0.0.1:9109/metrics

2,配置 prometheus.yml 添加监控目标

$ vim /usr/local/prometheus/prometheus.yml

  - job_name: 'elasticsearch'
    scrape_interval: 60s
    scrape_timeout:  30s
    metrics_path: "/metrics"
    static_configs:
    - targets:
      - '10.3.0.41:9109'
      labels:
        service: elasticsearch

重启服务。

$ systemctl restart prometheus

或者通过命令热加载:

curl  -XPOST localhost:9090/-/reload

5,配置 Grafana 的模板

模板通过json文件进行导入,文件就在解压的包内。

参考地址:https://shenshengkun.github.io/posts/550bdf86.html

或者通过如下ID进行导入:2322以及其他。

6,开启认证的启动方式

如果es开启了认证,那么启动的时候需要将用户名密码加载进去:

lasticsearch_exporter --web.listen-address ":9308"  --es.uri http://username:password@192.168.10.3:9200 & 

其中使用的是monitoring的用户密码。

当然,除去这种命令行的启动方式之外,还可以像上边一样,基于systemd进行管理,只需将认证的参数信息写入到如下内容当中:

$ cat /etc/default/elasticsearch_exporter
EXPORTER_ARGS="--es.uri=http://username:password@192.168.10.3:9200"

接着将启动配置文件封装成如下脚本:

$ cat /etc/init.d/elasticsearch_exporter
#!/bin/sh

# chkconfig: 2345 60 20
# description: elasticsearch_exporter

NAME=elasticsearch_exporter
SCRIPT="/usr/bin/${NAME}"
PIDFILE="/var/run/${NAME}.pid"
LOGFILE="/var/log/${NAME}.log"
ENVFILE="/etc/default/${NAME}"
USER="root"

URL='http://192.10.10.1'
EXPORTER_NAME=$NAME
EXPORTER_PORT="9114"

#获取本机ip
IP=$(grep "IPADDR" /etc/sysconfig/network-scripts/ifcfg-eth0 | grep -Eo "([0-9]{1,3}\.){3}[0-9]{1,3}")


register_exporter() {
    json_data='{"service_id":"'${EXPORTER_NAME}${IP//./}'","job":"'${EXPORTER_NAME}'","ip":"'${IP}'","port":"'$EXPORTER_PORT'","tags":"","meta": {"hostname": "'$(hostname)'"}}'

    curl --connect-timeout 2 -s -X POST -H "Content-type: application/json" -d "${json_data}" $URL 2>&1 > /dev/null
}

start() {
  if [ -f "${PIDFILE}" ] && kill -0 $(cat "${PIDFILE}") &> /dev/null; then
    echo "${NAME} already running with PID $(cat ${PIDFILE})" >&2
    return 1
  fi

  echo "Starting ${NAME}" >&2
  . "${ENVFILE}"
  CMD="${SCRIPT} --web.listen-address=${IP}:${EXPORTER_PORT} --log.level=error ${EXPORTER_ARGS}"
  su - "${USER}" -c "${CMD} &> ${LOGFILE} & echo \$! > ${PIDFILE}"
  # echo "${NAME} started with PID $(cat ${PIDFILE})" >&2
  sleep 1

  if [ -f "${PIDFILE}" ] && kill -0 $(cat "${PIDFILE}") &> /dev/null; then
    echo "${NAME} started successfully." >&2
    register_exporter
  else
    echo "${NAME} was not started OK"
    return 1
  fi
}

stop() {
  if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE") &> /dev/null; then
    echo "${NAME} not running" >&2
    return 1
  fi
  echo "Stopping ${NAME}..." >&2
  kill -15 $(cat "$PIDFILE")
  rm -f "$PIDFILE"
  echo "${NAME} stopped" >&2
}

status() {
  if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE") &> /dev/null; then
    echo "${NAME} is not running" >&2
  else
    echo "${NAME} is running" >&2
  fi

}

uninstall() {
  echo -n "Are you really sure you want to uninstall ${NAME}? That cannot be undone. [yes|No] "
  local SURE
  read SURE
  if [ "$SURE" = "yes" ]; then
    stop
    rm -f "$PIDFILE"
    echo "Notice: log file is not be removed: '$LOGFILE'" >&2
    update-rc.d -f <NAME> remove
    rm -fv "$0"
  fi
}

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  uninstall)
    uninstall
    ;;
  restart)
    stop
    start
    ;;
  status)
  status
  ;;
  register)
  register_exporter
  ;;
  *)
    echo "Usage: $0 {start|stop|restart|status|register|uninstall}"
esac

此处服务启动之后将会自动注册到统一的注册中心去,而不必再手动添加配置。


weinxin
扫码订阅本站,第一时间获得更新
微信扫描二维码,订阅我们网站的动态,另外不定时发送WordPress小技巧,你可以随时退订,欢迎订阅哦~

二丫讲梵 , 版权所有丨如未注明 , 均为原创丨本网站采用BY-NC-SA协议进行授权 , 转载请注明<八>Prometheus学习笔记–Prometheus监控之elasticsearch集群
喜欢 (0)
[如果想支持本站,可支付宝赞助]
分享 (0)
eryajf
关于作者:
学无止境,我愿意无止境学。书山有路,我愿意举身投火,淬炼成金!永远不要忘记,激情的奋进,就是美好的未来!

您必须 登录 才能发表评论!

(1)个小伙伴在吐槽
  1. 您好,请问我想使用钉钉的加签报警,请问怎么在systemd中添加time和sign,麻烦您收到回复一下
    hopeking2020-07-22 12:24 Windows 10 | Chrome 74.0.3729.169