elk采集日志的流程引入kafka集群的配置流程

文章发布较早，内容可能过时，阅读注意甄别。

当日志量足够大的时候，就有必要使用 kafka 了，这里简单记录一下整个流程。

流程简单如下：

filebeat --> kafka --> logstash --> elasticsearch --> kiabana 。

版本如下：

filebeat：6.5.4
zookeeper-3.4.10
kafka：kafka_2.12-2.0.0
logstash：6.5.4
elasticsearch ：6.5.4
kiabana ：6.5.4

看上去应该先部署 filebeat，这里因为熟悉了每个阶段组件所负责的内容，因此可以根据依赖顺序，来部署。

# 1，准备工作。

安装一些依赖包。

yum -y install lrzsz vim curl wget java ntpdate && ntpdate -u cn.pool.ntp.org

添加 hosts。

echo "10.3.0.41 elk-node1" >> /etc/hosts

这里 java 环境是非常重要的，如果不通过 yum 安装，源码方式也是可以的。但要注意配置好环境变量。

配置 yum 源。

添加源：

cat > /etc/yum.repos.d/elk.repo << EOF
[elasticsearch-6.x]
name=Elasticsearch repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF

1
2
3
4
5
6
7
8
9
10

导入 key：

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

如果执行这一步报错，则有可能是主机时间问题，可以同步一下主机时间，在执行！

# 2，filebeat。

安装应用。

yum -y install filebeat-6.5.4

简单配置。

$ cd /etc/filebeat
$ cat filebeat.yml

filebeat.config_dir: /etc/filebeat/conf.d/
filebeat.shutdown_timeout: 5s
fields_under_root: true
fields:
  ip: "10.3.2.12"
  groups: nginx

output.kafka:
  enabled: true
  hosts: ["10.3.0.41:9092"]
  topic: "%{[log_topic]}"
  worker: 2
  max_retries: 3
  bulk_max_size: 2048
  timeout: 30s
  broker_timeout: 10s
  channel_buffer_size: 256
  keep_alive: 30
  compression: gzip
  max_message_bytes: 1000000
  required_acks: 1

logging.level: warning
#logging.level: debug

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

所有采集日志的配置文件放在 /etc/filebeat/conf.d/ 目录下，这里创建一个采集 NGINX 日志的为例。

$ mkdir conf.d

$ cat conf.d/nginx.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /data/log/tmp.log
  fields_under_root: true
  json.keys_under_root: true
  json.overwrite_keys: true
  json.message_key: message
  ignore_older: 24h
  scan_frequency: 10s
  clean_inactive: 25h
  close_inactive: 24h
  clean_removed: true
  close_removed: true
  close_renamed: true
  tail_files: true
  fields:
    type: log
    log_topic: nginx-access

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

然后启动应用。

systemctl restart filebeat
systemctl status filebeat
systemctl enable filebeat

1
2
3

# 3，kafka。

这里只是模拟整个链路，因此没有配置集群。

# 1，安装 zookeeper。

首先安装配置 zookeeper。

wget http://archive.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
tar xf zookeeper-3.4.10.tar.gz -C /usr/local
cd /usr/local
mv zookeeper-3.4.10/ zookeeper

1
2
3
4

配置 zookeeper。

$ egrep -v "^$|^#" conf/zoo.cfg

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/logs
clientPort=2181
server.1=127.0.0.1:2888:3888

1
2
3
4
5
6
7
8
9

创建对应目录：

mkdir  -p /usr/local/zookeeper/data
mkdir -p /usr/local/zookeeper/logs

1
2

添加启动文件：

cat >> /lib/systemd/system/zookeeper.service << EOF

[Unit]
Description=zookeeper.service
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/zookeeper/bin/zkServer.sh start-foreground
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOF

1
2
3
4
5
6
7
8
9
10
11
12
13
14

启动 zookeeper，如果前边的 java 环境是源码配置的，启动的时候可能会报错误： /usr/local/zookeeper/bin/zkServer.sh: line 170: exec: java: not found ，把 JAVA_HOME 添加到告警提示的文件中即可：

JAVA_HOME="/usr/local/jdk1.8.0_192"

然后启动：

systemctl start zookeeper
systemctl status zookeeper
systemctl enable zookeeper

1
2
3

# 2，安装 kafka。

然后安装 kafka。

$ wget https://archive.apache.org/dist/kafka/2.0.0/kafka_2.12-2.0.0.tgz
$ tar xf kafka_2.12-2.0.0.tgz -C /usr/local
$ cd /usr/local
$ mv kafka_2.12-2.0.0/ kafka

1
2
3
4

调整配置文件：

$ egrep -v "^$|^#" config/server.properties

broker.id=0
advertised.listeners=PLAINTEXT://10.3.0.41:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/usr/local/kafka/logs/kafka
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

创建日志目录：

mkdir -p /usr/local/kafka/logs/kafka

配置启动文件：

$ cat >> /lib/systemd/system/kafka.service <<EOF

[Unit]
Description=kafka.service
After=network.target remote-fs.target zookeeper.service

[Service]
Type=simple
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties

[Install]
WantedBy=multi-user.target
EOF

1
2
3
4
5
6
7
8
9
10
11
12
13

启动服务，同样，如果源码安装的 java 还是会报一个错误: /usr/local/kafka/bin/kafka-run-class.sh: line 306: exec: java: not found ，同样把配置写入到上边内容即可。

systemctl start kafka
systemctl status kafka
systemctl enable kafka

1
2
3

简单验证：

查看 topic：

$ cat list.sh
bin/kafka-topics.sh --list --zookeeper localhost:2181

$ chmod +x list.sh

$ ./list.sh
--from-beginning
__consumer_offsets
nginx-access

1
2
3
4
5
6
7
8
9

查看 topic 的内容：

$ cat topic.sh
bin/kafka-console-consumer.sh --bootstrap-server  localhost:9092 --topic $1  --from-beginning

$ chmod +x topic.sh

$ ./topic.sh nginx-access  #即可看到对应topic的内容

1
2
3
4
5
6

# 4，logstash。

安装应用：

yum -y install logstash-6.5.4

配置应用：

$ cat /etc/logstash/logstash.yml

pipeline.workers: 2
pipeline.output.workers: 2
#每次发送的事件数
pipeline.batch.size: 800

http.host: "0.0.0.0"
log.level: warn
path.logs: /var/log/logstash
xpack.monitoring.enabled: true
xpack.monitoring.elasticsearch.url: ["http://10.3.0.41:9200"]
xpack.monitoring.collection.interval: 10s
slowlog.threshold.warn: 2s
slowlog.threshold.info: 1s
slowlog.threshold.debug: 500ms
slowlog.threshold.trace: 100ms

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

添加 NGINX 日志配置：

$ cat /etc/logstash/conf.d/nginx.yml

input {
  kafka {
      bootstrap_servers  => "10.3.0.41:9092"
      group_id          => "nginx"
      consumer_threads => 6
      topics            => "nginx-access"
      codec             => "json"
      client_id => "nginx"
   }
}

filter {
    geoip {
        source => "remote_addr"
        fields => ["city_name", "country_code2", "country_name", "latitude", "longitude", "region_name"]
        remove_field => ["[geoip][latitude]", "[geoip][longitude]"]
    }
    json {
        source => "message"
        target => "jsoncontent"
    }
}

output {
   elasticsearch {
      hosts => ["http://10.3.0.41:9200"]
      index => "logstash-nginx-%{+YYYY.MM}"
   }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

添加从管道引用：

$ cat /etc/logstash/pipelines.yml

# This file is where you define your pipelines. You can define multiple.
# For more information on multiple pipelines, see the documentation:
#   https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html

- pipeline.id: nginx
  path.config: "/etc/logstash/conf.d/nginx.yml"

1
2
3
4
5
6
7
8

启动应用：

systemctl start logstash
systemctl status logstash
systemctl enable logstash

1
2
3

# 5，elasticsearch。

安装应用：

yum -y install elasticsearch-6.5.4

配置应用：

$ cat /etc/elasticsearch/elasticsearch.yml

cluster.name: my-application
node.name: node-1
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["elk-node1"]
discovery.zen.minimum_master_nodes: 1
xpack.security.enabled: false

1
2
3
4
5
6
7
8
9
10
11

创建对应目录：

mkdir -p /logs/elasticsearch6/log
cd /logs
chown -R elasticsearch.elasticsearch elasticsearch6/

1
2
3

启动：

systemctl start elasticsearch
systemctl status elasticsearch
systemctl enable elasticsearch

1
2
3

# 6，kibana。

安装应用：

yum -y install kibana-6.5.4

配置应用：

$ cat /etc/kibana/kibana.yml

server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://10.3.0.41:9200"
kibana.index: ".kibana"

1
2
3
4
5
6

配置发现，当我把上边配置写入 kibana，然后启动，看状态是正常的，但是访问起来总是会报 Kibana server is not ready yet ，这似乎是一个经典的错误，却又让人无从下手解决。经过我的一些测试，获得以小经验。

那就是，此处配置文件，不建议直接把原来配置内容清空，然后添加当前内容的方式，尽管在上边配置 elasticsearch 以及 logstash 的时候，都这么做了，两个应用都没有发生什么奇怪的问题，但是这在 kibana 这里，似乎是不可行的，于是如果已经陷入上边那个报错之中了，那么我的建议是首先把当前 kibana 卸载，然后重新安装，接着在原有配置文件中，比照着上边的四项配置文件进行更改即可，配置完毕之后，启动 kibana，等个两三分钟之后再访问会发现，问题就神奇的消失了。

启动：

systemctl start kibana
systemctl status kibana
systemctl enable kibana

1
2
3

然后浏览器访问 kibana，就能够看到对应索引以及日志了。

# 7，两个插件。

使用过程中，可以添加两个插件，以便于日常管理与维护，一个是 kafka 管理，一个 es 的监控。

管理组件，直接通过 docker 来部署。

# 1，kafka-manager

$ cat docker-compose.yml
version: '2'
services:
  kafka-manager:
    image: sheepkiller/kafka-manager                ## 镜像：开源的web管理kafka集群的界面
    environment:
        ZK_HOSTS: 10.3.0.41                   ## 修改:宿主机IP
    ports:
      - "9000:9000"                                 ## 暴露端口

1
2
3
4
5
6
7
8
9

启动之后访问即可。

# 2，es 监控。

docker run -d -p 9001:9000 lmenezes/cerebro

然后浏览器访问即可。

注意：

注意 kafka 版本问题。
- https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html (opens new window)

#elk #kafka

上次更新: 2024/04/25, 22:08:24

← 监控告警之elastalert的配置全解记一次日志突然写不进去了的处理→