SpringCloud微服务实战教程中,如何实现企业级ELK日志采集与分布式架构配置?

2026-05-27 15:021阅读0评论SEO资源
  • 内容介绍
  • 文章标签
  • 相关推荐

本文共计6294个文字,预计阅读时间需要26分钟。

SpringCloud微服务实战教程中,如何实现企业级ELK日志采集与分布式架构配置?

一套优秀的日志分析系统可以详细记录系统的运行情况,便于我们定位分析系统性能瓶颈、查找定位系统问题。以下一篇讲述了日志的多种业务场景及日志记录的实现方式。那么,如何记录这些日志呢?

  一套好的日志分析系统可以详细记录系统的运行情况,方便我们定位分析系统性能瓶颈、查找定位系统问题。上一篇说明了日志的多种业务场景以及日志记录的实现方式,那么日志记录下来,相关人员就需要对日志数据进行处理与分析,基于E(ElasticSearch)L(Logstash)K(Kibana)组合的日志分析系统可以说是目前各家公司普遍的首选方案。

  • Elasticsearch: 分布式、RESTful 风格的搜索和数据分析引擎,可快速存储、搜索、分析海量的数据。在ELK中用于存储所有日志数据。

  • Logstash: 开源的数据采集引擎,具有实时管道传输功能。Logstash 能够将来自单独数据源的数据动态集中到一起,对这些数据加以标准化并传输到您所选的地方。在ELK中用于将采集到的日志数据进行处理、转换然后存储到Elasticsearch。

  • Kibana: 免费且开放的用户界面,能够让您对 Elasticsearch 数据进行可视化,并让您在 Elastic Stack 中进行导航。您可以进行各种操作,从跟踪查询负载,到理解请求如何流经您的整个应用,都能轻松完成。在ELK中用于通过界面展示存储在Elasticsearch中的日志数据。

  作为微服务集群,必须要考虑当微服务访问量暴增时的高并发场景,此时系统的日志数据同样是爆发式增长,我们需要通过消息队列做流量削峰处理,Logstash官方提供Redis、Kafka、RabbitMQ等输入插件。Redis虽然可以用作消息队列,但其各项功能显示不如单一实现的消息队列,所以通常情况下并不使用它的消息队列功能;Kafka的性能要优于RabbitMQ,通常在日志采集,数据采集时使用较多,所以这里我们采用Kafka实现消息队列功能。
  ELK日志分析系统中,数据传输、数据保存、数据展示、流量削峰功能都有了,还少一个组件,就是日志数据的采集,虽然log4j2可以将日志数据发送到Kafka,甚至可以将日志直接输入到Logstash,但是基于系统设计解耦的考虑,业务系统运行不会影响到日志分析系统,同时日志分析系统也不会影响到业务系统,所以,业务只需将日志记录下来,然后由日志分析系统去采集分析即可,Filebeat是ELK日志系统中常用的日志采集器,它是 Elastic Stack 的一部分,因此能够与 Logstash、Elasticsearch 和 Kibana 无缝协作。

  • Kafka: 高吞吐量的分布式发布订阅消息队列,主要应用于大数据的实时处理。

  • Filebeat: 轻量型日志采集器。在 Kubernetes、Docker 或云端部署中部署 Filebeat,即可获得所有的日志流:信息十分完整,包括日志流的 pod、容器、节点、VM、主机以及自动关联时用到的其他元数据。此外,Beats Autodiscover 功能可检测到新容器,并使用恰当的 Filebeat 模块对这些容器进行自适应监测。

软件下载:

  因经常遇到在内网搭建环境的问题,所以这里习惯使用下载软件包的方式进行安装,虽没有使用Yum、Docker等安装方便,但是可以对软件目录、配置信息等有更深的了解,在后续采用Yum、Docker等方式安装时,也能清楚安装了哪些东西,安装配置的文件是怎样的,即使出现问题,也可以快速的定位解决。

Elastic Stack全家桶下载主页: www.elastic.co/cn/downloads/

我们选择如下版本:

  • Elasticsearch8.0.0,下载地址:artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.0.0-linux-x86_64.tar.gz

  • Logstash8.0.0,下载地址:artifacts.elastic.co/downloads/logstash/logstash-8.0.0-linux-x86_64.tar.gz

  • Kibana8.0.0,下载地址:artifacts.elastic.co/downloads/kibana/kibana-8.0.0-linux-x86_64.tar.gz

  • Filebeat8.0.0,下载地址:artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.0.0-linux-x86_64.tar.gz

Kafka下载:

  • Kafka3.1.0,下载地址:dlcdn.apache.org/kafka/3.1.0/kafka_2.13-3.1.0.tgz
安装配置:

  安装前先准备好三台CentOS7服务器用于集群安装,这是IP地址为:172.16.20.220、172.16.20.221、172.16.20.222,然后将上面下载的软件包上传至三台服务器的/usr/local目录。因服务器资源有限,这里所有的软件都安装在这三台集群服务器上,在实际生产环境中,请根据业务需求设计规划进行安装。
  在集群搭建时,如果能够编写shell安装脚本就会很方便,如果不能编写,就需要在每台服务器上执行安装命令,多数ssh客户端提供了多会话同时输入的功能,这里一些通用安装命令可以选择启用该功能。

一、安装Elasticsearch集群 1、Elasticsearch是使用Java语言开发的,所以需要在环境上安装jdk并配置环境变量。
  • 下载jdk软件包安装,www.oracle.com/java/technologies/downloads/#java8

新建/usr/local/java目录

mkdir /usr/local/java

将下载的jdk软件包jdk-8u64-linux-x64.tar.gz上传到/usr/local/java目录,然后解压

tar -zxvf jdk-8u77-linux-x64.tar.gz

配置环境变量/etc/profile

vi /etc/profile

在底部添加以下内容

JAVA_HOME=/usr/local/java/jdk1.8.0_64 PATH=$JAVA_HOME/bin:$PATH CLASSPATH=$JAVA_HOME/jre/lib/ext:$JAVA_HOME/lib/tools.jar export PATH JAVA_HOME CLASSPATH

使环境变量生效

source /etc/profile

  • 另外一种十分快捷的方式,如果不是内网环境,可以直接使用命令行安装,这里安装的是免费版本的openjdk

yum install java-1.8.0-openjdk* -y 2、安装配置Elasticsearch

  • 进入/usr/local目录,解压Elasticsearch安装包,请确保执行命令前已将环境准备时的Elasticsearch安装包上传至该目录。

tar -zxvf elasticsearch-8.0.0-linux-x86_64.tar.gz

  • 重命名文件夹

mv elasticsearch-8.0.0 elasticsearch

  • elasticsearch不能使用root用户运行,这里创建运行elasticsearch的用户组和用户

# 创建用户组 groupadd elasticsearch # 创建用户并添加至用户组 useradd elasticsearch -g elasticsearch # 更改elasticsearch密码,设置一个自己需要的密码,这里设置为和用户名一样:El12345678 passwd elasticsearch

  • 新建elasticsearch数据和日志存放目录,并给elasticsearch用户赋权限

mkdir -p /data/elasticsearch/data mkdir -p /data/elasticsearch/log chown -R elasticsearch:elasticsearch /data/elasticsearch/* chown -R elasticsearch:elasticsearch /usr/local/elasticsearch/*

  • elasticsearch默认启用了x-pack,集群通信需要进行安全认证,所以这里需要用到SSL证书。注意:这里生成证书的命令只在一台服务器上执行,执行之后copy到另外两台服务器的相同目录下。

# 提示输入密码时,直接回车 ./elasticsearch-certutil ca -out /usr/local/elasticsearch/config/elastic-stack-ca.p12 # 提示输入密码时,直接回车 ./elasticsearch-certutil cert --ca /usr/local/elasticsearch/config/elastic-stack-ca.p12 -out /usr/local/elasticsearch/config/elastic-certificates.p12 -pass "" # 如果使用root用户生成的证书,记得给elasticsearch用户赋权限 chown -R elasticsearch:elasticsearch /usr/local/elasticsearch/config/elastic-certificates.p12

  • 设置密码,这里在出现输入密码时,所有的都是输入的123456

./elasticsearch-setup-passwords interactive Enter password for [elastic]: Reenter password for [elastic]: Enter password for [apm_system]: Reenter password for [apm_system]: Enter password for [kibana_system]: Reenter password for [kibana_system]: Enter password for [logstash_system]: Reenter password for [logstash_system]: Enter password for [beats_system]: Reenter password for [beats_system]: Enter password for [remote_monitoring_user]: Reenter password for [remote_monitoring_user]: Changed password for user [apm_system] Changed password for user [kibana_system] Changed password for user [kibana] Changed password for user [logstash_system] Changed password for user [beats_system] Changed password for user [remote_monitoring_user] Changed password for user [elastic]

  • 修改elasticsearch配置文件

vi /usr/local/elasticsearch/config/elasticsearch.yml

# 修改配置 # 集群名称 cluster.name: log-elasticsearch # 节点名称 node.name: node-1 # 数据存放路径 path.data: /data/elasticsearch/data # 日志存放路径 path.logs: /data/elasticsearch/log # 当前节点IP network.host: 192.168.60.201 # 对外端口 172.16.20.220:9200/
172.16.20.221:9200/
172.16.20.222:9200/

  • 正常运行没有问题后,Ctrl+c关闭服务,然后使用后台启动命令
  • ./elasticsearch -d

    备注:后续可通过此命令停止elasticsearch运行

    # 查看进程id ps -ef | grep elastic # 关闭进程 kill -9 1376(进程id) 3、安装ElasticSearch界面管理插件elasticsearch-head,只需要在一台服务器上安装即可,这里我们安装到172.16.20.220服务器上

    • 配置nodejs环境
      下载地址: (nodejs.org/dist/v16.14.0/node-v16.14.0-linux-x64.tar.xz)[nodejs.org/dist/v16.14.0/node-v16.14.0-linux-x64.tar.xz],将node-v16.14.0-linux-x64.tar.xz上传到服务器172.16.20.220的/usr/local目录

    # 解压 tar -xvJf node-v16.14.0-linux-x64.tar.xz # 重命名 mv node-v16.14.0-linux-x64 nodejs # 配置环境变量 vi /etc/profile # 新增以下内容 export NODE_HOME=/usr/local/nodejs PATH=$JAVA_HOME/bin:$NODE_HOME/bin:/usr/local/mysql/bin:/usr/local/subversion/bin:$PATH export PATH JAVA_HOME NODE_HOME JENKINS_HOME CLASSPATH # 使配置生效 source /etc/profile # 测试是否配置成功 node -v

    • 配置elasticsearch-head
      项目开源地址:github.com/mobz/elasticsearch-head
      zip包下载地址:github.com/mobz/elasticsearch-head/archive/master.zip
      下载后上传至172.16.20.220的/usr/local目录,然后进行解压安装

    # 解压 unzip elasticsearch-head-master.zip # 重命名 mv elasticsearch-head-master elasticsearch-head # 进入到elasticsearch-head目录 cd elasticsearch-head #切换软件源,可以提升安装速度 npm config set registry registry.npm.taobao.org # 执行安装命令 npm install -g npm@8.5.1 npm install phantomjs-prebuilt@2.1.16 --ignore-scripts npm install # 启动命令 npm run start

    • 浏览器访问172.16.20.220:9100/?auth_user=elastic&auth_password=123456 ,需要加上我们上面设置的用户名密码,就可以看到我们的Elasticsearch集群状态了。

    SpringCloud微服务实战教程中,如何实现企业级ELK日志采集与分布式架构配置?

    二、安装Kafka集群
    • 环境准备:

      新建kafka的日志目录和zookeeper数据目录,因为这两项默认放在tmp目录,而tmp目录中内容会随重启而丢失,所以我们自定义以下目录:

    mkdir /data/zookeeper mkdir /data/zookeeper/data mkdir /data/zookeeper/logs mkdir /data/kafka mkdir /data/kafka/data mkdir /data/kafka/logs

    • zookeeper.properties配置

    vi /usr/local/kafka/config/zookeeper.properties

    修改如下:

    # 修改为自定义的zookeeper数据目录 dataDir=/data/zookeeper/data # 修改为自定义的zookeeper日志目录 dataLogDir=/data/zookeeper/logs # 端口 clientPort=2181 # 注释掉 #maxClientCnxns=0 # 设置连接参数,添加如下配置 # 为zk的基本时间单元,毫秒 tickTime=2000 # Leader-Follower初始通信时限 tickTime*10 initLimit=10 # Leader-Follower同步通信时限 tickTime*5 syncLimit=5 # 设置broker Id的服务地址,本机ip一定要用0.0.0.0代替 server.1=0.0.0.0:2888:3888 server.2=172.16.20.221:2888:3888 server.3=172.16.20.222:2888:3888

    • 在各台服务器的zookeeper数据目录/data/zookeeper/data添加myid文件,写入服务broker.id属性值

    在data文件夹中新建myid文件,myid文件的内容为1(一句话创建:echo 1 > myid)

    cd /data/zookeeper/data vi myid #添加内容:1 其他两台主机分别配置 2和3 1

    • kafka配置,进入config目录下,修改server.properties文件

    vi /usr/local/kafka/config/server.properties

    # 每台服务器的broker.id都不能相同 broker.id=1 # 是否可以删除topic delete.topic.enable=true # topic 在当前broker上的分片个数,与broker保持一致 num.partitions=3 # 每个主机地址不一样: listeners=PLAINTEXT://172.16.20.220:9092 advertised.listeners=PLAINTEXT://172.16.20.220:9092 # 具体一些参数 log.dirs=/data/kafka/kafka-logs # 设置zookeeper集群地址与端口如下: zookeeper.connect=172.16.20.220:2181,172.16.20.221:2181,172.16.20.222:2181

    • Kafka启动

    kafka启动时先启动zookeeper,再启动kafka;关闭时相反,先关闭kafka,再关闭zookeeper。
    1、zookeeper启动命令

    ./zookeeper-server-start.sh ../config/zookeeper.properties &

    后台运行启动命令:

    nohup ./zookeeper-server-start.sh ../config/zookeeper.properties >/data/zookeeper/logs/zookeeper.log 2>1 &

    或者

    ./zookeeper-server-start.sh -daemon ../config/zookeeper.properties &

    查看集群状态:

    ./zookeeper-server-start.sh status ../config/zookeeper.properties

    2、kafka启动命令

    ./kafka-server-start.sh ../config/server.properties &

    后台运行启动命令:

    nohup bin/kafka-server-start.sh ../config/server.properties >/data/kafka/logs/kafka.log 2>1 &

    或者

    ./kafka-server-start.sh -daemon ../config/server.properties &

    3、创建topic,最新版本已经不需要使用zookeeper参数创建。

    ./kafka-topics.sh --create --replication-factor 2 --partitions 1 --topic test --bootstrap-server 172.16.20.220:9092

    参数解释:
    复制两份
      --replication-factor 2
    创建1个分区
      --partitions 1
    topic 名称
      --topic test

    4、查看已经存在的topic(三台设备都执行时可以看到)

    ./kafka-topics.sh --list --bootstrap-server 172.16.20.220:9092

    5、启动生产者:

    ./kafka-console-producer.sh --broker-list 172.16.20.220:9092 --topic test

    6、启动消费者:

    ./kafka-console-consumer.sh --bootstrap-server 172.16.20.221:9092 --topic test ./kafka-console-consumer.sh --bootstrap-server 172.16.20.222:9092 --topic test

    添加参数 --from-beginning 从开始位置消费,不是从最新消息

    ./kafka-console-consumer.sh --bootstrap-server 172.16.20.221 --topic test --from-beginning

    7、测试:在生产者输入test,可以在消费者的两台服务器上看到同样的字符test,说明Kafka服务器集群已搭建成功。

    三、安装配置Logstash

    Logstash没有提供集群安装方式,相互之间并没有交互,但是我们可以配置同属一个Kafka消费者组,来实现统一消息只消费一次的功能。

    • 解压安装包

    tar -zxvf logstash-8.0.0-linux-x86_64.tar.gz mv logstash-8.0.0 logstash

    • 配置kafka主题和组

    cd logstash # 新建配置文件 vi logstash-kafka.conf # 新增以下内容 input { kafka { codec => "json" group_id => "logstash" client_id => "logstash-api" topics_pattern => "api_log" type => "api" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } kafka { codec => "json" group_id => "logstash" client_id => "logstash-operation" topics_pattern => "operation_log" type => "operation" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } kafka { codec => "json" group_id => "logstash" client_id => "logstash-debugger" topics_pattern => "debugger_log" type => "debugger" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } kafka { codec => "json" group_id => "logstash" client_id => "logstash-nginx" topics_pattern => "nginx_log" type => "nginx" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } } output { if [type] == "api"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_api-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } if [type] == "operation"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_operation-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } if [type] == "debugger"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_operation-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } if [type] == "nginx"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_operation-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } }

    • 启动logstash

    # 切换到bin目录 cd /usr/local/logstash/bin # 启动命令 nohup ./logstash -f ../config/logstash-kafka.conf & #查看启动日志 tail -f nohup.out 四、安装配置Kibana

    • 解压安装文件

    tar -zxvf kibana-8.0.0-linux-x86_64.tar.gz mv kibana-8.0.0 kibana

    • 修改配置文件

    cd /usr/local/kibana/config vi kibana.yml # 修改以下内容 server.port: 5601 server.host: "172.16.20.220" elasticsearch.hosts: ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] elasticsearch.username: "kibana_system" elasticsearch.password: "123456"

    • 启动服务

    cd /usr/local/kibana/bin # 默认不允许使用root运行,可以添加 --allow-root 参数使用root用户运行,也可以跟Elasticsearch一样新增一个用户组用户 nohup ./kibana --allow-root &

    • 访问172.16.20.220:5601/,并使用elastic / 123456登录。

    五、安装Filebeat

      Filebeat用于安装在业务软件运行服务器,收集业务产生的日志,并推送到我们配置的Kafka、Redis、RabbitMQ等消息中间件,或者直接保存到Elasticsearch,下面来讲解如何安装配置:

    1、进入到/usr/local目录,执行解压命令

    tar -zxvf filebeat-8.0.0-linux-x86_64.tar.gz mv filebeat-8.0.0-linux-x86_64 filebeat

    2、编辑配置filebeat.yml
      配置文件中默认是输出到elasticsearch,这里我们改为kafka,同文件目录下的filebeat.reference.yml文件是所有配置的实例,可以直接将kafka的配置复制到filebeat.yml

    • 配置采集开关和采集路径:

    # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. # enable改为true enabled: true # Paths that should be crawled and fetched. Glob based paths. # 修改微服务日志的实际路径 paths: - /data/gitegg/log/gitegg-service-system/*.log - /data/gitegg/log/gitegg-service-base/*.log - /data/gitegg/log/gitegg-service-oauth/*.log - /data/gitegg/log/gitegg-service-gateway/*.log - /data/gitegg/log/gitegg-service-extension/*.log - /data/gitegg/log/gitegg-service-bigdata/*.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering #fields: # level: debug # review: 1

    • Elasticsearch 模板配置

    # ======================= Elasticsearch template setting ======================= setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 1 #index.codec: best_compression #_source.enabled: false # 允许自动生成index模板 setup.template.enabled: true # # 生成index模板时字段配置文件 setup.template.fields: fields.yml # # 如果存在模块则覆盖 setup.template.overwrite: true # # 生成index模板的名称 setup.template.name: "api_log" # # 生成index模板匹配的index格式 setup.template.pattern: "api-*" #索引生命周期管理ilm功能默认开启,开启的情况下索引名称只能为filebeat-*, 通过setup.ilm.enabled: false进行关闭; setup.ilm.pattern: "{now/d}" setup.ilm.enabled: false

    • 开启仪表盘并配置使用Kibana仪表盘:

    # ================================= Dashboards ================================= # These settings control loading the sample dashboards to the Kibana index. Loading # the dashboards is disabled by default and can be enabled either by setting the # options here or by using the `setup` command. setup.dashboards.enabled: true # The URL from where to download the dashboards archive. By default this URL # has a value which is computed based on the Beat name and version. For released # versions, this URL points to the dashboard archive on the artifacts.elastic.co # website. #setup.dashboards.url: # =================================== Kibana =================================== # Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana: # Kibana Host # Scheme and port can be left out and will be set to the default (localhost:5601/path # IPv6 addresses should always be defined as: [2001:db8::1]:5601 host: "172.16.20.220:5601" # Kibana Space ID # ID of the Kibana Space into which the dashboards should be loaded. By default, # the Default Space will be used. #space.id:

    • 配置输出到Kafka,完整的filebeat.yml如下

    ###################### Filebeat Configuration Example ######################### # This file is an example configuration file highlighting only the most common # options. The filebeat.reference.yml file from the same directory contains all the # supported options with more comments. You can use it as a reference. # # You can find the full configuration reference here: # www.elastic.co/guide/en/beats/filebeat/index.html # For more available modules and options, please see the filebeat.reference.yml sample # configuration file. # ============================== Filebeat inputs =============================== filebeat.inputs: # Each - is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations. # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /data/gitegg/log/*/*operation.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: operation_log # level: debug # review: 1 # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /data/gitegg/log/*/*api.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: api_log # level: debug # review: 1 # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /data/gitegg/log/*/*debug.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: debugger_log # level: debug # review: 1 # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /usr/local/nginx/logs/access.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: nginx_log # level: debug # review: 1 # ============================== Filebeat modules ============================== filebeat.config.modules: # Glob pattern for configuration loading path: ${path.config}/modules.d/*.yml # Set to true to enable config reloading reload.enabled: false # Period on which files under path should be checked for changes #reload.period: 10s # ======================= Elasticsearch template setting ======================= setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 1 #index.codec: best_compression #_source.enabled: false # 允许自动生成index模板 setup.template.enabled: true # # 生成index模板时字段配置文件 setup.template.fields: fields.yml # # 如果存在模块则覆盖 setup.template.overwrite: true # # 生成index模板的名称 setup.template.name: "gitegg_log" # # 生成index模板匹配的index格式 setup.template.pattern: "filebeat-*" #索引生命周期管理ilm功能默认开启,开启的情况下索引名称只能为filebeat-*, 通过setup.ilm.enabled: false进行关闭; setup.ilm.pattern: "{now/d}" setup.ilm.enabled: false # ================================== General =================================== # The name of the shipper that publishes the network data. It can be used to group # all the transactions sent by a single shipper in the web interface. #name: # The tags of the shipper are included in their own field with each # transaction published. #tags: ["service-X", "web-tier"] # Optional fields that you can specify to add additional information to the # output. #fields: # env: staging # ================================= Dashboards ================================= # These settings control loading the sample dashboards to the Kibana index. Loading # the dashboards is disabled by default and can be enabled either by setting the # options here or by using the `setup` command. setup.dashboards.enabled: true # The URL from where to download the dashboards archive. By default this URL # has a value which is computed based on the Beat name and version. For released # versions, this URL points to the dashboard archive on the artifacts.elastic.co # website. #setup.dashboards.url: # =================================== Kibana =================================== # Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana: # Kibana Host # Scheme and port can be left out and will be set to the default (localhost:5601/path # IPv6 addresses should always be defined as: [2001:db8::1]:5601 host: "172.16.20.220:5601" # Optional protocol and basic auth credentials. #protocol: "cloud.elastic.co/). # The cloud.id setting overwrites the `output.elasticsearch.hosts` and # `setup.kibana.host` options. # You can find the `cloud.id` in the Elastic Cloud web UI. #cloud.id: # The cloud.auth setting overwrites the `output.elasticsearch.username` and # `output.elasticsearch.password` settings. The format is `<user>:<pass>`. #cloud.auth: # ================================== Outputs =================================== # Configure what output to use when sending the data collected by the beat. # ---------------------------- Elasticsearch Output ---------------------------- #output.elasticsearch: # Array of hosts to connect to. hosts: ["localhost:9200"] # Protocol - either `localhost:8200 # API Key for the APM Server(s). # If api_key is set then secret_token will be ignored. #api_key: # Secret token for the APM Server(s). #secret_token: # ================================= Migration ================================== # This allows to enable 6.7 migration aliases #migration.6_to_7.enabled: true

    • 执行filebeat启动命令

    ./filebeat -e -c filebeat.yml

    后台启动命令

    nohup ./filebeat -e -c filebeat.yml >/dev/null 2>&1 &

    停止命令

    ps -ef |grep filebeat kill -9 进程号 六、测试配置是否正确 1、测试filebeat是否能够采集log文件并发送到Kafka

    • 在kafka服务器开启消费者,监听api_log主题和operation_log主题

    ./kafka-console-consumer.sh --bootstrap-server 172.16.20.221:9092 --topic api_log ./kafka-console-consumer.sh --bootstrap-server 172.16.20.222:9092 --topic operation_log

    • 手动写入日志文件,按照filebeat配置的采集目录写入

    echo "api log1111" > /data/gitegg/log/gitegg-service-system/api.log echo "operation log1111" > /data/gitegg/log/gitegg-service-system/operation.log

    • 观察消费者是消费到日志推送内容



    2、测试logstash是消费Kafka的日志主题,并将日志内容存入Elasticsearch

    • 手动写入日志文件

    echo "api log8888888888888888888888" > /data/gitegg/log/gitegg-service-system/api.log echo "operation loggggggggggggggggggg" > /data/gitegg/log/gitegg-service-system/operation.log

    • 打开Elasticsearch Head界面 172.16.20.220:9100/?auth_user=elastic&auth_password=123456 ,查询Elasticsearch是否有数据。

    自动新增的两个index,规则是logstash中配置的

    数据浏览页可以看到Elasticsearch中存储的日志数据内容,说明我们的配置已经生效。

    七、配置Kibana用于日志统计和展示
    • 依次点击左侧菜单Management -> Kibana -> Data Views -> Create data view , 输入logstash_* ,选择@timestamp,再点击Create data view按钮,完成创建。



    • 点击日志分析查询菜单Analytics -> Discover,选择logstash_* 进行日志查询

    源码地址:

    Gitee: gitee.com/wmz1930/GitEgg
    GitHub: github.com/wmz1930/GitEgg

    本文共计6294个文字,预计阅读时间需要26分钟。

    SpringCloud微服务实战教程中,如何实现企业级ELK日志采集与分布式架构配置?

    一套优秀的日志分析系统可以详细记录系统的运行情况,便于我们定位分析系统性能瓶颈、查找定位系统问题。以下一篇讲述了日志的多种业务场景及日志记录的实现方式。那么,如何记录这些日志呢?

      一套好的日志分析系统可以详细记录系统的运行情况,方便我们定位分析系统性能瓶颈、查找定位系统问题。上一篇说明了日志的多种业务场景以及日志记录的实现方式,那么日志记录下来,相关人员就需要对日志数据进行处理与分析,基于E(ElasticSearch)L(Logstash)K(Kibana)组合的日志分析系统可以说是目前各家公司普遍的首选方案。

    • Elasticsearch: 分布式、RESTful 风格的搜索和数据分析引擎,可快速存储、搜索、分析海量的数据。在ELK中用于存储所有日志数据。

    • Logstash: 开源的数据采集引擎,具有实时管道传输功能。Logstash 能够将来自单独数据源的数据动态集中到一起,对这些数据加以标准化并传输到您所选的地方。在ELK中用于将采集到的日志数据进行处理、转换然后存储到Elasticsearch。

    • Kibana: 免费且开放的用户界面,能够让您对 Elasticsearch 数据进行可视化,并让您在 Elastic Stack 中进行导航。您可以进行各种操作,从跟踪查询负载,到理解请求如何流经您的整个应用,都能轻松完成。在ELK中用于通过界面展示存储在Elasticsearch中的日志数据。

      作为微服务集群,必须要考虑当微服务访问量暴增时的高并发场景,此时系统的日志数据同样是爆发式增长,我们需要通过消息队列做流量削峰处理,Logstash官方提供Redis、Kafka、RabbitMQ等输入插件。Redis虽然可以用作消息队列,但其各项功能显示不如单一实现的消息队列,所以通常情况下并不使用它的消息队列功能;Kafka的性能要优于RabbitMQ,通常在日志采集,数据采集时使用较多,所以这里我们采用Kafka实现消息队列功能。
      ELK日志分析系统中,数据传输、数据保存、数据展示、流量削峰功能都有了,还少一个组件,就是日志数据的采集,虽然log4j2可以将日志数据发送到Kafka,甚至可以将日志直接输入到Logstash,但是基于系统设计解耦的考虑,业务系统运行不会影响到日志分析系统,同时日志分析系统也不会影响到业务系统,所以,业务只需将日志记录下来,然后由日志分析系统去采集分析即可,Filebeat是ELK日志系统中常用的日志采集器,它是 Elastic Stack 的一部分,因此能够与 Logstash、Elasticsearch 和 Kibana 无缝协作。

    • Kafka: 高吞吐量的分布式发布订阅消息队列,主要应用于大数据的实时处理。

    • Filebeat: 轻量型日志采集器。在 Kubernetes、Docker 或云端部署中部署 Filebeat,即可获得所有的日志流:信息十分完整,包括日志流的 pod、容器、节点、VM、主机以及自动关联时用到的其他元数据。此外,Beats Autodiscover 功能可检测到新容器,并使用恰当的 Filebeat 模块对这些容器进行自适应监测。

    软件下载:

      因经常遇到在内网搭建环境的问题,所以这里习惯使用下载软件包的方式进行安装,虽没有使用Yum、Docker等安装方便,但是可以对软件目录、配置信息等有更深的了解,在后续采用Yum、Docker等方式安装时,也能清楚安装了哪些东西,安装配置的文件是怎样的,即使出现问题,也可以快速的定位解决。

    Elastic Stack全家桶下载主页: www.elastic.co/cn/downloads/

    我们选择如下版本:

    • Elasticsearch8.0.0,下载地址:artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.0.0-linux-x86_64.tar.gz

    • Logstash8.0.0,下载地址:artifacts.elastic.co/downloads/logstash/logstash-8.0.0-linux-x86_64.tar.gz

    • Kibana8.0.0,下载地址:artifacts.elastic.co/downloads/kibana/kibana-8.0.0-linux-x86_64.tar.gz

    • Filebeat8.0.0,下载地址:artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.0.0-linux-x86_64.tar.gz

    Kafka下载:

    • Kafka3.1.0,下载地址:dlcdn.apache.org/kafka/3.1.0/kafka_2.13-3.1.0.tgz
    安装配置:

      安装前先准备好三台CentOS7服务器用于集群安装,这是IP地址为:172.16.20.220、172.16.20.221、172.16.20.222,然后将上面下载的软件包上传至三台服务器的/usr/local目录。因服务器资源有限,这里所有的软件都安装在这三台集群服务器上,在实际生产环境中,请根据业务需求设计规划进行安装。
      在集群搭建时,如果能够编写shell安装脚本就会很方便,如果不能编写,就需要在每台服务器上执行安装命令,多数ssh客户端提供了多会话同时输入的功能,这里一些通用安装命令可以选择启用该功能。

    一、安装Elasticsearch集群 1、Elasticsearch是使用Java语言开发的,所以需要在环境上安装jdk并配置环境变量。
    • 下载jdk软件包安装,www.oracle.com/java/technologies/downloads/#java8

    新建/usr/local/java目录

    mkdir /usr/local/java

    将下载的jdk软件包jdk-8u64-linux-x64.tar.gz上传到/usr/local/java目录,然后解压

    tar -zxvf jdk-8u77-linux-x64.tar.gz

    配置环境变量/etc/profile

    vi /etc/profile

    在底部添加以下内容

    JAVA_HOME=/usr/local/java/jdk1.8.0_64 PATH=$JAVA_HOME/bin:$PATH CLASSPATH=$JAVA_HOME/jre/lib/ext:$JAVA_HOME/lib/tools.jar export PATH JAVA_HOME CLASSPATH

    使环境变量生效

    source /etc/profile

    • 另外一种十分快捷的方式,如果不是内网环境,可以直接使用命令行安装,这里安装的是免费版本的openjdk

    yum install java-1.8.0-openjdk* -y 2、安装配置Elasticsearch

    • 进入/usr/local目录,解压Elasticsearch安装包,请确保执行命令前已将环境准备时的Elasticsearch安装包上传至该目录。

    tar -zxvf elasticsearch-8.0.0-linux-x86_64.tar.gz

    • 重命名文件夹

    mv elasticsearch-8.0.0 elasticsearch

    • elasticsearch不能使用root用户运行,这里创建运行elasticsearch的用户组和用户

    # 创建用户组 groupadd elasticsearch # 创建用户并添加至用户组 useradd elasticsearch -g elasticsearch # 更改elasticsearch密码,设置一个自己需要的密码,这里设置为和用户名一样:El12345678 passwd elasticsearch

    • 新建elasticsearch数据和日志存放目录,并给elasticsearch用户赋权限

    mkdir -p /data/elasticsearch/data mkdir -p /data/elasticsearch/log chown -R elasticsearch:elasticsearch /data/elasticsearch/* chown -R elasticsearch:elasticsearch /usr/local/elasticsearch/*

    • elasticsearch默认启用了x-pack,集群通信需要进行安全认证,所以这里需要用到SSL证书。注意:这里生成证书的命令只在一台服务器上执行,执行之后copy到另外两台服务器的相同目录下。

    # 提示输入密码时,直接回车 ./elasticsearch-certutil ca -out /usr/local/elasticsearch/config/elastic-stack-ca.p12 # 提示输入密码时,直接回车 ./elasticsearch-certutil cert --ca /usr/local/elasticsearch/config/elastic-stack-ca.p12 -out /usr/local/elasticsearch/config/elastic-certificates.p12 -pass "" # 如果使用root用户生成的证书,记得给elasticsearch用户赋权限 chown -R elasticsearch:elasticsearch /usr/local/elasticsearch/config/elastic-certificates.p12

    • 设置密码,这里在出现输入密码时,所有的都是输入的123456

    ./elasticsearch-setup-passwords interactive Enter password for [elastic]: Reenter password for [elastic]: Enter password for [apm_system]: Reenter password for [apm_system]: Enter password for [kibana_system]: Reenter password for [kibana_system]: Enter password for [logstash_system]: Reenter password for [logstash_system]: Enter password for [beats_system]: Reenter password for [beats_system]: Enter password for [remote_monitoring_user]: Reenter password for [remote_monitoring_user]: Changed password for user [apm_system] Changed password for user [kibana_system] Changed password for user [kibana] Changed password for user [logstash_system] Changed password for user [beats_system] Changed password for user [remote_monitoring_user] Changed password for user [elastic]

    • 修改elasticsearch配置文件

    vi /usr/local/elasticsearch/config/elasticsearch.yml

    # 修改配置 # 集群名称 cluster.name: log-elasticsearch # 节点名称 node.name: node-1 # 数据存放路径 path.data: /data/elasticsearch/data # 日志存放路径 path.logs: /data/elasticsearch/log # 当前节点IP network.host: 192.168.60.201 # 对外端口 172.16.20.220:9200/
    172.16.20.221:9200/
    172.16.20.222:9200/

  • 正常运行没有问题后,Ctrl+c关闭服务,然后使用后台启动命令
  • ./elasticsearch -d

    备注:后续可通过此命令停止elasticsearch运行

    # 查看进程id ps -ef | grep elastic # 关闭进程 kill -9 1376(进程id) 3、安装ElasticSearch界面管理插件elasticsearch-head,只需要在一台服务器上安装即可,这里我们安装到172.16.20.220服务器上

    • 配置nodejs环境
      下载地址: (nodejs.org/dist/v16.14.0/node-v16.14.0-linux-x64.tar.xz)[nodejs.org/dist/v16.14.0/node-v16.14.0-linux-x64.tar.xz],将node-v16.14.0-linux-x64.tar.xz上传到服务器172.16.20.220的/usr/local目录

    # 解压 tar -xvJf node-v16.14.0-linux-x64.tar.xz # 重命名 mv node-v16.14.0-linux-x64 nodejs # 配置环境变量 vi /etc/profile # 新增以下内容 export NODE_HOME=/usr/local/nodejs PATH=$JAVA_HOME/bin:$NODE_HOME/bin:/usr/local/mysql/bin:/usr/local/subversion/bin:$PATH export PATH JAVA_HOME NODE_HOME JENKINS_HOME CLASSPATH # 使配置生效 source /etc/profile # 测试是否配置成功 node -v

    • 配置elasticsearch-head
      项目开源地址:github.com/mobz/elasticsearch-head
      zip包下载地址:github.com/mobz/elasticsearch-head/archive/master.zip
      下载后上传至172.16.20.220的/usr/local目录,然后进行解压安装

    # 解压 unzip elasticsearch-head-master.zip # 重命名 mv elasticsearch-head-master elasticsearch-head # 进入到elasticsearch-head目录 cd elasticsearch-head #切换软件源,可以提升安装速度 npm config set registry registry.npm.taobao.org # 执行安装命令 npm install -g npm@8.5.1 npm install phantomjs-prebuilt@2.1.16 --ignore-scripts npm install # 启动命令 npm run start

    • 浏览器访问172.16.20.220:9100/?auth_user=elastic&auth_password=123456 ,需要加上我们上面设置的用户名密码,就可以看到我们的Elasticsearch集群状态了。

    SpringCloud微服务实战教程中,如何实现企业级ELK日志采集与分布式架构配置?

    二、安装Kafka集群
    • 环境准备:

      新建kafka的日志目录和zookeeper数据目录,因为这两项默认放在tmp目录,而tmp目录中内容会随重启而丢失,所以我们自定义以下目录:

    mkdir /data/zookeeper mkdir /data/zookeeper/data mkdir /data/zookeeper/logs mkdir /data/kafka mkdir /data/kafka/data mkdir /data/kafka/logs

    • zookeeper.properties配置

    vi /usr/local/kafka/config/zookeeper.properties

    修改如下:

    # 修改为自定义的zookeeper数据目录 dataDir=/data/zookeeper/data # 修改为自定义的zookeeper日志目录 dataLogDir=/data/zookeeper/logs # 端口 clientPort=2181 # 注释掉 #maxClientCnxns=0 # 设置连接参数,添加如下配置 # 为zk的基本时间单元,毫秒 tickTime=2000 # Leader-Follower初始通信时限 tickTime*10 initLimit=10 # Leader-Follower同步通信时限 tickTime*5 syncLimit=5 # 设置broker Id的服务地址,本机ip一定要用0.0.0.0代替 server.1=0.0.0.0:2888:3888 server.2=172.16.20.221:2888:3888 server.3=172.16.20.222:2888:3888

    • 在各台服务器的zookeeper数据目录/data/zookeeper/data添加myid文件,写入服务broker.id属性值

    在data文件夹中新建myid文件,myid文件的内容为1(一句话创建:echo 1 > myid)

    cd /data/zookeeper/data vi myid #添加内容:1 其他两台主机分别配置 2和3 1

    • kafka配置,进入config目录下,修改server.properties文件

    vi /usr/local/kafka/config/server.properties

    # 每台服务器的broker.id都不能相同 broker.id=1 # 是否可以删除topic delete.topic.enable=true # topic 在当前broker上的分片个数,与broker保持一致 num.partitions=3 # 每个主机地址不一样: listeners=PLAINTEXT://172.16.20.220:9092 advertised.listeners=PLAINTEXT://172.16.20.220:9092 # 具体一些参数 log.dirs=/data/kafka/kafka-logs # 设置zookeeper集群地址与端口如下: zookeeper.connect=172.16.20.220:2181,172.16.20.221:2181,172.16.20.222:2181

    • Kafka启动

    kafka启动时先启动zookeeper,再启动kafka;关闭时相反,先关闭kafka,再关闭zookeeper。
    1、zookeeper启动命令

    ./zookeeper-server-start.sh ../config/zookeeper.properties &

    后台运行启动命令:

    nohup ./zookeeper-server-start.sh ../config/zookeeper.properties >/data/zookeeper/logs/zookeeper.log 2>1 &

    或者

    ./zookeeper-server-start.sh -daemon ../config/zookeeper.properties &

    查看集群状态:

    ./zookeeper-server-start.sh status ../config/zookeeper.properties

    2、kafka启动命令

    ./kafka-server-start.sh ../config/server.properties &

    后台运行启动命令:

    nohup bin/kafka-server-start.sh ../config/server.properties >/data/kafka/logs/kafka.log 2>1 &

    或者

    ./kafka-server-start.sh -daemon ../config/server.properties &

    3、创建topic,最新版本已经不需要使用zookeeper参数创建。

    ./kafka-topics.sh --create --replication-factor 2 --partitions 1 --topic test --bootstrap-server 172.16.20.220:9092

    参数解释:
    复制两份
      --replication-factor 2
    创建1个分区
      --partitions 1
    topic 名称
      --topic test

    4、查看已经存在的topic(三台设备都执行时可以看到)

    ./kafka-topics.sh --list --bootstrap-server 172.16.20.220:9092

    5、启动生产者:

    ./kafka-console-producer.sh --broker-list 172.16.20.220:9092 --topic test

    6、启动消费者:

    ./kafka-console-consumer.sh --bootstrap-server 172.16.20.221:9092 --topic test ./kafka-console-consumer.sh --bootstrap-server 172.16.20.222:9092 --topic test

    添加参数 --from-beginning 从开始位置消费,不是从最新消息

    ./kafka-console-consumer.sh --bootstrap-server 172.16.20.221 --topic test --from-beginning

    7、测试:在生产者输入test,可以在消费者的两台服务器上看到同样的字符test,说明Kafka服务器集群已搭建成功。

    三、安装配置Logstash

    Logstash没有提供集群安装方式,相互之间并没有交互,但是我们可以配置同属一个Kafka消费者组,来实现统一消息只消费一次的功能。

    • 解压安装包

    tar -zxvf logstash-8.0.0-linux-x86_64.tar.gz mv logstash-8.0.0 logstash

    • 配置kafka主题和组

    cd logstash # 新建配置文件 vi logstash-kafka.conf # 新增以下内容 input { kafka { codec => "json" group_id => "logstash" client_id => "logstash-api" topics_pattern => "api_log" type => "api" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } kafka { codec => "json" group_id => "logstash" client_id => "logstash-operation" topics_pattern => "operation_log" type => "operation" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } kafka { codec => "json" group_id => "logstash" client_id => "logstash-debugger" topics_pattern => "debugger_log" type => "debugger" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } kafka { codec => "json" group_id => "logstash" client_id => "logstash-nginx" topics_pattern => "nginx_log" type => "nginx" bootstrap_servers => "172.16.20.220:9092,172.16.20.221:9092,172.16.20.222:9092" auto_offset_reset => "latest" } } output { if [type] == "api"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_api-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } if [type] == "operation"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_operation-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } if [type] == "debugger"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_operation-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } if [type] == "nginx"{ elasticsearch { hosts => ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] index => "logstash_operation-%{+YYYY.MM.dd}" user => "elastic" password => "123456" } } }

    • 启动logstash

    # 切换到bin目录 cd /usr/local/logstash/bin # 启动命令 nohup ./logstash -f ../config/logstash-kafka.conf & #查看启动日志 tail -f nohup.out 四、安装配置Kibana

    • 解压安装文件

    tar -zxvf kibana-8.0.0-linux-x86_64.tar.gz mv kibana-8.0.0 kibana

    • 修改配置文件

    cd /usr/local/kibana/config vi kibana.yml # 修改以下内容 server.port: 5601 server.host: "172.16.20.220" elasticsearch.hosts: ["172.16.20.220:9200","172.16.20.221:9200","172.16.20.222:9200"] elasticsearch.username: "kibana_system" elasticsearch.password: "123456"

    • 启动服务

    cd /usr/local/kibana/bin # 默认不允许使用root运行,可以添加 --allow-root 参数使用root用户运行,也可以跟Elasticsearch一样新增一个用户组用户 nohup ./kibana --allow-root &

    • 访问172.16.20.220:5601/,并使用elastic / 123456登录。

    五、安装Filebeat

      Filebeat用于安装在业务软件运行服务器,收集业务产生的日志,并推送到我们配置的Kafka、Redis、RabbitMQ等消息中间件,或者直接保存到Elasticsearch,下面来讲解如何安装配置:

    1、进入到/usr/local目录,执行解压命令

    tar -zxvf filebeat-8.0.0-linux-x86_64.tar.gz mv filebeat-8.0.0-linux-x86_64 filebeat

    2、编辑配置filebeat.yml
      配置文件中默认是输出到elasticsearch,这里我们改为kafka,同文件目录下的filebeat.reference.yml文件是所有配置的实例,可以直接将kafka的配置复制到filebeat.yml

    • 配置采集开关和采集路径:

    # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. # enable改为true enabled: true # Paths that should be crawled and fetched. Glob based paths. # 修改微服务日志的实际路径 paths: - /data/gitegg/log/gitegg-service-system/*.log - /data/gitegg/log/gitegg-service-base/*.log - /data/gitegg/log/gitegg-service-oauth/*.log - /data/gitegg/log/gitegg-service-gateway/*.log - /data/gitegg/log/gitegg-service-extension/*.log - /data/gitegg/log/gitegg-service-bigdata/*.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering #fields: # level: debug # review: 1

    • Elasticsearch 模板配置

    # ======================= Elasticsearch template setting ======================= setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 1 #index.codec: best_compression #_source.enabled: false # 允许自动生成index模板 setup.template.enabled: true # # 生成index模板时字段配置文件 setup.template.fields: fields.yml # # 如果存在模块则覆盖 setup.template.overwrite: true # # 生成index模板的名称 setup.template.name: "api_log" # # 生成index模板匹配的index格式 setup.template.pattern: "api-*" #索引生命周期管理ilm功能默认开启,开启的情况下索引名称只能为filebeat-*, 通过setup.ilm.enabled: false进行关闭; setup.ilm.pattern: "{now/d}" setup.ilm.enabled: false

    • 开启仪表盘并配置使用Kibana仪表盘:

    # ================================= Dashboards ================================= # These settings control loading the sample dashboards to the Kibana index. Loading # the dashboards is disabled by default and can be enabled either by setting the # options here or by using the `setup` command. setup.dashboards.enabled: true # The URL from where to download the dashboards archive. By default this URL # has a value which is computed based on the Beat name and version. For released # versions, this URL points to the dashboard archive on the artifacts.elastic.co # website. #setup.dashboards.url: # =================================== Kibana =================================== # Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana: # Kibana Host # Scheme and port can be left out and will be set to the default (localhost:5601/path # IPv6 addresses should always be defined as: [2001:db8::1]:5601 host: "172.16.20.220:5601" # Kibana Space ID # ID of the Kibana Space into which the dashboards should be loaded. By default, # the Default Space will be used. #space.id:

    • 配置输出到Kafka,完整的filebeat.yml如下

    ###################### Filebeat Configuration Example ######################### # This file is an example configuration file highlighting only the most common # options. The filebeat.reference.yml file from the same directory contains all the # supported options with more comments. You can use it as a reference. # # You can find the full configuration reference here: # www.elastic.co/guide/en/beats/filebeat/index.html # For more available modules and options, please see the filebeat.reference.yml sample # configuration file. # ============================== Filebeat inputs =============================== filebeat.inputs: # Each - is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations. # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /data/gitegg/log/*/*operation.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: operation_log # level: debug # review: 1 # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /data/gitegg/log/*/*api.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: api_log # level: debug # review: 1 # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /data/gitegg/log/*/*debug.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: debugger_log # level: debug # review: 1 # filestream is an input for collecting log messages from files. - type: filestream # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /usr/local/nginx/logs/access.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #prospector.scanner.exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering fields: topic: nginx_log # level: debug # review: 1 # ============================== Filebeat modules ============================== filebeat.config.modules: # Glob pattern for configuration loading path: ${path.config}/modules.d/*.yml # Set to true to enable config reloading reload.enabled: false # Period on which files under path should be checked for changes #reload.period: 10s # ======================= Elasticsearch template setting ======================= setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 1 #index.codec: best_compression #_source.enabled: false # 允许自动生成index模板 setup.template.enabled: true # # 生成index模板时字段配置文件 setup.template.fields: fields.yml # # 如果存在模块则覆盖 setup.template.overwrite: true # # 生成index模板的名称 setup.template.name: "gitegg_log" # # 生成index模板匹配的index格式 setup.template.pattern: "filebeat-*" #索引生命周期管理ilm功能默认开启,开启的情况下索引名称只能为filebeat-*, 通过setup.ilm.enabled: false进行关闭; setup.ilm.pattern: "{now/d}" setup.ilm.enabled: false # ================================== General =================================== # The name of the shipper that publishes the network data. It can be used to group # all the transactions sent by a single shipper in the web interface. #name: # The tags of the shipper are included in their own field with each # transaction published. #tags: ["service-X", "web-tier"] # Optional fields that you can specify to add additional information to the # output. #fields: # env: staging # ================================= Dashboards ================================= # These settings control loading the sample dashboards to the Kibana index. Loading # the dashboards is disabled by default and can be enabled either by setting the # options here or by using the `setup` command. setup.dashboards.enabled: true # The URL from where to download the dashboards archive. By default this URL # has a value which is computed based on the Beat name and version. For released # versions, this URL points to the dashboard archive on the artifacts.elastic.co # website. #setup.dashboards.url: # =================================== Kibana =================================== # Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana: # Kibana Host # Scheme and port can be left out and will be set to the default (localhost:5601/path # IPv6 addresses should always be defined as: [2001:db8::1]:5601 host: "172.16.20.220:5601" # Optional protocol and basic auth credentials. #protocol: "cloud.elastic.co/). # The cloud.id setting overwrites the `output.elasticsearch.hosts` and # `setup.kibana.host` options. # You can find the `cloud.id` in the Elastic Cloud web UI. #cloud.id: # The cloud.auth setting overwrites the `output.elasticsearch.username` and # `output.elasticsearch.password` settings. The format is `<user>:<pass>`. #cloud.auth: # ================================== Outputs =================================== # Configure what output to use when sending the data collected by the beat. # ---------------------------- Elasticsearch Output ---------------------------- #output.elasticsearch: # Array of hosts to connect to. hosts: ["localhost:9200"] # Protocol - either `localhost:8200 # API Key for the APM Server(s). # If api_key is set then secret_token will be ignored. #api_key: # Secret token for the APM Server(s). #secret_token: # ================================= Migration ================================== # This allows to enable 6.7 migration aliases #migration.6_to_7.enabled: true

    • 执行filebeat启动命令

    ./filebeat -e -c filebeat.yml

    后台启动命令

    nohup ./filebeat -e -c filebeat.yml >/dev/null 2>&1 &

    停止命令

    ps -ef |grep filebeat kill -9 进程号 六、测试配置是否正确 1、测试filebeat是否能够采集log文件并发送到Kafka

    • 在kafka服务器开启消费者,监听api_log主题和operation_log主题

    ./kafka-console-consumer.sh --bootstrap-server 172.16.20.221:9092 --topic api_log ./kafka-console-consumer.sh --bootstrap-server 172.16.20.222:9092 --topic operation_log

    • 手动写入日志文件,按照filebeat配置的采集目录写入

    echo "api log1111" > /data/gitegg/log/gitegg-service-system/api.log echo "operation log1111" > /data/gitegg/log/gitegg-service-system/operation.log

    • 观察消费者是消费到日志推送内容



    2、测试logstash是消费Kafka的日志主题,并将日志内容存入Elasticsearch

    • 手动写入日志文件

    echo "api log8888888888888888888888" > /data/gitegg/log/gitegg-service-system/api.log echo "operation loggggggggggggggggggg" > /data/gitegg/log/gitegg-service-system/operation.log

    • 打开Elasticsearch Head界面 172.16.20.220:9100/?auth_user=elastic&auth_password=123456 ,查询Elasticsearch是否有数据。

    自动新增的两个index,规则是logstash中配置的

    数据浏览页可以看到Elasticsearch中存储的日志数据内容,说明我们的配置已经生效。

    七、配置Kibana用于日志统计和展示
    • 依次点击左侧菜单Management -> Kibana -> Data Views -> Create data view , 输入logstash_* ,选择@timestamp,再点击Create data view按钮,完成创建。



    • 点击日志分析查询菜单Analytics -> Discover,选择logstash_* 进行日志查询

    源码地址:

    Gitee: gitee.com/wmz1930/GitEgg
    GitHub: github.com/wmz1930/GitEgg