这篇文章带你搞懂日志采集利器Filebeat!老男孩SRE工程师培训
filebeat是用于"转发"和"集中日志数据"的轻量级数据采集器。
filebeat会监视指定的日志文件路径,收集日志事件并将数据转发到elasticsearch,logstash,redis,kafka存储服务器。
当您要面对成百上千,甚至成千上万的服务器,虚拟机的容器生成的日志时,请告别SSH吧。
Filebeat将为您提供一种轻量级方法,用于转发和汇总日志与文件,让简单的事情不再繁杂。
| Filebeat的组件
Filebeat包含两个主要组件,input(输入)和Harvester(收割机),两个组件协同工作将文件的尾部最新数据发送出去。
Harveste组件: 负责逐行读取单个文件的内容,然后将内容发送到输出。
input组件: 输入负责管理收割机并找到所有要读取的源。该参数的源文件路径需要使用者手动配置。
Spooler(缓冲区): 将Harvester组件采集的数据进行统一的缓存,并发往目的端,可以是 Elasticsearch, Logstash , kafka 和 Redis 等。
| Filebeat工作原理
filebeat工作流程如下:
1、filebeat启动后,filebeat通过Input读取指定的日志路径;
2、为该文件日志启动收割进程harvester,每个收割进程读取一个日志文件的新内容,并发送这些新的日志数据到处理程序spooler;
3、spooler会集合这些事件,最后filebeat会发送集合的数据到你指定的位置。
Filebeat如何保持文件的状态?
Filebeat保持每个文件的状态,并经常将状态刷新到注册表文件(data/registry/filebeat/log.json)中的磁盘。
该状态用于记住收割机读取的最后一个偏移量,并确保发送所有日志行。
Filebeat如何确保至少一次交付?
Filebeat保证事件将至少传送到配置的输出一次并且不会丢失数据。
Filebeat能够实现这种行为,因为它将每个事件的传递状态存储在注册表文件中。
部署Filebeat环境
| 安装Filebeat软件
# 编译安装Filebeat
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.12.1-linux-x86_64.tar.gz
tar xf filebeat-7.12.1-linux-x86_64.tar.gz -C /oldboyedu/softwares/
cd /oldboyedu/softwares/
ln -s filebeat-7.12.1-linux-x86_64 filebeat
vim /etc/profile.d/filebeat.sh
# 添加Filebeat的环境变量
cat /etc/profile.d/filebeat.sh
#!/bin/bash
export FILE_BEAT=/oldboyedu/softwares/filebeat
export PATH=$PATH:$FILE_BEAT
# 使环境变量生效
source /etc/profile.d/filebeat.sh 
# 查看环境变量是否生效
which filebeat| filebeat参数介绍
| 运行第一个实例
将标准输入的数据进行标准输出
vim stdin-to-console.yaml
filebeat.inputs:
- type: stdin
  enabled: true
output.console:
  pretty: true
  enable: true
# 查看filebeat的输出
filebeat -e -c stdin-to-console.yaml企业实战
| nginx日志收集
安装nginx
yum -y install epel-release
yum -y install nginx创建配置文件
vim /etc/nginx/conf.d/elk103.oldboyedu.com.conf
server {
	listen 80;
	
	server_name es.oldboyedu.com;
	root /oldboyedu/data/nginx/;
	location / {
		index index.html;
	}
}创建测试数据
mkdir -p /oldboyedu/data/nginx/
echo "<h1>老男孩教育</h1>" > /oldboyedu/data/nginx/index.html检查配置文件
nginx -t启动nginx服务
systemctl start nginx测试nginx服务
# 编写脚本
vim /server/scripts/nginx.sh 
#!/bin/bash
while true
  do
  for i in "curl es.oldboyedu.com"
    do
	Time=$((RANDOM%5 +1 ))
	echo "本次间隔时间为:$Time"
	curl elk103.oldboyedu.com
	sleep $Time
  done
done配置nginx收集JSON并重启nginx
# 修改nginx的配置文件
vim /etc/nginx/nginx.conf
...
# 自定义nginx的日志格式为json格式
log_format oldboyedu_nginx_json '{"@timestamp":"$time_iso8601",' 
                          '"host":"$server_addr",' 
                          '"clientip":"$remote_addr",' 
                          '"size":$body_bytes_sent,' 
                          '"responsetime":$request_time,' 
                          '"upstreamtime":"$upstream_response_time",' 
                          '"upstreamhost":"$upstream_addr",' 
                          '"http_host":"$host",' 
                          '"uri":"$uri",' 
                          '"domain":"$host",' 
                          '"xff":"$http_x_forwarded_for",' 
                          '"referer":"$http_referer",' 
                          '"tcp_xff":"$proxy_protocol_addr",' 
                          '"http_user_agent":"$http_user_agent",' 
                          '"status":"$status"}';
access_log  /var/log/nginx/access.log  oldboyedu_nginx_json;
# 测试配置文件是否正常
nginx -t
# 重新加载nginx
systemctl restart nginx配置filebeat的配置文件
vim 01-nginx-to-es.yaml
filebeat.inputs:
- type: log
  paths: 
	- /var/log/nginx/access.log
  tags: "nginx"
  # 默认值为false,我们需要修改为true,即不会将消息存储至message字段!
  json.keys_under_root: true
output.elasticsearch:
  hosts: ["192.168.56.130:9200","192.168.56.131:9200","192.168.56.132:9200"]
  #index: "oldboy-2022-%{[agent.version]}-%{+yyyy.MM.dd}"
  indices:
	- index: "oldboyedu-nginx2022-%{+yyyy.MM.dd}"
	  when.contains:
		tags: "nginx"
# 禁用索引的生命周期!
setup.ilm.enabled: false
# 指定索引模板的名称
setup.template.name: "oldboyedu"
# 指定索引模板的匹配模式
setup.template.pattern: "oldboyedu-nginx*"
# 指定索引模板的分片信息
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0收集nginx的错误日志
vim 02-nginx-to-es.yaml 
filebeat.inputs:
- type: log
  paths: 
    - /var/log/nginx/access.log
  tags: "nginx-access"
  # 默认值为false,我们需要修改为true,即不会将消息存储至message字段!
  json.keys_under_root: true
- type: log
  paths: 
    - /var/log/nginx/error.log
  tags: "nginx-error"
output.elasticsearch:
  hosts: ["192.168.56.130:9200","192.168.56.131:9200","192.168.56.132:9200"]
  #index: "oldboy-2022-%{[agent.version]}-%{+yyyy.MM.dd}"
  indices:
    - index: "oldboyedu-nginx-access-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-access"
    - index: "oldboyedu-nginx-error-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-error"
# 禁用索引的生命周期!
setup.ilm.enabled: false
# 指定索引模板的名称
setup.template.name: "oldboyedu"
# 指定索引模板的匹配模式
setup.template.pattern: "oldboyedu-nginx*"
# 指定索引模板的分片信息
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0| Nginx多虚拟主机
配置nginx的多虚拟主机
vim /etc/nginx/conf.d/bbs.oldboyedu.com.conf
server {
	listen 80;
	
	server_name bbs.oldboyedu.com;
	root /oldboyedu/data/nginx/bbs;
	 # 指定access.log的存储路径及日志格式.
        access_log /var/log/nginx/bbs.log oldboyedu_nginx_json;
	location / {
		index index.html;
	}
}
vim /etc/nginx/conf.d/blog.oldboyedu.com.conf 
server {
	listen 80;
	
	server_name blog.oldboyedu.com;
	root /oldboyedu/data/nginx/blog;
        # 指定access.log的存储路径及日志格式.
        access_log /var/log/nginx/blog.log oldboyedu_nginx_json;
	location / {
		index index.html;
	}
}创建测试数据
mkdir -p /oldboyedu/data/nginx/{blog,bbs}
echo "<h1>blog</h1>" > /oldboyedu/data/nginx/blog/index.html
echo "<h1>bbs</h1>" > /oldboyedu/data/nginx/bbs/index.html
# 检查配置文件的语法
nginx -t
# 修改主机名映射
vim /etc/hosts
...
192.168.56.132 blog.oldboyedu.com
192.168.56.132 bbs.oldboyedu.com
# 重启nginx服务
systemctl restart nginx
# 测试服务
curl blog.oldboyedu.com
curl bbs.oldboyedu.com编写fielbeat的yaml
vim nginx_vm_host.yaml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/access.log
  # false会将json解析的格式存储至message,改为true则不存储至message
  json.keys_under_root: true
  # 覆盖默认的message字段,使用自定义json格式的key
  json.overwrite_keys: true
  # 为访问日志("access.log")打标签
  tags: ["nginx-access"]
- type: log
  enabled: true
  paths:
    - /var/log/nginx/blog.log
  # false会将json解析的格式存储至message,改为true则不存储至message
  json.keys_under_root: true
  # 覆盖默认的message字段,使用自定义json格式的key
  json.overwrite_keys: true
  # 为访问日志("access.log")打标签
  tags: ["nginx-blog"]
- type: log
  enabled: true
  paths:
    - /var/log/nginx/demo.log
  # false会将json解析的格式存储至message,改为true则不存储至message
  json.keys_under_root: true
  # 覆盖默认的message字段,使用自定义json格式的key
  json.overwrite_keys: true
  # 为访问日志("access.log")打标签
  tags: ["nginx-demo"]
- type: log
  enable: true
  paths:
    - /var/log/nginx/error.log
  # 为错误日志("error.log")打标签
  tags: ["nginx-error"]
output.elasticsearch:
  hosts: ["192.168.56.130:9200","192.168.56.131:9200","192.168.56.132:9200"]  # index: "nginx-access-%{[agent.version]}-%{+yyyy.MM.dd}"
  # 注意哈,下面的标签不再是"index"啦~
  indices:
    - index: "nginx-access-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-access"
    - index: "nginx-error-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-error"
    - index: "nginx-blog-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-blog"
    - index: "nginx-demo-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-demo"
setup.ilm.enabled: false
# 定义模板名称.
setup.template.name: "nginx"
# 定义模板的匹配索引名称.
setup.template.pattern: "nginx-*"
[root@oldboy-es03 project]# filebeat -e -c nginx_vm_host.yaml| Tomcat日志收集
部署tomcat
tar zxf apache-tomcat-10.0.6.tar.gz -C /oldboy/softwares/
cd  /oldboyedu/softwares/
ln -s apache-tomcat-10.0.6 tomcat
# 配置JDK 的环境变量
vim  /etc/profile.d/tomcat.sh
#!/bin/bash
export TOMCAT_HOME=/oldboyedu/softwares/tomcat
export PATH=$PATH:$TOMCAT_HOME/bin
# 让环境变量生效
.  /etc/profile.d/tomcat.sh
catalina.sh
# 配置tomcat的JSON格式
vim /oldboyedu/softwares/tomcat/conf/server.xml
···(大概在133行哟~)
      <Host name="tomcat.oldboyedu.com"  appBase="webapps"
            unpackWARs="true" autoDeploy="true">
...(需要手动注释一下原内容)
<!--
        <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
               prefix="localhost_access_log" suffix=".txt"
               pattern="%h %l %u %t "%r" %s %b" />
-->
<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
            prefix="tomcat.oldboyedu.com_access_log" suffix=".txt"
pattern="{"clientip":"%h","ClientUser":"%l","authentica
ted":"%u","AccessTime":"%t","request":"%r","status":"%s","SendBytes":"%b","Query?string":"%q","partner":"%{Referer}i","AgentVersion":"%{User-Agent}i"}"/>
...
# 配置主机解析
vim /etc/hosts
...
19.168.56.132 tomcat.oldboyedu.com
# 启动tomcat服务
catalina.sh start
# 验证服务
	略。使用filebeat收集日志
vim ~/conf/project/tomcat01.tomcat-to-es.yaml
filebeat.inputs:
- type: log
  paths:
    - /oldboyedu/softwares/tomcat/logs/tomcat.oldboyedu.com_access_log.*.txt
  # false会将json解析的格式存储至message,改为true则不存储至message
  json.keys_under_root: true
  # 为访问日志("access.log")打标签
  tags: "tomcat-access"
output.elasticsearch:
  hosts: ["192.168.56.130:9200","192.168.56.131:9200","192.168.56.132:9200"]
  # 注意哈,下面的标签不再是"index"啦~
  indices:
    - index: "tomcat-access-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "tomcat-access"
setup.ilm.enabled: false
# 定义模板名称.
setup.template.name: "tomcat"
# 定义模板的匹配索引名称.
setup.template.pattern: "tomcat-*"
# 指定索引模板的分片信息
setup.template.settings:
  index.number_of_shards: 3
  index.number_of_replicas: 0
[root@oldboy-es03 ~]# 收集错误日志
vim  ~/conf/project/tomcat/03.tomcat-to-es.yaml
filebeat.inputs:
- type: log
  paths:
    - /oldboyedu/softwares/tomcat/logs/tomcat.oldboyedu.com_access_log.*.txt
  json.keys_under_root: true
  tags: "tomcat-access"
- type: log
  paths:
    - /oldboyedu/softwares/tomcat/logs/catalina*
  tags: "tomcat-error"
  multiline.type: pattern
  multiline.pattern: '^\d{2}'
  multiline.negate: true
  multiline.match: after
  multiline.max_lines: 1000
output.elasticsearch:
  hosts: ["192.168.56.130:9200","192.168.56.131:9200","192.168.56.132:9200"]
  indices:
    - index: "tomcat-access-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "tomcat-access"
    - index: "tomcat-error-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "tomcat-error"
setup.ilm.enabled: false
setup.template.name: "tomcat"
setup.template.pattern: "tomcat-*"
setup.template.settings:
  index.number_of_shards: 3
  index.number_of_replicas: 0                            注意:吐槽知乎网只是一个问答与文章免费发布平台,所有信息均有会员免费发布,不产生金钱交易,如果你有资金往来,请及时通过电话与对方联系,调查清楚,确认无误在选择,否则造成你的损失,由自己承担,本平台概不负责,谢谢!





