ELKlogstash配置语法

2019独角兽企业重金招聘Python工程师标准>>>

数据类型

logstash支持的数据类型有&＃xff1a;

array
数组可以是单个或者多个字符串值。
path &＃61;> [ "/var/log/messages", "/var/log/*.log" ]
path &＃61;> "/data/mysql/mysql.log"
如果指定了多次&＃xff0c;追加数组。此实例path数组包含三个字符串元素。
boolean
布尔值必须是TRUE或者false。true和false不能有引号。
ssl_enable &＃61;> true
bytes
指定字节单位。支持的单位有SI (k M G T P E Z Y) 和 Binary (Ki Mi Gi Ti Pi Ei Zi Yi)。Binary单位基于1024&＃xff0c;SI单位基于1000。不区分大小写和忽略值与单位之间的空格。如果没有指定单位&＃xff0c;默认是byte。
my_bytes &＃61;> "1113" # 1113 bytes
my_bytes &＃61;> "10MiB" # 10485760 bytes
my_bytes &＃61;> "100kib" # 102400 bytes
my_bytes &＃61;> "180 mb" # 180000000 bytes
Codec
logstash编码名称用来表示数据编码。用于input和output段。便于数据的处理。如果input和output使用合适的编码&＃xff0c;就无需单独的filter对数据进行处理。
codec &＃61;> "json"
hash
键值对&＃xff0c;注意多个键值对用空格分隔&＃xff0c;而不是逗号。
match &＃61;> {
"field1" &＃61;> "value1"
"field2" &＃61;> "value2"
... }
number
必须是有效的数值&＃xff0c;浮点数或者整数。
port &＃61;> 33
password
一个单独的字符串。
my_password &＃61;> "password"
path
一个代表有效的操作系统路径。
my_path &＃61;> "/tmp/logstash"
string
name &＃61;> "Hello world"
name &＃61;> &＃39;It\&＃39;s a beautiful day&＃39;

字段引用

logstash字段引用语法。要在 Logstash 配置中使用字段的值&＃xff0c;只需要把字段的名字写在中括号 [] 里就行了&＃xff0c;这就叫字段引用。还需注意字段层次。如果引用的是一个顶级字段&＃xff0c;可以省略[]&＃xff0c;直接指定字段名。要引用嵌套的字段&＃xff0c;需要指定完整的路径&＃xff0c;如[top-level field][nested field]。

下面有五个顶级字段(agent, ip, request, response, ua) 和三个嵌套字段 (status, bytes, os)。

{ "agent": "Mozilla/5.0 (compatible; MSIE 9.0)", "ip": "192.168.24.44", "request": "/index.html" "response": { "status": 200, "bytes": 52353 }, "ua": { "os": "Windows 7" } }

{

"agent": "Mozilla/5.0 (compatible; MSIE 9.0)",

"ip": "192.168.24.44",

"request": "/index.html"

"response": {

"status": 200,

"bytes": 52353

"ua": {

"os": "Windows 7"

}

为了引用os字段&＃xff0c;需指定[ua][os]。引用顶级字段如request&＃xff0c;可以简单指定request即可。

sprintf格式

字段引用格式也可以用于logstash调用sprintf格式。这种格式可以从其他字符串中引用字段值。如&＃xff1a;

output { statsd { increment &＃61;> "apache.%{[response][status]}" } }

output {

statsd {

increment &＃61;> "apache.%{[response][status]}"

}

也可以格式化时间。如&＃xff1a;

output { file { path &＃61;> "/var/log/%{type}.%{&＃43;yyyy.MM.dd.HH}" } }

output {

file {

path &＃61;> "/var/log/%{type}.%{&＃43;yyyy.MM.dd.HH}"

}

条件判断

使用条件来决定filter和output处理特定的事件。

logstash条件类似于编程语言。条件支持if、else if、else语句&＃xff0c;可以嵌套。

条件语法如下&＃xff1a;

if EXPRESSION { ... } else if EXPRESSION { ... } else { ... }

if EXPRESSION {

...

} else if EXPRESSION {

...

} else {

...

}

比较操作有&＃xff1a;

相等: &＃61;&＃61;, !&＃61;, <, >, <&＃61;, >&＃61;
正则: &＃61;~(匹配正则), !~(不匹配正则)
包含: in(包含), not in(不包含)

布尔操作&＃xff1a;

and(与), or(或), nand(非与), xor(非或)

一元运算符&＃xff1a;

!(取反)
()(复合表达式), !()(对复合表达式结果取反)

如mutate filter删除secret字段对于action是login的&＃xff1a;

filter { if [action] &＃61;&＃61; "login" { mutate { remove &＃61;> "secret" } } }

filter {

if [action] &＃61;&＃61; "login" {

mutate { remove &＃61;> "secret" }

}

在一个条件里指定多个表达式&＃xff1a;

output { # Send production errors to pagerduty if [loglevel] &＃61;&＃61; "ERROR" and [deployment] &＃61;&＃61; "production" { pagerduty { ... } } }

output {

# Send production errors to pagerduty

if [loglevel] &＃61;&＃61; "ERROR" and [deployment] &＃61;&＃61; "production" {

pagerduty {

...

}

在in条件&＃xff0c;可以比较字段值&＃xff1a;

filter { if [foo] in [foobar] { mutate { add_tag &＃61;> "field in field" } } if [foo] in "foo" { mutate { add_tag &＃61;> "field in string" } } if "hello" in [greeting] { mutate { add_tag &＃61;> "string in field" } } if [foo] in ["hello", "world", "foo"] { mutate { add_tag &＃61;> "field in list" } } if [missing] in [alsomissing] { mutate { add_tag &＃61;> "shouldnotexist" } } if !("foo" in ["hello", "world"]) { mutate { add_tag &＃61;> "shouldexist" } } }

filter {

if [foo] in [foobar] {

mutate { add_tag &＃61;> "field in field" }

}

if [foo] in "foo" {

mutate { add_tag &＃61;> "field in string" }

}

if "hello" in [greeting] {

mutate { add_tag &＃61;> "string in field" }

}

if [foo] in ["hello", "world", "foo"] {

mutate { add_tag &＃61;> "field in list" }

}

if [missing] in [alsomissing] {

mutate { add_tag &＃61;> "shouldnotexist" }

}

if !("foo" in ["hello", "world"]) {

mutate { add_tag &＃61;> "shouldexist" }

}

output { if "_grokparsefailure" not in [tags] { elasticsearch { ... } } }

output {

if "_grokparsefailure" not in [tags] {

elasticsearch { ... }

}

字段引用、sprintf格式、条件判断只能用于filter和output&＃xff0c;不能用于input。

&＃64;metadata字段

在logstash1.5版本开始&＃xff0c;有一个特殊的字段&＃xff0c;叫做&＃64;metadata。&＃64;metadata包含的内容不会作为事件的一部分输出。

input { stdin { } } filter { mutate { add_field &＃61;> { "show" &＃61;> "This data will be in the output" } } mutate { add_field &＃61;> { "[&＃64;metadata][test]" &＃61;> "Hello" } } mutate { add_field &＃61;> { "[&＃64;metadata][no_show]" &＃61;> "This data will not be in the output" } } } output { if [&＃64;metadata][test] &＃61;&＃61; "Hello" { stdout { codec &＃61;> rubydebug } } }

input { stdin { } }

filter {

mutate { add_field &＃61;> { "show" &＃61;> "This data will be in the output" } }

mutate { add_field &＃61;> { "[&＃64;metadata][test]" &＃61;> "Hello" } }

mutate { add_field &＃61;> { "[&＃64;metadata][no_show]" &＃61;> "This data will not be in the output" } }

}

output {

if [&＃64;metadata][test] &＃61;&＃61; "Hello" {

stdout { codec &＃61;> rubydebug }

}

查看输出&＃xff1a;

$ bin/logstash -f ../test.conf Logstash startup completed asdf { "message" &＃61;> "asdf", "&＃64;version" &＃61;> "1", "&＃64;timestamp" &＃61;> "2015-03-18T23:09:29.595Z", "host" &＃61;> "www.ttlsa.com", "show" &＃61;> "This data will be in the output" }

$ bin/logstash -f ../test.conf

Logstash startup completed

asdf

{

"message" &＃61;> "asdf",

"&＃64;version" &＃61;> "1",

"&＃64;timestamp" &＃61;> "2015-03-18T23:09:29.595Z",

"host" &＃61;> "www.ttlsa.com",

"show" &＃61;> "This data will be in the output"

}

"asdf"变成message字段内容。条件与&＃64;metadata内嵌的test字段内容判断成功&＃xff0c;但是输出并没有展示&＃64;metadata字段和其内容。

不过&＃xff0c;如果指定了metadata &＃61;> true&＃xff0c;rubydebug codec允许显示&＃64;metadata字段的内容。

stdout { codec &＃61;> rubydebug { metadata &＃61;> true } }

1	stdout { codec &＃61;> rubydebug { metadata &＃61;> true } }

下面是输出的内容&＃xff1a;

$ bin/logstash -f ../test.conf Logstash startup completed asdf { "message" &＃61;> "asdf", "&＃64;version" &＃61;> "1", "&＃64;timestamp" &＃61;> "2015-03-18T23:10:19.859Z", "host" &＃61;> "www.ttlsa.com", "show" &＃61;> "This data will be in the output", "&＃64;metadata" &＃61;> { "test" &＃61;> "Hello", "no_show" &＃61;> "This data will not be in the output" } }

$ bin/logstash -f ../test.conf

Logstash startup completed

asdf

{

"message" &＃61;> "asdf",

"&＃64;version" &＃61;> "1",

"&＃64;timestamp" &＃61;> "2015-03-18T23:10:19.859Z",

"host" &＃61;> "www.ttlsa.com",

"show" &＃61;> "This data will be in the output",

"&＃64;metadata" &＃61;> {

"test" &＃61;> "Hello",

"no_show" &＃61;> "This data will not be in the output"

}

可以看到&＃64;metadata字段及其子字段内容。

注意&＃xff1a;只有rubydebug codec可以显示&＃64;metadata字段内容。

确保&＃64;metadata字段临时需要&＃xff0c;不希望最终输出。最常见的情景是filter的时间字段&＃xff0c;需要一临时的时间戳。如&＃xff1a;

input { stdin { } } filter { grok { match &＃61;> [ "message", "%{HTTPDATE:[&＃64;metadata][timestamp]}" ] } date { match &＃61;> [ "[&＃64;metadata][timestamp]", "dd/MMM/yyyy:HH:mm:ss Z" ] } } output { stdout { codec &＃61;> rubydebug } }

input { stdin { } }

filter {

grok { match &＃61;> [ "message", "%{HTTPDATE:[&＃64;metadata][timestamp]}" ] }

date { match &＃61;> [ "[&＃64;metadata][timestamp]", "dd/MMM/yyyy:HH:mm:ss Z" ] }

}

output {

stdout { codec &＃61;> rubydebug }

}

输出结果&＃xff1a;

$ bin/logstash -f ../test.conf Logstash startup completed 02/Mar/2014:15:36:43 &＃43;0100 { "message" &＃61;> "02/Mar/2014:15:36:43 &＃43;0100", "&＃64;version" &＃61;> "1", "&＃64;timestamp" &＃61;> "2014-03-02T14:36:43.000Z", "host" &＃61;> "example.com" }