2019独角兽企业重金招聘Python工程师标准>>>
数据类型
logstash支持的数据类型有:
- array
数组可以是单个或者多个字符串值。
path => [ "/var/log/messages", "/var/log/*.log" ]
path => "/data/mysql/mysql.log"
如果指定了多次,追加数组。此实例path数组包含三个字符串元素。 - boolean
布尔值必须是TRUE或者false。true和false不能有引号。
ssl_enable => true - bytes
指定字节单位。支持的单位有SI (k M G T P E Z Y) 和 Binary (Ki Mi Gi Ti Pi Ei Zi Yi)。Binary单位基于1024,SI单位基于1000。不区分大小写和忽略值与单位之间的空格。如果没有指定单位,默认是byte。
my_bytes => "1113" # 1113 bytes
my_bytes => "10MiB" # 10485760 bytes
my_bytes => "100kib" # 102400 bytes
my_bytes => "180 mb" # 180000000 bytes - Codec
logstash编码名称用来表示数据编码。用于input和output段。便于数据的处理。如果input和output使用合适的编码,就无需单独的filter对数据进行处理。
codec => "json" - hash
键值对,注意多个键值对用空格分隔,而不是逗号。
match => {
"field1" => "value1"
"field2" => "value2"
... } - number
必须是有效的数值,浮点数或者整数。
port => 33 - password
一个单独的字符串。
my_password => "password" - path
一个代表有效的操作系统路径。
my_path => "/tmp/logstash" - string
name => "Hello world"
name => 'It\'s a beautiful day'
字段引用
logstash字段引用语法。要在 Logstash 配置中使用字段的值,只需要把字段的名字写在中括号 [] 里就行了,这就叫字段引用。还需注意字段层次。如果引用的是一个顶级字段,可以省略[],直接指定字段名。要引用嵌套的字段,需要指定完整的路径,如[top-level field][nested field]。
下面有五个顶级字段(agent, ip, request, response, ua) 和三个嵌套字段 (status, bytes, os)。
{ "agent": "Mozilla/5.0 (compatible; MSIE 9.0)", "ip": "192.168.24.44", "request": "/index.html" "response": { "status": 200, "bytes": 52353 }, "ua": { "os": "Windows 7" } }
1 2 3 4 5 6 7 8 9 10 11 12 | { "agent": "Mozilla/5.0 (compatible; MSIE 9.0)", "ip": "192.168.24.44", "request": "/index.html" "response": { "status": 200, "bytes": 52353 }, "ua": { "os": "Windows 7" } } |
为了引用os字段,需指定[ua][os]。引用顶级字段如request,可以简单指定request即可。
sprintf格式
字段引用格式也可以用于logstash调用sprintf格式。这种格式可以从其他字符串中引用字段值。如:
output { statsd { increment => "apache.%{[response][status]}" } }
1 2 3 4 5 | output { statsd { increment => "apache.%{[response][status]}" } } |
也可以格式化时间。如:
output { file { path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}" } }
1 2 3 4 5 | output { file { path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}" } } |
条件判断
使用条件来决定filter和output处理特定的事件。
logstash条件类似于编程语言。条件支持if、else if、else语句,可以嵌套。
条件语法如下:
if EXPRESSION { ... } else if EXPRESSION { ... } else { ... }
1 2 3 4 5 6 7 | if EXPRESSION { ... } else if EXPRESSION { ... } else { ... } |
比较操作有:
- 相等:
==
,!=
,<
,>
,<&#61;
,>&#61;
- 正则:
&#61;~(匹配正则)
,!~(不匹配正则)
- 包含:
in(包含)
,not in(不包含)
布尔操作&#xff1a;
and(与)
,or(或)
,nand(非与)
,xor(非或)
一元运算符&#xff1a;
!(取反)
()
(复合表达式),!()
(对复合表达式结果取反)
如mutate filter删除secret字段对于action是login的&#xff1a;
filter { if [action] &#61;&#61; "login" { mutate { remove &#61;> "secret" } } }
1 2 3 4 5 | filter { if [action] &#61;&#61; "login" { mutate { remove &#61;> "secret" } } } |
在一个条件里指定多个表达式&#xff1a;
output { # Send production errors to pagerduty if [loglevel] &#61;&#61; "ERROR" and [deployment] &#61;&#61; "production" { pagerduty { ... } } }
1 2 3 4 5 6 7 8 | output { # Send production errors to pagerduty if [loglevel] &#61;&#61; "ERROR" and [deployment] &#61;&#61; "production" { pagerduty { ... } } } |
在in条件&#xff0c;可以比较字段值&#xff1a;
filter { if [foo] in [foobar] { mutate { add_tag &#61;> "field in field" } } if [foo] in "foo" { mutate { add_tag &#61;> "field in string" } } if "hello" in [greeting] { mutate { add_tag &#61;> "string in field" } } if [foo] in ["hello", "world", "foo"] { mutate { add_tag &#61;> "field in list" } } if [missing] in [alsomissing] { mutate { add_tag &#61;> "shouldnotexist" } } if !("foo" in ["hello", "world"]) { mutate { add_tag &#61;> "shouldexist" } } }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | filter { if [foo] in [foobar] { mutate { add_tag &#61;> "field in field" } } if [foo] in "foo" { mutate { add_tag &#61;> "field in string" } } if "hello" in [greeting] { mutate { add_tag &#61;> "string in field" } } if [foo] in ["hello", "world", "foo"] { mutate { add_tag &#61;> "field in list" } } if [missing] in [alsomissing] { mutate { add_tag &#61;> "shouldnotexist" } } if !("foo" in ["hello", "world"]) { mutate { add_tag &#61;> "shouldexist" } } } |
output { if "_grokparsefailure" not in [tags] { elasticsearch { ... } } }
1 2 3 4 5 | output { if "_grokparsefailure" not in [tags] { elasticsearch { ... } } } |
字段引用、sprintf格式、条件判断只能用于filter和output&#xff0c;不能用于input。
&#64;metadata字段
在logstash1.5版本开始&#xff0c;有一个特殊的字段&#xff0c;叫做&#64;metadata。&#64;metadata包含的内容不会作为事件的一部分输出。
input { stdin { } } filter { mutate { add_field &#61;> { "show" &#61;> "This data will be in the output" } } mutate { add_field &#61;> { "[&#64;metadata][test]" &#61;> "Hello" } } mutate { add_field &#61;> { "[&#64;metadata][no_show]" &#61;> "This data will not be in the output" } } } output { if [&#64;metadata][test] &#61;&#61; "Hello" { stdout { codec &#61;> rubydebug } } }
1 2 3 4 5 6 7 8 9 10 11 12 13 | input { stdin { } }
filter { mutate { add_field &#61;> { "show" &#61;> "This data will be in the output" } } mutate { add_field &#61;> { "[&#64;metadata][test]" &#61;> "Hello" } } mutate { add_field &#61;> { "[&#64;metadata][no_show]" &#61;> "This data will not be in the output" } } }
output { if [&#64;metadata][test] &#61;&#61; "Hello" { stdout { codec &#61;> rubydebug } } } |
查看输出&#xff1a;
$ bin/logstash -f ../test.conf Logstash startup completed asdf { "message" &#61;> "asdf", "&#64;version" &#61;> "1", "&#64;timestamp" &#61;> "2015-03-18T23:09:29.595Z", "host" &#61;> "www.ttlsa.com", "show" &#61;> "This data will be in the output" }
1 2 3 4 5 6 7 8 9 10 | $ bin/logstash -f ../test.conf Logstash startup completed asdf { "message" &#61;> "asdf", "&#64;version" &#61;> "1", "&#64;timestamp" &#61;> "2015-03-18T23:09:29.595Z", "host" &#61;> "www.ttlsa.com", "show" &#61;> "This data will be in the output" } |
"asdf"变成message字段内容。条件与&#64;metadata内嵌的test字段内容判断成功&#xff0c;但是输出并没有展示&#64;metadata字段和其内容。
不过&#xff0c;如果指定了metadata &#61;> true&#xff0c;rubydebug codec允许显示&#64;metadata字段的内容。
stdout { codec &#61;> rubydebug { metadata &#61;> true } }
1 | stdout { codec &#61;> rubydebug { metadata &#61;> true } } |
下面是输出的内容&#xff1a;
$ bin/logstash -f ../test.conf Logstash startup completed asdf { "message" &#61;> "asdf", "&#64;version" &#61;> "1", "&#64;timestamp" &#61;> "2015-03-18T23:10:19.859Z", "host" &#61;> "www.ttlsa.com", "show" &#61;> "This data will be in the output", "&#64;metadata" &#61;> { "test" &#61;> "Hello", "no_show" &#61;> "This data will not be in the output" } }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | $ bin/logstash -f ../test.conf Logstash startup completed asdf { "message" &#61;> "asdf", "&#64;version" &#61;> "1", "&#64;timestamp" &#61;> "2015-03-18T23:10:19.859Z", "host" &#61;> "www.ttlsa.com", "show" &#61;> "This data will be in the output", "&#64;metadata" &#61;> { "test" &#61;> "Hello", "no_show" &#61;> "This data will not be in the output" } } |
可以看到&#64;metadata字段及其子字段内容。
注意&#xff1a;只有rubydebug codec可以显示&#64;metadata字段内容。
确保&#64;metadata字段临时需要&#xff0c;不希望最终输出。最常见的情景是filter的时间字段&#xff0c;需要一临时的时间戳。如&#xff1a;
input { stdin { } } filter { grok { match &#61;> [ "message", "%{HTTPDATE:[&#64;metadata][timestamp]}" ] } date { match &#61;> [ "[&#64;metadata][timestamp]", "dd/MMM/yyyy:HH:mm:ss Z" ] } } output { stdout { codec &#61;> rubydebug } }
1 2 3 4 5 6 7 8 9 10 | input { stdin { } }
filter { grok { match &#61;> [ "message", "%{HTTPDATE:[&#64;metadata][timestamp]}" ] } date { match &#61;> [ "[&#64;metadata][timestamp]", "dd/MMM/yyyy:HH:mm:ss Z" ] } }
output { stdout { codec &#61;> rubydebug } } |
输出结果&#xff1a;
$ bin/logstash -f ../test.conf Logstash startup completed 02/Mar/2014:15:36:43 &#43;0100 { "message" &#61;> "02/Mar/2014:15:36:43 &#43;0100", "&#64;version" &#61;> "1", "&#64;timestamp" &#61;> "2014-03-02T14:36:43.000Z", "host" &#61;> "example.com" }
1 2 3 4 5 6 7 8 9 | $ bin/logstash -f ../test.conf Logstash startup completed 02/Mar/2014:15:36:43 &#43;0100 { "message" &#61;> "02/Mar/2014:15:36:43 &#43;0100", "&#64;version" &#61;> "1", "&#64;timestamp" &#61;> "2014-03-02T14:36:43.000Z", "host" &#61;> "example.com" } |