作者:星仔走天涯k | 来源:互联网 | 2023-05-19 00:20
IamintheprocessoftryingtouseLogstashtoconvertanXMLintoJSONforElasticSearch.Iamab
I am in the process of trying to use Logstash to convert an XML into JSON for ElasticSearch. I am able to get the the values read and sent to ElasticSearch. The issue is that all the values come out as arrays. I would like to make them come out as just strings. I know I can do a replace
for each field individually, but then I run into an issue with nested fields being 3 levels deep.
我正在尝试使用Logstash将XML转换为ElasticSearch的JSON。我能够获取读取的值并将其发送到ElasticSearch。问题是所有值都以数组形式出现。我想让它们像串一样出来。我知道我可以单独替换每个字段,但后来我遇到了嵌套字段深度为3级的问题。
XML
XML
Location Id
User Id
My Name
2015-08-07
10.5
Logstash Config
Logstash配置
input {
file {
path => "/var/log/logstash/test.xml"
}
}
filter {
multiline {
pattern => "^\s\s(\s\s|\<\/acs2:SubmitTestResult\>)"
what => "previous"
}
if "multiline" in [tags] {
mutate {
replace => ["message", '%{message}']
}
xml {
target => "SubmitTestResult"
source => "message"
}
mutate {
remove_field => ["message", "@version", "host", "@timestamp", "path", "tags", "type"]
remove_field => ["entry", "[SubmitTestResult][xmlns:acs2]", "[SubmitTestResult][xmlns:acs]", "[SubmitTestResult][xmlns:acs1]"]
# This works
replace => [ "[SubmitTestResult][locationId]", "%{[SubmitTestResult][locationId]}" ]
# This does NOT work
replace => [ "[SubmitTestResult][TestResult][CreatedBy]", "%{[SubmitTestResult][TestResult][CreatedBy]}" ]
}
}
}
output {
stdout {
codec => "rubydebug"
}
elasticsearch {
index => "xmltest"
cluster => "logstash"
}
}
Example Output
示例输出
{
"_index": "xmltest",
"_type": "logs",
"_id": "AU8IZBURkkRvuur_3YDA",
"_version": 1,
"found": true,
"_source": {
"SubmitTestResult": {
"locationId": "Location Id",
"userId": [
"User Id"
],
"TestResult": [
{
"CreatedBy": [
"My Name"
],
"CreatedDate": [
"2015-08-07"
],
"Output": [
"10.5"
]
}
]
}
}
}
As you can see, the output is an array for each element (except for the locationId I replaced with). I am trying to not have to do the replace for each element. Is there a way to adjust the config to make the output come put properly? If not, how do I get 3 levels deep in the replace
?
如您所见,输出是每个元素的数组(除了替换为的locationId)。我试图不必为每个元素做替换。有没有办法调整配置,使输出正确?如果没有,我如何在替换中获得3级深度?
--UPDATE--
--UPDATE--
I figured out how to get to the 3rd level in Test Results. The replace is:
我想出了如何在测试结果中达到第3级。替换是:
replace => [ "[SubmitTestResult][TestResult][0][CreatedBy]", "%{[SubmitTestResult][TestResult][0][CreatedBy]}" ]
1 个解决方案