作者:何处逐梦_273 | 来源:互联网 | 2023-10-13 08:28
使用JQ解析JSON嵌套对象,使用select来匹配嵌套对象中的键值,同时显示现有结构
我正在尝试获取一个20,000行以上的复杂JSON文件并提取特定的密钥,同时保留周围的元数据,从而增加了必要的人类可理解的上下文。
数据源(复杂结构):
{
"Marketplace": [
{
"Level1Name": "Company A Products","Level1Array": [
{
"Level2Name": "USA Products List","Level2Contents": [
{
"Level3Name": "ALL","Level3URL": "https://a.com/products"
},{
"Level3Name": "Subset1001","Level3URL": "https://a.com/products/subset1001"
}
]
}
]
},{
"Level1Name": "Company B Products","Level3URL": "https://b.com/products"
},{
"Level3Name": "Subset500","Level3URL": "https://b.com/products/subset500"
}
]
},{
"Level2Name": "EU Products List","Level3URL": "https://b.eu/products"
},{
"Level3Name": "Subset200","Level3URL": "https://b.eu/products/subset200"
}
]
}
]
},{
"Level1Name": "Company X Products","Level1Array": [
{
"Level2Name": "Deleted Products","Level2URL": "https://internal.x.com/products"
}
]
}
]
}
当前用于提取的JQ命令会删除所有其他上下文元数据...
jq -r '(
.Marketplace[].Level1Array[].Level2Contents[]
| select (.Level3Name | index("ALL"))
| [.]
)'
已给出输出...
[
{
"Level3Name": "ALL","Level3URL": "https://a.com/products"
}
]
[
{
"Level3Name": "ALL","Level3URL": "https://b.com/products"
}
]
[
{
"Level3Name": "ALL","Level3URL": "https://b.eu/products"
}
]
希望输出选项1,相同的JSON结构,并删除所有不匹配的其他对象,请选择过滤条件“ ALL”字符串条件
{
"Marketplace":
[
{
"Level1Name": "Company A Products","Level1Array": [
{
"Level2Name": "USA Products List","Level2Contents": [
{
"Level3Name": "ALL","Level3URL": "https://a.com/products"
}
]
}
]
},{
"Level1Name": "Company B Products","Level3URL": "https://b.com/products"
}
]
},{
"Level2Name": "EU Products List","Level3URL": "https://b.eu/products"
}
]
}
]
}
]
}
需要选项2输出,可以通过循环来迭代的任何类似格式,例如:
{
"Marketplace":
[
{
"Level1Name": "Company A Products","Level2Name": "USA Products List","Level3Name": "ALL","Level3URL": "https://a.com/products"
},"Level3URL": "https://b.com/products"
},"Level2Name": "EU Products List","Level3URL": "https://b.eu/products"
}
]
}
以下过滤器产生“选项2”输出:
.Marketplace |= map(
{Level1Name} as $Level1Name
| .Level1Array[]
| {Level2Name} as $Level2Name
| .Level2Contents[]?
| select(.Level3Name == "ALL")
| $Level1Name + $Level2Name + . )
破坏它...
了解这一点的一种方法是考虑:
.Marketplace[]
| {Level1Name} as $Level1Name
| .Level1Array[]
| {Level2Name} as $Level2Name
| .Level2Contents[]? # in case .Level2Contents is missing
| if (.Level3Name == "ALL")
then $Level1Name + $Level2Name + .
else empty
end
附录:“名称”
OP随后询问如果三个级别的“名称”键都都命名为“名称”,该怎么办。通过对上述内容进行调整,可以很容易地获得答案:
.Marketplace |= map(
{Level1Name: .Name} as $Level1Name
| .Level1Array[]
| {Level2Name: .Name} as $Level2Name
| .Level2Contents[]?
| select(.Name == "ALL")
| $Level1Name + $Level2Name + . )
输出
在这种情况下,输出如下:
{
"Marketplace": [
{
"Level1Name": "Company A Products","Level2Name": "USA Products List","Name": "ALL","Level3URL": "https://a.com/products"
},{
"Level1Name": "Company B Products","Level3URL": "https://b.com/products"
},"Level2Name": "EU Products List","Level3URL": "https://b.eu/products"
}
]
}
,
这里是您可以解决此问题的另一种方法。据我了解,您想要一种方法来搜索对象的递归树中的某个值,并删除所有不具有该值的属性的对象。
您可以做的是搜索要保留的所有值的路径(具有要搜索的值),然后删除要保留的任何路径的路径上没有的所有其他对象。 / p>
def is_subpath($paths): [.,length] as [$path,$length] |
any($paths[]; $length <= length and $path == .[:$length]);
[paths(strings == "ALL")[:-1]] as $keepers
| delpaths([paths(objects) | select(is_subpath($keepers) | not)])
{
"Marketplace": [
{
"Level1Name": "Company A Products","Level1Array": [
{
"Level2Name": "USA Products List","Level2Contents": [
{
"Level3Name": "ALL","Level3URL": "https://a.com/products"
}
]
}
]
},"Level3URL": "https://b.com/products"
}
]
},{
"Level2Name": "EU Products List","Level3URL": "https://b.eu/products"
}
]
}
]
}
]
}