作者:520那孩HAPPY | 来源:互联网 | 2023-08-27 13:30
HelloImaMongoDbbeginner.IhaveadatabaseofaIRCchatlog.Thedocumentstructureisverysim
Hello I'm a MongoDb beginner. I have a database of a IRC chatlog. The document structure is very simple
你好,我是MongoDb初学者。我有一个IRC聊天记录的数据库。文档结构非常简单
{
"_id" : ObjectId("000"),
"user" : "username",
"message" : "foobar foobar potato idontknow",
"time" : NumberLong(1451775601469)
}
I have thousands of these and I want to count the number of occurrences of the string "foobar". I have googled this issue and found something about aggregations. I looks very complicated and I haven't really found any issue this "simple". I'd be glad if someone pointed me in the right direction what to research and I wouldn't mind an example command that does exactly this what I want. Thank you.
我有成千上万的这些,我想计算字符串“foobar”的出现次数。我搜索了这个问题并找到了有关聚合的内容。我看起来很复杂,我没有发现任何这个“简单”的问题。如果有人指出我正确的方向研究什么,我会很高兴,我不介意一个完全符合我想要的示例命令。谢谢。
1 个解决方案
0
There is no any built-in operator to solve your request.
没有任何内置运算符可以解决您的请求。
You can try this query, but it has very poor performance:
您可以尝试此查询,但性能非常差:
db.chat.find().forEach(function(doc){
print(doc["user"] + " > " + ((doc["message"].match(/foobar/g) || []).length))
})
If you could change your message
field to array, then we could apply aggregation
...
如果您可以将消息字段更改为数组,那么我们可以应用聚合...
EDIT:
编辑:
If you add array of splitted words into your entry, we can apply aggregation
如果在条目中添加分割单词数组,我们可以应用聚合
Sample:
样品:
{
"_id" : ObjectId("569bb7040586bcb40f7d2539"),
"user" : "username",
"fullmessage" : "foobar foobar potato idontknow",
"message" : [
"foobar",
"foobar",
"potato",
"idontknow"
],
"time" : NumberLong(1451775601469)
}
Aggregation. We create new entry for each array element, match given word (foobar, in this case) and then count matched result.
聚合。我们为每个数组元素创建新条目,匹配给定的单词(在这种情况下为foobar),然后计算匹配的结果。
db.chat.aggregate([
{"$unwind" : "$message"},
{"$match" : {"message" : {"$regex" : "foobar", "$options" : "i"}}},
{"$group" : {_id:{"_id" : "$_id", "user" : "$user", "time" : "$time", "fullmessage" : "$fullmessage"}, "count" : {$sum:1}}},
{"$project" : {_id:"$_id._id", "user" : "$_id.user", "time" : "$_id.time", "fullmessage" : "$_id.fullmessage", "count" : "$count"}}
])
Result:
结果:
[
{
"_id" : ObjectId("569bb7040586bcb40f7d2539"),
"count" : 2,
"user" : "username",
"time" : NumberLong(1451775601469),
"fullmessage" : "foobar foobar potato idontknow"
}
]