热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

MongoDB聚合操作(aggregate)

聚合操作聚合函数:db.collection.aggregate()语法:db.collection.aggregate(pipeline,options)文档

聚合操作

聚合函数:db.collection.aggregate()



  • 语法: db.collection.aggregate(pipeline, options)

    • 文档定义了操作中使用的聚合管道阶段和聚合操作符


    • 文档声明了一些和聚合操作的参数

    • 返回值是游标




聚合表达式



  1. 字段路径表达式

    $: 使用 $ 来表示路径

    $. 使用 $ 和 . 来指示内嵌文档字段路径



  2. 系统变量表达式

    $$: 使用 $$ 指示系统变量

    $$CURRENT: 指示管道中当前操作的文档, $$CURRENT.$ 是等效的



  3. 常量表达式

    $literal: : 指示常量

    $literal: "$name": 指示常量字符串 "\(name"(这里的\)被当做常量处理,而不是字段路径表达式)




聚合管道阶段

$project: 对输入文档进行再次投影



  • 可以用了灵活的的控制输出文档的格式



  • 也可用了剔除不相关的字段,以优化聚合管道操作的性能

    > db.accounts.find()
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20 }
    { "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20 }
    >
    // 对账户文档进行重新映射
    > db.accounts.aggregate([
    ... {
    ... $project: {
    ... _id: 0,
    ... balance: 1,
    ... clientName: "$name.firstName"
    ... }
    ... }
    ... ])
    { "balance" : 20, "clientName" : "alice" }
    { "balance" : 20, "clientName" : "bob" }
    > db.accounts.aggregate([
    ... {
    ... $project: {
    ... _id: 0,
    ... balance: 1,
    ... nameArrage: ["$name.firstName", "$name.middleName", "$name.lastName"]
    ... }
    ... }
    ... ])
    { "balance" : 20, "nameArrage" : [ "alice", null, "wong" ] }
    { "balance" : 20, "nameArrage" : [ "bob", null, "yang" ] }



$match: 对输入文档进行筛选



  • $match 中使用的文档筛选语法,和读取文档时的筛选语法相同



  • 应该尽量在聚合管道开始的时候使用 $match 这样可以减少后续阶段中需要处理的文档数量,优化聚合操作的性能

    > db.accounts.aggregate([
    ... {
    ... $match: {
    ... $or: [
    ... {balance: {$gt: 10, $lt: 30}},
    ... {"name.lastName": "yang"}
    ... ]
    ... }
    ... },
    ... {
    ... $project: {
    ... _id: 0
    ... }
    ... }
    ... ])
    { "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20 }
    { "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20 }



$limit: 筛选出管道中前 N 篇文档


$skip: 跳过管道内前 N 篇文档

> db.accounts.aggregate([
... {
... $limit: 1
... }
... ])
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20 }
>
>
> db.accounts.aggregate([
... {
... $skip: 1
... }
... ])
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20 }
>

$unwind: 展开输入文档中的数组字段



  • 就是把指定数据的元素单独拿出来,与其他文档字段组成新的文档,一变多

  • 默认展开文档时,如果文档中指定数组字段不存在,或者为null, 或者为空[], 将会被过滤掉

  • 如果展示数组时保留空数组,或者不存在数组字段的文档, 同(preserveNullAndEmptyArrays: true)实现

    > db.accounts.find()
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ] }
    { "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }
    >
    >
    > db.accounts.aggregate([
    ... {
    ... $unwind: {
    ... path: "$currency"
    ... }
    ... }
    ... ])
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "CNY" }
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "USD" }
    { "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }
    >
    // 展开数组时添加元素的位置
    > db.accounts.aggregate([
    ... {
    ... $unwind: {
    ... path: "$currency",
    ... includeArrayIndex: "ccyIndex"
    ... }
    ... }
    ... ])
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "CNY", "ccyIndex" : NumberLong(0) }
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "USD", "ccyIndex" : NumberLong(1) }
    { "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP", "ccyIndex" : null }
    >
    // 默认展开文档时,如果文档中指定数组字段不存在,或者为null, 或者为空[], 将会被过滤掉
    > db.accounts.find()
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ] }
    { "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }
    { "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40 }
    { "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60, "currency" : [ ] }
    { "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null }
    >
    > db.accounts.aggregate([
    ... {
    ... $unwind: {
    ... path: "$currency"
    ... }
    ... }
    ... ])
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "CNY" }
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "USD" }
    { "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }
    // 如果展示数组时保留空数组,或者不存在数组字段的文档, 同(preserveNullAndEmptyArrays: true)实现
    > db.accounts.aggregate([
    ... {
    ... $unwind: {
    ... path: "$currency",
    ... preserveNullAndEmptyArrays: true
    ... }
    ... }
    ... ])
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "CNY" }
    { "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "USD" }
    { "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }
    { "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40 }
    { "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60 }
    { "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null }



$sort: 对输入文档进行排序

> db.accounts.aggregate([
... {
... $sort: {balance: -1, "name.LastName": 1}
... }
... ])
{ "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null }
{ "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60, "currency" : [ ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40 }
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ] }
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }

$lookup: 类似于join操作,对其他 collection 进行查询,结果写入管道文档中

官方文档地址

语法1:

{
$lookup:
{
from: ,
localField: ,
foreignField: ,
as:
}
}

语法2:

{
$lookup:
{
from: ,
let: { : , …, : },
pipeline: [

],
as:
}
}


  • pipeline: 对查询集合中的文档使用聚合阶段进行处理

  • let(可参数): 对查询集合中的文档使用聚合阶段进行处理时,如果需要参考管道文档中的字段,则必须使用let参数先对字段进行声明

示例1:

> db.forex.find()
{ "_id" : ObjectId("634d73585799d43c905c719a"), "ccy" : "USD", "rate" : 6.91, "data" : ISODate("2018-12-12T00:00:00Z") }
{ "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") }
{ "_id" : ObjectId("634d73585799d43c905c719c"), "ccy" : "CNY", "rate" : 1, "data" : ISODate("2018-12-12T00:00:00Z") }
>
> db.account.find()
> db.accounts.find()
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ] }
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }
{ "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40 }
{ "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60, "currency" : [ ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null }
>
> db.accounts.aggregate([
... {
... $lookup: {
... from: "forex",
... localField: "currency",
... foreignField: "ccy",
... as: "forexData"
... }
... }
... ])
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ], "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719a"), "ccy" : "USD", "rate" : 6.91, "data" : ISODate("2018-12-12T00:00:00Z") }, { "_id" : ObjectId("634d73585799d43c905c719c"), "ccy" : "CNY", "rate" : 1, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP", "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40, "forexData" : [ ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60, "currency" : [ ], "forexData" : [ ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null, "forexData" : [ ] }
> db.accounts.aggregate([
... {
... $unwind: {
... path: "$currency"
... }
... },
... {
... $lookup: {
... from: "forex",
... localField: "currency",
... foreignField: "ccy",
... as: "forexData"
... }
... }
... ])
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "CNY", "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719c"), "ccy" : "CNY", "rate" : 1, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : "USD", "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719a"), "ccy" : "USD", "rate" : 6.91, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP", "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }

示例2:

# 将特定日期外汇汇率写入银行账户文档 (不相关查询 3.6版本开始支持)
db.forex.find()
{ "_id" : ObjectId("634d73585799d43c905c719a"), "ccy" : "USD", "rate" : 6.91, "data" : ISODate("2018-12-13T00:00:00Z") }
{ "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") }
{ "_id" : ObjectId("634d73585799d43c905c719c"), "ccy" : "CNY", "rate" : 1, "data" : ISODate("2018-12-13T00:00:00Z") }
>
> db.accounts.find()
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ] }
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP" }
{ "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40 }
{ "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60, "currency" : [ ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null }
>
> db.accounts.aggregate([
... {
... $lookup: {
... from: "forex",
... pipeline: [
... {
... $match: {
... data: new Date("2018-12-12")
... }
... }
... ],
... as: "forexData"
... }
... }
... ])
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ], "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP", "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40, "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60, "currency" : [ ], "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null, "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
>
# 将特定日期的外汇汇率写入余额大于50的银行账户
> db.accounts.aggregate([
... {
... $lookup: {
... from: "forex",
... let: {bal: "$balance"},
... pipeline: [
... { $match:
... { $expr:
... { $and:
... [
... {$eq: ["$data", new Date("2018-12-12")]},
... {$gt: ["$$bal", 50]}
... ]
...
... }
...
... }
... }
... ],
... as: "forexData"
... }
... }
... ])
{ "_id" : ObjectId("634d622b5799d43c905c7195"), "name" : { "firstName" : "alice", "lastName" : "wong" }, "balance" : 20, "currency" : [ "CNY", "USD" ], "forexData" : [ ] }
{ "_id" : ObjectId("634d622b5799d43c905c7196"), "name" : { "firstName" : "bob", "lastName" : "yang" }, "balance" : 20, "currency" : "GBP", "forexData" : [ ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7197"), "name" : { "firstName" : "charlie", "lastName" : "gordon" }, "balance" : 40, "forexData" : [ ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7198"), "name" : { "firstName" : "david", "lastName" : "wu" }, "balance" : 60, "currency" : [ ], "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }
{ "_id" : ObjectId("634d6ec45799d43c905c7199"), "name" : { "firstName" : "eddie", "lastName" : "kim" }, "balance" : 100, "currency" : null, "forexData" : [ { "_id" : ObjectId("634d73585799d43c905c719b"), "ccy" : "GBP", "rate" : 8.72, "data" : ISODate("2018-12-12T00:00:00Z") } ] }

$group: 对输入文档进行分组

官方文档地址

语法:

{
$group:
{
_id: , // Group key
: { : },
...
}
}


  • _id: 定义分组规则, 就是通过那个字段进行分组

  • : 可以使用聚合操作符来定义新字段(求和,平均 ...)

  • $group: 可以返回文档中某一字段的所有(不重复)值

  • 使用聚合操作符计算分组聚合值

  • 使用聚合操作符计算所有文档的聚合值(_id: null), 相当于只分了一组

  • 使用聚合操作符创建数组字段,并将指定字段,push到数组中

# 按照交易货币来分组交易记录(可以返回文档中某一字段的所有(不重复)值)
> db.transactions.find()
{ "_id" : ObjectId("635d46341b8b77383133f015"), "symbol" : "600519", "qty" : 100, "price" : 567.4, "currency" : "CNY" }
{ "_id" : ObjectId("635d46341b8b77383133f016"), "symbol" : "AMZN", "qty" : 1, "price" : 1377.5, "currency" : "USD" }
{ "_id" : ObjectId("635d46341b8b77383133f017"), "symbol" : "AAPL", "qty" : 2, "price" : 150.7, "currency" : "USD" }
>
> db.transactions.aggregate([
... {
... $group: {
... _id: "$currency"
... }
... }
... ])
{ "_id" : "USD" }
{ "_id" : "CNY" }
# 使用聚合操作符计算分组聚合值
> db.transactions.aggregate([
... {
... $group: {
... _id: "$currency",
... totalQty: {$sum: "$qty"},
... totalNotional: {$sum: { $multiply: ["$price", "$qty"]}},
... avgPrice: {$avg: "$price"},
... count: {$sum: 1},
... maxNotional: {$max: {$multiply: ["$price", "$qty"]}},
... minNotional: {$min: {$multiply: ["$price", "$qty"]}}
... }
... }
... ])
{ "_id" : "USD", "totalQty" : 3, "totalNotional" : 1678.9, "avgPrice" : 764.1, "count" : 2, "maxNotional" : 1377.5, "minNotional" : 301.4 }
{ "_id" : "CNY", "totalQty" : 100, "totalNotional" : 56740, "avgPrice" : 567.4, "count" : 1, "maxNotional" : 56740, "minNotional" : 56740 }
# 使用聚合操作符计算所有文档的聚合值(`_id: null`), 相当于只分了一个组
> db.transactions.aggregate([
... {
... $group: {
... _id: null,
... totalQty: {$sum: "$qty"},
... totalNotional: {$sum: { $multiply: ["$price", "$qty"]}},
... avgPrice: {$avg: "$price"},
... count: {$sum: 1},
... maxNotional: {$max: {$multiply: ["$price", "$qty"]}},
... minNotional: {$min: {$multiply: ["$price", "$qty"]}}
... }
... }
... ])
{ "_id" : null, "totalQty" : 103, "totalNotional" : 58418.9, "avgPrice" : 698.5333333333333, "count" : 3, "maxNotional" : 56740, "minNotional" : 301.4 }
# 使用聚合操作符创建数组字段,并将指定字段,push到数组中
> db.transactions.aggregate([
... {
... $group: {
... _id: "$currency",
... "symbols": {$push: "$symbol"}
... }
... }
... ])
{ "_id" : "USD", "symbols" : [ "AMZN", "AAPL" ] }
{ "_id" : "CNY", "symbols" : [ "600519" ] }

$out: 获取聚合管道返回的文档并将它们写入指定的集合。从 MongoDB 4.4 开始,您可以指定输出数据库

官方文档地址

语法:

{ $out: { db: "", coll: "" } }

# 将聚合管道中的文档写入一个新集合中
> db.transactions.aggregate([
... {
... $group: {
... _id: "$currency",
... "symbols": {$push: "$symbol"}
... }
... },
... {$out: "output"}
... ])
>
> db.output.find()
{ "_id" : "USD", "symbols" : [ "AMZN", "AAPL" ] }
{ "_id" : "CNY", "symbols" : [ "600519" ] }
# 将聚合管道中的文档写入一个已经存在的集合中,已存在的集合会被清空,然后写入的值,原集合中的数据会丢失,非常危险
> db.transactions.aggregate([
... {
... $group: {
... _id: "$symbol",
... totalNotional: {$sum: {$multiply: ["$price", "$qty"]}}
... }
... },
... {$out: "output"}
... ])
> db.output.find()
{ "_id" : "AMZN", "totalNotional" : 1377.5 }
{ "_id" : "AAPL", "totalNotional" : 301.4 }
{ "_id" : "600519", "totalNotional" : 56740 }
>

聚合操作



  • 语法: db..aggregate(

    , )


    • 文档定义了操作中使用的聚合管道阶段和聚合操作符


    • 文档声明了一些和聚合操作的参数




allowDiskUse:



  • 每个聚合管道阶段使用的内存不能超过100MB

  • 如果数据量较大,为了防止聚合管道阶段超出内存上限并抛出错误,可以启用allowDiskUse选项

  • allowDiskUse 启用后,聚合阶段可以在内存容量不足时,将操作数据写入临时文件中

  • 临时文件会被写入 dbPath 下的 _tmp 文件夹,dbPath 的默认值为 /data/db

> db.transactions.aggregate([
... {
... $group: {
... _id: "$currency",
... "symbols": {$push: "$symbol"}
... }
... }
... ],
... {allowDiskUse: true}
... )
{ "_id" : "CNY", "symbols" : [ "600519" ] }
{ "_id" : "USD", "symbols" : [ "AMZN", "AAPL" ] }

聚合操作的优化



  • 聚合阶段顺序的优化



    • $project + $match, $match 阶段会在 $project 阶段前运行


      db.transactions.aggregate([
      {
      $project: {
      _id: 0, symbol: 1, currency: 1,
      notional: {$multiply: ["$price", "$qty"]}
      }
      },
      {
      $match: {
      currency: "USD",
      notional: {$gt: 1000}
      }
      }
      ])
      # 实际执行时,会优化成如下
      db.transactions.aggregate([
      {
      $match: {
      currency: "USD",
      }
      },
      {
      $project: {
      _id: 0, symbol: 1, currency: 1,
      notional: {$multiply: ["$price", "$qty"]}
      }
      },
      {
      $match: {
      notional: {$gt: 1000}
      }
      }
      ])


    • $sort + $match, $match 阶段会在 $sort 阶段之前运行

    • $project + $skip, $skip 阶段会在 $project 阶段之前运行



  • 聚合阶段合并的优化



    • $sort + $limit, 如果两者之间没有夹杂会改变文档数量的聚合阶段, $sort 和 $limit 阶段可以合并

      db.transactions.aggregate([
      {
      $sort: {price: 1}
      },
      {
      $project: {
      _id: 0, symbol: 1, currency: 1,
      notional: {$multiply: ["$price", "$qty"]}
      }
      },
      {
      $limit: 2
      }
      ])
      # 实际执行
      db.transactions.aggregate([
      {
      $sort: {price: 1} + $limit: 2 # 伪代码
      },
      {
      $project: {
      _id: 0, symbol: 1, currency: 1,
      notional: {$multiply: ["$price", "$qty"]}
      }
      }
      ])


    • $limit + $limit $skip + $skip $match + $match: 连续 \(limit,\)skip,$match 阶段排列在一起时,可以合并为一个阶段

      { $limit: 10} + { $limit: 5} = { $limit: 5}
      { $skip: 10} + { $skip: 5} = { $skip: 15}
      # $match 合并
      { $match: {currency: "USD"}}
      { $match: {qty: 1}
      {
      $match: {
      $and: [
      {"currency": "USD"},
      {"qty": 1}
      ]
      }
      }


    • $lookup + $unwind, 连续排列在一起的 $lookup 和 $unwind 阶段,如果 $unwind 应用在 $lookup 阶段创建的 as 字段上,则两者可以合并

      db.accounts.aggregate([
      {
      $lookup: {
      from: "forex",
      localField: "currency",
      foreignField: "ccy",
      as: "forexData"
      }
      },
      {
      $unwind: "$forexData"
      }
      ])
      db.accounts.aggregate([
      {
      $lookup: {
      from: "forex",
      localField: "currency",
      foreignField: "ccy",
      as: "forexData"
      }
      }
      + // 伪码,合并后可以节省临时性文档的创建和更改
      {
      $unwind: "$forexData"
      }
      ])






推荐阅读
  • 使用Ubuntu中的Python获取浏览器历史记录原文: ... [详细]
  • vue使用
    关键词: ... [详细]
  • Iamtryingtomakeaclassthatwillreadatextfileofnamesintoanarray,thenreturnthatarra ... [详细]
  • 本文介绍了数据库的存储结构及其重要性,强调了关系数据库范例中将逻辑存储与物理存储分开的必要性。通过逻辑结构和物理结构的分离,可以实现对物理存储的重新组织和数据库的迁移,而应用程序不会察觉到任何更改。文章还展示了Oracle数据库的逻辑结构和物理结构,并介绍了表空间的概念和作用。 ... [详细]
  • Java序列化对象传给PHP的方法及原理解析
    本文介绍了Java序列化对象传给PHP的方法及原理,包括Java对象传递的方式、序列化的方式、PHP中的序列化用法介绍、Java是否能反序列化PHP的数据、Java序列化的原理以及解决Java序列化中的问题。同时还解释了序列化的概念和作用,以及代码执行序列化所需要的权限。最后指出,序列化会将对象实例的所有字段都进行序列化,使得数据能够被表示为实例的序列化数据,但只有能够解释该格式的代码才能够确定数据的内容。 ... [详细]
  • 本文介绍了Hyperledger Fabric外部链码构建与运行的相关知识,包括在Hyperledger Fabric 2.0版本之前链码构建和运行的困难性,外部构建模式的实现原理以及外部构建和运行API的使用方法。通过本文的介绍,读者可以了解到如何利用外部构建和运行的方式来实现链码的构建和运行,并且不再受限于特定的语言和部署环境。 ... [详细]
  • 怀疑是每次都在新建文件,具体代码如下 ... [详细]
  • sklearn数据集库中的常用数据集类型介绍
    本文介绍了sklearn数据集库中常用的数据集类型,包括玩具数据集和样本生成器。其中详细介绍了波士顿房价数据集,包含了波士顿506处房屋的13种不同特征以及房屋价格,适用于回归任务。 ... [详细]
  • 利用Visual Basic开发SAP接口程序初探的方法与原理
    本文介绍了利用Visual Basic开发SAP接口程序的方法与原理,以及SAP R/3系统的特点和二次开发平台ABAP的使用。通过程序接口自动读取SAP R/3的数据表或视图,在外部进行处理和利用水晶报表等工具生成符合中国人习惯的报表样式。具体介绍了RFC调用的原理和模型,并强调本文主要不讨论SAP R/3函数的开发,而是针对使用SAP的公司的非ABAP开发人员提供了初步的接口程序开发指导。 ... [详细]
  • 本文讨论了clone的fork与pthread_create创建线程的不同之处。进程是一个指令执行流及其执行环境,其执行环境是一个系统资源的集合。在调用系统调用fork创建一个进程时,子进程只是完全复制父进程的资源,这样得到的子进程独立于父进程,具有良好的并发性。但是二者之间的通讯需要通过专门的通讯机制,另外通过fork创建子进程系统开销很大。因此,在某些情况下,使用clone或pthread_create创建线程可能更加高效。 ... [详细]
  • 模板引擎StringTemplate的使用方法和特点
    本文介绍了模板引擎StringTemplate的使用方法和特点,包括强制Model和View的分离、Lazy-Evaluation、Recursive enable等。同时,还介绍了StringTemplate语法中的属性和普通字符的使用方法,并提供了向模板填充属性的示例代码。 ... [详细]
  • 本文介绍了操作系统的定义和功能,包括操作系统的本质、用户界面以及系统调用的分类。同时还介绍了进程和线程的区别,包括进程和线程的定义和作用。 ... [详细]
  • EzPP 0.2发布,新增YAML布局渲染功能
    EzPP发布了0.2.1版本,新增了YAML布局渲染功能,可以将YAML文件渲染为图片,并且可以复用YAML作为模版,通过传递不同参数生成不同的图片。这个功能可以用于绘制Logo、封面或其他图片,让用户不需要安装或卸载Photoshop。文章还提供了一个入门例子,介绍了使用ezpp的基本渲染方法,以及如何使用canvas、text类元素、自定义字体等。 ... [详细]
  • 本文讨论了如何使用GStreamer来删除H264格式视频文件中的中间部分,而不需要进行重编码。作者提出了使用gst_element_seek(...)函数来实现这个目标的思路,并提到遇到了一个解决不了的BUG。文章还列举了8个解决方案,希望能够得到更好的思路。 ... [详细]
  • MySQL多表数据库操作方法及子查询详解
    本文详细介绍了MySQL数据库的多表操作方法,包括增删改和单表查询,同时还解释了子查询的概念和用法。文章通过示例和步骤说明了如何进行数据的插入、删除和更新操作,以及如何执行单表查询和使用聚合函数进行统计。对于需要对MySQL数据库进行操作的读者来说,本文是一个非常实用的参考资料。 ... [详细]
author-avatar
孤火自燃
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有