作者:Hyukjae333 | 来源:互联网 | 2023-08-27 23:07
GistIimplementedalikebuttoninmyapplication.Letsimagineusersareabletolikeotheruser
Gist
I implemented a like button in my application. Let's imagine users are able to like other users products.
我在我的应用程序中实现了一个like按钮。让我们假设用户能够喜欢其他用户的产品。
Issue
I am now wondering which of the following is the most effective and robust method to store those likes in a non-relational Database (in my case MongoDB). It's important that no user can like a product twice.
我现在想知道以下哪一种是在非关系数据库(在我的例子中是MongoDB)中存储这些喜欢的最有效和最强大的方法。重要的是没有用户可以两次购买产品。
Possible Solutions
(1) Store the user ids of those, who liked on the product itself and keep track of the number of likes via likes.length
(1)存储喜欢产品本身的用户ID,并通过likes.length跟踪喜欢的数量
// Product in database
{
likes: [
'userId1',
'userId2',
'userId3',
...
],
...
}
(2) Store all products, that a user liked on the user itself and keep track of the number of likes through a number on the product
(2)存储用户喜欢的所有产品,并通过产品上的数字跟踪喜欢的数量
// User in database
{
likedProducts: [
'productId1',
'productId2',
'productId3',
...
]
...
}
// Product in database
{
numberOfLikes: 42,
...
}
(3) Maybe there is even a better solution for this?
(3)也许甚至有更好的解决方案?
Either way, if the product has many likes or the user liked many products, there is a big amount of data, that has to load only to show likes and check if the user has already liked it.
无论哪种方式,如果产品有很多喜欢或用户喜欢很多产品,那么有大量数据,只需要加载以显示喜欢并检查用户是否已经喜欢它。
2 个解决方案
1
Which approach to use, (1) or (2) depends on your use case, specifically, you should think about what data you will need to access more: to retrieve all products liked by a particular user (2) or to retrieve all users who liked a particular product (1). It looks more likely that (1) is a more frequent case - that way you would easily know if the user already liked the product as well as number of likes for the product as it is simply array length.
使用哪种方法,(1)或(2)取决于您的使用案例,具体而言,您应该考虑更多需要访问的数据:检索特定用户喜欢的所有产品(2)或检索所有用户谁喜欢特定的产品(1)。看起来更可能是(1)更常见的情况 - 这样你很容易知道用户是否已经喜欢该产品以及产品的喜欢数量,因为它只是数组长度。
I would argue that any further improvement would likely be a premature optimization - it's better to optimize with a problem in hand.
我认为任何进一步的改进都可能是过早的优化 - 最好用手头的问题进行优化。
If showing number of likes, for example, appears to be a bottleneck, you can denormalize your data further by storing array length as a separate key-value. That way displaying the product list wouldn't require receiving array of likes with userIds from the database.
例如,如果显示喜欢的数量似乎是瓶颈,则可以通过将数组长度存储为单独的键值来进一步对数据进行非规范化。这样显示产品列表不需要从数据库接收带有userIds的喜欢数组。
Even more unlikely, with millions of likes of a single product, you'll find significant slowdown from looping through the likes array to check if the userId is already in it. You can, of course, use something like a sorted array to keep likes sorted, but database communication would be still slow (slower than looping through array in memory anyway). It's better to use the database indexing for binary search and instead of storing array of likes as array embedded into the product (or user) you can store likes in a separate collection:
更不可能的是,对于单个产品的数百万个喜欢,你会发现通过在like数组中循环以检查userId是否已经在其中的显着减速。当然,您可以使用像排序数组这样的东西来保持喜欢排序,但数据库通信仍然很慢(无论如何都比在内存中循环数组慢)。最好使用数据库索引进行二进制搜索,而不是将喜欢的数组存储为嵌入到产品(或用户)中的数组,您可以将喜欢存储在单独的集合中:
{
_id: $oid1,
productId: $oid2,
userId: $oid3
}
That, assuming, that the product has key with a number of likes, should be fastest way of accessing likes if all 3 keys are indexed.
假设产品具有许多喜欢的密钥,那么如果所有3个密钥都被编入索引,则应该是访问喜欢的最快方式。
You can also be creative and use concatenation of $oid2+$oid3 as $oid1 which would automatically enforce uniqueness of the user-product pair likes. So you'd just try saving it and ignore database error (might lead to subtle bugs, so it'd be safer to check like exists on a failure to save).
您也可以创造性地使用$ oid2 + $ oid3的连接作为$ oid1,这将自动强制用户 - 产品对喜欢的唯一性。因此,您只需尝试保存它并忽略数据库错误(可能会导致细微的错误,因此在保存失败时检查是否更安全)。