在非关系数据库中存储喜欢-StoringLikesinaNon-RelationalDatabase

作者：Hyukjae333 | 来源：互联网 | 2023-08-27 23:07

GistIimplementedalikebuttoninmyapplication.Letsimagineusersareabletolikeotheruser

Gist

I implemented a like button in my application. Let's imagine users are able to like other users products.

我在我的应用程序中实现了一个like按钮。让我们假设用户能够喜欢其他用户的产品。

Issue

I am now wondering which of the following is the most effective and robust method to store those likes in a non-relational Database (in my case MongoDB). It's important that no user can like a product twice.

我现在想知道以下哪一种是在非关系数据库(在我的例子中是MongoDB)中存储这些喜欢的最有效和最强大的方法。重要的是没有用户可以两次购买产品。

Possible Solutions

(1) Store the user ids of those, who liked on the product itself and keep track of the number of likes via likes.length

(1)存储喜欢产品本身的用户ID,并通过likes.length跟踪喜欢的数量

// Product in database
    {
        likes: [
            'userId1',
            'userId2',
            'userId3',
            ...
        ],
        ...
    }

(2) Store all products, that a user liked on the user itself and keep track of the number of likes through a number on the product

(2)存储用户喜欢的所有产品,并通过产品上的数字跟踪喜欢的数量

// User in database
{
    likedProducts: [
        'productId1',
        'productId2',
        'productId3',
        ...
    ]
    ...
}
// Product in database
{
    numberOfLikes: 42,
    ...
}

(3) Maybe there is even a better solution for this?

(3)也许甚至有更好的解决方案?

Either way, if the product has many likes or the user liked many products, there is a big amount of data, that has to load only to show likes and check if the user has already liked it.

无论哪种方式,如果产品有很多喜欢或用户喜欢很多产品,那么有大量数据,只需要加载以显示喜欢并检查用户是否已经喜欢它。

2 个解决方案

#1

Which approach to use, (1) or (2) depends on your use case, specifically, you should think about what data you will need to access more: to retrieve all products liked by a particular user (2) or to retrieve all users who liked a particular product (1). It looks more likely that (1) is a more frequent case - that way you would easily know if the user already liked the product as well as number of likes for the product as it is simply array length.

使用哪种方法,(1)或(2)取决于您的使用案例,具体而言,您应该考虑更多需要访问的数据:检索特定用户喜欢的所有产品(2)或检索所有用户谁喜欢特定的产品(1)。看起来更可能是(1)更常见的情况 - 这样你很容易知道用户是否已经喜欢该产品以及产品的喜欢数量,因为它只是数组长度。

I would argue that any further improvement would likely be a premature optimization - it's better to optimize with a problem in hand.

我认为任何进一步的改进都可能是过早的优化 - 最好用手头的问题进行优化。

If showing number of likes, for example, appears to be a bottleneck, you can denormalize your data further by storing array length as a separate key-value. That way displaying the product list wouldn't require receiving array of likes with userIds from the database.

例如,如果显示喜欢的数量似乎是瓶颈,则可以通过将数组长度存储为单独的键值来进一步对数据进行非规范化。这样显示产品列表不需要从数据库接收带有userIds的喜欢数组。

Even more unlikely, with millions of likes of a single product, you'll find significant slowdown from looping through the likes array to check if the userId is already in it. You can, of course, use something like a sorted array to keep likes sorted, but database communication would be still slow (slower than looping through array in memory anyway). It's better to use the database indexing for binary search and instead of storing array of likes as array embedded into the product (or user) you can store likes in a separate collection:

更不可能的是,对于单个产品的数百万个喜欢,你会发现通过在like数组中循环以检查userId是否已经在其中的显着减速。当然,您可以使用像排序数组这样的东西来保持喜欢排序,但数据库通信仍然很慢(无论如何都比在内存中循环数组慢)。最好使用数据库索引进行二进制搜索,而不是将喜欢的数组存储为嵌入到产品(或用户)中的数组,您可以将喜欢存储在单独的集合中:

{
    _id: $oid1,
    productId: $oid2,
    userId: $oid3
}

That, assuming, that the product has key with a number of likes, should be fastest way of accessing likes if all 3 keys are indexed.

假设产品具有许多喜欢的密钥,那么如果所有3个密钥都被编入索引,则应该是访问喜欢的最快方式。

You can also be creative and use concatenation of $oid2+$oid3 as $oid1 which would automatically enforce uniqueness of the user-product pair likes. So you'd just try saving it and ignore database error (might lead to subtle bugs, so it'd be safer to check like exists on a failure to save).

您也可以创造性地使用$ oid2 + $ oid3的连接作为$ oid1,这将自动强制用户 - 产品对喜欢的唯一性。因此,您只需尝试保存它并忽略数据库错误(可能会导致细微的错误,因此在保存失败时检查是否更安全)。

#2

Why simply not amend requirements and use either relational database or RDBMS alike solution. Basically, use the right tool, for the right job:

为什么不修改需求并使用关系数据库或RDBMS类似的解决方案。基本上,使用正确的工具,为正确的工作:

Create another table Likes that keeps pair of your productId and userId as unique key. For example:

创建另一个表喜欢将productId和userId配对作为唯一键。例如:

userId1 - productId2
userId2 - productId3
userId2 - productId2
userId1 - productId5
userId3 - productId2

Then you can query by userId and get number of likes per user or query by productId and get number of likes per product.

然后,您可以通过userId查询并获取每个用户的喜欢数量或按productId查询,并获得每个产品的喜欢数量。

Moreover, unique key userId_productId will guarantee that one user can only like one product.

此外,唯一密钥userId_productId将保证一个用户只能喜欢一个产品。

Additionally, you can keep in another column(s) extra information like timestamp when user liked the product etc.

此外,当用户喜欢产品等时,您可以在其他列中保留额外信息,例如时间戳等。

推荐阅读

java
解决Bootstrap DataTable Ajax请求重复问题

在最近的一个项目中，我们使用了JQuery DataTable进行数据展示，虽然使用起来非常方便，但在测试过程中发现了一个问题：当查询条件改变时，有时查询结果的数据不正确。通过FireBug调试发现，点击搜索按钮时，会发送两次Ajax请求，一次是原条件的请求，一次是新条件的请求。 ... [详细]

蜡笔小新 2024-11-12 13:59:27
java
Python基础：使用NLTK和Python构建机器学习应用

本文节选自《NLTK基础教程——用NLTK和Python库构建机器学习应用》一书的第1章第1.2节，作者Nitin Hardeniya。本文将带领读者快速了解Python的基础知识，为后续的机器学习应用打下坚实的基础。 ... [详细]

蜡笔小新 2024-11-13 21:23:34
text
在 QQmlPropertyMap 的派生类中无法调用槽函数或 Q_INVOKABLE 方法？

在尝试对 QQmlPropertyMap 类进行测试驱动开发时，发现其派生类中无法正常调用槽函数或 Q_INVOKABLE 方法。这可能是由于 QQmlPropertyMap 的内部实现机制导致的，需要进一步研究以找到解决方案。 ... [详细]

蜡笔小新 2024-11-11 15:34:22
java
Android Studio SQLite 数据库增删改查简单（代码参考）

一个建表一个执行crud操作建表代码importandroid.content.Context;importandroid.database.sqlite.SQLiteDat ... [详细]

蜡笔小新 2024-11-14 11:01:49
default
更新vuex的数据为什么用mutation?

更新vuex的数据为什么用mutation?,Go语言社区,Golang程序员人脉社 ... [详细]

蜡笔小新 2024-11-13 18:30:04
java
浅析python实现布隆过滤器及Redis中的缓存穿透原理_python

本文带你了解了位图的实现，布隆过滤器的原理及Python中的使用，以及布隆过滤器如何应对Redis中的缓存穿透，相信你对布隆过滤 ... [详细]

蜡笔小新 2024-11-13 16:43:07
web
用阿里云的免费 SSL 证书让网站从 HTTP 换成 HTTPS

HTTP协议是不加密传输数据的，也就是用户跟你的网站之间传递数据有可能在途中被截获，破解传递的真实内容，所以使用不加密的HTTP的网站是不 ... [详细]

蜡笔小新 2024-11-13 14:02:50
java
Java 编程错误：对象无法转换为 long 类型

本文介绍了在 Java 编程中遇到的一个常见错误：对象无法转换为 long 类型，并提供了详细的解决方案。 ... [详细]

蜡笔小新 2024-11-13 10:57:24
include
单片微机原理P3：80C51外部拓展系统

　　外部拓展其实是个相对来说很好玩的章节，可以真正开始用单片机写程序了，比较重要的是外部存储器拓展，81C55拓展，矩阵键盘，动态显示，DAC和ADC。0.IO接口电路概念与存 ... [详细]

蜡笔小新 2024-11-12 19:51:29
default
Python 使用 DOM 和 SAX 解析 XML 的应用实例

本文介绍如何使用 Python 的 DOM 和 SAX 方法解析 XML 文件，并通过示例展示了如何动态创建数据库表和处理大量数据的实时插入。 ... [详细]

蜡笔小新 2024-11-12 16:10:39
list
大类|电阻器_使用Requests、Etree、BeautifulSoup、Pandas和Path库进行数据抓取与处理 | 将指定区域内容保存为HTML和Excel格式

大类|电阻器_使用Requests、Etree、BeautifulSoup、Pandas和Path库进行数据抓取与处理 | 将指定区域内容保存为HTML和Excel格式 ... [详细]

蜡笔小新 2024-11-11 19:05:59
text
如何在PHP中获取数组中特定元素的索引位置

在PHP中获取数组中特定元素的索引位置有多种方法。首先，可以使用 `array_search()` 函数，其语法为 `array_search(目标值, $array)`，该函数将返回匹配元素的第一个键名（即下标）。其次，也可以利用 `array_keys()` 函数，通过 `array_keys($array, 目标值)` 语法来获取所有匹配元素的键名列表。这两种方法都能有效解决数组元素定位的问题，具体选择取决于实际需求和性能考虑。 ... [详细]

蜡笔小新 2024-11-11 17:25:16
list
如何将Python与Excel高效结合：常用操作技巧解析

本文深入探讨了如何将Python与Excel高效结合，涵盖了一系列实用的操作技巧。文章内容详尽，步骤清晰，注重细节处理，旨在帮助读者掌握Python与Excel之间的无缝对接方法，提升数据处理效率。 ... [详细]

蜡笔小新 2024-11-11 15:18:30
list
机器学习的持续探索与进展

在机器学习领域，深入探讨了概率论与数理统计的基础知识，特别是这些理论在数据挖掘中的应用。文章重点分析了偏差（Bias）与方差（Variance）之间的平衡问题，强调了方差反映了不同训练模型之间的差异，例如在K折交叉验证中，不同模型之间的性能差异显著。此外，还讨论了如何通过优化模型选择和参数调整来有效控制这一平衡，以提高模型的泛化能力。 ... [详细]

蜡笔小新 2024-11-11 10:27:39
java
使用 ListView 浏览安卓系统中的回收站文件

使用 ListView 浏览安卓系统中的回收站文件 ... [详细]

蜡笔小新 2024-11-09 16:34:55

Hyukjae333

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章