如何在节点中进行管道传输。js复述,?-Howtopipelineinnode.jstoredis?

作者：零食君 | 来源：互联网 | 2023-05-19 06:21

Ihavelotsofdatatoinsert(SET\INCR)toredisDB,soImlookingforpipeline\massinsertio

I have lot's of data to insert (SET \ INCR) to redis DB, so I'm looking for pipeline \ mass insertion through node.js.

我有很多数据可以插入(SET \ INCR)到redis DB，所以我正在寻找通过node.js进行的管道\ mass插入。

I couldn't find any good example/ API for doing so in node.js, so any help would be great!

在node我找不到任何这样做的好例子/ API。js，任何帮助都是非常好的!

3 个解决方案

#1

Yes, I must agree that there is lack of examples for that but I managed to create the stream on which I sent several insert commands in batch.

是的，我必须承认这方面缺乏示例，但是我成功地创建了一个流，在这个流上，我批量地发送了几个insert命令。

You should install module for redis stream:

您应该为redis流安装模块:

npm install redis-stream

And this is how you use the stream:

这就是你使用流的方式:

var redis = require('redis-stream'),
    client = new redis(6379, '127.0.0.1');

// Open stream
var stream = client.stream();

// Example of setting 10000 records
for(var record = 0; record <10000; record++) {

    // Command is an array of arguments:
    var command = ['set', 'key' + record, 'value'];  

    // Send command to stream, but parse it before
    stream.redis.write( redis.parse(command) );
}

// Create event when stream is closed
stream.on('close', function () {
    console.log('Completed!');

    // Here you can create stream for reading results or similar
});

// Close the stream after batch insert
stream.end();

Also, you can create as many streams as you want and open/close them as you want at any time.

此外，您可以创建任意数量的流，并随时打开/关闭它们。

There are several examples of using redis stream in node.js on redis-stream node module

在node中使用redis流有几个例子。在反向流节点模块上安装js

#2

In node_redis there all commands are pipelined:

在node_redis中，所有命令都是流水线的:

https://github.com/mranney/node_redis/issues/539#issuecomment-32203325

https://github.com/mranney/node_redis/issues/539 # issuecomment - 32203325

#3

You might want to look at batch() too. The reason why it'd be slower with multi() is because it's transactional. If something failed, nothing would be executed. That may be what you want, but you do have a choice for speed here.

您可能也想查看batch()。使用multi()时速度较慢的原因是它是事务性的。如果失败了，什么也不会执行。这可能是你想要的，但是你可以选择速度。

The redis-stream package doesn't seem to make use of Redis' mass insert functionality so it's also slower than the mass insert Redis' site goes on to talk about with redis-cli.

Redis -stream包似乎并没有利用Redis的大规模插入功能，因此它也比mass insert Redis的站点在谈到Redis -cli时要慢。

Another idea would be to use redis-cli and give it a file to stream from, which this NPM package does: https://github.com/almeida/redis-mass

另一个想法是使用redis-cli并给它一个文件以流，这个NPM包就是这样做的:https://github.com/almeida/redis-mass

Not keen on writing to a file on disk first? This repo: https://github.com/eugeneiiim/node-redis-pipe/blob/master/example.js

不喜欢先写磁盘上的文件?这种回购:https://github.com/eugeneiiim/node-redis-pipe/blob/master/example.js

...also streams to Redis, but without writing to file. It streams to a spawned process and flushes the buffer every so often.

…也流到Redis，但没有写入文件。它流到一个衍生的进程，并经常刷新缓冲区。

On Redis' site under mass insert (http://redis.io/topics/mass-insert) you can see a little Ruby example. The repo above basically ported that to Node.js and then streamed it directly to that redis-cli process that was spawned.

在Redis' site under mass insert (http://redis.io/topics/mass-insert)中，您可以看到一个小Ruby示例。上面的repo基本上将其移植到Node。然后将它直接流到生成的redis-cli进程。

So in Node.js, we have:

所以在节点。js,我们有:

var redisPipe = spawn('redis-cli', ['--pipe']);

var redisipe = spawn('redis-cli'， ['- pipe']);

spawn() returns a reference to a child process that you can pipe to with stdin. For example: redisPipe.stdin.write().

spawn()返回对子进程的引用，您可以使用stdin将该进程转换为管道。例如:redisPipe.stdin.write()。

You can just keep writing to a buffer, streaming that to the child process, and then clearing it every so often. This then won't fill it up and will therefore be a bit better on memory than perhaps the node_redis package (that literally says in its docs that data is held in memory) though I haven't looked into it that deeply so I don't know what the memory footprint ends up being. It could be doing the same thing.

您可以继续写入缓冲区，将其流到子进程，然后每隔一段时间就清除它。这就不会把它填平,因此有点记忆比也许node_redis包(在其文档,从字面上说,数据保存在内存中)虽然我没有深入,所以我不知道内存占用最终被。它也可以做同样的事情。

Of course keep in mind that if something goes wrong, it all fails. That's what tools like fluentd were created for (and that's yet another option: http://www.fluentd.org/plugins/all - it has several Redis plugins)...But again, it means you're backing data on disk somewhere to some degree. I've personally used Embulk to do this too (which required a file on disk), but it did not support mass inserts, so it was slow. It took nearly 2 hours for 30,000 records.

当然要记住，如果出了问题，一切都会失败。这就是像fluentd这样的工具被创建的目的(这是另一个选项:http://www.fluentd.org/plugins/all——它有几个Redis插件)……但同样，这也意味着在某种程度上你在备份磁盘上的数据。我个人也使用Embulk进行此操作(这需要磁盘上的文件)，但是它不支持大量插入，所以速度很慢。3万张唱片用了近2个小时。

One benefit to a streaming approach (not backed by disk) is if you're doing a huge insert from another data source. Assuming that data source returns a lot of data and your server doesn't have the hard disk space to support all of it - you can stream it instead. Again, you risk failures.

流处理方法的一个好处(不支持磁盘)是如果您正在从另一个数据源进行大量插入。假设数据源返回大量数据，并且您的服务器没有硬盘空间来支持所有这些数据，那么您可以对其进行流处理。你失败的风险。

I find myself in this position as I'm building a Docker image that will run on a server with not enough disk space to accommodate large data sets. Of course it's a lot easier if you can fit everything on the server's hard disk...But if you can't, streaming to redis-cli may be your only option.

我发现自己处于这个位置，因为我正在构建一个Docker映像，它将在一个没有足够的磁盘空间来容纳大型数据集的服务器上运行。当然，如果你能把所有东西都安装到服务器的硬盘上，那就容易多了。但如果不能，则可以将其流到redis-cli中。

If you are really pushing a lot of data around on a regular basis, I would probably recommend fluentd to be honest. It comes with many great features for ensuring your data makes it to where it's going and if something fails, it can resume.

如果你真的定期发布大量数据，我可能会建议fluentd说实话。它提供了许多很棒的特性，可以确保数据到达它要去的地方，如果失败了，它可以继续。

One problem with all of these Node.js approaches is that if something fails, you either lose it all or have to insert it all over again.

所有这些节点都存在一个问题。js的方法是，如果某件事失败了，要么全部丢失，要么重新插入。

推荐阅读

config
编写有趣的VBScript恶作剧脚本

本文将介绍如何编写一些有趣的VBScript脚本，这些脚本可以在朋友之间进行无害的恶作剧。通过简单的代码示例，帮助您了解VBScript的基本语法和功能。 ... [详细]

蜡笔小新 2024-12-28 09:46:23
string
技术分享：从动态网站提取站点密钥的解决方案

本文探讨了如何从动态网站中提取站点密钥，特别是针对验证码（reCAPTCHA）的处理方法。通过结合Selenium和requests库，提供了详细的代码示例和优化建议。 ... [详细]

蜡笔小新 2024-12-28 04:11:47
go
解决TensorFlow CPU版本安装中的依赖问题

本文记录了在安装CPU版本的TensorFlow过程中遇到的依赖问题及解决方案，特别是numpy版本不匹配和动态链接库（DLL）错误。通过详细的步骤说明和专业建议，帮助读者顺利安装并使用TensorFlow。 ... [详细]

蜡笔小新 2024-12-22 13:22:19
web
Web与游戏开发的主要差异

本文探讨了Web开发与游戏开发之间的主要区别，旨在帮助开发者更好地理解两种开发领域的特性和需求。文章基于作者的实际经验和网络资料整理而成。 ... [详细]

蜡笔小新 2024-12-18 08:26:30
string
深入理解org.neo4j.helpers.collection.Iterators.single()方法及其应用

本文详细介绍了Java中org.neo4j.helpers.collection.Iterators.single()方法的功能、使用场景及代码示例，帮助开发者更好地理解和应用该方法。 ... [详细]

蜡笔小新 2024-12-28 10:51:55
go
资源推荐 | TensorFlow官方中文教程助力英语非母语者学习

来源：机器之心。本文详细介绍了TensorFlow官方提供的中文版教程和指南，帮助开发者更好地理解和应用这一强大的开源机器学习平台。 ... [详细]

蜡笔小新 2024-12-28 09:00:51
config
Python配置文件读写指南

本文详细介绍如何使用Python进行配置文件的读写操作，涵盖常见的配置文件格式（如INI、JSON、TOML和YAML），并提供具体的代码示例。 ... [详细]

蜡笔小新 2024-12-28 08:39:55
input
JQuery基础：省市联动与表单验证

本文介绍了如何使用JQuery实现省市二级联动和表单验证。首先，通过change事件监听用户选择的省份，并动态加载对应的城市列表。其次，详细讲解了使用Validation插件进行表单验证的方法，包括内置规则、自定义规则及实时验证功能。 ... [详细]

蜡笔小新 2024-12-27 17:10:48
string
深入理解Redis的数据结构与对象系统

本文详细探讨了Redis中的数据结构和对象系统的实现，包括字符串、列表、集合、哈希表和有序集合等五种核心对象类型，以及它们所使用的底层数据结构。通过分析源码和相关文献，帮助读者更好地理解Redis的设计原理。 ... [详细]

蜡笔小新 2024-12-25 04:11:22
web
Python 异步编程：ASGI 服务器与框架详解

自 Python 3.5 引入 async/await 语法以来，异步编程迅速崛起，吸引了大量开发者的关注。本文将深入探讨 ASGI（异步服务器网关接口）及其在现代 Python Web 开发中的应用，介绍主流的 ASGI 服务器和框架。 ... [详细]

蜡笔小新 2024-12-24 17:15:09
int
Redis Hash 数据结构详解

本文详细介绍了 Redis 中的 Hash 数据类型及其常用命令。Hash 类型用于存储键值对集合，支持多种操作如插入、查询、更新和删除字段值。此外，文章还探讨了 Hash 类型在实际业务场景中的应用，并提供了优化建议。 ... [详细]

蜡笔小新 2024-12-24 13:33:33
string
深入解析Redis内存对象模型

本文详细介绍了Redis内存对象模型的关键知识点，包括内存统计、内存分配、数据存储细节及优化策略。通过实际案例和专业分析，帮助读者全面理解Redis内存管理机制。 ... [详细]

蜡笔小新 2024-12-23 14:50:23
config
解决Anaconda安装TensorFlow时遇到的TensorBoard版本问题

本文介绍了在使用Anaconda安装TensorFlow时遇到的“Could not find a version that satisfies the requirement tensorboard”错误，并提供详细的解决方案，包括创建虚拟环境和配置PyCharm项目。 ... [详细]

蜡笔小新 2024-12-23 11:58:00
go
优化Flask应用的并发处理：解决Mysql连接过多问题

本文探讨了在Flask应用中通过优化后端架构来应对高并发请求，特别是针对Mysql 'too many connections' 错误的解决方案。我们将介绍如何利用Redis缓存、Gunicorn多进程和Celery异步任务队列来提升系统的性能和稳定性。 ... [详细]

蜡笔小新 2024-12-21 09:21:49
go
Java SpringMVC SSM 实现多模块集成：操作日志、文件管理、头像编辑、权限控制及缓存优化

本文介绍了一个基于 Java SpringMVC 和 SSM 框架的综合系统，涵盖了操作日志记录、文件管理、头像编辑、权限控制、以及多种技术集成如 Shiro、Redis 等，旨在提供一个高效且功能丰富的开发平台。 ... [详细]

蜡笔小新 2024-12-20 19:17:47

零食君

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章