作者:朱劭文_850 | 来源:互联网 | 2023-05-19 16:32
IhavefewmillionsofrecordsandIneedthemtobeindexedinSolr.Oncetheyreindexed,theyre
I have few millions of records and I need them to be indexed in Solr. Once they're indexed, they're not going to be changed and the collections are used only for "read". I am following the pattern by posting the xml docs to the REST api and it works fine ... even though it takes some time (configs are optimized for read and cache);
我有几百万条记录,我需要它们在Solr中编入索引。一旦它们被索引,它们就不会被改变,并且集合仅用于“读取”。我通过将xml文档发布到REST API来遵循该模式,并且它工作正常......即使它需要一些时间(配置针对读取和缓存进行了优化);
But I was wondering ... is there a better/faster approach - maybe avoiding the HTTP/network layer? Something like working locally to build the collection, copy it to solr server and then add/swap the collection?
但我想知道......是否有更好/更快的方法 - 可能避免HTTP /网络层?在本地工作以构建集合,将其复制到solr服务器然后添加/交换集合?
One choice could be a custom DIH for a second/backup core and swap when done - but this would mean I would have to "eat" the memory used on solr for caching slowing down searches.
一个选择可能是第二个/备份核心的自定义DIH和完成时交换 - 但这意味着我必须“吃掉”solr上用于缓存的内存减慢搜索速度。
I am searching/hoping for a disconnected solution - like a command line tool, running on a different machine with the configuration optimized for writing, then copy the core on production swapping the old with the new one.
我正在寻找/希望找到一个断开连接的解决方案 - 比如一个命令行工具,在不同的机器上运行,并且配置已针对写入进行了优化,然后将生产中的核心复制到新的生产中。
Any ideas?
有任何想法吗?
1 个解决方案