作者:yatho802_201 | 来源:互联网 | 2023-06-29 18:59
Docker-based elasticdump.
$ elasticdump --version
6.15.5
$ node --version
v12.13.0
Older Elasticsearch due to constraints in environment using elasticsearch.
$ curl -Xget http://node:9200
{
....
"version" : {
"number" : "2.4.6",
"build_hash" : "5376dca9f70f3abef96a77f4bb22720ace8240fd",
"build_timestamp" : "2017-07-18T12:17:44Z",
"build_snapshot" : false,
"lucene_version" : "5.5.4"
},
....
}
docker run --rm -ti -v /path/dump:/dump_files --name elasticsearch-dump taskrabbit/elasticsearch-dump --quiet --input=http://node:9200/index_n --output=/dump_files/index_n.data.json --type=data --fileSize=1g -e NODE_OPTIOnS="--max-old-space-size=16384" --limit=10000
No dataset available.
Description:
While reading data from a largeish index I see memory usage go up with about the same amount as the size of the file being saved. When "--fileSize" limit is reached I see a new file being used for subsequent data but unfortunately memory usage continues to grow. Memory usage equals about the total of all files saved. Error is either out of HEAP or "terminate called after throwing an instance of 'std::bad_alloc'".
To reproduce: Dump an index with a size on disk larger than "--max-old-space", in my case I have indices ranging from about 6GB to 70GB when dumped to disk.
Current behaviour: Memory is not released on "--fileSize"-splits.
Expected behaviour: Memory should be released when a new file is triggered by "--fileSize".
Additional Context. I see the "bad_alloc" being discussed elsewhere as a bug/limit in V8 regarding array sizes - for example at https://github.com/nodejs/node/issues/27715
I imagine that array-limits should be a non-issue if memory gets released on fileSize-splits.
该提问来源于开源项目:elasticsearch-dump/elasticsearch-dump
The new version plays nice with larger indices. Memory usage for 'node' stays between approximately 110-170 MB for my testrun, samples taken every ten seconds. Memory is released continuously and not only when reaching "--splitSize". No errors about "bad_alloc" either.
Have not looked very much at data "correctness" but with a limited number of checks the data looks complete and uncorrupted.
I'll open a new issue if I see something else.
Thanks for the fix and for a very useful tool.
Cheers.