原文:
This is the code i am using to test the memory allocation
import pycurl
import io
url = "http://www.stackoverflow.com"
buf = io.BytesIO()
print(len(buf.getvalue())) #here i am getting 0 as length
c = pycurl.Curl()
c.setopt(c.URL, url)
c.setopt(c.CONNECTTIMEOUT, 10)
c.setopt(c.TIMEOUT, 10)
c.setopt(c.ENCODING, 'gzip')
c.setopt(c.FOLLOWLOCATION, True)
c.setopt(c.IPRESOLVE, c.IPRESOLVE_V4)
c.setopt(c.USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0')
c.setopt(c.WRITEFUNCTION, buf.write)
c.perform()
c.close()
print(len(buf.getvalue())) #here length of the dowloaded file
print(buf.getvalue())
buf.close()
How to get the allocated buffer/memory length by BytesIO ?
what am i doing wrong here ? python doesn't allocate fixed buffer length ?
# Answer 1
I am not sure what you mean by allocated buffer/memory length, but if you want the length of the user data stored in the BytesIO object you can do
>>> bio = io.BytesIO()
>>> bio.getbuffer().nbytes
0
>>> bio.write(b'here is some data')
17
>>> bio.getbuffer().nbytes
17
But this seems equivalent to the len(buf.getvalue()) that you are currently using.
The actual size of the BytesIO object can be found using sys.getsizeof():
>>> bio = io.BytesIO()
>>> sys.getsizeof(bio)
104
Or you could be nasty and call __sizeof__() directly (which is like sys.getsizeof() but without garbage collector overhead applicable to the object):
>>> bio = io.BytesIO()
>>> bio.__sizeof__()
72
Memory for BytesIO is allocated as required, and some buffering does take place:
>>> bio = io.BytesIO()
>>> for i in range(20):
... _=bio.write(b'a')
... print(bio.getbuffer().nbytes, sys.getsizeof(bio), bio.__sizeof__())
...
1 106 74
2 106 74
3 108 76
4 108 76
5 110 78
6 110 78
7 112 80
8 112 80
9 120 88
10 120 88
11 120 88
12 120 88
13 120 88
14 120 88
15 120 88
16 120 88
17 129 97
18 129 97
19 129 97
20 129 97
# Answer 2
io.BytesIO() returns a standard file object which has function tell(). It reports the current descriptor position and does not copy the whole buffer out to compute total size as len(bio.getvalue()) of bio.getbuffer().nbytes. It is a very fast and simple method to get the exact size of used memory in the buffer object.
I posted an example code and a more detailed answer here
# Answer 3
You can also use tracemalloc to get indirect information about the size of objects, by wrapping memory events in tracemalloc.get_traced_memory()
Do note that active threads (if any) and side effects of your program will affect the output, but it may also be more representative of the real memory cost if many samples are taken, as shown below.
>>> import tracemalloc
>>> from io import BytesIO
>>> tracemalloc.start()
>>>
>>> memory_traces = []
>>>
>>> with BytesIO() as bytes_fh:
... # returns (current memory usage, peak memory usage)
# ..but only since calling .start()
... memory_traces.append(tracemalloc.get_traced_memory())
... bytes_fh.write(b'a' * (1024**2)) # create 1MB of 'a'
... memory_traces.append(tracemalloc.get_traced_memory())
...
1048576
>>> print("used_memory = {}b".format(memory_traces[1][0] - memory_traces[0][0]))
used_memory = 1048870b
>>> 1048870 - 1024**2 # show small overhead
294