It's really slower. If you use a ramdisk, you've lesser problems with caching and buffers during testing.
But I got the similar results.
Here my testcode:
If you inherit from it, then Python comes into the game.
I guess this is the cause why it's much slower.
The time spent to creatine new instances seems not to be the problem.
I changed the test code a bit, that I create the instances only once a dusing instead seek(0).
Same result, no improvement.
I guess that's why the project Borgbackup implemented the functions for file access/chunking etc. with Cython and C.
Borgbackup is a tool for backups with deduplication written in Python: https://github.com/borgbackup/borg
But I got the similar results.
Here my testcode:
import io import os import shlex import timeit from pathlib import Path from shutil import rmtree from subprocess import call from contextlib import contextmanager RAMDISK = Path("ramdisk") TESTFILE = RAMDISK / "testfile.bin" def mount_ramfs(): RAMDISK.mkdir(exist_ok=True) cmd = shlex.split(f"sudo mount -t ramfs ramfs {RAMDISK}") call(cmd) def umount_ramfs(): cmd = shlex.split(f"sudo umount {RAMDISK}") call(cmd) RAMDISK.rmdir() def make_test_file(): with TESTFILE.open("wb") as fd: fd.write(os.urandom(4 * 1024 ** 2)) @contextmanager def with_testfile(): mount_ramfs() make_test_file() try: yield TESTFILE except Exception as e: print(repr(e)) umount_ramfs() ## testcode ## class FileIO_new(io.FileIO): pass class BufferedReader_new(io.BufferedReader): pass def built_in_io(testfile): raw = io.FileIO(testfile, 'r') buffer = io.BufferedReader(raw) lines = buffer.readlines() buffer.close() def new_io(testfile): raw = FileIO_new(testfile, 'r') buffer = BufferedReader_new(raw) lines = buffer.readlines() buffer.close() with with_testfile() as tf: print(tf, tf.stat().st_size / 1024 ** 2, "MiB") result_built_in = timeit.timeit("built_in_io(tf)", globals=globals(), number=1000) result_new_io = timeit.timeit("new_io(tf)", globals=globals(), number=1000) print(f"BuiltIn: {result_built_in:.3f}") print(f"NewIO: {result_new_io:.3f}")
Output:ramdisk/testfile.bin 4.0 MiB
BuiltIn: 3.578
NewIO: 5.005
The built-in _io
is implemented in C.If you inherit from it, then Python comes into the game.
I guess this is the cause why it's much slower.
The time spent to creatine new instances seems not to be the problem.
I changed the test code a bit, that I create the instances only once a dusing instead seek(0).
Same result, no improvement.
I guess that's why the project Borgbackup implemented the functions for file access/chunking etc. with Cython and C.
Borgbackup is a tool for backups with deduplication written in Python: https://github.com/borgbackup/borg
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
All humans together. We don't need politicians!