Reading Large Files and Perf

One of the things that has often confused me is how little good advice there is for reading large files efficiently when writing code.

Typically most people use whatever the canonical file read suggestion for their language is, until they need to read large files and it’s too slow. Then they google “efficiently reading large files in ” and are pointed to a buffered reader of some sort, and that’s that.

However, in Halvar’s recent QCon talk he had several slides talking about how most code is written based on the old assumptions of spinning disks. With non-SSD HD’s there’s usually a single read head and you can’t do much in parallel. This requires code to optimise for single reads, minimal seeks, and large redhead of data layed out on disk next to each other. But modern SSDs are much more comfortable with seeks and parallelism.

Continue reading "Reading Large Files and Perf"

Tuesday, September 19. 2023
Posted by Dominic White in

(Page 1 of 1, totaling 1 entries)

Dominic White

Entries from September 2023

Reading Large Files and Perf