Skip to main content
added 115 characters in body
Source Link

Remember that your hardware is non-deterministic: CPU cache behavior, CPU pipelining, superscalar processors with out-of-order execution, external interrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (because of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes. Some software layers (e.g. ASLR) could add more non-determinism.

Remember that your hardware is non-deterministic: CPU cache behavior, CPU pipelining, superscalar processors with out-of-order execution, external interrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (because of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes.

Remember that your hardware is non-deterministic: CPU cache behavior, CPU pipelining, superscalar processors with out-of-order execution, external interrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (because of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes. Some software layers (e.g. ASLR) could add more non-determinism.

added 69 characters in body
Source Link

Remember that your hardware is non-deterministic: cachehardware is non-deterministic: CPU cache behavior, CPU pipelining, superscalar processors with out-of-order execution, external interruptsinterrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (becausbecause of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes.

Remember that your hardware is non-deterministic: cache behavior, external interrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (becaus of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes.

Remember that your hardware is non-deterministic: CPU cache behavior, CPU pipelining, superscalar processors with out-of-order execution, external interrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (because of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes.

added 223 characters in body
Source Link

Remember that your hardware is non-deterministic: cache behavior, external interrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (becaus of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes.

In your case, I believe you want to consider the average time. In practice, it is very likely that some of the data is already "here" (e.g. in the page cache) when you would really use your program.

At last, your problem (splitting huge files of hundred of gigabytes each) is probably disk-IO-bound, not CPU bound, so the actual way of coding should not matter that much, at least if your buffers have suitable sizes (perhapsat least 128 kilobytes;kilobytes, and more likely a few megabytes; see setvbuf(3)...). If the files are not huge and could entirely fit in the page cache (e.g. if most files have a few gigabytes) things could be different.

In your case, I believe you want to consider the average time. In practice, it is very likely that some of the data is already "here" (e.g. in the page cache) when you would really use your program.

At last, your problem (splitting huge files of hundred of gigabytes each) is probably disk-IO-bound, not CPU bound, so the actual way of coding should not matter that much, at least if your buffers have suitable sizes (perhaps 128 kilobytes; see setvbuf(3)...). If the files are not huge and could entirely fit in the page cache (e.g. if most files have a few gigabytes) things could be different.

Remember that your hardware is non-deterministic: cache behavior, external interrupts -timers, networks, USB, disk, ...- and perhaps CPU frequency -limited when the chip is too hot- is changing without software control. Hence the kernel scheduler is behaving differently from one run to the next (becaus of preemptive scheduling, ...). Read also Operating Systems: three easy pieces for more about OSes.

In your case, I believe you want to consider the average time. In practice, it is very likely that some of the data is already "here" (e.g. in the page cache) when you would really use your program.

At last, your problem (splitting huge files of hundred of gigabytes each) is probably disk-IO-bound, not CPU bound, so the actual way of coding should not matter that much, at least if your buffers have suitable sizes (at least 128 kilobytes, and more likely a few megabytes; see setvbuf(3)...). If the files are not huge and could entirely fit in the page cache (e.g. if most files have a few gigabytes) things could be different.

added 34 characters in body
Source Link
Loading
added 276 characters in body
Source Link
Loading
added 60 characters in body
Source Link
Loading
added 319 characters in body
Source Link
Loading
added 78 characters in body
Source Link
Loading
added 78 characters in body
Source Link
Loading
added 50 characters in body
Source Link
Loading
added 154 characters in body
Source Link
Loading
added 105 characters in body
Source Link
Loading
added 64 characters in body
Source Link
Loading
Source Link
Loading