It is well known that accessing the memory in a non-sequential way is slower because it leads to a lot of cache miss. This micro-project was made to evaluate the difference in performance.
An array of 10 000 000 elements (size_t) is created. It represents a typical data structure that you will access in your program. A random list of 1 000 000 offsets is generated.
I compare the time used to fetch this million offset in a random order with the time used to fetch the same offsets, but in ascending (or descending) order.
These results have been obtained on a intel i5 4570:
stdCreate Data: 187ms. Create offsetAccess: 56ms. Find Max: 2ms. data.size=100000000, offsetAccess.size=1000000, max(offsetAccess)=99999927 Pure random access: 16 ms. ... (done 20 times) Pure random access: 17 ms. Sort: 86 ms. Pseudo sequential access: 9 ms. ... (done 20 times) Pseudo sequential access: 9 ms. Sort (reversed): 15 ms. Pseudo sequential access (reversed): 9 ms. ...(done 20 times) Pseudo sequential access (reversed): 9 ms. Total time random: 317ms (mean: 15ms) Total time sequential: 180ms (mean: 9ms) Total time sequential reversed: 186ms (mean: 9ms) Performance (sequential/random): 0.567823
This show a twofold improvement when the data is fetched in ascending (or descending) order!
On a 2-core ARM Cortex A-15, the improvement is threefold.
The source code is available on Github
- Use a proper benchmarking library like Google benchmark.