These 2 commands seem to have the same time complexity right? Iterating through a row or column will take O(N), however, one of them is faster than the other. The reason is due to the way data is stored and accessed in memory.
In most modern computer systems, memory is organized in row-major order, meaning that elements of the same row are stored in contiguous memory locations. When you iterate row-wise, the data elements are accessed sequentially, which aligns with the memory layout. This sequential access pattern enables more efficient use of the cache, a small and fast memory component in the CPU. The cache stores recently accessed memory data, so when you access adjacent elements (as in row-wise iteration), the chances are higher that the required data is already in the cache or can be fetched in large contiguous blocks, reducing the time for memory access.
On the other hand, column-wise iteration involves jumping across memory locations, as elements in the same column are not stored next to each other in memory. This non-sequential access pattern leads to more cache misses, where the CPU has to wait longer to fetch data from the main memory. Cache misses significantly slow down the computation, making column-wise iteration generally slower compared to row-wise iteration. The impact of this difference is particularly noticeable in large datasets where efficient memory access and cache utilization become crucial for performance.