-
-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why are matrices 2-d arrays instead of 1-d? #81
Comments
Hi @Elronnd Thank you for your feedbacks/contributions Representing matrices as [m][n] makes things easier to manage; you can access column vectors as matrix[0], matrix[1], matrix[2] and matrix[3].
this is not true. Why do you think it would give better performance?
matrices are two dimensional arrays but still continuous arrays (check type definitions: https://github.com/recp/cglm/blob/master/include/cglm/types.h). Two dimensional arrays do not take more memory, why do you think that? Also why do you think flat arrays don't require "many dereferences to access"? What is the difference to compiler to use flat array or two-dimensional array? Probably compiler will generate similar (probably same) assembly or cpu instructions... Am I wrong? To multiply two matrices cglm do not access matrix elements randomly, it loads columns to SSE/AVX registers then shuffles them if required. I see no benefit to use flat arrays for now, but if you convince us then things may change 🤔 if you can make matrix flat array then we can compare results, for instance for |
I thought -- for instance, |
Thank you very much for explanation, typedef float vec3[3];
typedef vec3 mat3[3];
typedef float mat3_flat[9];
float
get1(mat3 mat, int m, int n) {
return mat[m][n];
}
float
get2(mat3_flat mat, int m, int n) {
return mat[m * 3 + n];
}
float
get3(mat3 mat, int m, int n) {
return ((float *)mat)[m * 3 + n];
}
float
get1_literal(mat3 mat, int m, int n) {
return mat[1][2];
}
float
get2_literal(mat3_flat mat) {
return mat[1 * 3 + 2];
} Output (https://godbolt.org/z/YXn4Ms): get1:
movsx rsi, esi
movsx rdx, edx
lea rax, [rsi+rsi*2]
lea rax, [rdi+rax*4]
movss xmm0, DWORD PTR [rax+rdx*4]
ret
get2:
lea eax, [rsi+rsi*2]
add eax, edx
cdqe
movss xmm0, DWORD PTR [rdi+rax*4]
ret
get3:
lea eax, [rsi+rsi*2]
add eax, edx
cdqe
movss xmm0, DWORD PTR [rdi+rax*4]
ret
get1_literal:
movss xmm0, DWORD PTR [rdi+20]
ret
get2_literal:
movss xmm0, DWORD PTR [rdi+20]
ret I got same result if m and n are known at compile time. But got different result if they unkown at compile time (which makes your right) On the other hand I got same result for get2 and get3 So maybe instead of changing all design, we could provide a macro (or inline function) like, to access matrix elements at runtime: #define GLM_MAT3_GET(mat, m, n) (((float *)mat)[m * 3 + n]) What are your suggestions? |
That sounds good. Although, now that I think about it further, I'm not sure if it's such a good idea. Disassembly is good but I should probably do an actual benchmark. Also, I'm pretty sure that that relies on undefined behaviour--I don't think that multidimensional arrays are guaranteed to be laid out in memory like that. EDIT: that definitely wouldn't work. Since mat is a |
That would be really perfect!
Okay you are right, let's ignore EDIT:
Added to TODOs, I should do some research for this. |
You're both very confused about how arrays in C work. A two-dimensional array is not the same as an array of pointers! The reason why you seemingly get a pointer from dereferencing a 2D array once is because you actually get an array, but in C arrays decays into pointers to their first element in most context. Here's an attempt at an explanation: https://godbolt.org/z/W_R-b9
They are guaranteed to be contiguous. Any array of objects is contiguous in memory. A 2D array is just an array of arrays, and is therefore also contiguous. There is no special exception here, and this is not undefined behavior.
This is wrong. Dereferencing a |
@Akaricchi thanks for your comments, I'll check godbolt results later.
One thing here; in cglm typedef CGLM_ALIGN_IF(16) float vec4[4];
typedef CGLM_ALIGN_MAT vec4 mat4[4]; since If 2d array is not contiguous at all, then we should update the statements in the docs (https://cglm.readthedocs.io/en/latest/opengl.html), because it suggests to pass cglm's types to OpenGL directly. What do you think about the design of cglm? (1-d array vs 2d?) 2d array makes things easier to manage I think (for me) PS: Users can disable alignment of course, by defining |
Alignment can only introduce padding when the size isn't an integer multiple of the alignment.
Clang does compile it, but does not actually add any padding to the inner
They are contiguous. I use cglm with OpenGL myself, and it's working fine.
I think 2D arrays are perfectly fine. If you need alignment for vec3/mat3 types for whatever reason, you need to ask yourself what exactly do you want to align there - the whole matrix or each individual vector in it? If it's the later, that's not OpenGL-compatible. If it's the former, that is what clang seems to do, but it's not reliable - probably better to resort to a 1D array in this case. |
Awesome 🎉
Currently we don't. simd/sse2/mat3.h is already used SIMD with IIRC, I was used alignment for It seems using 2d array and alignment configuration seem good for now. Nothing to change. PS: #124 introduces new way to multiply 3 or 4 general and special matrices which probably will make things faster and more readable. Any feedback is also welcome for that too (in there). |
Hi, this is not an issue so much as just a question I have. Why are matrices represented as an array[m][n] instead of a flat array[m*n]? It seems like doing it the second way would be better for performance since it takes up less memory and doesn't require as many dereferences to access, and you can still trivially get individual vectors out of flat-array-represented matrices.
The text was updated successfully, but these errors were encountered: