In a scenario that a caller function needs to pass one of its local variables of a primitive type to a callee and doesn't necessarily need to pass it by reference to track that variable later after callee is done, I know that it'd be said to pass it by value and that makes total sense for primitive types with a size equal to or smaller than size of pointer; but I'm curious that from an assembly point(performance-wise or space-wise), wouldn't be there a situation when passing a variable of a primitive type with a size bigger than pointers like long double on my platform which has a size of 8 bytes and is bigger than pointers that have a size of 4 bytes; by reference would be more efficient? like an imagined situation where pointer can be loaded directly into some register by caller but the primitive itself not and thus no need to load the pointer from callee stack frame to some register by callee and there's 8 more free bytes of stack memory in the end comparing to pass by value where there'd be 8 more used bytes of stack memory.
If in this specific case, passing by reference might ever be more efficient, how can we know to pass by reference or value?