I've always assumed it was because I grew up in a family of engineers that spatial awareness and, particularly, seeing 2-D information in 3-D was just something I learnt like learning to read. It came as a bit of a shock to find that most people don't look at a contour map or an engineering drawing and see a 3-D structure.
What's really ironic though is that due to very unequal vision in my eyes, I don't see depth perception and real 3-D well at all. Those photograph stereo-pairs beloved of geography and geology classes never turn into a 3-D picture for me - I'm much better off with a map I can imagine into 3-D.
I think the flatpack furniture problem is something different though - something about organisation of the process is too often missing. The OP said about laying out all the bits, identifying what's what etc. It's amazing how often people don't start from this basis but just pick up the first item out of the box and go from there...
