VASP.4 was optimized for the T3D (and T3E) to make use of the efficient T3D (T3E) shmem communication scheme specify T3D_SMA in the makefile. This can speed up communication by up to a factor of 2. But, mind that this can also cause problems on the T3E if vasp is used with data-streams:
export SCACHE_D_STREAMS=1The default makefile on the T3E, therefore does not use the optimized communication routines, because performance improvements due to data-streams are usually more important than optimized communication (it is thus save to switch on data streaming on the T3E typing i.e. export SCACHE_D_STREAMS=1).