After looking at the Cholesky factorization problem, time to have a look at the eigenvalue decomposition of a symmetric matrix. Depending on the function and the blas library used, lots of differences appear.
The functions which are benchmarked here are:
- SSYEV: symmetric matrix
- SSYEVD: symmetric matrix, divide and conquer
- SSPEV: symmetric matrix, packed storage
- SSPEVD: symmetric matrix, packed storage, divide and conquer
Library tested:
The size of the matrix used is 4000×4000. The cpu is an Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz. The maximum threads used is 8 which is equal to the number of cores.
lapack as provided by archlinux for reference
[pparois@orion SSPEVD]$ ./test
ssyev (symmetric): 143393 ms
ssyevd (symmetric, divide): 81614 ms
sspev (symmetric, packed): 141414 ms
sspevd (symmetric, packed, divide): 82028 ms
Openblas 0.2.8 (target=sandybridge)
[pparois@orion SSPEVD]$ ./test
ssyev (symmetric): 77752 ms
ssyevd (symmetric, divide): 2630 ms
sspev (symmetric, packed): 83840 ms
sspevd (symmetric, packed, divide): 13901 ms
intel mkl (xe composer 2013)
[pparois@orion SSPEVD]$ ./test
ssyev (symmetric): 6769 ms
ssyevd (symmetric, divide): 1602 ms
sspev (symmetric, packed): 9560 ms
sspevd (symmetric, packed, divide): 4386 ms
acml 5.3.1
[pparois@orion SSPEVD]$ ./test
ssyev (symmetric): 14010 ms
ssyevd (symmetric, divide): 4290 ms
sspev (symmetric, packed): 29772 ms
sspevd (symmetric, packed, divide): 31016 ms
Conclusion
Same conclusion as for the Cholesky factorization, the packed versions are less efficient and very often not optimised. The MKL library is also significantly faster than any other implementation tested here. Openblas is also a good alternative when using ssyevd.
Source code
program truc implicit none real, dimension(:,:), allocatable :: nmatrix, temp, eigenvectors real, dimension(:), allocatable :: packed, eigenvalues, work integer, parameter :: n=4000 integer info, lwork integer :: starttime integer, dimension(8) :: measuredtime integer, dimension(:), allocatable :: iwork integer liwork lwork= 1 + 6*n + 2*n**2 liwork = 3+5*n allocate(nmatrix(n,n)) allocate(temp(n,n)) allocate(eigenvectors(n,n)) allocate(packed(n*(n+1)/2)) allocate(eigenvalues(n)) allocate(work(lwork)) allocate(iwork(liwork)) call random_number(temp) call sgemm ('N', 'T', n, n, n, 1.0, temp, n, temp, n, 0.0, nmatrix, n) ! real symmetric matrice temp=nmatrix call date_and_time(VALUES=measuredtime) starttime=((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000+measuredtime(8) call ssyev('V', 'L', n, temp, n, eigenvalues, work, lwork, info) call date_and_time(VALUES=measuredtime) print *, 'ssyev (symmetric): ', ((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000& & +measuredtime(8)-starttime, 'ms' print *, '' ! real symmetric matrice, divide and conquer temp=nmatrix call date_and_time(VALUES=measuredtime) starttime=((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000+measuredtime(8) call ssyevd ('V', 'L', n, temp, n, eigenvalues, work, lwork, iwork, liwork, info) call date_and_time(VALUES=measuredtime) print *, 'ssyevd (symmetric, divide): ', ((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000+& & measuredtime(8)-starttime, 'ms' print *, '' call strttp ('L', n, nmatrix, n, packed, INFO) ! packaed storage call date_and_time(VALUES=measuredtime) starttime=((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000+measuredtime(8) call sspev ('V', 'L', n, packed, eigenvalues, eigenvectors, n, work, info) call date_and_time(VALUES=measuredtime) print *, 'sspev (symmetric, packed): ', ((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000+& & measuredtime(8)-starttime, 'ms' print *, '' !packed storage, divide and conquer call strttp ('L', n, nmatrix, n, packed, INFO) call date_and_time(VALUES=measuredtime) starttime=((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000+measuredtime(8) call sspevd ('V', 'L', n, packed, eigenvalues, eigenvectors, n, work, lwork, iwork, liwork, info) call date_and_time(VALUES=measuredtime) print *, 'sspevd (symmetric, packed, divide): ', ((measuredtime(5)*3600+measuredtime(6)*60)+measuredtime(7))*1000+& & measuredtime(8)-starttime, 'ms' print *, '' end program