After the relase of EPD 6.0 now linking numpy agains the Intel MKL library (10.2), I wanted to have some insight about the performance impact of the MKL usage.
What impact does the MKL have on numpy performance ?
I have very roughly started a basic benchmark comparing EPD 5.1 with EPD 6.0. The former is using numpy 1.3 with BLAS and the latter numpy 1.4 with the MKL. I am using a Thinkpad T60 with an Intel dual-core 2Ghz CPU running Windows 32bit.
! The benchmarking methodology is really poor and can be made much more realistic but it gives a first insight.
Contrary to what I said at the last LFPUG meeting on Wednesday, you can control the maximal number of threads used by the system using the OMP_NUM_THREADS environment variables. I have updated the benchmark script to show its value when running it.
Here are some results :
- Testing linear algebra functions
I took some of the often used methods and barely compared the cpu time using the ipython timeit command.
Example 1 : eigenvalues
def test_eigenvalue():
i= 500
data = random((i,i))
result = numpy.linalg.eig(data)
The results are interesting 752ms for the MKL version versus 3376 for the ATLAS. That is a 4.5x faster. Testing the very same code on Matlab 7.4 (R2007a) gives a timing of 790ms.
Example 2 : single value decompositions
def test_svd():
i = 1000
data = random((i,i))
result = numpy.linalg.svd(data)
result = numpy.linalg.svd(data, full_matrices=False)
Results are 4608ms with the MKL versus 15990ms without. This is nearly 3.5x faster.
Example 3 : matrix inversion
def test_inv():
i = 1000
data = random((i,i))
result = numpy.linalg.inv(data)
Results are 418ms with the MKL versus 1457ms without. This is 3.5x faster
Example 4 : det()
def test_det():
i=1000
data = random((i,i))
result = numpy.linalg.det(data)
Results are 186ms with the MKL versus 400ms without. This is 2x faster.
Example 5 : dot()
def test_dot():
i = 1000
a = random((i, i))
b = numpy.linalg.inv(a)
result = numpy.dot(a, b) - numpy.eye(i)
Results are 666ms with the MKL versus 2444ms without. This is 3.5x faster.
Conclusion :
Linear algebra functions show a clear performance improvement. I am open to collect more information on that if you have some home made benchmarking. If the amount of information, we should consider publishing the results as official benchmark somewhere.
| Function |
Without MKL |
With MKL |
Speed up |
| test_eigenvalue |
3376ms |
752ms |
4.5x |
| test_svd |
15990ms |
4608ms |
3.5x |
| test_inv |
1457ms |
418ms |
3.5x |
| test_det |
400ms |
186ms |
2x |
| test_dot |
2444ms |
666ms |
3.5x |
For those of you wanting to test your environment, feel free to use the script here below.
Read the rest of this entry »