|
7 | 7 |  |
8 | 8 |  |
9 | 9 |
|
10 | | -This package provides bindings to the Intel MKL [Vector Mathematics Functions](https://software.intel.com/en-us/node/521751). |
| 10 | +This package provides bindings to the Intel MKL [Vector Mathematics Functions](https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/vector-mathematical-functions.html). |
11 | 11 | This is often substantially faster than broadcasting Julia's built-in functions, especially when applying a transcendental function over a large array. |
12 | 12 | Until Julia 0.6 the package was registered as `VML.jl`. |
13 | 13 |
|
@@ -65,7 +65,19 @@ implementation, although the exact results may be different. To specify |
65 | 65 | low accuracy, use `vml_set_accuracy(VML_LA)`. To specify enhanced |
66 | 66 | performance, use `vml_set_accuracy(VML_EP)`. More documentation |
67 | 67 | regarding these options is available on |
68 | | -[Intel's website](http://software.intel.com/sites/products/documentation/hpc/mkl/IntelVectorMath/vmldata.htm). |
| 68 | +[Intel's website](https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/vector-mathematical-functions.html). |
| 69 | + |
| 70 | +### Denormalized numbers |
| 71 | + |
| 72 | +On some CPU, operations on denormalized numbers are extremely slow. You case use `vml_set_denormalmode(VML_DENORMAL_FAST)` |
| 73 | +to handle denormalized numbers as zero. See the `?VML_DENORMAL_FAST` for more information. You can get the |
| 74 | +current mode by `vml_get_denormalmode()`. The default is `VML_DENORMAL_ACCURATE`. |
| 75 | + |
| 76 | +### Threads |
| 77 | + |
| 78 | +By default, IntelVectorMath uses multithreading. The maximum number of threads that a call may use |
| 79 | +is given by `vml_get_max_threads()`. On most environment this will default to the number of physical |
| 80 | +cores available to IntelVectorMath. This behavior can be changed using `vml_set_num_threads(numthreads)`. |
69 | 81 |
|
70 | 82 | ## Performance |
71 | 83 | Summary of Results: |
@@ -229,5 +241,12 @@ Next steps for this package |
229 | 241 |
|
230 | 242 |
|
231 | 243 | ## Advanced |
| 244 | + |
| 245 | +<!-- This does not seems to be true anymore ? No reference to CpuId.jl in the Manifest ? |
| 246 | +
|
232 | 247 | IntelVectorMath.jl uses [CpuId.jl](https://github.com/m-j-w/CpuId.jl) to detect if your processor supports the newer `avx2` instructions, and if not defaults to `libmkl_vml_avx`. If your system does not have AVX this package will currently not work for you. |
233 | | -If the CPU feature detection does not work for you, please open an issue. |
| 248 | +If the CPU feature detection does not work for you, please open an issue. --> |
| 249 | + |
| 250 | +As a quick help to convert benchmark timings into operations-per-cycle, IntelVectorMath.jl |
| 251 | +provides `vml_get_cpu_frequency()` which will return the *actual* current frequency of the |
| 252 | +CPU in GHz. |
0 commit comments