Skip to content

Commit 81bbab9

Browse files
committed
added doc in README. Update links to Intel VML.
1 parent 6340a0f commit 81bbab9

File tree

1 file changed

+22
-3
lines changed

1 file changed

+22
-3
lines changed

README.md

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
![](https://github.com/JuliaMath/VML.jl/workflows/julia%201.6/badge.svg)
88
![](https://github.com/JuliaMath/VML.jl/workflows/julia%20nightly/badge.svg)
99

10-
This package provides bindings to the Intel MKL [Vector Mathematics Functions](https://software.intel.com/en-us/node/521751).
10+
This package provides bindings to the Intel MKL [Vector Mathematics Functions](https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/vector-mathematical-functions.html).
1111
This is often substantially faster than broadcasting Julia's built-in functions, especially when applying a transcendental function over a large array.
1212
Until Julia 0.6 the package was registered as `VML.jl`.
1313

@@ -65,7 +65,19 @@ implementation, although the exact results may be different. To specify
6565
low accuracy, use `vml_set_accuracy(VML_LA)`. To specify enhanced
6666
performance, use `vml_set_accuracy(VML_EP)`. More documentation
6767
regarding these options is available on
68-
[Intel's website](http://software.intel.com/sites/products/documentation/hpc/mkl/IntelVectorMath/vmldata.htm).
68+
[Intel's website](https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/vector-mathematical-functions.html).
69+
70+
### Denormalized numbers
71+
72+
On some CPU, operations on denormalized numbers are extremely slow. You case use `vml_set_denormalmode(VML_DENORMAL_FAST)`
73+
to handle denormalized numbers as zero. See the `?VML_DENORMAL_FAST` for more information. You can get the
74+
current mode by `vml_get_denormalmode()`. The default is `VML_DENORMAL_ACCURATE`.
75+
76+
### Threads
77+
78+
By default, IntelVectorMath uses multithreading. The maximum number of threads that a call may use
79+
is given by `vml_get_max_threads()`. On most environment this will default to the number of physical
80+
cores available to IntelVectorMath. This behavior can be changed using `vml_set_num_threads(numthreads)`.
6981

7082
## Performance
7183
Summary of Results:
@@ -229,5 +241,12 @@ Next steps for this package
229241

230242

231243
## Advanced
244+
245+
<!-- This does not seems to be true anymore ? No reference to CpuId.jl in the Manifest ?
246+
232247
IntelVectorMath.jl uses [CpuId.jl](https://github.com/m-j-w/CpuId.jl) to detect if your processor supports the newer `avx2` instructions, and if not defaults to `libmkl_vml_avx`. If your system does not have AVX this package will currently not work for you.
233-
If the CPU feature detection does not work for you, please open an issue.
248+
If the CPU feature detection does not work for you, please open an issue. -->
249+
250+
As a quick help to convert benchmark timings into operations-per-cycle, IntelVectorMath.jl
251+
provides `vml_get_cpu_frequency()` which will return the *actual* current frequency of the
252+
CPU in GHz.

0 commit comments

Comments
 (0)