Thursday, January 22, 2009

32- vs. 64-bit IDL on Mac OS X

IDL 7.0.4 was recently released, providing full 64-bit support for Mac OS X, and some users who have upgraded have asked how this impacts GPULib, as using the library when running IDL in 64-bit mode doesn't work.

Currently, in order to build 64-bit code using GCC on OS X, one needs to set an additional flag, depending on the architecture targeted. We don't do this at the moment for GPULib, so it defaults to building the more compatible 32-bit libraries. Full details at

Apple goes so far as to say, "You should transition your application to a 64-bit executable format only when the 64-bit environment offers a compelling advantage for your specific application." We haven't switched to using 64-bit libraries with GPULib on Mac OS X, since thus far there hasn't been a need. If you do need the extra address space using GPULib, please let us know! Otherwise, you can add the "-32" flag to IDL 7.0.4 and run IDL in 32-bit mode, which will be compatible with the current release of GPULib.

Tuesday, January 13, 2009

Performing a shift in IDL with GPULib

A couple of users recently asked about doing a shift of a matrix using GPULib. While there is no interface that mimics 'shift' at the moment, all the functionality for shifting arrays is there. For example, you can use the following to shift an array along the x-direction:

IDL> x = findgen(5, 5)
IDL> print, x
0.00000 1.00000 2.00000 3.00000 4.00000
5.00000 6.00000 7.00000 8.00000 9.00000
10.0000 11.0000 12.0000 13.0000 14.0000
15.0000 16.0000 17.0000 18.0000 19.0000
20.0000 21.0000 22.0000 23.0000 24.0000

IDL> a = gpuPutArr(x)
IDL> b = gpuFltarr(5, 5)

; here comes the actual shift:
; we first perform b[0:3, *] = a[1:*, *]
; see the documentation for 'gpuSubArr' for an
; explanation of the arguments.
IDL> gpuSubArr, a, [1, -1], -1, b, [0, 3], -1

; and then b[4, * ] = a[0,*]
IDL> gpuSubArr, a, 0, -1, b, 4, -1

IDL> res = gpuGetArr(b)
IDL> print, res

1.00000 2.00000 3.00000 4.00000 0.00000
6.00000 7.00000 8.00000 9.00000 5.00000
11.0000 12.0000 13.0000 14.0000 10.0000
16.0000 17.0000 18.0000 19.0000 15.0000
21.0000 22.0000 23.0000 24.0000 20.0000

Shifts in other directions can be implemented similarly. This is not the fastest possible shift, but at least it should allow you to perform the shift on the GPU, rather than transferring it back to the CPU.

As always, if you have questions about this or any other use of GPULib, feel free to leave comments here, or email us at

Double precision and GPULib

A few users have asked about double precision calculations using GPULib. I thought I'd try to clear up some common questions.

First, you may notice when running the GPULib unit tests in IDL (using 'make check') that the double precision tests fail. There are two reasons for this. First, not all CUDA-enabled hardware is capable of double precision calculations -- currently, the GTX 200 series are the only cards that can do this; check the NVIDIA Web site for details on your specific card.

Secondly, you may need to update your video card drivers. We've had some issues with the 177.x series of CUDA drivers, but the 180.x series is doing much better. I'd recommend upgrading to the most recent drivers available from NVIDIA in any case, as they are constantly improving both features and performance. The downloads are available at