Miracle¶
<username>@miracle.tcs.uj.edu.pl
Kompilator CUDA: nvcc (jeśli nie działa to: /usr/local/cuda/bin/nvcc)
Przykładowe programy Nvidii /usr/local/cuda/samples
Literatura¶
An open, online course from Udacity: Intro to Parallel Programming
Przykłady¶
Kompilacja: nvcc helloWorldRuntime.cu
Kompilacja: Makefile
Dodawanie macierzy (runtime API)
Mnożenie macierzy (runtime API) - bez wykorzystania shared memory
Mnożenie macierzy (runtime API) - z wykorzystaniem shared memory
Prezentacja z pierwszych ćwiczeń¶
CUresult¶
cuMemAlloc¶
Allocates device memory.
Allocates bytesize bytes of linear memory on the device and returns in *dptr a pointer to the allocated memory. The allocated memory is suitably aligned for any kind of variable. The memory is not cleared. If bytesize is 0, cuMemAlloc() returns CUDA_ERROR_INVALID_VALUE.
cuMemAllocHost¶
Allocates page-locked host memory.
Allocates bytesize bytes of host memory that is page-locked and accessible to the device. The driver tracks the virtual memory ranges allocated with this function and automatically accelerates calls to functions such as cuMemcpy(). Since the memory can be accessed directly by the device, it can be read or written with much higher bandwidth than pageable memory obtained with functions such as malloc(). Allocating excessive amounts of memory with cuMemAllocHost() may degrade system performance, since it reduces the amount of memory available to the system for paging. As a result, this function is best used sparingly to allocate staging areas for data exchange between host and device.
Profiler¶
/usr/local/cuda/bin/nvprof
/usr/local/cuda/bin/nvprof –events all –metrics all ./solution.x
/usr/local/cuda/bin/nvprof –events elapsed_cycles_sm ./solution.x
/usr/local/cuda/bin/nvprof –query-events ./solution.x
Bank conflicts:
Odczytu: /usr/local/cuda/bin/nvprof –events shared_ld_bank_conflict ./solution.x
Zapisu: /usr/local/cuda/bin/nvprof –events shared_st_bank_conflict ./solution.x
MPI - Message Passing Interface¶
TensorFlow¶
Na Miracle:
<user>@miracle:~$ cd /mnt/storage/users/<user>
<user>@miracle:~$ virtualenv --system-site-packages -p python3 tensorflow
<user>@miracle:~$ source tensorflow/bin/activate
<user>@miracle:~$ pip3 install --upgrade tensorflow-gpu
Prezentacje: