Calling global function inside a CUDA Kernel -

- May 15, 2014

i'm trying write cuda kernel function contains matrix multiplication, like:

__device__ matrix_multi(matrix a,matrix b,matrix c);   __global__ void foo(type para){        ....        matrix_multi(matrix a,matrix b,matrix c);        .... }

i want accelerate matrix multiplication operation. have 2 choices:

first, using cublas library. second, write kernel matrix multiplication , call inside foo().

i failed in both cases.

can help?

i suggest not write own mat-mul kernel @ time. try cublas way.

cublas lib can called in kernel devices compute capability @ least equal 3.5. otherwise can called host side. check cc version before using cublas lib.

Search This Blog

LAVA

Calling global function inside a CUDA Kernel -

Comments

Post a Comment

Popular posts from this blog

c++ - Linked List error when inserting for the last time -

java - activate/deactivate sonar maven plugin by profile? -

tsql - Pivot with Temp Table (definition for column must include data type) -- SQL Server 2008 -