$treeview $search $mathjax
|
Palabos
Version 1.1
$projectbrief
|
$projectbrief
|
$searchbox |
#include <coProcessor3D.h>

Public Member Functions | |
| virtual | ~CoProcessor3D () |
| virtual int | addDomain (plint nx, plint ny, plint nz, T omega, int &domainHandle)=0 |
| Add a domain for which the co-processor will perform computations. | |
| virtual int | send (int domainHandle, Box3D const &subDomain, std::vector< char > const &data)=0 |
| Copy data from Palabos' CPU memory to the co-processors' device memory. | |
| virtual int | receive (int domainHandle, Box3D const &subDomain, std::vector< char > &data) const =0 |
| Copy data from the co-processors' device memory to Palabos' CPU memory. | |
| virtual int | collideAndStream (int domainHandle)=0 |
| Execute a collision step on each cell, and then a streaming on the full domain. | |
A co-processor provides access to a computational hardware unit, such as a GPU or a FPGA. An instance of the CoProcessor3D class is considered to represent a single hardware unit. In a multi-GPU machine for instance, a new CoProcessor3D is instantiated for each GPU.
A co-processor acts exclusively on rectangular domains, and can be responsible for more than one domain. The method addDomain is used to add new domains for which the co-processor is reponsible.
The memory is considered to be duplicated. It allocated once on the CPU by Palabos and once on the device by the co-processor. The send() and receive() methods are responsible for communication between the two memory spaces, while the collideAndStream() method works on device memory only.
At this stage, co-processors implement only BGK dynamics on a D3Q19 lattice. Also, only the collide-and-stream operation is performed by the device at this point. Both these aspects will be generalized in the future.
| virtual plb::CoProcessor3D< T >::~CoProcessor3D | ( | ) | [inline, virtual] |
| virtual int plb::CoProcessor3D< T >::addDomain | ( | plint | nx, | |
| plint | ny, | |||
| plint | nz, | |||
| T | omega, | |||
| int & | domainHandle | |||
| ) | [pure virtual] |
Add a domain for which the co-processor will perform computations.
All domains range from 0 to nx-1, from 0 to ny-1, and from 0 to nz-1 at the present interface representation, no matter where they are actually placed in the physical space.
The relaxation parameter omega is used to implement the BGK collision rule on the device.
A handle "domainHandle" is returned by the co-processor, and is subsequently used to identify the various domains during the calls to send(), receive(), and collideAndStream().
The method returns an error code: 1=success, 0=failure.
Implemented in plb::D3Q19ExampleCoProcessor3D< T >, and plb::D3Q19CudaCoProcessor3D< T >.
| virtual int plb::CoProcessor3D< T >::collideAndStream | ( | int | domainHandle | ) | [pure virtual] |
Execute a collision step on each cell, and then a streaming on the full domain.
Note that the result of the streaming step is undefined in a one-cell layer at the outer border of the domain. The method collideAndStream() is free to produce whatever result it wishes inside this layer.
It is also mentioned that the collideAndStream() operation is blocking: it does not terminated before the operation is fully completed. In order to overlay computations, you must use the MPI-based multi-thread mechanism in Palabos.
Implemented in plb::D3Q19ExampleCoProcessor3D< T >, and plb::D3Q19CudaCoProcessor3D< T >.
| virtual int plb::CoProcessor3D< T >::receive | ( | int | domainHandle, | |
| Box3D const & | subDomain, | |||
| std::vector< char > & | data | |||
| ) | const [pure virtual] |
Copy data from the co-processors' device memory to Palabos' CPU memory.
The method returns an error code: 1=success, 0=failure. Further information on the memory layout is available in the documentation of the method send().
Attention: it is the responsibility of the receive method to resize the data vector so it is big enough.
Implemented in plb::D3Q19ExampleCoProcessor3D< T >, and plb::D3Q19CudaCoProcessor3D< T >.
| virtual int plb::CoProcessor3D< T >::send | ( | int | domainHandle, | |
| Box3D const & | subDomain, | |||
| std::vector< char > const & | data | |||
| ) | [pure virtual] |
Copy data from Palabos' CPU memory to the co-processors' device memory.
The method returns an error code: 1=success, 0=failure. Please note that the memory of a std::vector is always contiguous, which means that you can get a c-array representation of the data through the syntax T const* carray = &data[0].
The memory layout must respect the following ordering:
Implemented in plb::D3Q19ExampleCoProcessor3D< T >, and plb::D3Q19CudaCoProcessor3D< T >.
1.6.3
1.6.3