$treeview $search $mathjax
Palabos  Version 1.1
$projectbrief
$projectbrief
$searchbox

plb::D3Q19ExampleCoProcessor3D< T > Class Template Reference

#include <coProcessor3D.h>

Inheritance diagram for plb::D3Q19ExampleCoProcessor3D< T >:
Collaboration diagram for plb::D3Q19ExampleCoProcessor3D< T >:

List of all members.

Public Member Functions

virtual int addDomain (plint nx, plint ny, plint nz, T omega, int &domainHandle)
 Add a domain for which the co-processor will perform computations.
virtual int send (int domainHandle, Box3D const &subDomain, std::vector< char > const &data)
 Copy data from Palabos' CPU memory to the co-processors' device memory.
virtual int receive (int domainHandle, Box3D const &subDomain, std::vector< char > &data) const
 Copy data from the co-processors' device memory to Palabos' CPU memory.
virtual int collideAndStream (int domainHandle)
 Execute a collision step on each cell, and then a streaming on the full domain.

Detailed Description

template<typename T>
class plb::D3Q19ExampleCoProcessor3D< T >

This place-holder co-processor does nothing else than implement all functionalities of a co-processor on the CPU, using Palabos library calls. It is in principle useless, but can be used to debug or illustrate the mechanism for calling co-processors in Palabos.


Member Function Documentation

template<typename T >
int plb::D3Q19ExampleCoProcessor3D< T >::addDomain ( plint  nx,
plint  ny,
plint  nz,
omega,
int &  domainHandle 
) [inline, virtual]

Add a domain for which the co-processor will perform computations.

All domains range from 0 to nx-1, from 0 to ny-1, and from 0 to nz-1 at the present interface representation, no matter where they are actually placed in the physical space.

The relaxation parameter omega is used to implement the BGK collision rule on the device.

A handle "domainHandle" is returned by the co-processor, and is subsequently used to identify the various domains during the calls to send(), receive(), and collideAndStream().

The method returns an error code: 1=success, 0=failure.

Implements plb::CoProcessor3D< T >.

References PLB_ASSERT.

template<typename T >
int plb::D3Q19ExampleCoProcessor3D< T >::collideAndStream ( int  domainHandle  )  [inline, virtual]

Execute a collision step on each cell, and then a streaming on the full domain.

Note that the result of the streaming step is undefined in a one-cell layer at the outer border of the domain. The method collideAndStream() is free to produce whatever result it wishes inside this layer.

It is also mentioned that the collideAndStream() operation is blocking: it does not terminated before the operation is fully completed. In order to overlay computations, you must use the MPI-based multi-thread mechanism in Palabos.

Implements plb::CoProcessor3D< T >.

References PLB_ASSERT.

template<typename T >
int plb::D3Q19ExampleCoProcessor3D< T >::receive ( int  domainHandle,
Box3D const &  subDomain,
std::vector< char > &  data 
) const [inline, virtual]

Copy data from the co-processors' device memory to Palabos' CPU memory.

The method returns an error code: 1=success, 0=failure. Further information on the memory layout is available in the documentation of the method send().

Attention: it is the responsibility of the receive method to resize the data vector so it is big enough.

Implements plb::CoProcessor3D< T >.

References PLB_ASSERT, and plb::modif::staticVariables.

template<typename T >
int plb::D3Q19ExampleCoProcessor3D< T >::send ( int  domainHandle,
Box3D const &  subDomain,
std::vector< char > const &  data 
) [inline, virtual]

Copy data from Palabos' CPU memory to the co-processors' device memory.

The method returns an error code: 1=success, 0=failure. Please note that the memory of a std::vector is always contiguous, which means that you can get a c-array representation of the data through the syntax T const* carray = &data[0].

The memory layout must respect the following ordering:

  • The fastest running index is for the 19 populations, with an ordering specified in the structure "D3Q19Constants" in the file "latticeBoltzmann/nearestNeighborLattices3D.hh".
  • The space indices are ordered according to the C convention, meaning that, if you take the space matrix to be declared as matrix[nx][ny][nz], then the z-index is fastest running.

Implements plb::CoProcessor3D< T >.

References PLB_ASSERT, and plb::modif::staticVariables.


The documentation for this class was generated from the following files: