Kernel interface¶
Finite Element Method Kernel Interface¶
The finite element method kernel interface (FEMKI) specifies an API for the launching of computational kernels for solving physics discretized using the finite element method. Using this approach, a set of generic element looping pattens and kernel launching functions may be implemented, and reused by various physics solvers that contain kernels conforming to the FEMKI.
There are several main components of the FEMKI:
- A collection of element looping functions that provide various looping
patterns, and call the
launch
function. - The kernel interface, which is specified by the
finiteElement::KernelBase class.
Each physics solver will define a class that contains its kernels functions,
most likely deriving, or conforming to the API specified by the KernelBase
class. Also part of this class will typically be a nested
StackVariables
class that defines a collection of stack variables for use in the various kernel interface functions. - A
launch
function, which launches the kernel, and calls the kernel interface functions conforming to the interface defined byKernelBase
. This function is actaully a member function of theKernel
class, so it may be overridden by a specific physics kernel, allowing complete customizationAn of the interface, while maintaining the usage of the looping patterns.
A Generic Element Looping Pattern¶
One example of a looping pattern is the regionBasedKernelApplication function.
The contents of the looping function are displayed here:
/**
* @brief Performs a loop over specific regions (by type and name) and calls
* a kernel launch on the subregions with compile time knowledge of
* sub-loop bounds such as number of nodes and quadrature points per
* element.
* @tparam POLICY The RAJA launch policy to pass to the kernel launch.
* @tparam CONSTITUTIVE_BASE The common base class for constitutive
* pass-thru/dispatch which gives the kernel launch
* compile time knowledge of the constitutive model.
* This is achieved through a call to the
* ConstitutivePassThru function which
* should have a specialization for CONSTITUTIVE_BASE
* implemented in order to perform the compile time
* dispatch.
* @tparam REGION_TYPE The type of region to loop over. TODO make this a
* parameter pack?
* @tparam KERNEL_TEMPLATE The type of template for the physics kernel, which
* conforms to the interface specified by KernelBase.
* @tparam KERNEL_CONSTRUCTOR_PARAMS The template parameter pack to hold the
* parameter pack (i.e. custom) parameters
* that are sent to the constructor for the
* @p KERNEL_TEMPLATE.
* @param mesh The MeshLevel object.
* @param targetRegions The names of the target regions(of type @p REGION_TYPE)
* to apply the @p KERNEL_TEMPLATE.
* @param finiteElementName The name of the finite element.
* @param constitutiveNames The names of the constitutive models present in the
* Region.
* @param kernelConstructorParams The parameter list for corresponding to the
* parameter @p KERNEL_CONSTRUCTOR_PARAMS that
* are passed to the @p KERNEL_TEMPLATE
* constructor.
* @return The maximum contribution to the residual, which may be used to scale
* the residual.
*
* Loops over all regions Applies/Launches a kernel specified by the @p KERNEL_TEMPLATE through
* #::geosx::finiteElement::KernelBase::kernelLaunch().
*/
template< typename POLICY,
typename CONSTITUTIVE_BASE,
typename REGION_TYPE,
template< typename SUBREGION_TYPE,
typename CONSTITUTIVE_TYPE,
typename FE_TYPE > class KERNEL_TEMPLATE,
typename ... KERNEL_CONSTRUCTOR_PARAMS >
static
real64 regionBasedKernelApplication( MeshLevel & mesh,
arrayView1d< string const > const & targetRegions,
string const & finiteElementName,
arrayView1d< string const > const & constitutiveNames,
KERNEL_CONSTRUCTOR_PARAMS && ... kernelConstructorParams )
{
GEOSX_MARK_FUNCTION;
// save the maximum residual contribution for scaling residuals for convergence criteria.
real64 maxResidualContribution = 0;
NodeManager & nodeManager = *(mesh.getNodeManager());
EdgeManager & edgeManager = *(mesh.getEdgeManager());
FaceManager & faceManager = *(mesh.getFaceManager());
ElementRegionManager & elementRegionManager = *(mesh.getElemManager());
// Create a tuple that contains the kernelConstructorParams, as the lambda does not properly catch the parameter pack
// until c++20
#if CONSTRUCTOR_PARAM_OPTION==1
std::tuple< KERNEL_CONSTRUCTOR_PARAMS &... > kernelConstructorParamsTuple = std::forward_as_tuple( kernelConstructorParams ... );
#elif CONSTRUCTOR_PARAM_OPTION==2
camp::tuple< KERNEL_CONSTRUCTOR_PARAMS &... > kernelConstructorParamsTuple = camp::forward_as_tuple( kernelConstructorParams ... );
#endif
// Loop over all sub-regions in regiongs of type REGION_TYPE, that are listed in the targetRegions array.
elementRegionManager.forElementSubRegions< REGION_TYPE >( targetRegions,
[&constitutiveNames,
&maxResidualContribution,
&nodeManager,
&edgeManager,
&faceManager,
&kernelConstructorParamsTuple,
&finiteElementName]
( localIndex const targetRegionIndex, auto & elementSubRegion )
{
localIndex const numElems = elementSubRegion.size();
// Create an alias for the type of subregion we are in, which is now known at compile time.
typedef TYPEOFREF( elementSubRegion ) SUBREGIONTYPE;
// Get the constitutive model...and allocate a null constitutive model if required.
constitutive::ConstitutiveBase * constitutiveRelation = nullptr;
constitutive::NullModel * nullConstitutiveModel = nullptr;
if( targetRegionIndex <= constitutiveNames.size()-1 )
{
constitutiveRelation = elementSubRegion.template getConstitutiveModel( constitutiveNames[targetRegionIndex] );
}
else
{
nullConstitutiveModel = elementSubRegion.template RegisterGroup< constitutive::NullModel >( "nullModelGroup" );
constitutiveRelation = nullConstitutiveModel;
}
// Call the constitutive dispatch which converts the type of constitutive model into a compile time constant.
constitutive::ConstitutivePassThru< CONSTITUTIVE_BASE >::Execute( constitutiveRelation,
[&maxResidualContribution,
&nodeManager,
&edgeManager,
&faceManager,
&kernelConstructorParamsTuple,
&elementSubRegion,
&finiteElementName,
numElems]
( auto * const castedConstitutiveRelation )
{
// Create an alias for the type of constitutive model.
using CONSTITUTIVE_TYPE = TYPEOFPTR( castedConstitutiveRelation );
string const elementTypeString = elementSubRegion.GetElementTypeString();
FiniteElementBase &
subRegionFE = elementSubRegion.template getReference< FiniteElementBase >( finiteElementName );
finiteElement::dispatch3D( subRegionFE,
[&maxResidualContribution,
&nodeManager,
&edgeManager,
&faceManager,
&kernelConstructorParamsTuple,
&elementSubRegion,
&numElems,
&castedConstitutiveRelation] ( auto const finiteElement )
{
using FE_TYPE = TYPEOFREF( finiteElement );
// Define an alias for the kernel type for easy use.
using KERNEL_TYPE = KERNEL_TEMPLATE< SUBREGIONTYPE,
CONSTITUTIVE_TYPE,
FE_TYPE >;
// 1) Combine the tuple containing the physics kernel specific constructor parameters with
// the parameters common to all phsyics kernels that use this interface,
// 2) Instantiate the kernel.
// note: have two options, using std::tuple and camp::tuple. Due to a bug in the OSX
// implementation of std::tuple_cat, we must use camp on OSX. In the future, we should
// only use one option...most likely camp since we can easily fix bugs.
#if CONSTRUCTOR_PARAM_OPTION==1
auto temp = std::forward_as_tuple( nodeManager,
edgeManager,
faceManager,
elementSubRegion,
finiteElement,
castedConstitutiveRelation );
auto fullKernelComponentConstructorArgs = std::tuple_cat( temp,
kernelConstructorParamsTuple );
KERNEL_TYPE kernelComponent = std::make_from_tuple< KERNEL_TYPE >( fullKernelComponentConstructorArgs );
#elif CONSTRUCTOR_PARAM_OPTION==2
auto temp = camp::forward_as_tuple( nodeManager,
edgeManager,
faceManager,
elementSubRegion,
finiteElement,
castedConstitutiveRelation );
auto fullKernelComponentConstructorArgs = camp::tuple_cat_pair( temp,
kernelConstructorParamsTuple );
KERNEL_TYPE kernelComponent = camp::make_from_tuple< KERNEL_TYPE >( fullKernelComponentConstructorArgs );
#endif
// Call the kernelLaunch function, and store the maximum contribution to the residual.
maxResidualContribution =
std::max( maxResidualContribution,
KERNEL_TYPE::template kernelLaunch< POLICY,
KERNEL_TYPE >( numElems,
kernelComponent ) );
} );
} );
// Remove the null constitutive model (not required, but cleaner)
if( nullConstitutiveModel )
{
elementSubRegion.deregisterGroup( "nullModelGroup" );
}
} );
return maxResidualContribution;
}
This pattern may be used with any kernel class that either:
- Conforms to the
KernelBase
interface by defining each of the kernel functions inKernelBase
. - Defines its own
kernelLaunch
function that conforms the the signature ofKernelBase::kernelLaunch
. This option essentially allows for a custom kernel that does not conform to the interface defined byKernelBase
andKernelBase::kernelLaunch
.
The KernelBase::kernelLaunch Interface¶
The kernelLaunch
function is a member of the kernel class itself.
As mentioned above, a physics implementation may use the existing KernelBase
interface, or define its own.
The KernelBase::kernelLaunch
function defines a launching policy, and an
internal looping pattern over the quadrautre points, and calls the functions
defined by the KernelBase
as shown here:
template< typename POLICY,
typename KERNEL_TYPE >
static
real64
kernelLaunch( localIndex const numElems,
KERNEL_TYPE const & kernelComponent )
{
GEOSX_MARK_FUNCTION;
// Define a RAJA reduction variable to get the maximum residual contribution.
RAJA::ReduceMax< ReducePolicy< POLICY >, real64 > maxResidual( 0 );
forAll< POLICY >( numElems,
[=] GEOSX_HOST_DEVICE ( localIndex const k )
{
typename KERNEL_TYPE::StackVariables stack;
kernelComponent.setup( k, stack );
for( integer q=0; q<numQuadraturePointsPerElem; ++q )
{
kernelComponent.quadraturePointKernel( k, q, stack );
}
maxResidual.max( kernelComponent.complete( k, stack ) );
} );
return maxResidual.get();
}
Each of the KernelBase
functions called in the KernelBase::kernelLaunch
function are intended to provide a certain amount of modularity and flexibility
for the physics implementations.
The general purpose of each function is described by the function name, but may
be further descibed by the function documentation found
here.