Kernel interface

Finite Element Method Kernel Interface

The finite element method kernel interface (FEMKI) specifies an API for the launching of computational kernels for solving physics discretized using the finite element method. Using this approach, a set of generic element looping pattens and kernel launching functions may be implemented, and reused by various physics solvers that contain kernels conforming to the FEMKI.

There are several main components of the FEMKI:

  1. A collection of element looping functions that provide various looping patterns, and call the launch function.
  2. The kernel interface, which is specified by the finiteElement::KernelBase class. Each physics solver will define a class that contains its kernels functions, most likely deriving, or conforming to the API specified by the KernelBase class. Also part of this class will typically be a nested StackVariables class that defines a collection of stack variables for use in the various kernel interface functions.
  3. A launch function, which launches the kernel, and calls the kernel interface functions conforming to the interface defined by KernelBase. This function is actaully a member function of the Kernel class, so it may be overridden by a specific physics kernel, allowing complete customizationAn of the interface, while maintaining the usage of the looping patterns.

A Generic Element Looping Pattern

One example of a looping pattern is the regionBasedKernelApplication function.

The contents of the looping function are displayed here:

/**
 * @brief Performs a loop over specific regions (by type and name) and calls
 *        a kernel launch on the subregions with compile time knowledge of
 *        sub-loop bounds such as number of nodes and quadrature points per
 *        element.
 * @tparam POLICY The RAJA launch policy to pass to the kernel launch.
 * @tparam CONSTITUTIVE_BASE The common base class for constitutive
 *                           pass-thru/dispatch which gives the kernel launch
 *                           compile time knowledge of the constitutive model.
 *                           This is achieved through a call to the
 *                           ConstitutivePassThru function which
 *                           should have a specialization for CONSTITUTIVE_BASE
 *                           implemented in order to perform the compile time
 *                           dispatch.
 * @tparam REGION_TYPE The type of region to loop over. TODO make this a
 *                     parameter pack?
 * @tparam KERNEL_TEMPLATE The type of template for the physics kernel, which
 *                         conforms to the interface specified by KernelBase.
 * @tparam KERNEL_CONSTRUCTOR_PARAMS The template parameter pack to hold the
 *                                   parameter pack (i.e. custom) parameters
 *                                   that are sent to the constructor for the
 *                                   @p KERNEL_TEMPLATE.
 * @param mesh The MeshLevel object.
 * @param targetRegions The names of the target regions(of type @p REGION_TYPE)
 *                      to apply the @p KERNEL_TEMPLATE.
 * @param finiteElementName The name of the finite element.
 * @param constitutiveNames The names of the constitutive models present in the
 *                          Region.
 * @param kernelConstructorParams The parameter list for corresponding to the
 *                                parameter @p KERNEL_CONSTRUCTOR_PARAMS that
 *                                are passed to the @p KERNEL_TEMPLATE
 *                                constructor.
 * @return The maximum contribution to the residual, which may be used to scale
 *         the residual.
 *
 * Loops over all regions Applies/Launches a kernel specified by the @p KERNEL_TEMPLATE through
 * #::geosx::finiteElement::KernelBase::kernelLaunch().
 */
template< typename POLICY,
          typename CONSTITUTIVE_BASE,
          typename REGION_TYPE,
          template< typename SUBREGION_TYPE,
                    typename CONSTITUTIVE_TYPE,
                    typename FE_TYPE > class KERNEL_TEMPLATE,
          typename ... KERNEL_CONSTRUCTOR_PARAMS >
static
real64 regionBasedKernelApplication( MeshLevel & mesh,
                                     arrayView1d< string const > const & targetRegions,
                                     string const & finiteElementName,
                                     arrayView1d< string const > const & constitutiveNames,
                                     KERNEL_CONSTRUCTOR_PARAMS && ... kernelConstructorParams )
{
  GEOSX_MARK_FUNCTION;
  // save the maximum residual contribution for scaling residuals for convergence criteria.
  real64 maxResidualContribution = 0;

  NodeManager & nodeManager = *(mesh.getNodeManager());
  EdgeManager & edgeManager = *(mesh.getEdgeManager());
  FaceManager & faceManager = *(mesh.getFaceManager());
  ElementRegionManager & elementRegionManager = *(mesh.getElemManager());


  // Create a tuple that contains the kernelConstructorParams, as the lambda does not properly catch the parameter pack
  // until c++20
#if CONSTRUCTOR_PARAM_OPTION==1
  std::tuple< KERNEL_CONSTRUCTOR_PARAMS &... > kernelConstructorParamsTuple = std::forward_as_tuple( kernelConstructorParams ... );
#elif CONSTRUCTOR_PARAM_OPTION==2
  camp::tuple< KERNEL_CONSTRUCTOR_PARAMS &... > kernelConstructorParamsTuple = camp::forward_as_tuple( kernelConstructorParams ... );
#endif


  // Loop over all sub-regions in regiongs of type REGION_TYPE, that are listed in the targetRegions array.
  elementRegionManager.forElementSubRegions< REGION_TYPE >( targetRegions,
                                                            [&constitutiveNames,
                                                             &maxResidualContribution,
                                                             &nodeManager,
                                                             &edgeManager,
                                                             &faceManager,
                                                             &kernelConstructorParamsTuple,
                                                             &finiteElementName]
                                                              ( localIndex const targetRegionIndex, auto & elementSubRegion )
  {
    localIndex const numElems = elementSubRegion.size();

    // Create an alias for the type of subregion we are in, which is now known at compile time.
    typedef TYPEOFREF( elementSubRegion ) SUBREGIONTYPE;

    // Get the constitutive model...and allocate a null constitutive model if required.
    constitutive::ConstitutiveBase * constitutiveRelation = nullptr;
    constitutive::NullModel * nullConstitutiveModel = nullptr;
    if( targetRegionIndex <= constitutiveNames.size()-1 )
    {
      constitutiveRelation = elementSubRegion.template getConstitutiveModel( constitutiveNames[targetRegionIndex] );
    }
    else
    {
      nullConstitutiveModel = elementSubRegion.template RegisterGroup< constitutive::NullModel >( "nullModelGroup" );
      constitutiveRelation = nullConstitutiveModel;
    }

    // Call the constitutive dispatch which converts the type of constitutive model into a compile time constant.
    constitutive::ConstitutivePassThru< CONSTITUTIVE_BASE >::Execute( constitutiveRelation,
                                                                      [&maxResidualContribution,
                                                                       &nodeManager,
                                                                       &edgeManager,
                                                                       &faceManager,
                                                                       &kernelConstructorParamsTuple,
                                                                       &elementSubRegion,
                                                                       &finiteElementName,
                                                                       numElems]
                                                                        ( auto * const castedConstitutiveRelation )
    {
      // Create an alias for the type of constitutive model.
      using CONSTITUTIVE_TYPE = TYPEOFPTR( castedConstitutiveRelation );


      string const elementTypeString = elementSubRegion.GetElementTypeString();

      FiniteElementBase &
      subRegionFE = elementSubRegion.template getReference< FiniteElementBase >( finiteElementName );

      finiteElement::dispatch3D( subRegionFE,
                                 [&maxResidualContribution,
                                  &nodeManager,
                                  &edgeManager,
                                  &faceManager,
                                  &kernelConstructorParamsTuple,
                                  &elementSubRegion,
                                  &numElems,
                                  &castedConstitutiveRelation] ( auto const finiteElement )
      {
        using FE_TYPE = TYPEOFREF( finiteElement );

        // Define an alias for the kernel type for easy use.
        using KERNEL_TYPE = KERNEL_TEMPLATE< SUBREGIONTYPE,
                                             CONSTITUTIVE_TYPE,
                                             FE_TYPE >;

        // 1) Combine the tuple containing the physics kernel specific constructor parameters with
        // the parameters common to all phsyics kernels that use this interface,
        // 2) Instantiate the kernel.
        // note: have two options, using std::tuple and camp::tuple. Due to a bug in the OSX
        // implementation of std::tuple_cat, we must use camp on OSX. In the future, we should
        // only use one option...most likely camp since we can easily fix bugs.
#if CONSTRUCTOR_PARAM_OPTION==1
        auto temp = std::forward_as_tuple( nodeManager,
                                           edgeManager,
                                           faceManager,
                                           elementSubRegion,
                                           finiteElement,
                                           castedConstitutiveRelation );

        auto fullKernelComponentConstructorArgs = std::tuple_cat( temp,
                                                                  kernelConstructorParamsTuple );

        KERNEL_TYPE kernelComponent = std::make_from_tuple< KERNEL_TYPE >( fullKernelComponentConstructorArgs );

#elif CONSTRUCTOR_PARAM_OPTION==2
        auto temp = camp::forward_as_tuple( nodeManager,
                                            edgeManager,
                                            faceManager,
                                            elementSubRegion,
                                            finiteElement,
                                            castedConstitutiveRelation );
        auto fullKernelComponentConstructorArgs = camp::tuple_cat_pair( temp,
                                                                        kernelConstructorParamsTuple );
        KERNEL_TYPE kernelComponent = camp::make_from_tuple< KERNEL_TYPE >( fullKernelComponentConstructorArgs );

#endif

        // Call the kernelLaunch function, and store the maximum contribution to the residual.
        maxResidualContribution =
          std::max( maxResidualContribution,
                    KERNEL_TYPE::template kernelLaunch< POLICY,
                                                        KERNEL_TYPE >( numElems,
                                                                       kernelComponent ) );
      } );
    } );

    // Remove the null constitutive model (not required, but cleaner)
    if( nullConstitutiveModel )
    {
      elementSubRegion.deregisterGroup( "nullModelGroup" );
    }

  } );

  return maxResidualContribution;
}

This pattern may be used with any kernel class that either:

  1. Conforms to the KernelBase interface by defining each of the kernel functions in KernelBase.
  2. Defines its own kernelLaunch function that conforms the the signature of KernelBase::kernelLaunch. This option essentially allows for a custom kernel that does not conform to the interface defined by KernelBase and KernelBase::kernelLaunch.

The KernelBase::kernelLaunch Interface

The kernelLaunch function is a member of the kernel class itself. As mentioned above, a physics implementation may use the existing KernelBase interface, or define its own. The KernelBase::kernelLaunch function defines a launching policy, and an internal looping pattern over the quadrautre points, and calls the functions defined by the KernelBase as shown here:

  template< typename POLICY,
            typename KERNEL_TYPE >
  static
  real64
  kernelLaunch( localIndex const numElems,
                KERNEL_TYPE const & kernelComponent )
  {
    GEOSX_MARK_FUNCTION;

    // Define a RAJA reduction variable to get the maximum residual contribution.
    RAJA::ReduceMax< ReducePolicy< POLICY >, real64 > maxResidual( 0 );

    forAll< POLICY >( numElems,
                      [=] GEOSX_HOST_DEVICE ( localIndex const k )
    {
      typename KERNEL_TYPE::StackVariables stack;

      kernelComponent.setup( k, stack );
      for( integer q=0; q<numQuadraturePointsPerElem; ++q )
      {
        kernelComponent.quadraturePointKernel( k, q, stack );
      }
      maxResidual.max( kernelComponent.complete( k, stack ) );
    } );
    return maxResidual.get();
  }

Each of the KernelBase functions called in the KernelBase::kernelLaunch function are intended to provide a certain amount of modularity and flexibility for the physics implementations. The general purpose of each function is described by the function name, but may be further descibed by the function documentation found here.