XML Guidelines

XML Schema

An XML schema definition (XSD) file lays out the expected structure of an input XML file. During the build process, GEOSX automatically constructs a comprehensive schema from the code’s data structure, and updates the version in the source (GEOSX/src/coreComponents/fileIO/schema/schema.xsd).

Schema Components

The first entry in the schema are a set of headers the file type and version. Following this, the set of available simple types for attributes are layed out. Each of these include a variable type name, which mirrors those used in the main code, and a regular expression, which is designed to match valid inputs. These patterns are defined and documented in DataTypes::typeRegex. The final part of the schema is the file layout, beginning with the root Problem. Each complex type defines an element, its children, and its attributes. Each attribute defines the input name, type, default value, and/or usage. Comments preceding each attribute are used to relay additional information to the user.

Automatic Schema Generation

A schema may be generated by calling the main code with the -s argument , e.g.: geosx -s schema.xsd (Note: this is done automatically during the bulid process). To do this, GEOSX does the following:

  1. Initialize the GEOSX data structure.
  2. Initialize objects that are registered to catalogs via ManagedGroup::ExpandObjectCatalogs().
  3. Recursively write element and attribute definitions to the schema using information stored in GEOSX groups and wrappers.
  4. Define any expected deviations from the schema via ManagedGroup::SetSchemaDeviations().

Input File Validation

The optional attributes xmlns:xsi and xsi:noNamespaceSchemaLocation of the root Problem element are used to define the file format and schema location. While these are ignored by the main code, they may be used to configure various xml validation tools.

<Problem>

The following sections discuss how to setup Sublime Text, Eclipse, and geosx_xml_tools to validate input xml files before submitting jobs to GEOSX.

Sublime Text

We reccomend using the Exalt plug-in to validate xml files within sublime. If you have not done so already, install the sublime Package Control. To install the package, press ctrl + shift + p, type and select Package Controll: Install Package, and search for exalt. Finally, set the location where the code will cache schema files (so that they don’t need to be individually fetched every time a file is validated). To do this, go to Preferences -> Package Settings -> Exalt -> Settings - User, and define the location of the xml cache:

{
  "xml_catalog_files": ["~/.schemas/catalog.xml"]
}

Sublime will automatically fetch a schema defined in the header and validate the current file. If present, sublime will highlight lines that contain errors and display messages at the bottom of the editor. Note: the exalt plug-in will only highlight the first error present in an file. Once resolved, exalt will move to and display the next error.

Eclipse

To Eclipse Web Develop Tools includes features for validating xml files. To install them, go to Help -> Eclipse Marketplace, search for the Eclipse Web Developer Tools, install the package, and restart eclipse. Finally, configure the xml validation preferences under Window -> Preferences -> XML -> XML Files -> Validation. Eclipse will automatically fetch the schema, and validate an active xml file. The editor will highlight any lines with errors, and underline the specific errors.

GEOSX XML Tools

There are a number of command-line tools for validating input xml files. The geosx_xml_tools package, which is used to enable advanced features such as parameters, symbolic math, etc., contains tools for validating xml files. To do so, call the command-line script with the -s argument, i.e.: preprocess_xml input_file.xml -s /path/to/schema.xsd. After compiling the final xml file, pygeos will fetch the designated schema, validate, and print any errors to the screen.

Note: Attributes that are using advanced xml features will likely contain characters that are not allowed by their corresponding type pattern. As such, file editors that are configured to use other validation methods will likely identify errors in the raw input file.