Data Type Formatting Options

To give you control over the data types that are used when coercing strings to other types, the formats property can be included in GDI queries to define the desired types. In addition, formats can be used to describe the formats of date and time values in the source to ensure that they are recognized and parsed to the appropriate date, time, and/or dateTime values. You can also use the formats property to suppress the conversion so that the generated values are typed the same way as the source.

The GDI takes locale into account when formatting the generated date and time values.

For sources that do not include data type specifications and natively treat values as strings, the GDI Generator automatically converts the values to the appropriate type. For example, if a CSV file includes the value "Feb-18-2022," the GDI parses the string to an xsd:date with the format "2022-02-18". A column with numbers is converted to an xsd:int type and a column with a decimal value is converted to xsd:float. The formats property usage is described below.

Formats Syntax

s:formats [
   s:strict boolean ; [
       xsd:data_type "format" 
     | xsd:data_type boolean ;
       [ ... ; ]
   ]
] ;
Option Data Type Description
strict boolean This property enables or disables the automatic type conversion feature. By default, strict is set to false (s:strict false), meaning the GDI's automatic type conversion feature is enabled. When strict is false, any formats specified in s:formats [ ] augment the GDI's built-in date and time formats. You can selectively disable certain type conversions, however, by including xsd:data_type false. For example, xsd:dateTime false disables the parsing of strings to dateTime.

When strict is true (s:strict true), the auto conversion logic is essentially disabled and the generated data will be represented the same way it is in the source. When strict is true, you can selectively enable certain conversions by including xsd:data_type true or by defining xsd:data_type "format". In this case, values that do not match any of the formats provided will be typed as xsd:string.

xsd:data_type "format" N/A Include xsd:data_type "format" when you want to describe the formats of date and time values in the source. The GDI supports Java date and time format notation. For example, if dates in the source are formatted like "yyyy-MM-dd," include the statement xsd:date "yyyy-MM-dd". If the source uses multiple formats for dates, e.g., 18-MAR-1978 and 03/18/1978, you can list multiple formats for xsd:date, such as xsd:date "dd-MMM-yyyy", "MM/dd/yyyy".

The GDI's default base year is 2000. If the source data has years with only two digits, such as 02-04-99, the GDI prepends 20 to the digits. The value 02-04-99 is parsed to 02-04-2099. To specify an alternate base year to use for two-digit values, you can include the notation ^nnnn (e.g., ^1900) in the format value. For example, to set the base year to 1900 instead of 2000, use a format value such as xsd:date "dd-MMM-yy^1900" or xsd:date "dd-MMM-yy^1990". When one of those values is specified, 02-04-99 is parsed to 02-04-1999.

xsd:data_type boolean N/A When strict is false or not set, you can disable specific type conversions by listing data types and setting their values to false.For example, if you want the GDI to convert strings to integers or floats when possible but you want the dates in the source to be preserved as strings, you can include xsd:date false to disable the conversion of strings to dates.

When strict is true, you can enable specific type conversions by listing data types and setting their values to true. For example, if you want the GDI to preserve the strings in the source except for when the string is a number, you can include xsd:int true to enable the conversion of strings to integers.

Formats Examples

The example below sets strict to true and forces the GDI to parse values only to the data types that are enabled with true. It also defines the format to look for when converting strings to dateTime:

s:formats [
   s:strict true ;
   xsd:int true ;
   xsd:dateTime true ;
   xsd:dateTime "yyyy-MM-dd-HH-mm-ss" ;
] ;

The example below does not set strict, so the default value of false is used. The data type definitions specify the formats of the values to parse as date, time, and dateTime values. The example also disables the conversion from string to long:

s:formats [
   xsd:date "MM/dd/yyyy", "MMM dd", "MMM dd yyyy" ;
   xsd:time "HH[:mm][:ss][ ]a" ;
   xsd:dateTime "M/d/yyyy HH:mm:ss a", "yyyy-MM-dd-HH-mm-ss" ;
   xsd:long false ;
] ;

Related Topics