Attribute data values are standardized or normalized both to remove invalid values and to ensure consistent format for use in matching algorithm decisions. Provider Identity standardizes incoming data values on both web service requests and batch files. Standardized data values are persisted in Provider Identity in their changed or standardized form.
When those values are retrieved later (by retrieving the identity to which the attribute values belong), the values returned are the standardized values rather than the original values. Standardization (or normalization) can vary depending on the nature of each data element.
Data Standardization in Provider Identity
|Attribute or attribute cluster
|Standardization performed by Provider Identity
Extended ASCII characters with a logical ASCII equivalent are converted to their ASCII equivalent. For example, the extended ASCII character Ñ is converted to the ASCII character N.
|Alphabetic characters are converted to uppercase. A subset of the Extended Attribute fields can optionally be kept in their original case, but by default all alphabetic characters are converted to uppercase.
Address components are standardized in several ways consistent with US Postal Service conventions. Examples include:
Gender values are standardized to single-character gender codes of M, F, U, O, T, A, and N if the full string is provided as input. For example, an input gender of FEMALE is standardized to F. An input gender value that does not correspond to M, F, U, O, T, A, or N is not standardized – it is stored as-is.
M = Male, F = Female, U = Unknown, O = Other, T= Transgender, A = Ambiguous.
Alphabetic characters and punctuation symbols are removed.