HL7 Anonymizer Device
The HL7 Anonymizer device de-identifies an HL7 message stream while maintaining the integrity/intent of each message. Replacement values are persisted, meaning each time the device encounters a specific field value, the same replacement is always used.
The device includes a number of automatic features, which can be supplemented by user-defined replacements. Additionally, the underlying logic can be tweaked via a Custom Code Control.
The output from this device is not guaranteed to be PHI-free. You must verify the output to ensure no PHI is leaked.
This device requires an instance of Sql Server to store replacement values.
The Anonymizer device extends the HL7 Transform device, and utilizes much of that devices functionality. It has an additional Settings tab which must be configured prior to using the device.
Database
The first step is to configure the database connection and create the database. Navigate to the Anonymizer Database tab in the Settings section. Input the connection information, test the connection, and if successful, click the Create or Update Database button.
If elevated credentials are required to create the database, you will be prompted to enter them.
Click the Submit button to attempt database creation. If successful, you will be notified and the database status and location will be updated.
Auto-Anonymization Configuration
The Settings tab contains most of the settings for automatic anonymization. This is a feature which automatically replaces fields based on their data type.
Namespace: If you have multiple anonymizer devices backed by the same database, you can provide each device with a unique namespace. Replacement values are distinct for each namespace.
Replacement Cache Size: The number of original-to-replacement items to keep cached in memory. The higher the number, the higher the memory footprint but the faster the performance.
Custom Data Source: By default, the anonymizer uses US census data. You can provide your own data source to use for replacement values. A sample Excel file is provided (usa.xlsx) so you can see the expected format for custom data sources.
Auto-Anonymize: Automatically anonymize fields based on the data type.
Auto-FreeText: Automatically replace strings is free text which have been replaced elsewhere in the message.
Fail if RTF/Base64 detected: Move messages which appear to contain RTF or Base64 to the error queue, as these cannot be automatically anonymized.
Log to Processing History: Log replacement information into the processing history. This is mainly intended for debugging purposes.
Override HL7 Version: By default, the HL7 version is read from the current message, and the corresponding HL7 schema is used. You can force a different HL7 schema, if desired.
The Date Formats tab lets you specify the expected date/time formats used in your messages. As part of the automatic anonymization, the message is scanned for text which matches the given formats, and they are subsequently parsed and replaced.
Custom Code
The custom code control has two purposes in this device:
Let you define your own anonymization functions.
Let you override the default behavior of the auto-anonymization.
Like the HL7 Transform device, you can create your own functions and then utilize them within the Transform grid. The following is a contrived example of supplying some altered output based on the current message.
Then add a new transform and selecting this function.
There are a number of pre-built functions available to you, allowing you to hook into the replacement database.
As mentioned, the automatic anonymization is done based on the HL7 data type. The default behavior can be overridden by altering the code for the specified data type. For example, the following function for altering XAD fields is defined in the custom code:
[FunctionCategory(DefaultTypeCategory)]
[FunctionDescription("Anonymizes a field of type XAD.")]
public override string XAD(HL7Path path)
{
string pathVal = HL7Message[path];
var xpnVals = pathVal.Split(HL7Message.EncodingCharacters.Component);
var outVal = new StringBuilder(pathVal.Length * 2);
int component = 1;
foreach(string inVal in xpnVals)
{
path.Component = component;
if(!string.IsNullOrEmpty(inVal))
{
switch(component)
{
case 1:
//street address
outVal.Append(MailingAddress(path));
break;
case 2:
//other
outVal.Append(OtherGeographicDestination(path));
break;
case 3:
//city
outVal.Append(City(path));
break;
case 4:
//state
outVal.Append(State(path));
break;
case 5:
//zip
outVal.Append(Zip(path));
break;
case 6:
//country
outVal.Append("UNK");
break;
case 8:
// other geographic
outVal.Append(City(path));
break;
default:
outVal.Append(AlphaNumeric(inVal));
break;
}
}
outVal.Append(HL7Message.EncodingCharacters.Component);
component++;
}
return outVal.RTrim().RemoveFromEnd(1).ToString();
}Defaults
You can preview changes made by the anonymizer by selecting a message within an upstream queue. By default, the anonymizer defines four transforms - PatientId, Birthdate, Account, and SSN. You must review the anonymizer changes and verify all PHI fields are handled. Add additional transform items for any PHI that isn’t properly handled.
Lookup
You can lookup the replacement value for a given input using the Lookup button, and entering the anonymized value.
This will return the original value.