- Healthcare data de-identification could bring several potential benefits to healthcare organizations. As more providers begin to connect to HIEs, implement BYOD strategies, or even opt for cloud computing options, it is essential that patient data remain secure the entire time.
However, healthcare data breaches are a constant concern. Can data truly be securely transferred from one location to another? What are options for covered entities and their business associates? Will the de-identification of data guarantee that a breach will never take place?
HealthITSecurity.com will break down the finer points of healthcare data de-identification, and discuss why providers would want to consider this approach. Moreover, we will review what current federal regulations stipulate in terms of the de-identification of data, and whether organizations are required to implement such options.
What is healthcare data de-identification?
Data de-identification is when certain identifiers are removed from the information. In healthcare data de-identification, identifiers are removed from PHI, which can then allow healthcare organizations to potentially use that information in areas such as research, policy assessment, or comparative effectiveness studies.
The HIPAA Privacy Rule states that there are two forms of healthcare data de-identification. First, a qualified expert makes a formal determination. Or, specified individual identifiers are removed, along with “actual knowledge by the covered entity that the remaining information could be used alone or in combination with other information to identify the individual.” The latter method is also referred to as the “Safe Harbor” method.
“The increasing adoption of health information technologies in the United States accelerates their potential to facilitate beneficial studies that combine large, complex data sets from multiple sources,” HHS states on its website. “The process of de-identification, by which identifiers are removed from the health information, mitigates privacy risks to individuals and thereby supports the secondary use of data for comparative effectiveness studies, policy assessment, life sciences research, and other endeavors.”
It is also important to note that regardless of which method is chosen to de-identify information, “the Privacy Rule does not restrict the use or disclosure of de-identified health information,” according to HHS. This is because the data is no longer considered PHI.
For the expert determination option, a person “with appropriate knowledge of and experience” in rendering data unidentifiable will apply the necessary methods to determine that the risk to the data is small. That individual will then document the methods and results to prove how he or she determined that the data had been de-identified.
With the safe harbor approach, healthcare organizations need to consider 18 different identifiers of which to potentially remove. These include, but are not limited to names, telephone numbers, Social Security numbers, and medical record numbers.
Implementing either method will demonstrate that a covered entity has met federal standards of data de-identification, HHS states:
De-identified health information created following these methods is no longer protected by the Privacy Rule because it does not fall within the definition of PHI. Of course, de-identification leads to information loss which may limit the usefulness of the resulting health information in certain circumstances. As described in the forthcoming sections, covered entities may wish to select de-identification strategies that minimize such loss.
To re-identify the information, “a covered entity may assign a code or other means of record identification to allow information de-identified under this section to be re-identified by the covered entity,” according to HHS. However, the following two provisions must be met:
(1) Derivation. The code or other means of record identification is not derived from or related to information about the individual and is not otherwise capable of being translated so as to identify the individual; and
(2) Security. The covered entity does not use or disclose the code or other means of record identification for any other purpose, and does not disclose the mechanism for re-identification.
As previously mentioned, a covered entity may want to de-identify data for several reasons. Whether organizations want to conduct genetic research or simply compile comparative data, it is important to take the time to keep sensitive data secure.
Recent examples of healthcare data de-identification
The anonymization of data could prove very beneficial to healthcare organizations conducting research. This was discussed in a recent TransCelerate Biopharma Inc. paper that breaks down a mechanism for providing anonymized or de-identified data for further research.
This is data that has been collected through clinical trials – interventional clinical trials usually – and is being made available to additional researchers, oftentimes physicians or academic individuals in the community," Ben Rotz, Director of the Office of Medical Transparency at Eli Lily, and Co-lead of the Clinical Data Transparency Initiative at TransCelerate explained in an interview with HealthITSecurity.com. "That way they can test hypotheses that they may have and can hopefully advance science and find new information that wasn't discovered or found when looking at the data originally."
The Journal of the American Medical Informatics Association (JAMIA) also published a study earlier this year that delved into the future of data de-identification, explaining that a broader range of options could be available.
Rule-based policies can be mapped to a utility (U) and re-identification risk (R) space, according to the study, which can then be searched for a collection, or frontier, of policies that systematically trade off between these goals. The R-U frontiers of de-identification policies can be discovered efficiently, the study’s authors explained, which can let healthcare organizations better tailor protections.
“This paper shows that an efficient and effective mechanism can be applied to discover rule-based de-identification policy alternatives for patient-level datasets,” wrote the authors. “To do so, we extend an algorithm designed to search a collection of de-identification policies that compose a frontier that optimally balances risk (R) and utility (U).”
Finally, The Health Information Trust Alliance’s (HITRUST) released The HITRUST De-Identification Framework earlier this year, which was designed to “enhance innovation and streamline the appropriate use of healthcare data.”
The framework was also meant to help healthcare organizations enhance the understanding of de-identification, clarify what qualifies as de-identified data, and better promote de-identified data usage. Moreover, HITRUST explained the framework will take use cases and define “the multiple levels of anonymization” and then recommend specific use cases for each variant.
While not even healthcare data de-identification can guarantee that a data breach will never occur, it can be a key tool for healthcare organizations. This is especially true with transporting data and using sensitive information to further research. However, healthcare organizations could greatly improve their approach to data security by considering data de-identification options.