← Back to All Frameworks

China Health Data De-identification Framework

Personal Information Protection Law (PIPL) and Related Regulations

Overview

China has rapidly developed a comprehensive data protection framework in recent years, with specific provisions addressing health data. The framework is characterized by stringent requirements, particularly for sensitive personal information like health data, and places a strong emphasis on data localization and security.

Recent Implementation Timeline

  • July 2021: Data Security Law took effect
  • November 2021: Personal Information Protection Law (PIPL) implemented
  • September 2022: Revised Measures for Cybersecurity Review implemented
  • February 2023: Measures for Standard Contracts for Cross-border Transfer of Personal Information issued

Legal Framework

China's health data protection and de-identification framework is built upon several key laws and regulations:

Primary Legislation

Reference Links:

Health-Specific Regulations

Example: Scope of Health Data Regulation

Under China's framework, regulated health data includes:

  • Medical records and treatment histories
  • Disease diagnostic information
  • Genetic testing results
  • Biometric data related to health
  • Health monitoring data from wearable devices
  • Prescription and medication information
  • Health insurance claims data

Key Concepts and Classifications

Classification of Health Data

Under the PIPL, health data is explicitly classified as "Sensitive Personal Information" (SPI), which is defined as:

"Personal information that, if leaked or illegally used, may cause discrimination against individuals or seriously endanger personal or property security, including information on biometric characteristics, religious beliefs, specially-designated status, medical health, financial accounts, individual location tracking, etc."

Article 28 of the PIPL specifically requires that processors of sensitive personal information:

De-identification Concepts

Chinese law distinguishes between two levels of de-identification:

Concept Definition Legal Status Examples
De-identification (去标识化) Processing of personal information so that it cannot be identified without additional information Still considered personal information and subject to PIPL Replacing patient names with codes; removing ID numbers but keeping a separate key
Anonymization (匿名化) Processing of personal information so that the identification of specific individuals is impossible and the processed information cannot be restored Not considered personal information and falls outside PIPL scope Aggregating patient data into statistical reports; irreversibly hashing identifiers

Reference:

The distinction between these concepts is detailed in Article 73.1 of the PIPL: http://www.npc.gov.cn/npc/c30834/202108/a8c4e3672c74491a80b53a172bb753fe.shtml

Special Requirements for Health Data

As Sensitive Personal Information, health data is subject to heightened protection requirements:

1. Enhanced Consent Requirements

Example: Consent Form Requirements

A compliant health data consent form in China must include:

  • Specific description of health data to be collected
  • Explicit purpose of collection (e.g., "for diabetes treatment monitoring")
  • Retention period (e.g., "data will be retained for 5 years after your last visit")
  • Security measures implemented to protect the data
  • Individual rights to access, correct, and delete their data
  • Separate checkbox specifically for health data consent
  • Clear language avoiding technical or legal jargon

2. Impact Assessments

Case Study: Hospital PIPIA Process

A major Shanghai hospital implemented a standardized PIPIA process for its new patient management system that includes:

  1. Cataloging all health data collected and processed
  2. Identifying potential risks of data breaches or misuse
  3. Evaluating necessity of each data element collected
  4. Implementing technical safeguards including encryption and access controls
  5. Establishing a data breach response plan
  6. Documenting the entire assessment process for regulatory compliance

3. Localization Requirements

Reference:

Measures for Security Assessment of Cross-border Data Transfer (effective September 1, 2022): http://www.cac.gov.cn/2022-07/07/c_1658811536396503.htm

Technical Standards for De-identification

Several national standards provide technical guidance for de-identification in China:

Information Security Technology Standards

Reference:

National Standards can be accessed through the Standardization Administration of China: http://www.sac.gov.cn/

The technical standards recommend several de-identification methods:

Method Description Example in Health Context
Data Masking Replacing identifiers with special characters or placeholders Replacing "Zhang Wei, ID: 310104198012121234" with "Zhang **, ID: 310104********1234"
Pseudonymization Replacing direct identifiers with artificial identifiers or pseudonyms Replacing patient names with randomly generated codes (e.g., "Patient-A12B34")
Generalization Reducing precision of data (e.g., converting exact age to age ranges) Changing "43 years old" to "40-45 years old"; changing "Shanghai Pudong District" to "Shanghai"
Data Aggregation Combining data to prevent individual identification Reporting "15% of patients experienced side effect X" rather than individual patient reactions
Data Perturbation Adding statistical noise to data Slightly adjusting laboratory values within clinically insignificant ranges
K-anonymity Ensuring each record is indistinguishable from at least k-1 other records Ensuring any combination of quasi-identifiers applies to at least 5 patients in the dataset

Example: De-identification Process for Clinical Trial Data

A pharmaceutical company conducting clinical trials in China implemented this de-identification process:

  1. Removal of direct identifiers (names, ID numbers, contact information)
  2. Replacement of patient IDs with study-specific codes
  3. Generalization of demographic data (age ranges instead of exact ages)
  4. Removal of rare disease information that could enable identification
  5. Generalization of location data to city level only
  6. Removal of exact dates, replacing with study day numbers
  7. Implementation of access controls for the pseudonymization key

Enforcement and Penalties

China's framework includes strict enforcement mechanisms for violations:

Enforcement Example: Health App Penalties

In April 2022, the CAC announced penalties against multiple mobile health applications for excessive collection of personal information and failure to properly de-identify health data. Violations included:

  • Collection of health data beyond the scope necessary for service provision
  • Failure to obtain explicit consent for health data processing
  • Insufficient de-identification measures for shared health data
  • Inadequate security measures for sensitive personal information

Penalties included fines, mandatory rectification within specified timeframes, and temporary removal from app stores.

Reference:

CAC Enforcement Announcements: http://www.cac.gov.cn/2022-04/25/c_1651927474338504.htm

Recent Developments

China continues to evolve its approach to health data protection:

Reference:

14th Five-Year Plan for National Health: http://www.gov.cn/zhengce/zhengceku/2022-05/20/content_5691905.htm

National Healthcare Security Administration: http://www.nhsa.gov.cn/

How It Compares to HIPAA Safe Harbor

China's approach differs from HIPAA Safe Harbor in several key ways:

Key Practical Difference Example

A multinational pharmaceutical company conducting clinical trials in both the US and China would need to:

  • In the US (HIPAA): Remove 18 specific identifiers to qualify for Safe Harbor, with limited restrictions on data transfer
  • In China (PIPL): Implement more comprehensive de-identification, store data locally, conduct impact assessments, obtain CAC approval for any cross-border transfers, and implement stricter security measures