PII Redaction

Project information

  • Category: Machine Learning
  • Project date: March, 2024
  • Skills: NLP, Python, ML, HTML, CSS
Problem Statement

In today's data-driven world, handling personal identifiable information (PII) with utmost care is crucial to maintain privacy and comply with regulations such as GDPR and CCPA. Organizations often need to share data for various purposes, but sharing PII can lead to significant privacy risks and potential legal repercussions. The challenge lies in efficiently identifying and redacting PII from data to ensure its safe handling and distribution.

Solution Overview

This project addresses this critical need by developing a comprehensive system for the best redaction and detection of PII in a series of virtual conversations (vCons). The solution involves two main components:

  • Redaction: Automatically redacting all PII from the provided vCons to produce anonymized data, ensuring privacy protection and regulatory compliance.
  • Detection: Identifying and listing all instances of PII within the vCons, providing detailed reports to facilitate data auditing and compliance checks.