The CoreScan product from CoreView has helped organisations of all sizes to discover and report on sensitive data, such as Personally Identifiable Information (PII), Payment Card Information (PCI) and medical health data. It has helped many customers respond to regulatory requirements such as GDPR and HIPPA.

Today we are announcing a powerful new feature available to all CoreScan customers – automated redaction!

Automated redaction

The new redaction feature allows you to remove sensitive information from within a document while leaving the remaining information intact. In many governance situations, deletion of an entire document just to deal with sensitive data is not a viable solution as that data may only be a small part of the information contained in the document. Only sensitive data needs to be redacted.

Manually redacting documents is time consuming and potentially error prone.  It is also often difficult to identify what information needs to be removed.  For example, if you needed to remove all credit card numbers from a set of documents you cannot simply search for a specific value – instead you would need to read every document to identify all card numbers.

Another challenge of manual redaction is working with PDF files.  By design, the PDF document format is read-only and does not allow editing.

CoreScan redaction

The CoreScan redaction feature enables you to identify any type of sensitive data and to automatically redact it from documents.  CoreScan can redact from most common file types, including all versions of Microsoft Office files. And, critically, it can redact information from within PDF files. 

Retaining Context

Traditional redaction simply places a black box over the sensitive data.  For example, if we wished to redact the names from this text:

John Smith paid Jane Smith $1000.  Jane Smith gave John Smith a receipt for the transaction.

Then the redacted text might look like the following:

Traditional redaction

Although this has successfully removed the names from the text, we have lost the context of the information.  We don’t know if the person who received the payment was the same person as the person who issued the receipt. In a complex document, losing the context of the information is a major disadvantage with traditional redaction.

CoreScan retains the context of the redacted information by assigning a unique value to each item that is redacted.  Using our previous example, CoreScan would redact the text like this:

PERSON1 paid PERSON2 $1000.  PERSON2 gave PERSON1 a receipt for the transaction.

File Management

When you use CoreScan’s redaction feature, new versions of the redacted files are created and renamed with a ‘redacted’ suffix together with a date and time stamp of the redacted version. You can then choose whether to delete (or perhaps archive) the original files in line with your organization’s policies. And because CoreScan can operate at scale, you can redact sensitive information from thousands of files together.

Summary

The new redaction feature in CoreScan enables you to redact and manage sensitive information based on our natural language processing and pattern-matching techniques and can save you countless time and money when compared to traditional manual approaches.