DEV Community

Cover image for Double-Check Content Using Redaction Annotation in PDFs with C#
Suresh Mohan for Syncfusion, Inc.

Posted on • Originally published at syncfusion.com on

Double-Check Content Using Redaction Annotation in PDFs with C#

PDF documents are used for exchanging business data with confidential information such as financial account numbers, social security numbers, email addresses, phone numbers, and credit card information.

At times, we need to show a part of or a whole document without exposing the sensitive content or private information. In this case, we can use the redaction feature to safely, permanently remove confidential information from a PDF document.

By using redaction annotation, users can do:

  • Content identification : A user applies redaction annotations that indicate the content areas needing to be deleted. Before proceeding to the next step, the user can review, move, and redefine the annotations.
  • Content removal : In this process, the marked content is removed permanently and a mark is applied in the redacted area defined by the redaction annotation.

In this blog, we are going to cover the following topics:

Let’s get started!

Mark text and images for redaction

In this step, we only mark the content that needs to be redacted. We can mark the text and image area in the PDF document using rectangle bounds.

The following code example shows the procedure to mark the content for redaction using the PdfRedactionAnnotation method.

//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument("invoice_merged.pdf");
PdfLoadedPage loadedPage = loadedDocument.Pages[0] as PdfLoadedPage;

//Create redaction annotation.
PdfRedactionAnnotation redactionAnnotation1 = new PdfRedactionAnnotation();
//Assign the rectangle bounds.
redactionAnnotation1.Bounds = new System.Drawing.RectangleF(35, 393, 100, 20);
//Set inner color of the annotation.
redactionAnnotation1.InnerColor = Color.Black;
//Set border color of the annotation.
redactionAnnotation1.BorderColor = Color.Red;
//Add annotation to the page.
loadedPage.Annotations.Add(redactionAnnotation1);

PdfRedactionAnnotation redactionAnnotation2 = new PdfRedactionAnnotation();
redactionAnnotation2.Bounds = new System.Drawing.RectangleF(95, 435, 80, 100);
redactionAnnotation2.InnerColor = Color.Black;
redactionAnnotation2.BorderColor = Color.Red;

loadedPage.Annotations.Add(redactionAnnotation2);

//Save and close the PDF document.
loadedDocument.Save("output.pdf");
loadedDocument.Close(true);
Enter fullscreen mode Exit fullscreen mode

By executing this code example, you will get a PDF document like the following screenshot.

Marking text and image for redaction
Marking text and image for redaction

Find text and mark with the redaction annotation

Using the find-text feature, you can find a particular text in a PDF document and mark that text using PdfRedactionAnnotation.

The following code example shows the procedure to find and mark text for redaction using PdfRedactionAnnotation.

// Load an existing PDF.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument("Invoice.pdf");

TextSearchResultCollection searchCollection;

TextSearchItem text = new TextSearchItem("Invoice Number", TextSearchOptions.None);

//Search the text in PDF document.
loadedDocument.FindText(new List<TextSearchItem> { text }, out searchCollection);

//Iterate search collection to get search results.
foreach (KeyValuePair<int, MatchedItemCollection> textCollection in searchCollection)
{
//Get matched text collection.
foreach (MatchedItem textItem in textCollection.Value)
{
//Create redaction annotation.
PdfRedactionAnnotation redactionAnnotation1 = new PdfRedactionAnnotation();
//Assign the rectangle bounds to cover full invoice number.
redactionAnnotation1.Bounds =new RectangleF(textItem.Bounds.X,textItem.Bounds.Y,textItem.Bounds.Width+55,textItem.Bounds.Height);
//Set inner color of the annotation.
redactionAnnotation1.InnerColor = Color.Black;
//Set border color of the annotation.
redactionAnnotation1.BorderColor = Color.Red;
//Add annotation to the page.
loadedDocument.Pages[textCollection.Key].Annotations.Add(redactionAnnotation1);
}

}

loadedDocument.Save("Redact.pdf");
//Close the document.
loadedDocument.Close(true);
Enter fullscreen mode Exit fullscreen mode

In the previous code example, we used the FindText method to find the invoice number in the PDF document. Then, we created the PdfRedactionAnnotation to mark the bounds we get from the FindText method.

By executing this code example, you will get a PDF document like in the following screenshot.

Find text and mark with the redaction annotation
Find text and mark with the redaction annotation

Mark page or consecutive pages for redaction

You can mark a whole PDF page for redaction using PdfRedactionAnnotation. You can also iterate each page of a PDF document and add the PdfRedactionAnnotation with the bounds of the page.

The following code example shows the procedure to find and mark a whole page for redaction using PdfRedactionAnnotation.

//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument("Invoice.pdf");

//Iterate each page and add redaction annotation.
foreach (PdfLoadedPage page in loadedDocument.Pages)
{
//Create redaction annotation.
PdfRedactionAnnotation redactionAnnotation = new PdfRedactionAnnotation();
//Assign the page bounds to redaction annotation.
redactionAnnotation.Bounds = new System.Drawing.RectangleF(0, 0, page.Size.Width, page.Size.Height);
//Set inner color of the annotation.
redactionAnnotation.InnerColor = Color.Black;
//Set border color of the annotation.
redactionAnnotation.BorderColor = Color.Red;
//Add annotation to the page.
page.Annotations.Add(redactionAnnotation);
}

//Save and close the PDF document.
loadedDocument.Save("WholePage.pdf");
loadedDocument.Close(true);
Enter fullscreen mode Exit fullscreen mode

By executing this code example, you will get a PDF document like the following screenshot.

Marking consecutive pages for redaction
Marking consecutive pages for redaction

Change the look of redaction marks

You can set custom text and colors to a redaction annotation. This will help you provide the exact reason for the marked content, so anyone can easily review it and proceed further later.

The following code example shows the procedure to customize the appearance of redaction marks using the PdfRedactionAnnotation.

//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument("Invoice.pdf");

//Create redaction annotation.
PdfRedactionAnnotation redactionAnnotation = new PdfRedactionAnnotation();
//Assign the page bounds to redaction annotation.
redactionAnnotation.Bounds = new System.Drawing.RectangleF(60, 150,170, 120);
//Set inner color of the annotation.
redactionAnnotation.InnerColor = Color.Black;
//Set border color of the annotation.
redactionAnnotation.BorderColor = Color.Red;

//Font for the overlay text.
redactionAnnotation.Font = new PdfStandardFont(PdfFontFamily.Courier, 10);
//Text color.
redactionAnnotation.TextColor = Color.White;
//Text alignment.
redactionAnnotation.TextAlignment = PdfTextAlignment.Center;
//Set overlay text.
redactionAnnotation.OverlayText = "Confidential";
//Set repeat text option.
redactionAnnotation.RepeatText = true;
//Enable appearance of the annotation.
redactionAnnotation.SetAppearance(true);

//Add annotation to the page page.Annotations.Add(redactionAnnotation);
loadedDocument.Pages[0].Annotations.Add(redactionAnnotation);

//Save and close the PDF document.
loadedDocument.Save("CustomAppearance.pdf");
loadedDocument.Close(true);
Enter fullscreen mode Exit fullscreen mode

By executing this code example, you will get a PDF document like in the following screenshot.

Customizing the redaction marks
Customizing the redaction marks

Remove the marked content from PDF document

Finally, you can remove the sensitive content from a PDF document. It’s very easy; we just need to load each redaction annotation and then flatten it.

The following code example shows procedure to remove the marked content permanently by flattening the redaction annotation.

//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument("InvoiceMarked.pdf");

//Iterate each redaction annotation and apply flatten to redact marked content.
foreach(PdfAnnotation annotation in loadedDocument.Pages[0].Annotations)
{
if(annotation is PdfLoadedRedactionAnnotation)
{
(annotation as PdfLoadedRedactionAnnotation).Flatten = true;
}
}

//Save and close the PDF document.
loadedDocument.Save("Redacted.pdf");
loadedDocument.Close(true);
Enter fullscreen mode Exit fullscreen mode

By executing this code example, you will get a PDF document like in the following screenshot.

Removing the marked content from PDF
Removing the marked content from PDF

GitHub sample

You can download all these samples of PDF redaction annotation from the following GitHub repository: https://github.com/SyncfusionExamples/pdf-redaction-annotation-csharp.

Conclusion

In this blog post, we have learned how to use the PdfRedactionAnnotation method to mark content for redaction and then remove the content permanently. This will help you easily review sensitive data in a PDF document before redacting it.

Take a moment to peruse our documentation, where you’ll find other options and features, all with accompanying code examples.

If you have any questions about these features, please let us know in the comments below. You can also contact us through our support forum, Direct-Trac, or feedback portal. As always, we are happy to assist you!

If you like this article, we think you would also like the following articles about PDF Library:

Top comments (0)