DEV Community

Cover image for How to Add Programmatic Collaboration Features to PDFs in .NET C#
Chelsea Devereaux for MESCIUS inc.

Posted on • Updated on • Originally published at developer.mescius.com

How to Add Programmatic Collaboration Features to PDFs in .NET C#

The Setup: Medical research and all the paths and complexities involved with this research tend to be quite a web of knowledge that can easily get crossed, incorrectly referenced, or even plagiarized (purposefully or inadvertently). However, there are ways to help streamline this process, especially when working with multiple authors and/or institutions in geographically dispersed regions. Specifically, a tool that allows researchers to search for specific words and/or phrases within a document or set of documents and then automatically highlight those words or phrases so the documents can be reviewed and updated.

Why is this important? Because for most large research projects, a myriad of PhDs, Ph.D. candidates, and research associates work on various stages of the project. Many may or may not know the number of people involved in the project. Often, on purpose, several groups perform similar or even identical research to create baselines or discover abnormalities in current status quo settings of processes or research. With all these moving parts, it can be difficult for researchers and their teams to ensure the relevance of their outcomes and uniquely report them to avoid plagiarizing any other research in their niche.

To do this, research into the unique project (ongoing) needs to occur, as well as research into similar existing projects and publications. As those who run these labs begin to think about publishing, the latter becomes particularly important as they will need to reference any existing research or publications that may have been utilized, either through research protocols or direct quotes from existing projects.

It’s certainly possible with word processing tools to find language in documents such as Microsoft Word or OpenOffice. However, many, if not all, of the current publications stored at NIH (National Institutes of Health) and other locations are usually stored in PDF format because of the secure nature of the documentation. This presents a bit of an issue, especially for the technology team tasked with creating this tool.

The Use Case: Your team has been selected to help a group of researchers streamline their processes. They have asked you to create a tool that can load multiple PDF files and find text or phrases within these files that researchers type in, in an ad-hoc fashion. These words and/or phrases must then be highlighted in all the documents with a color determined by the team to bring attention to these document areas for further review.

The Proposed Solution: After reviewing all their needs, you have suggested a two-pronged approach to helping them achieve their goals. First, utilizing C# .NET and the GrapeCity GcPdf API, you will create a .NET 7 application that allows the researchers to search any PDF document easily, highlight the search terms and save the newly highlighted document. The second portion of the solution is to provide a web-based JavaScript PDF reader application that allows users to highlight words in PDF documents.

Highlight words in PDF documents using PDF API

GrapeCity Documents for PDF (GcPdf) API supports finding occurrences of a word in a PDF document and highlighting them using TextMarkupAnnotation. The annotation helps to add highlights, underlines, strikeouts, or jagged (“squiggly”) underlines to words using the TextMarkupType enum.

Use the following code to highlight a word with specific color using GcPdf API:

    // Find all occurrences of the word "childbirths":
    var found = doc.FindText(new FindTextParams("childbirths", true, false), null);

    // Add a text markup annotation to highlight each occurrence:
    foreach (var f in found)
    {
            var markup = new TextMarkupAnnotation()
            {
                Page = doc.Pages[f.PageIndex],
                MarkupType = TextMarkupType.Highlight,
                Color = Color.Yellow
            };
            foreach (var b in f.Bounds)
            markup.Area.Add(b);
    }
    // Done:
    doc.Save(stream);
Enter fullscreen mode Exit fullscreen mode

PDF Collaboration C#

The snapshot above shows multiple occurrences of the word ‘childbirths’ highlighted using GcPdf API.

Highlight words in PDF documents using PDF Viewer

You can highlight, strikeout, underline text or add a squiggly line with a new set of annotations in Javascript-based GrapeCity Documents PDF Viewer (GcPdfViewer). The following options are available:

  • Highlight, Underline, Squiggly, and StrikeOut annotations in the Quick edit tools toolbar and Annotation Editor toolbar

PDF Collaboration C#

  • Text markup context menu with new options visible when the text is selected. The options are also available in the default context menu

PDF Collaboration C#

  • Button keys for new annotations to add in the toolbar -'edit-highlight', 'edit-underline', 'edit-squiggly', 'edit-strike-out'
  • Enable or disable the Text markup context menu
  • Change the list of colors available in the context menu through code

PDF Collaboration C#

To highlight words in PDF Viewer -

1. Configure GcPdfViewer for Edit PDF options to show Annotation Editor in Toolbar.

2. Open the desired PDF in the viewer from the Open button in the toolbar.

3. From the main toolbar, choose ‘Text tools’.

PDF Collaboration C#

4. Begin editing by adding the Highlight, Underline, Squiggly, and StrikeOut annotations on the PDF at the desired locations.

PDF Collaboration C#

5. Alternatively, click ‘Annotation editor’ on the left sidebar.

  1. Highlight, Underline, Squiggly, and StrikeOut annotations options would be available in the main toolbar.

7. After adding the annotations, save the PDF. The saved PDF would reflect the newly added annotations.

References:

What do you think about the Text markup features of GcPdf, GcPdfViewer? Please leave a comment below. Thanks!

Top comments (0)