DEV Community

loading...

Reading a Pdf file in C# using IronPdf; Step by Step Guide:

Mehr Muhammad Hamza
Updated on ・6 min read

Reading a Pdf file in C# using IronPdf; Step by Step Guide:

image

If you're a developer, you must have gone through the problems, where you need to read text from a pdf file. For example:

  1. You are developing an application, which takes two pdf documents as input and finds the similarity between files.
  2. You are developing an application which needs to read pdf files and return its word count.
  3. You are developing an application which extracts data from your pdf file and puts it in some structured database.
  4. You are developing an application which needs to extract pdf text and convert them into string.

Extracting data from a pdf file using C# was a difficult and complex task until IronPdf launched.

IronPdf is a library which makes a developers’ life easy, when it comes to reading pdf files.

You can explore more about IronPdf from this link.

You can read pdf files and display it in C# Textbox by using just two lines of code. Yes! Just two lines of code. You can also extract all the images in your pdf files and you can create another document of that images or display them in your Application as per your requirement.

Are you wondering how..?

I will show you how. I will show you step by step with the application ,which will enable you to select any pdf files and then it will display it’s content.

Prerequisite Knowledge:

  1. Basic Knowledge of C# Programming
  2. Basic Knowledge of C# GUI Controls

I have designed this tutorial in such a way that even a person with no programming background can understand this.

Who should read this.?

  1. Any newbie who is learning C# should also know how to read pdf files, because this is something you are definitely going to use this in your career.
  2. Professional developer should also read this to understand the Ironpdf Library which helps us to read, generate, manipulate pdf files.

Now Let's see, how we can use this Library in our Project to read the pdf files.

I am making Window form APP for demonstration. you can develop Console Application, WPF Application or ASP.Net web Application according to your choice or need.

The interesting thing is this library can be used with both C# and VB.Net.

Now Let's begin the demonstration without any further delay.

Step # 1: Create Visual Studio Project:

Open Visual Studio. I am using Visual Studio 2019.
image
Click on Create New Project
image
Now, Select Windows Form App from Template, and Press Next, Following Window will appear. Write Project Name. I have written Read Pdf using IronPdf.
image
Now, Click Next, Following WIndow will appear. Select .Net Core 3.1 from Drop down Menu.
image
Click on the "Create" Button, Project will be created as shown below.
image

Step # 2: Install Nuget Package of IronPdf:

Now, we have to download Nuget Package to create solution. Click on Project Menu from Menu Bar, Drop Down List will appear. Select Manage Nuget Packages, and click on it. The following window will appear.
image
Now, Click on the "Browse" tab. The Following window will appear.
image

Type IronPdf in Search Box, and Press Enter. The following window will appear.
image
Select and Click on IronPdf. The following Window will appear.
image
Press on Install Button, and wait for installation to complete. The following window will appear after successful Installation.
image
Press Ok Button, and you are good to go.

Following Readme.Txt file will open.
image
I suggest you go through all links and explore more about this Library.

Step # 3: Design Window Form:

Project is created and Nuget Package is installed. Next step is to Design Window form that will ask the user to browse for a file and display its content.

Open Form1 Design
image
Click on the Toolbar that is on the left side of the window.
image
Search for Label, Drag and Drop it on Form Design
Name the Label. I have name it as C# Read Pdf using IronPdf
image
Now Drag and Drop one Text Box (for showing file Path), Three Buttons (One for Browse the file and another for Read the Pdf file using IronPdf and Other Button for Clear the Text fields), one Rich Text Box (for read and Display the File Content).

Set Read Only Property for Text Box and Rich Text Box to False. So that user can only read the content and file path.

image

Step # 4: Add the Back End Code for Browse the File:

Double Click on the Browse Button, The following code will appear.

private void Browse_Click(object sender, EventArgs e)
        {

        }
Enter fullscreen mode Exit fullscreen mode

Now, Write following code inside Browse_Click Function

private void Browse_Click(object sender, EventArgs e)
        {
            OpenFileDialog BrowseFile = new OpenFileDialog
            {
                InitialDirectory = @"D:\",
                Title = "Browse Pdf Files",

                CheckFileExists = true,
                CheckPathExists = true,

                DefaultExt = "pdf",
                Filter = "pdf files (*.pdf)|*.pdf",
                FilterIndex = 2,
                RestoreDirectory = true,

                ReadOnlyChecked = true,
                ShowReadOnly = true
            };

            if (BrowseFile.ShowDialog() == DialogResult.OK)
            {
                FilePath.Text = BrowseFile.FileName;
            }
        }
Enter fullscreen mode Exit fullscreen mode

OpenFileDialogue will create the instance of the File Dialogue control of Windows form.

I have set Initial Path to D Drive, you can set it to any.

I have set DefaultExt = “pdf” as we have to just read the pdf file.

I have used a filter so that the browse file dialog will only show you the pdf files to select.

When the user clicks Ok, It will show the file path in the File Path field.

Let's run the Project and Test Browse Button.
image
Press on the Browse Button, The following window will appear.
image
Select File, I am selecting IronPdfTest.pdf and Press Open. The following window will appear.
image
Now let's write the Code behind the Read Button to read the file.

Step # 5: Add the Back End Code for Read Pdf file using Iron Pdf:

You might be thinking that code for reading pdf files will be complex and difficult to write / understand.

Don’t worry. Ironpdf has made it easier and simpler. We can easily read pdf files using just two lines of code.

Go to Form1 Design, double click on Read Button, Following code will appear.

private void Read_Click(object sender, EventArgs e)
        {

        }
Enter fullscreen mode Exit fullscreen mode

Add following code for importing IronPdf library on the top of the .cs file.

using IronPdf;
using System;
using System.Windows.Forms;
Enter fullscreen mode Exit fullscreen mode

Write the following code inside the Read_Click function.

private void Read_Click(object sender, EventArgs e)
        {
            PdfDocument PDF = PdfDocument.FromFile(FilePath.Text);
            FileContent.Text = PDF.ExtractAllText(); 
        }
Enter fullscreen mode Exit fullscreen mode

Now, Let’s write the code behind Clear Button.

Double Click on Clear Button, It will take you to the following code.

private void Clear_Click(object sender, EventArgs e)
        {
        }
Enter fullscreen mode Exit fullscreen mode

Write following Code inside Clear_Click Function

private void Clear_Click(object sender, EventArgs e)
        {
            FileContent.Text = "";
            FilePath.Text = "";
        }
Enter fullscreen mode Exit fullscreen mode

Run the Project.

image
Click on the Browse Button and select the file that you want to read. In my case I am reading the IronPdf.pdf file.

image
Press Open Button, The following Window will appear.
image
Press on Read Button. It will read the file and display the content as shown below.

image

This is the completion of the guide. I hope it was easy for you to follow and understand. If you have a query feel free to ask in the comments.

You can download this solution from here.

Discussion (0)