DEV Community

Drazan-Jarak
Drazan-Jarak

Posted on

How to delete empty rows from tables in a Word document

Recently I have received a request from a customer that asked: ''How to delete empty rows from tables in a Word document?'' At first, this seemed like an easy task that can be resolved with a simple code snippet. For this assignment we will use Aspose.Words API. Here is the solution:

       //load the document
       Document doc = new Document(path+ "DocumentWithEmpty Rows.docx");
       //get all rows
       Node[] rows = doc.GetChildNodes(NodeType.Row, true).ToArray();
       //iterate throughthe rows and check if all the cells are empty
       foreach (Row row in rows)
       {
           bool removeRow = true;
           foreach (Cell cell in row.Cells)
           {
               if (cell.FirstParagraph != null)
                   removeRow = !cell.FirstParagraph.HasChildNodes;
           }
           //if all the cells in the row are empty remove the row
           if (removeRow)
               row.Remove();
       }
       doc.Save(MyDir + "DocumentWithoutEmptyRows.docx");

This snippet works well in simpler DOC(X) documents and satisfies basic needs.
But if there are bookmark nodes placed on a block (within the document) then the provided snippet throws an exception. To avoid this we will add two more lines of code before loading the document and make a correction in the first line:

       LoadOptions options = new LoadOptions();
       options.AnnotationsAtBlockLevel = false;
       //load the document
       Document doc = new Document(path+ "DocumentWithEmpty Rows.docx", options);

The other potential problems would be the following: "What if some cells contain empty spaces, new line characters/carriage returns, or what if some rows are nested within a cell and so on?" The following code snippet removes all the unnecessary symbols (spaces and new line characters) at the end of the cells and if all the cells within a row are empty then it deletes the row:

       LoadOptions options = new LoadOptions();
       options.AnnotationsAtBlockLevel = false;
        //load the document
       Document doc = new Document(path+ "DocumentWithEmpty Rows.docx", options);
       Words.Document doc = new Words.Document(MyDir + "Original_MultiRow_Columns - Copy.docx");
       Words.SectionCollection sections = doc.Sections;
       foreach (Words.Section section in sections)
           if (section != null)
               foreach (Words.Tables.Table t in section.Body.Tables)
                   foreach (Row r in t.Rows)
                   {
                       bool removeRow = true;
                       foreach (Cell c in r.Cells)
                       {
                           Node[] paragraphs = c.GetChildNodes(NodeType.Paragraph, true).ToArray();
                           for (int i = paragraphs.Length - 1; i >= 0; i--)
                           {
                               Words.Paragraph p = (Words.Paragraph)paragraphs[i];
                               if (p.ToString(SaveFormat.Text).Trim() == "")
                               {
                                   p.Remove();
                               }
                               else
                               {
                                   removeRow = false;
                                        break;
                               }
                           }
                       }
                       if (removeRow)
                           r.Remove();
                   }
       doc.Save(MyDir + "output20.3v1.docx")

I hope this short article can help someone.

Top comments (0)