Recently I have received a request from a customer that asked: ''How to delete empty rows from tables in a Word document?'' At first, this seemed like an easy task that can be resolved with a simple code snippet. For this assignment we will use Aspose.Words API. Here is the solution:
//load the document
Document doc = new Document(path+ "DocumentWithEmpty Rows.docx");
//get all rows
Node[] rows = doc.GetChildNodes(NodeType.Row, true).ToArray();
//iterate throughthe rows and check if all the cells are empty
foreach (Row row in rows)
{
bool removeRow = true;
foreach (Cell cell in row.Cells)
{
if (cell.FirstParagraph != null)
removeRow = !cell.FirstParagraph.HasChildNodes;
}
//if all the cells in the row are empty remove the row
if (removeRow)
row.Remove();
}
doc.Save(MyDir + "DocumentWithoutEmptyRows.docx");
This snippet works well in simpler DOC(X) documents and satisfies basic needs.
But if there are bookmark nodes placed on a block (within the document) then the provided snippet throws an exception. To avoid this we will add two more lines of code before loading the document and make a correction in the first line:
LoadOptions options = new LoadOptions();
options.AnnotationsAtBlockLevel = false;
//load the document
Document doc = new Document(path+ "DocumentWithEmpty Rows.docx", options);
The other potential problems would be the following: "What if some cells contain empty spaces, new line characters/carriage returns, or what if some rows are nested within a cell and so on?" The following code snippet removes all the unnecessary symbols (spaces and new line characters) at the end of the cells and if all the cells within a row are empty then it deletes the row:
LoadOptions options = new LoadOptions();
options.AnnotationsAtBlockLevel = false;
//load the document
Document doc = new Document(path+ "DocumentWithEmpty Rows.docx", options);
Words.Document doc = new Words.Document(MyDir + "Original_MultiRow_Columns - Copy.docx");
Words.SectionCollection sections = doc.Sections;
foreach (Words.Section section in sections)
if (section != null)
foreach (Words.Tables.Table t in section.Body.Tables)
foreach (Row r in t.Rows)
{
bool removeRow = true;
foreach (Cell c in r.Cells)
{
Node[] paragraphs = c.GetChildNodes(NodeType.Paragraph, true).ToArray();
for (int i = paragraphs.Length - 1; i >= 0; i--)
{
Words.Paragraph p = (Words.Paragraph)paragraphs[i];
if (p.ToString(SaveFormat.Text).Trim() == "")
{
p.Remove();
}
else
{
removeRow = false;
break;
}
}
}
if (removeRow)
r.Remove();
}
doc.Save(MyDir + "output20.3v1.docx")
I hope this short article can help someone.
Top comments (0)