How to Extract Metadata from Presentation Templates using .NET Parsing API

#net #parsing #api #metadata

When it comes to manipulating documents within your applications, the development options are endless. Organizations around the globe, regardless of the niche, regularly incorporate multitudes of innovative functionalities into each of their business scenarios for improving productivity. Extracting different types of information from multi-format documents is one such requisite. However, one primary concern is the accuracy or validity of the extracted data, not all software applications provide developers with highly accurate data extraction functionality.

Therefore, when looking for applications which could provide you with precise extraction of raw and formatted text as well as metadata from many different types of well-known file formats on .NET platform, GroupDocs.Parser for .NET must be considered. Apart from the basic data extraction features this document text extraction API does provide, app developers can use it for extracting text and metadata from various text and presentation templates with the
help of the latest API version. Another important feature is the ability to programmatically fetch tables from PDF documents within your .NET apps. And while working with this functionality, you can create table bounds manually or let the API identify the layout in automatic mode.

In addition to this, you have access to the features of detecting media type of your password-protected Office OpenXML documents and batch document processing –
http://bit.ly/2QuFPsr

Following code samples show how to extract text and metadata from templates:
// Extracting Text void ExtractText(string fileName) { // Extract a text from the file var text = Extractor.Default.ExtractText(fileName); // Print an extracted text Console.WriteLine(text); } // Extracting Metadata void ExtractMetadata(string fileName) { // Extract metadata from the file var metadata = Extractor.Default.ExtractMetadata(fileName); // Print extracted metadata foreach (var m in metadata) { // Print a metadata key Console.Write(m.Key); Console.Write(": "); // Print a metadata value Console.WriteLine(m.Value); } }

Below code sample shows how to detect media type in password-protected Office OpenXML documents:

// Create load options LoadOptions loadOptions = new LoadOptions(); // Set a password loadOptions.Password = "password"; // Get a default composite media type detector var detector = CompositeMediaTypeDetector.Default; // Create a stream to detect media type by content (not file extension) using (var stream = File.OpenRead(Common.GetFilePath(fileName))) { // Detect a media type var mediaType = detector.Detect(stream, loadOptions); // Print a detected media type Console.WriteLine(mediaType); }

DEV Community

How to Extract Metadata from Presentation Templates using .NET Parsing API

Top comments (0)

Read next

🎙️VoiceMath➕: Speak, Solve, Master Math! 🧠

A Beginner’s Guide to Building APIs with Express.js

Fun Flashcards Game for Kids using AI Speech Recognition

🌐 100+ Free APIs for Developers in 2024 🚀