PDFtoolkit VCL
Edit, enhance, secure, merge, split, view, print PDF and AcroForms documents
Compatibility
Delphi C++Builder

How To Extract XMP Metadata of a PDF Document

Extract meta data that others might miss.
By V. Subhash

Metadata means data about data. In PDF, the document properties such as title, subject, and keywords can be considered as meta data. Apart from this, applications may add other meta data, under the Adobe XMP specification.

PDFtoolkit provides a method TgtPDFDocument.GetXMLMetadata() to retrieve this meta data. The XMP specification requires that the meta data is stored in XML (eXtensible Markup Language).

Here is how you can extract the meta data.

{
 This program shows how to retrieve XML meta data
 of a PDF document.
}
program TgtCustomPDFDocument_GetXMLMetadata;

{$APPTYPE CONSOLE}

uses
  SysUtils, Classes, ShellApi,
  gtPDFDoc, gtCstPDFDoc;
var
  gtPDFDocument1: TgtPDFDocument;
  strXMLData: String;
  strXMLDataList: TStringList;
begin

  // Create a document object
  gtPDFDocument1 := TgtPDFDocument.Create(Nil);

  try
    // Load a document
    gtPDFDocument1.LoadFromFile('sample_doc.pdf');

    // Check if the document was loaded successfully
    if gtPDFDocument1.IsLoaded then
      begin
        // Obtain XML meta data
        strXMLData := gtPDFDocument1.GetXMLMetadata;

        // Write XML meta data to a text file
        strXMLDataList := TStringList.Create;
        strXMLDataList.Add(strXMLData);
        strXMLDataList.SaveToFile('sample_doc_pdf.xml');
        strXMLDataList.Free;

        // Launch the XML text file
        ShellExecute(0, 'open', 
                     'sample_doc_pdf.xml',nil,nil,1) ;
      end
    else
      Writeln('Sorry, I could not load sample_doc.pdf.');
  except
    on Err:Exception do
      begin
        Writeln('Sorry, an exception was raised. ');
        Writeln(Err.Classname + ': ' + Err.Message);
      end;
  end;

  // Free resources
  gtPDFDocument1.Reset;
  // Destroy PDF document object
  FreeAndNil(gtPDFDocument1);

end.
Screenshot of XML meta data extracted from a PDF document

---o0O0o---

Our .NET Developer Tools
Gnostice Document Studio .NET

Multi-format document-processing component suite for .NET developers.

PDFOne .NET

A .NET PDF component suite to create, edit, view, print, reorganize, encrypt, annotate, and bookmark PDF documents in .NET applications.

Our Delphi/C++Builder developer tools
Gnostice Document Studio Delphi

Multi-format document-processing component suite for Delphi/C++Builder developers, covering both VCL and FireMonkey platforms.

eDocEngine VCL

A Delphi/C++Builder component suite for creating documents in over 20 formats and also export reports from popular Delphi reporting tools.

PDFtoolkit VCL

A Delphi/C++Builder component suite to edit, enhance, view, print, merge, split, encrypt, annotate, and bookmark PDF documents.

Our Java developer tools
Gnostice Document Studio Java

Multi-format document-processing component suite for Java developers.

PDFOne (for Java)

A Java PDF component suite to create, edit, view, print, reorganize, encrypt, annotate, bookmark PDF documents in Java applications.

Our Platform-Agnostic Cloud and On-Premises APIs
StarDocs

Cloud-hosted and On-Premises REST-based document-processing and document-viewing APIs

Privacy | Legal | Feedback | Newsletter | Blog | Resellers © 2002-2023 Gnostice Information Technologies Private Limited. All rights reserved.