Skip to main content

ReadPDFFileV2

This Script is part of the Common Scripts Pack.#

Load a PDF file's content and metadata into context. Supports extraction of hashes, urls, and emails when available.

Script Data#


NameDescription
Script Typepython3
TagsUtility, ingestion
Cortex XSOAR Version4.1.0+

Inputs#


Argument NameDescription
entryIDThe War Room entryID of the file to read.
userPasswordThe password for the file, if encrypted.
maxImagesThe maximum number of images to extract from the PDF file.
unescape_urlTo unescape URLs that have been escaped as part of the URLs extraction. Invalid characters will be ignored. Default is true.

Outputs#


PathDescriptionType
URL.DataThe list of URLs that were extracted from the PDF file.String
File.TextThe text that was extracted from the PDF file.String
File.ProducerThe producer of the PDF file.String
File.TitleThe title of the PDF file.String
File.AuthorThe author of the PDF file.String
File.ModDateThe ModDate of the PDF file.Date
File.CreationDateThe CreationDate of the PDF file.Date
File.PagesThe number of pages in the PDF file.String
File.SizeThe file size in bytes.Number
File.FormThe PDF form type.String
File.EncryptedWhether the file is encrypted.String
File.FileSizeThe file size in bytes.String
File.SHA1The SHA1 file hash of the file.String
File.PageRotThe page rotation of the PDF file.String
File.OptimizedWhether the page has been optimized.String
File.SHA256The SHA256 file hash of the file.String
File.PDFVersionThe PDF version.String
File.NameThe name of the PDF file.String
File.CreatorThe creator of the PDF file.String
File.TaggedWhether the file has tagged meta-information.String
File.SSDeepThe SSDeep hash of the file.String
File.EntryIDThe Entry ID of the file.String
File.JavaScriptWhether the file is in JavaScript.String
File.InfoThe additional information about the file.String
File.PageSizeThe PDF file page size.String
File.TypeThe file type.String
File.SuspectsIndicates the presence of tag suspects.String
File.MD5The MD5 file hash of the file.String
File.UserPropertiesIndicates the presence of the structure elements that contain user properties attributes.String
File.ExtensionThe file's extension.String
Account.EmailThe email address of the account.String
Hashes.typeThe hash type extracted from the PDF file.String
Hashes.valueThe hash value extracted from the PDF file.String