Image OCR
Image OCR Pack.#
This Integration is part of theUse the Image OCR integration to extract text from images. The integration utilizes the open-source tesseract OCR engine.
#
Use Cases- Extract text from images included in emails during a phishing investigation.
- Extract text from images included in an html page.
#
Configure Image OCR in CortexParameter | Description | Required |
---|---|---|
A CSV of language codes of the language to use for OCR (leave empty to use defaults). | The default language used for OCR is English. Use this parameter to specify a list of additional languages. For example, eng,fra . To see all supported language codes, use the image-ocr-list-languages command. | False |
Skip on corrupted images | If true, will not raise an error if the image is corrupted and could not be processed. | False |
#
CommandsYou can execute these commands from the CLI, as part of an automation, or in a playbook. After you successfully execute a command, a DBot message appears in the War Room with the command details.
#
image-ocr-list-languagesLists supported languages for which the integration can extract text.
#
Base Commandimage-ocr-list-languages
#
InputThere are no input arguments for this command.
#
Command Example!image-ocr-list-languages
#
Human Readable Output#
Image OCR Supported Languages
- ara
- chi_sim
- chi_sim_vert
- chi_tra
- chi_tra_vert
- deu
- eng
- fra
- heb
- ita
- jpn
- jpn_vert
- pol
- por
- rus
- spa
- swe
- tur
#
image-ocr-extract-textExtracts text from an image.
#
Base Commandimage-ocr-extract-text
#
InputArgument Name | Description | Required |
---|---|---|
entryid | A comma-separated list of Entry IDs of image files to process. | Required |
langs | A CSV of language codes of the language to use for OCR. Overrides the default configured language list. | Optional |
verbose | Turn on verbose flag to display tesseract and other used libraries versions. | Optional |
#
Context OutputPath | Type | Description |
---|---|---|
File.Text | String | Extracted text from the passed image file. |
#
Command Example!image-ocr-extract-text entryid="922@e84104f7-b235-4d82-860a-ea09f5dc0559"
#
Context Example#
Human Readable Output#
Image OCR Extracted Text for Entry ID 1613@1e6b4a55-33e7-433b-8f6f-2c0751c8c444The quick brown fox jumped over the 5 lazy dogs!