browse
Optical Character Recognition (OCR) for Data Loss Prevention (DLP) leverages machine learning to extract text from images. This expands DLP coverage to image files, ensuring that text contained within image files, as well as within embedded images in documents, is scanned to prevent the exfiltration of sensitive data.
How can you enable OCR?
OCR is enabled for all DLP customers and applies to all DLP rules by default.
Whenever DLP needs to scan an image file, or a file containing embedded images, it will use OCR to automatically extract text from those images and search for any DLP violations based on the DLP rules.
Is OCR fully supported by multimode DLP?
Yes. OCR is applied to data-in-motion using Realtime DLP and data-at-rest using SaaS API DLP.
Which file types are supported by OCR?
OCR supports the following file types: JPEG, JPG, PNG, GIF, TIFF, and EMF.
Where can I find more information?
Refer to the Secure Access and Umbrella documentation for guidance on the supported file type.
Secure Access: Supported File and Form Types
Umbrella: Supported File and Form Types