Settings for the image recognition. Contains elements that allow customizing the recognition process. More...
#include <C:/Users/figor/programming/project/aspose.ocr-for-cpp/doc_gen/aspose_ocr.h>
Public Attributes | |
bool | all_image = false |
Disabled (false) by default. Turning on means recognizing the image as a single area. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt. | |
bool | correct_skew = true |
Enabled (true) by default. Detects orientation and auto-rotate image if needed. | |
bool | upscale_small_font = false |
Allows you to use additional algorithms specifically for small font recognition. Useful for images with small-size characters. | |
bool | lines_filtration = false |
Disabled (false) by default. Allows to recognize text in the tables (regions surrounded lines). | |
const wchar_t * | alphabet = NULL |
L"" by default (all alphabet allowed). Set of allowed characters in the alphabet (symbols for recognition) . | |
const wchar_t * | ignoredCharacters = L"" |
L"" by default (all alphabet allowed). Sets blacklist for recognition symbols. | |
export_format | format = export_format::text |
Choose result format: simple text or JSON-formatted text saved in wchar_t* buffer. Default simple text. Supported formats: text, json. | |
rect * | rectangles = NULL |
Choose areas for recognition. rect rectangles[2] = { { 3, 50, 100, 70 }, { 3, 160, 100, 75 } };. | |
size_t | rectangles_size = 0 |
Set areas for recognition size. | |
rect * | preprocess_area = NULL |
User area to be pre-processed rect are = {3 , 50, 100, 100}. | |
double | skew = 0 |
Rotate image on specified angle. Doesn't work if rectangles aDere specified. | |
language | language_alphabet = language::none |
Multi-language by default. Language used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi) | |
file_format | save_format = file_format::txt |
Choose result save format for "page_save" method. Default format - txt. Supported formats: file_format::docx, file_format::txt, file_format::pdf, file_format::xlsx, file_format::json, file_format::xml, file_fromat::rtf. Doesn't work for other methods. | |
int | threshold_value = 0 |
Sets custom threshold value for image binarization. Range from 1 to 255. | |
custom_preprocessing_filters | filters |
Allows to prepare the image for OCR by adjusting pre-processing methods. Allows to set 12 filters. Example to set: RecognitionSettings settings; settings.filters.filter_1 = OCR_IMG_PREPROCESS_GRAYSCALE; settings.filters.filter_2 = OCR_IMG_PREPROCESS_SCALE(2); settings.filters.filter_3 = OCR_IMG_PREPROCESS_THRESHOLD(200);. | |
characters_allowed_type | allowed_characters = characters_allowed_type::ALL |
Allowed characters set. Determines the type of characters allowed for recognition result. allowed_characters contains enum characters_allowed_type value. | |
bool | auto_contrast = false |
Allows using an additional contrast correction algorithm for the image before recognition. | |
bool | auto_denoising = false |
Enables the use of an additional neural network for the image before recognition. Useful for images with noice, spots, flares, gradients, foreign elements. | |
detect_areas_mode_enum | detect_areas_mode = detect_areas_mode_enum::DOCUMENT |
Allows to select the optimal mode for document type areas: document, photo, plain text, column, image. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt. | |
unsigned int | defects = defect_type::ASPOSE_OCR_NONE |
They will determine what types of defects need to be recognized at the moment. use case 1: defect_type = defect_type::ASPOSE_OCR_DETECT_DARK_IMAGES | defect_type::ASPOSE_OCR_DETECT_SALT_PEPPER_NOISE; use case 2: defect_type = defect_type::ASPOSE_OCR_DETECT_ALL;. | |
Settings for the image recognition. Contains elements that allow customizing the recognition process.
bool RecognitionSettings::all_image = false |
Disabled (false) by default. Turning on means recognizing the image as a single area. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt.
characters_allowed_type RecognitionSettings::allowed_characters = characters_allowed_type::ALL |
Allowed characters set. Determines the type of characters allowed for recognition result. allowed_characters contains enum characters_allowed_type value.
const wchar_t* RecognitionSettings::alphabet = NULL |
L"" by default (all alphabet allowed). Set of allowed characters in the alphabet (symbols for recognition) .
bool RecognitionSettings::auto_contrast = false |
Allows using an additional contrast correction algorithm for the image before recognition.
bool RecognitionSettings::auto_denoising = false |
Enables the use of an additional neural network for the image before recognition. Useful for images with noice, spots, flares, gradients, foreign elements.
bool RecognitionSettings::correct_skew = true |
Enabled (true) by default. Detects orientation and auto-rotate image if needed.
unsigned int RecognitionSettings::defects = defect_type::ASPOSE_OCR_NONE |
They will determine what types of defects need to be recognized at the moment. use case 1: defect_type = defect_type::ASPOSE_OCR_DETECT_DARK_IMAGES | defect_type::ASPOSE_OCR_DETECT_SALT_PEPPER_NOISE; use case 2: defect_type = defect_type::ASPOSE_OCR_DETECT_ALL;.
detect_areas_mode_enum RecognitionSettings::detect_areas_mode = detect_areas_mode_enum::DOCUMENT |
Allows to select the optimal mode for document type areas: document, photo, plain text, column, image. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt.
custom_preprocessing_filters RecognitionSettings::filters |
Allows to prepare the image for OCR by adjusting pre-processing methods. Allows to set 12 filters. Example to set: RecognitionSettings settings; settings.filters.filter_1 = OCR_IMG_PREPROCESS_GRAYSCALE; settings.filters.filter_2 = OCR_IMG_PREPROCESS_SCALE(2); settings.filters.filter_3 = OCR_IMG_PREPROCESS_THRESHOLD(200);.
export_format RecognitionSettings::format = export_format::text |
Choose result format: simple text or JSON-formatted text saved in wchar_t* buffer. Default simple text. Supported formats: text, json.
const wchar_t* RecognitionSettings::ignoredCharacters = L"" |
L"" by default (all alphabet allowed). Sets blacklist for recognition symbols.
language RecognitionSettings::language_alphabet = language::none |
Multi-language by default. Language used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi)
bool RecognitionSettings::lines_filtration = false |
Disabled (false) by default. Allows to recognize text in the tables (regions surrounded lines).
false
allows increase performance and don't detect tables and remove lines; otherwise true
. The default is false
.
rect* RecognitionSettings::preprocess_area = NULL |
User area to be pre-processed rect are = {3 , 50, 100, 100}.
rect* RecognitionSettings::rectangles = NULL |
Choose areas for recognition. rect rectangles[2] = { { 3, 50, 100, 70 }, { 3, 160, 100, 75 } };.
size_t RecognitionSettings::rectangles_size = 0 |
Set areas for recognition size.
file_format RecognitionSettings::save_format = file_format::txt |
Choose result save format for "page_save" method. Default format - txt. Supported formats: file_format::docx, file_format::txt, file_format::pdf, file_format::xlsx, file_format::json, file_format::xml, file_fromat::rtf. Doesn't work for other methods.
double RecognitionSettings::skew = 0 |
Rotate image on specified angle. Doesn't work if rectangles aDere specified.
int RecognitionSettings::threshold_value = 0 |
Sets custom threshold value for image binarization. Range from 1 to 255.
bool RecognitionSettings::upscale_small_font = false |
Allows you to use additional algorithms specifically for small font recognition. Useful for images with small-size characters.