DocumentRecognitionSettings
Inheritance: java.lang.Object
public class DocumentRecognitionSettings
Settings for the pdf recognition. Contains elements that allow customizing the recognition process.
Constructors
| Constructor | Description |
|---|---|
| DocumentRecognitionSettings(int pagesNumber) | Initializes a new instance of the @see #DocumentRecognitionSettings class with default properties. |
| DocumentRecognitionSettings(int startPage, int pagesNumber) | Initializes a new instance of the @see #DocumentRecognitionSettings class with short set of properties. |
| DocumentRecognitionSettings(int startPage, int pagesNumber, Language language, boolean detectAreas, boolean autoSkew, int threshold) | Initializes a new instance of the @see #DocumentRecognitionSettings class with full set of properties. |
Methods
| Method | Description |
|---|---|
| setDetectAreas(boolean detectAreas) | |
| setAutoSkew(boolean autoSkew) | |
| setLanguage(Language language) | |
| setThresholdValue(int thresholdValue) | |
| setIgnoredCharacters(String ignoredCharacters) | |
| setLinesFiltration(boolean linesFiltration) | |
| setStartPage(int startPage) | |
| setPagesNumber(int pagesNumber) | |
| setThreadsCount(int threadsCount) | Gets or sets the number of threads for processing. |
| setAutoContrast(boolean autoContrast) | Allows using an additional contrast correction algorithm for the image before recognition. |
| setAllowedCharacters(CharactersAllowedType allowedCharacters) | Allowed characters set. |
| setDetectAreasMode(DetectAreasMode detectAreasMode) | Determines the type of neural network used for areas detection. |
| getStartPage() | First page in pdf file to extract images. |
| getPagesNumber() | Total amount of pages from pdf file to extract i,ages (start with startPage). |
DocumentRecognitionSettings(int pagesNumber)
public DocumentRecognitionSettings(int pagesNumber)
Initializes a new instance of the @see #DocumentRecognitionSettings class with default properties. Demands to set pagesNumber. Set 0 to recognize all pages in document.
Parameters:
| Parameter | Type | Description |
|---|---|---|
| pagesNumber | int | Set the number of pages for recognition multipage pdf file. |
DocumentRecognitionSettings(int startPage, int pagesNumber)
public DocumentRecognitionSettings(int startPage, int pagesNumber)
Initializes a new instance of the @see #DocumentRecognitionSettings class with short set of properties.
Parameters:
| Parameter | Type | Description |
|---|---|---|
| startPage | int | Set the first page for recognition. |
| pagesNumber | int | Set the number of pages for recognition multipage pdf file. |
DocumentRecognitionSettings(int startPage, int pagesNumber, Language language, boolean detectAreas, boolean autoSkew, int threshold)
public DocumentRecognitionSettings(int startPage, int pagesNumber, Language language, boolean detectAreas, boolean autoSkew, int threshold)
Initializes a new instance of the @see #DocumentRecognitionSettings class with full set of properties.
Parameters:
| Parameter | Type | Description |
|---|---|---|
| startPage | int | Set the first page for recognition. 0 by default. |
| pagesNumber | int | Set the number of pages for recognition multipage pdf file. |
| language | Language | Language used for OCR. |
| detectAreas | boolean | Enable automatic text areas detection. |
| autoSkew | boolean | Enable automatic image skew correction. |
| threshold | int | Custom image binarization threshold |
setDetectAreas(boolean detectAreas)
public void setDetectAreas(boolean detectAreas)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| detectAreas | boolean |
setAutoSkew(boolean autoSkew)
public void setAutoSkew(boolean autoSkew)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| autoSkew | boolean |
setLanguage(Language language)
public void setLanguage(Language language)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| language | Language |
setThresholdValue(int thresholdValue)
public void setThresholdValue(int thresholdValue)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| thresholdValue | int |
setIgnoredCharacters(String ignoredCharacters)
public void setIgnoredCharacters(String ignoredCharacters)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| ignoredCharacters | java.lang.String |
setLinesFiltration(boolean linesFiltration)
public void setLinesFiltration(boolean linesFiltration)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| linesFiltration | boolean |
setStartPage(int startPage)
public void setStartPage(int startPage)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| startPage | int |
setPagesNumber(int pagesNumber)
public void setPagesNumber(int pagesNumber)
Parameters:
| Parameter | Type | Description |
|---|---|---|
| pagesNumber | int |
setThreadsCount(int threadsCount)
public void setThreadsCount(int threadsCount)
Gets or sets the number of threads for processing. By default, 0 means that the image will be processed with the number of threads equal to your number of processors. ThreadsCount = 1 means that the image will be processed in the main thread.
Parameters:
| Parameter | Type | Description |
|---|---|---|
| threadsCount | int | the number of threads that will be created for parallel recognition of image fragments. |
setAutoContrast(boolean autoContrast)
public void setAutoContrast(boolean autoContrast)
Allows using an additional contrast correction algorithm for the image before recognition.
Parameters:
| Parameter | Type | Description |
|---|---|---|
| autoContrast | boolean | contains boolean value - a contrast correction filter is set. |
setAllowedCharacters(CharactersAllowedType allowedCharacters)
public void setAllowedCharacters(CharactersAllowedType allowedCharacters)
Allowed characters set. Determines the type of characters allowed for recognition result.
Parameters:
| Parameter | Type | Description |
|---|---|---|
| allowedCharacters | CharactersAllowedType | contains enum @see CharactersAllowedType value. |
setDetectAreasMode(DetectAreasMode detectAreasMode)
public void setDetectAreasMode(DetectAreasMode detectAreasMode)
Determines the type of neural network used for areas detection.
Parameters:
| Parameter | Type | Description |
|---|---|---|
| detectAreasMode | DetectAreasMode | contains enum @see DetectAreasMode value. |
getStartPage()
public int getStartPage()
First page in pdf file to extract images.
Returns: int - start page
getPagesNumber()
public int getPagesNumber()
Total amount of pages from pdf file to extract i,ages (start with startPage).
Returns: int - pages amount for recognition