recognize_utf8_text property

RtfLoadOptions.recognize_utf8_text property

When set to True, will try to detect UTF8 characters, they will be preserved during import.

@property
def recognize_utf8_text(self) -> bool:
    ...

@recognize_utf8_text.setter
def recognize_utf8_text(self, value: bool):
    ...

Remarks

Default value is False.

Examples

Shows how to detect UTF-8 characters while loading an RTF document.

# Create an "RtfLoadOptions" object to modify how we load an RTF document.
load_options = aw.loading.RtfLoadOptions()
# Set the "recognize_utf8_text" property to "False" to assume that the document uses the ISO 8859-1 charset
# and loads every character in the document.
# Set the "recognize_utf8_text" property to "True" to parse any variable-length characters that may occur in the text.
load_options.recognize_utf8_text = recognize_utf8_text
doc = aw.Document(MY_DIR + 'UTF-8 characters.rtf', load_options)
if recognize_utf8_text:
    self.assertEqual('“John Doe´s list of currency symbols”™\r' + '€, ¢, £, ¥, ¤', doc.first_section.body.get_text().strip())
else:
    self.assertEqual('“John Doe´s list of currency symbolsâ€\x9dâ„¢\r' + '€, ¢, £, Â¥, ¤', doc.first_section.body.get_text().strip())

See Also