recognize_utf8_text property

RtfLoadOptions.recognize_utf8_text property

When set to True, will try to detect UTF8 characters, they will be preserved during import.

@property
def recognize_utf8_text(self) -> bool:
    ...

@recognize_utf8_text.setter
def recognize_utf8_text(self, value: bool):
    ...

Remarks

Default value is False.

Examples

Shows how to detect UTF-8 characters while loading an RTF document.

# Create an "RtfLoadOptions" object to modify how we load an RTF document.
load_options = aw.loading.RtfLoadOptions()

# Set the "recognize_utf8_text" property to "False" to assume that the document uses the ISO 8859-1 charset
# and loads every character in the document.
# Set the "recognize_utf8_text" property to "True" to parse any variable-length characters that may occur in the text.
load_options.recognize_utf8_text = recognize_utf8_text

doc = aw.Document(MY_DIR + "UTF-8 characters.rtf", load_options)

if recognize_utf8_text:
    self.assertEqual(
        "“John Doe´s list of currency symbols”™\r" + "€, ¢, £, ¥, ¤",
        doc.first_section.body.get_text().strip())
else:
    self.assertEqual(
        "“John Doe´s list of currency symbolsâ€\u009dâ„¢\r" + "€, ¢, £, Â¥, ¤",
        doc.first_section.body.get_text().strip())

See Also