HtmlSaveOptions

HtmlSaveOptions class

Save options for export to Html format

The HtmlSaveOptions type exposes the following members:

Constructors

NameDescription
HtmlSaveOptions()Initializes a new instance of the HtmlSaveOptions class.
HtmlSaveOptions(document_type)Initializes a new instance of the HtmlSaveOptions class
HtmlSaveOptions(fixed_layout)Initializes a new instance of the HtmlSaveOptions class
HtmlSaveOptions(document_type, fixed_layout)Initializes a new instance of the HtmlSaveOptions class

Properties

NameDescription
warning_handlerCallback to handle any warnings generated.
The WarningHandler returns ReturnAction enum item specifying either Continue or Abort.
Continue is the default action and the Save operation continues, however the user may also return Abort in which case the Save operation should cease.
save_formatFormat of data save.
close_responseGets or sets boolean value which indicates will Response object be closed after document saved into response.
extract_ocr_sublayer_onlyNone
try_merge_adjacent_same_background_imagesNone
document_typeGets or sets the HtmlDocumentType.
compress_svg_graphics_if_anyGets or sets the flag that indicates whether
found SVG graphics(if any) will be compressed(zipped)
into SVGZ format during saving
split_css_into_pagesWhen multipage-mode selected(i.e ‘SplitIntoPages’ is ’true’),
then this attribute defines whether should be created separate CSS-file
for each result HTML page.
By default this attribute is false, so, will be created
one big common CSS for all created pages. Summary size of all
CSSes generated in this mode(one CSS per page) usually
much more than size of one big CSS file, because in former case
CSS classes are duplicates in such case in several CSS files for each page.
So, this setting is worse to be used only when You are interested
in future processing of each HTML page independently, and therefore size
of CSS of each one page taken apart is the most critical issue.
split_into_pagesGets or sets the flag that indicates whether each page of source
document will be converted into it’s own target HTML document,
i.e whether result HTML will be splitted into several HTML-pages.
explicit_list_of_saved_pagesWith this property You can explicitely define
what pages of document should be converted.
Pages in this list must have 1-based numbers. I.e.
valid numbers of pages must be taken from range (1…[NumberOfPagesInConvertedDocument])
Order of appearing of pages in this list does not affect their
order in result HTML page(s) - in result pages allways will go in order in which they are
present in source PDF.
If this list is null (as it is by default), all pages will be converted.
If any page number of this list will go out of range of present pages(1-[amountOfPagesInDocument])
exception will be thrown.
fixed_layoutGets or sets a value indicating whether that HTML is created as fixed layout.
image_resolutionGets or sets resolution for image rendering.
default_font_nameSpecifies the name of an installed font which is used to substitute
any document font that is not embedded and not installed in the system.
If null then default substitution font is used.
batch_sizeDefines batch size if batched conversion is applicable
to source and destination formats pair.
font_sourcesFont sources of pre-saved fonts.
additional_margin_width_in_pointsIf attribute ‘SplitOnPages=false’, than whole HTML representing all input PDF pages wont
be not split into different HTML pages, but will be put into one big result HTML file.
But each source PDF page will be represented with it’s own
rectangle area in HTML (if necessary that areas can be bordered to show page paper edges
with special attribute ‘PageBorderIfAny’.
This parameter defines width of margin that will be forcibly left around that output HTML-areas
that represent pages of source PDF document.In essence it defines guaranteed interval between
HTML-representations of PDF “paper” pages such mode of conversion.
use_z_orderIf attribute UseZORder set to true, graphics and text are added to resultant HTML document
accordingly Z-order in original PDF document. If this attribute is false all graphics is put
as single layer which may cause some unnecessary effects for overlapped objects.
convert_marked_content_to_layersIf attribute ConvertMarkedContentToLayers set to true then an all elements inside a PDF marked
content (layer) will be put into an HTML div with “data-pdflayer” attribute specifying a layer name.
This layer name will be extracted from optional properties of PDF marked content.
If this attribute is false (by default) then no any layers will be created from PDF marked content.
minimal_line_widthThis attribute sets minimal width of graphic path line.
If thickness of line is less than 1px Adobe Acrobat rounds it to this value. So this attribute can
be used to emulate this behavior for HTML browsers.
prevent_glyphs_groupingThis attribute switch on the mode when text glyphs will not be grouped into words and strings
This mode allows to keep maximum precision during positioning of glyphs on the page and it can be
used for conversion documents with music notes or glyphs that should be placed separately each other.
This parameter will be applied to document only when the value of FixedLayout attribute is true.
simple_textbox_mode_groupingThis attribute specifies a sequential grouping of glyphs and words into strings
For example tags and words has different order in converted HTML and you want them to match.
This parameter will be applied to document only when the value of FixedLayout attribute is true.
flow_layout_paragraph_full_widthThis attribute specifies full width paragraph text for Flow mode, FixedLayout = false
render_text_as_imageIf attribute RenderTextAsImage set to true, the text from the source becomes an image in HTML.
May be useful to make text unselectable
or HTML text is not rendered properly.
save_full_fontIndicates that full font will be saved, supports only True Type Fonts.
By default SaveFullFont = false and the converter saves the subset of the initial font
needed to display the text of the document.
antialiasing_processingThis parameter defines required antialiasing measures during conversion of compound background images from PDF to HTML
save_transparent_textsPdf can contain transparent texts that can be selected to clipboard (usually it happen when document contains images and OCRed texts extracted from it).
This settings tells to converter whether we need save such texts as transparent
selectable texts in result HTML
save_shadowed_texts_as_transparent_textsPdf can contain texts that are shadowed by another elements (f.e. by images) but
can be selected to clipboard in Acrobat Reader (usually it happen when document contains images and OCRed texts extracted from it).
This settings tells to converter whether we need save such texts as transparent
selectable texts in result HTML to mimic behaviour of Acrobat Reader (othervise such texts are usually saved as hidden, not available for copying to clipboard)
font_saving_modeDefines font saving mode that will be used during saving of PDF to desirable format
page_border_if_anyThis attribute represents set of settings used for drawing border (if any)
in result HTML document around area that represent source PDF page.
In essence it concerns of showing of page’s paper edges,
not page border referenced in PDF page itself.
page_margin_if_anyThis attribute represents set of extra page margin (if any)
in result HTML document around area that represent source PDF page.
letters_positioning_methodSets mode of positioning of letters in words in result HTML
exclude_font_name_listList of PDF embedded font names that not be embedded in HTML.
special_folder_for_svg_imagesGets or sets path to directory to which must be saved only SVG-images if they
are encountered during saving of document as HTML. If parameter is empty or null
then SVG files(if any) wil be saved together with other image-files (near to output file)
or in special folder for images (if it specified in SpecialImagesFolderIfAny option).
It does not affect anything if CustomImageSavingStrategy
property was successfully used to process relevant image file.
special_folder_for_all_imagesGets or sets path to directory to which must be saved any images if they
are encountered during saving of document as HTML. If parameter is empty or null
then image files(if any) wil be saved together with other files linked to HTML
It does not affect anything if CustomImageSavingStrategy
property was successfully used to process relevant image file.
css_class_names_prefixWhen PDFtoHTML converter generates result CSSs, CSS class names
(something like “.stl_01 {}” … “.stl_NN {}) are generated
and used in result CSS. This property allows forcibly set class name prefix
For example, if You want that all class names start with ‘my_prefix_’
(i.e. were something like ‘my_prefix_1’ … ‘my_prefix_NNN’ ) ,
then just assign ‘my_prefix_’ to this property before conversion.
If this property will stay untouched(i.e. null will be leaved as value ), then
converter will generate class names itself
(it wil be something like “.stl_01 {}” … “.stl_NN {}”)
parts_embedding_modeIt defines whether referenced files (HTML, Fonts,Images, CSSes)
will be embedded into main HTML file or will be generated as apart binary entities
html_markup_generation_modeSometimes specific reqirments to generation of HTML markup are present.
This parameter defines HTML preparing modes that can be used
during conversion of PDF to HTML to match such specific requirments.
raster_images_saving_modeConverted PDF can contain raster images
This parameter defines how they should be handled
during conversion of PDF to HTML
remove_empty_areas_on_top_and_bottomDefines whether in created HTML will be removed top and bottom empty area without any content (if any).
font_encoding_strategyDefines encoding special rule to tune PDF decoding for current document
pages_flow_type_depends_on_viewers_screen_sizeIf attribute ‘SplitOnPages=false’, than whole HTML representing all input PDF pages will be
put into one big result HTML file.
This flag defines whether result HTML will be generated in such way
that flow of areas that represent PDF pages in result HTML will depend
on screen resolution of viewer.
Suppose width of screen on viewer side is big enough to put 2 or more pages one near
other in horizontal direction. If this flag set to true, then this opportunity
will be used (as many pages will be shown in horizontal direction one near another
as it possible, then next horizontal group of pages will be shown under first one ).
Otherwise pages will flow in such way: next page goes always under previous one.
try_save_text_underlining_and_strikeouting_in_cssPDF itself does not contain underlining markers for texts. It emulated with line situated under text.
This option allows converter try guess that this or that line is a text’s underlining
and put this info into CSS instead of drawing of underlining graphically

See Also