Re: Need paginated HTML output from DITA

Mark Giffin

Good point by Marijan. The area tree is an intermediate format that some PDF processors create before producing the PDF from it. I'm not sure that all PDF processors make the area tree available to you, so you will have to check. I think the free Apache FOP processor makes an area tree available.

So you may not need HTML5 Paged Media at all. You would figure out the structure of the area tree you are using and find where the page breaks are. Then generate HTML from the area tree and put your
at the page breaks.

Mark Giffin
Hi, Marijan and Mark...

Thanks for your replies. Marijan, what’s the area tree?

As far as browsers supporting Paged Media, really all we need is for the transform to generate a page # with a style attribute = "page break after”, followed by an
, so it’s not really paged media in the HTML output. We can generate that code manually when we know we want a page break, but it’s the automatic pagination when a page is full that’s the tricky part.  



Output the area tree instead of the pdf and transform that to html. The area tree should be a close representation of the PDF output if I remember correctlu. Another way would be to transform the final PDF to XML and use that instead to create HTML.

Marijan (Mario) Madunic

Prince PDF formatter and Antenna House Formatter both support HTML5 Paged Media. I've used Antenna House to create PDF with HTML/CSS instead of XML/FO. I'm sure there is a way to create HTML output instead of PDF. I think there is another HTML-CSS-toPDF formatter, can't recall the name, it's from Europe.

But if you are going to create HTML5 that uses Paged Media, what are you going to display it in? Last I checked, I could not find a browser that supported Paged Media.

Hi, all…

We need to be able to transform DITA to HTML, but with pagination. The pagination feature must be aware of page size and margins, and if there is really long narrative, must insert a page break when the page is full, like FOP would do with PDF. But we don’t want PDF output, we want HTML with pagination.

What we need, for example, might be an engine that uses CSS to transform XML to HTML, and supports the CSS paged media module (the “.page” specs). This would be like Oxygen’s Chemistry engine but that outputs HTML.

Does anyone know of such an engine? Thanks

