Date   

Re: “Generate template” in Dita PDF plugin generator #DITA-OT

Nicolas Delobel
 

Hi,

Thanks a lot for these informations. It's very interesting!

Cheers,
Nicolas


Re: “Generate template” in Dita PDF plugin generator #DITA-OT

Jarno Elovirta
 

Hi,

The "generate template" a new feature that hasn't been publicized because it's not really ready yet. It's part of https://www.dita-ot.org/plugins#!com.elovirta.pdf that adds support for theming in PDF2 plugin. So instead of passing various arguments to DITA-OT and customizing XSLT with a custom plugin, you can just pass a theme file with `--theme` property. https://github.com/jelovirt/pdf-generator/wiki/Theme has some very preliminary docs about how to use it, but whole tool is still under development and pre-1.0. Once It reaches some stable stage, 1.0 will be releases and hopefully this will be part of built-in plugins in DITA-OT.

Cheers,

Jarno

On Thu, 22 Jul 2021 at 11:46, Nicolas Delobel <nicolas.delobel@...> wrote:
Hello all,

I saw in Dita PDF plugin generator (https://dita-generator.elovirta.com/) that a new button “Generate template” is available now.


Do you have any ideas about the usage of this button and how can the produced json be used afterwards?

Thanks a lot.


“Generate template” in Dita PDF plugin generator #DITA-OT

Nicolas Delobel
 

Hello all,

I saw in Dita PDF plugin generator (https://dita-generator.elovirta.com/) that a new button “Generate template” is available now.


Do you have any ideas about the usage of this button and how can the produced json be used afterwards?

Thanks a lot.


Re: DITA and Security

Dan Caprioara
 

Since DITA-OT is a Java process, you can use the Java SecurityManager and policy file infrastructure to limit its access to the system resources (for example you could give it read access only to the folder that contains the topics and maps, or to connect to a specific host to get binary resources). 

Or, perhaps simpler, run DITA-OT in a docker container and mount in this container only the DITA source folders, and the output folder. You can also impose network restrictions to the container, so the process inside it  cannot connect to other hosts.

Many regards,
Dan Caprioara


On 16 Jul 2021, at 17:14, despopoulos_chriss via groups.io <despopoulos_chriss@...> wrote:

So, my interest in this is whether one can identify the threat loopholes in a DITA document, and then include an automated scan for any of them existing in a file before committing it in a Docs-As-Code work flow.  For example, if a given construct opens such a loophole, maybe you can specialize your topics to disallow it, and then simply validate against your specialization.

I understand that our situation is a bit unique in that we transform in the browser.  If you transform before delivery, then the danger is in the transformed artifact, and you need to check it before publishing.  Ib iyr case, we need to pre-empt that artifact by controlling the DITA, and controlling the transform. We can limit the possibilities since the transform can effectively white-list the constructs it will pass.  We also use parameters when calling the transform, but again I believe the transform simply ignores unrecognized params, so you can't directly inject mischief that way.  OTOH, we can pass variable values as params, and inject those values into a templated topic file.  But the topic file must declare the placeholders for those values, the transform must recognize the placeholders, and it must have the params already declared in the transform file.  So you have to know all these things.  If you get past that, you might be able to pass malicious code in a param...  But the transform can limit what it produces when it expands the param in a waiting placeholder.  So I need to get my head around this...

The billion laughs attack is pretty funny!  Interestingly enough, MS Edge does not allow a doc declaration in the XML it receives...  Calls that a security violation.  We had to abandon entities a long time ago, in favor of references.


Re: DITA and Security

despopoulos_chriss
 

So, my interest in this is whether one can identify the threat loopholes in a DITA document, and then include an automated scan for any of them existing in a file before committing it in a Docs-As-Code work flow.  For example, if a given construct opens such a loophole, maybe you can specialize your topics to disallow it, and then simply validate against your specialization.

I understand that our situation is a bit unique in that we transform in the browser.  If you transform before delivery, then the danger is in the transformed artifact, and you need to check it before publishing.  Ib iyr case, we need to pre-empt that artifact by controlling the DITA, and controlling the transform. We can limit the possibilities since the transform can effectively white-list the constructs it will pass.  We also use parameters when calling the transform, but again I believe the transform simply ignores unrecognized params, so you can't directly inject mischief that way.  OTOH, we can pass variable values as params, and inject those values into a templated topic file.  But the topic file must declare the placeholders for those values, the transform must recognize the placeholders, and it must have the params already declared in the transform file.  So you have to know all these things.  If you get past that, you might be able to pass malicious code in a param...  But the transform can limit what it produces when it expands the param in a waiting placeholder.  So I need to get my head around this...

The billion laughs attack is pretty funny!  Interestingly enough, MS Edge does not allow a doc declaration in the XML it receives...  Calls that a security violation.  We had to abandon entities a long time ago, in favor of references.


Oxygen XML - Refactoring action to convert DITA 1.3 topics and maps to DITA 2.0

Radu Coravu
 

Hi everyone,

For those who want to explore the DITA 2.0 standard we created a free Oxygen XML Editor add-on containing a refactoring action that converts DITA 1.3 topics and maps to the DITA 2.0 standard. The project along with the installation and usage instructions can be found here:

https://github.com/oxygenxml/dita_1_3_to_2_x_convert

Any contributions to the projects (pull requests, issues) are as always welcomed.

Regards,

Radu

Radu Coravu
Oxygen XML Editor


Re: DITA and Security

Don Day
 

Hi, Chris. Good question!

Because DITA as XML supports general entity resolution in the document instance, it is vulnerable to a denial-of-service class of exploit called Billion Laughs (see wikipedia link following). But since DITA uses transclusion instead of entities for content references, this type of exploit seems unlikely, although it might be introduced in manipulated DTD mod files which DO depend on entity definitions and resolutions. I would think that other schema formats would be much harder to manipulate to become memory bombs like Billion Laughs. Check the references in this following article for possibly related threat vectors in XML processing systems. Your web server application is more likely to be compromised by cross-site scripting attacks sent through form data, as in a DITA transform that generates an HTML input field tied to an unprotected POST or GET handler.

https://en.wikipedia.org/wiki/Billion_laughs_attack
--
Don Day

On 7/14/2021 9:36 AM, despopoulos_chriss via groups.io wrote:
I don't know if this is even a reasonable question, but...  Has anybody looked at ways to sneak security violations into DITA?  For example, sneaking in JavaScript as CDATA into a topic?  Does DITA just allow CDATA?  Are there other ways to sneak in bad things that might make it through a transform to HTML?  I'm looking for ways to prove that has not happened to my topics.  I think that is becoming important for Docs-As-Code workflows.


Re: DITA and Security

Nicholas Mucks
 

Metadata in the prolog does pass through to html but the html plugin would probably need additional templates to transform that into a script tag for javascript. You could certainly unintentionally pass internal information out to customers if your writers apply metadata that will appear in the output (like author) or your plugins include processing-instructions in the actual output (like some sort of track changes).

Take care,
- Nick

Sent from mobile

On Jul 14, 2021, at 11:36 AM, Mica Semrick <mica@...> wrote:

Certainly links to external websites/documents/etc could make it through the publish and into your production content.

On July 14, 2021 7:36:06 AM PDT, "despopoulos_chriss via groups.io" <despopoulos_chriss@...> wrote:
I don't know if this is even a reasonable question, but...  Has anybody looked at ways to sneak security violations into DITA?  For example, sneaking in JavaScript as CDATA into a topic?  Does DITA just allow CDATA?  Are there other ways to sneak in bad things that might make it through a transform to HTML?  I'm looking for ways to prove that has not happened to my topics.  I think that is becoming important for Docs-As-Code workflows.


Re: DITA and Security

Mica Semrick
 

Certainly links to external websites/documents/etc could make it through the publish and into your production content.


On July 14, 2021 7:36:06 AM PDT, "despopoulos_chriss via groups.io" <despopoulos_chriss@...> wrote:
I don't know if this is even a reasonable question, but...  Has anybody looked at ways to sneak security violations into DITA?  For example, sneaking in JavaScript as CDATA into a topic?  Does DITA just allow CDATA?  Are there other ways to sneak in bad things that might make it through a transform to HTML?  I'm looking for ways to prove that has not happened to my topics.  I think that is becoming important for Docs-As-Code workflows.


Re: DITA and Security

ekimber@contrext.com
 

I’m not sure that literal Javascript in HTML content can pose a security problem without something to run it, which means some Javascript loaded by the browser, which means something referenced from the generated HTML or otherwise loaded separate from the DITA itself.

 

If you’re processing the DITA XML directly in the browser and your XML processor does DTD resolution and entity expansion, the external entity expansion is always a potential issue, although I’m not sure it should be a danger in a browser (because browsers don’t normally have access to a user’s file system). External entity expansion is usually a vulnerability on servers (i.e., a processor expands an entity pointing to “/etc/passwords” or similar, putting the content of that file into the generated result).

 

Otherwise, I would think any DITA-carried vulnerability would be a function of processing applied to it in the browser, which is ultimately the responsibility of the implementor of the processing (i.e., not blindly processing DITA content just as you would not blindly process user-supplied values, i.e., the typical SQL injection attack).

 

If you are generating your HTML then you have complete control over what gets generated and can avoid any potential data issues in the HTML. If you’re generating the HTML in the browser from the base DITA then it’s a function of the in-browser processing. But even in that case I would expect the DITA provided to the browser to have gone through some preprocess, for example to normalize the XML (to remove the need for DTD resolution), resolve conrefs, apply filtering, and otherwise scrub the data for public delivery.

 

Cheers,

 

Eliot

 

--

Eliot Kimber

http://contrext.com

 

 

 

From: <main@dita-users.groups.io> on behalf of "despopoulos_chriss via groups.io" <despopoulos_chriss@...>
Reply-To: <main@dita-users.groups.io>
Date: Wednesday, July 14, 2021 at 9:36 AM
To: <main@dita-users.groups.io>
Subject: [dita-users] DITA and Security

 

I don't know if this is even a reasonable question, but...  Has anybody looked at ways to sneak security violations into DITA?  For example, sneaking in JavaScript as CDATA into a topic?  Does DITA just allow CDATA?  Are there other ways to sneak in bad things that might make it through a transform to HTML?  I'm looking for ways to prove that has not happened to my topics.  I think that is becoming important for Docs-As-Code workflows.


DITA and Security

despopoulos_chriss
 

I don't know if this is even a reasonable question, but...  Has anybody looked at ways to sneak security violations into DITA?  For example, sneaking in JavaScript as CDATA into a topic?  Does DITA just allow CDATA?  Are there other ways to sneak in bad things that might make it through a transform to HTML?  I'm looking for ways to prove that has not happened to my topics.  I think that is becoming important for Docs-As-Code workflows.


Is it legal to post a job here? #jobhunting

despopoulos_chriss
 

I hope so...  We have an opening at Turbonomic Inc.  The position is for a senior tech writer, but we're a DITA shop, so it might spark your interest...

The doc team is small and DIY -- We use oXygen, GIT, and our own publishing chain, which uses transforms in the browser.  AFAIK, we're compliant with the DITA spec, but we do unusual things.  I recently gave a talk at CDIM about how we single-source our topics into the product GUI.  I've also given talks about how we work with Jira to generate DITA release notes, and about the overall architecture of our approach.  We need somebody who can think through and constantly improve things.

The technology that we document is automatic management of VM and cloud infrastructure to guarantee application performance.  You will learn about VMs, storage, cloud, networks, Kubernetes...  Lots of stuff.  This is a challenging position for any writer.

If any of this interests you, please take a look at the posting:
https://www.turbonomic.com/company/careers/open-roles/?gh_jid=5375421002


Re: The current open repository for tools for generating DTD and XSD

Kristen James Eberlein
 

Sorry, folks; finger fumble on my part. The e-mail was meant for the DITA Technical Committee.

Best,
Kris

Kristen James Eberlein
Chair, OASIS DITA Technical Committee
OASIS Distinguished Contributor
Principal consultant, Eberlein Consulting LLC
www.eberleinconsulting.com
+1 919 622-1501; kriseberlein (skype)


On 7/13/2021 8:05 AM, Kristen James Eberlein via groups.io wrote:

https://github.com/oasis-open/dita-rng-converter

Here is the current description:

OASIS TC Open Repository: The DITA RNG Converter provides cross-platform tools for generating DITA-conforming DTD- and XSD-format versions of RELAX NG DITA grammars: document type shells, vocabulary modules, and constraint modules. It makes it as easy as possible to develop and maintain DITA grammars by allowing use of RELAX NG syntax.

We had talked about repurposing? broadening? the purpose of this repo to include generating (monolithic, rather than modular) XSDs from RNG.

Chet, can we simply change the description of the repo to meet new needs?

--
Best,
Kris

Kristen James Eberlein
Chair, OASIS DITA Technical Committee
OASIS Distinguished Contributor
Principal consultant, Eberlein Consulting LLC
www.eberleinconsulting.com
+1 919 622-1501; kriseberlein (skype)


.


The current open repository for tools for generating DTD and XSD

Kristen James Eberlein
 

https://github.com/oasis-open/dita-rng-converter

Here is the current description:

OASIS TC Open Repository: The DITA RNG Converter provides cross-platform tools for generating DITA-conforming DTD- and XSD-format versions of RELAX NG DITA grammars: document type shells, vocabulary modules, and constraint modules. It makes it as easy as possible to develop and maintain DITA grammars by allowing use of RELAX NG syntax.

We had talked about repurposing? broadening? the purpose of this repo to include generating (monolithic, rather than modular) XSDs from RNG.

Chet, can we simply change the description of the repo to meet new needs?

--
Best,
Kris

Kristen James Eberlein
Chair, OASIS DITA Technical Committee
OASIS Distinguished Contributor
Principal consultant, Eberlein Consulting LLC
www.eberleinconsulting.com
+1 919 622-1501; kriseberlein (skype)



Re: Font substitution in PDF SVGs?

Nicholas Mucks
 

Hi Glenn,
Thanks! This is also good information.

Take care,
- Nick

Sent from mobile

On Jul 9, 2021, at 6:59 AM, glenn emerson via groups.io <gemerso1=icloud.com@groups.io> wrote:

Depending on what editor you use to create your SVGs you may be able to solve your font problems during SVG creation.

Windows comes with a few fonts that are licensed for embedding. When you create a PDF and preserve the exact appearance of your original document, this is done by embedding the font or a subset in the output PDF.

The same is happening with those SVGs in your PDF. Antenna House and RenderX respect font licenses. They look at the license bit on the font file to see if it is enabled for embedding.

With Windows, I believe that TrueType (now OpenType) fonts Arial, Calibri, Courier New, and Times New Roman can all be embedded without a license purchase.

So if you use one of those fonts in your SVGs at creation, you will not have substitution problems on the FO rendering to PDF.

A further note: Adobe Illustrator used to default to postscript fonts—Arial.ps1 instead of Arial.ttf
So it was imperative to change the default settings for new files in Illustrator to use a licensed TTF font.

If you want to purchase a font for embedding (they are software, after all), you can go to several online type foundries. If you start with the Microsoft Typography site, you can get a more detailed explanation of how font licensing works and there are many fonts there to choose from.

__________________________
Glenn Emerson
gemerso1@icloud.com
584-732-6984
On Jul 9, 2021, at 8:28 AM, Nicholas Mucks via groups.io <urbanrobots=yahoo.com@groups.io> wrote:

Hi Radu,
Thank you! We’ll experiment with these approaches for handling SVG edits in the PDF output.

Take care,
- Nick

Sent from mobile

On Jul 8, 2021, at 9:42 PM, Radu Coravu <radu_coravu@sync.ro> wrote:
Hi Nick,

Besides fixing up the problem in the original SVGs, probably you need an Ant task to copy the original SVGs to some temporary files folder, modify them there and with XSLT replace the references to the SVGs in the topic.fo so that they refer to the temporary modified files.

You can also embed SVG directly inside the topic.fo file:

https://xmlgraphics.apache.org/fop/dev/fo/embedding.fo.pdf

so with an XSLT stage, for each image reference you could replace it with the embedded SVG content, content which could also be modified.

Regards,
Radu

Radu Coravu
Oxygen XML Editor

On 7/9/21 02:01, Nicholas Mucks via groups.io wrote:
Hi!
For HTML we run an ANT target against the output directory to replace certain strings in all svgs with another string. It’s just one older font name to another.

What’s the best way to accomplish this task in PDF? We notice that the topic.fo file references the source file locations, not temp or output locations. In testing we modified our ANT target to use the source folder instead of the output folder and although it works we don’t want to edit the source files.



Take care,
- Nick

Sent from mobile














Re: Migrating from DITA OT 2.5.4 to 3.6.1

Lief Erickson
 

Take a look at this page from the DITA-OT documentation:


When migrating customizations, identify the version of the toolkit you're currently using (base version) and the version of the toolkit you want to migrate to (target version). Then, review all of the migration changes described in all of the versions from the base through the target. For instance, if you're currently on 2.2 and want to move to 3.3, you should review all of the changes in 2.3 through 3.3. You may want to start at the oldest version and read forward so you can chronologically follow the changes, since it is possible that files or topics have had multiple changes.

-Lief

On Fri, Jul 9, 2021 at 9:45 AM <stefan.theisen@...> wrote:
Is there a short instruction about how to migrate from DITA OT 2.5.4 to 3.6.1? The folder structure has changed and I'm sure which files can be kept and where to copy the new files (especially the java library). When I run the Oxygen XML Webhelp (Responsive) Publication I get errors that several java files cannot be properly executed.


Migrating from DITA OT 2.5.4 to 3.6.1

stefan.theisen@...
 

Is there a short instruction about how to migrate from DITA OT 2.5.4 to 3.6.1? The folder structure has changed and I'm sure which files can be kept and where to copy the new files (especially the java library). When I run the Oxygen XML Webhelp (Responsive) Publication I get errors that several java files cannot be properly executed.


Re: Font substitution in PDF SVGs?

glenn emerson
 

Depending on what editor you use to create your SVGs you may be able to solve your font problems during SVG creation.

Windows comes with a few fonts that are licensed for embedding. When you create a PDF and preserve the exact appearance of your original document, this is done by embedding the font or a subset in the output PDF.

The same is happening with those SVGs in your PDF. Antenna House and RenderX respect font licenses. They look at the license bit on the font file to see if it is enabled for embedding.

With Windows, I believe that TrueType (now OpenType) fonts Arial, Calibri, Courier New, and Times New Roman can all be embedded without a license purchase.

So if you use one of those fonts in your SVGs at creation, you will not have substitution problems on the FO rendering to PDF.

A further note: Adobe Illustrator used to default to postscript fonts—Arial.ps1 instead of Arial.ttf
So it was imperative to change the default settings for new files in Illustrator to use a licensed TTF font.

If you want to purchase a font for embedding (they are software, after all), you can go to several online type foundries. If you start with the Microsoft Typography site, you can get a more detailed explanation of how font licensing works and there are many fonts there to choose from.

__________________________
Glenn Emerson
gemerso1@icloud.com
584-732-6984

On Jul 9, 2021, at 8:28 AM, Nicholas Mucks via groups.io <urbanrobots=yahoo.com@groups.io> wrote:

Hi Radu,
Thank you! We’ll experiment with these approaches for handling SVG edits in the PDF output.

Take care,
- Nick

Sent from mobile

On Jul 8, 2021, at 9:42 PM, Radu Coravu <radu_coravu@sync.ro> wrote:

Hi Nick,

Besides fixing up the problem in the original SVGs, probably you need an Ant task to copy the original SVGs to some temporary files folder, modify them there and with XSLT replace the references to the SVGs in the topic.fo so that they refer to the temporary modified files.

You can also embed SVG directly inside the topic.fo file:

https://xmlgraphics.apache.org/fop/dev/fo/embedding.fo.pdf

so with an XSLT stage, for each image reference you could replace it with the embedded SVG content, content which could also be modified.

Regards,
Radu

Radu Coravu
Oxygen XML Editor

On 7/9/21 02:01, Nicholas Mucks via groups.io wrote:
Hi!
For HTML we run an ANT target against the output directory to replace certain strings in all svgs with another string. It’s just one older font name to another.

What’s the best way to accomplish this task in PDF? We notice that the topic.fo file references the source file locations, not temp or output locations. In testing we modified our ANT target to use the source folder instead of the output folder and although it works we don’t want to edit the source files.



Take care,
- Nick

Sent from mobile











Re: Font substitution in PDF SVGs?

Nicholas Mucks
 

Hi Radu,
Thank you! We’ll experiment with these approaches for handling SVG edits in the PDF output.

Take care,
- Nick

Sent from mobile

On Jul 8, 2021, at 9:42 PM, Radu Coravu <radu_coravu@sync.ro> wrote:

Hi Nick,

Besides fixing up the problem in the original SVGs, probably you need an Ant task to copy the original SVGs to some temporary files folder, modify them there and with XSLT replace the references to the SVGs in the topic.fo so that they refer to the temporary modified files.

You can also embed SVG directly inside the topic.fo file:

https://xmlgraphics.apache.org/fop/dev/fo/embedding.fo.pdf

so with an XSLT stage, for each image reference you could replace it with the embedded SVG content, content which could also be modified.

Regards,
Radu

Radu Coravu
Oxygen XML Editor

On 7/9/21 02:01, Nicholas Mucks via groups.io wrote:
Hi!
For HTML we run an ANT target against the output directory to replace certain strings in all svgs with another string. It’s just one older font name to another.

What’s the best way to accomplish this task in PDF? We notice that the topic.fo file references the source file locations, not temp or output locations. In testing we modified our ANT target to use the source folder instead of the output folder and although it works we don’t want to edit the source files.



Take care,
- Nick

Sent from mobile







Re: Font substitution in PDF SVGs?

Radu Coravu
 

Hi Nick,

Besides fixing up the problem in the original SVGs, probably you need an Ant task to copy the original SVGs to some temporary files folder, modify them there and with XSLT replace the references to the SVGs in the topic.fo so that they refer to the temporary modified files.

You can also embed SVG directly inside the topic.fo file:

https://xmlgraphics.apache.org/fop/dev/fo/embedding.fo.pdf

so with an XSLT stage, for each image reference you could replace it with the embedded SVG content, content which could also be modified.

Regards,
Radu

Radu Coravu
Oxygen XML Editor

On 7/9/21 02:01, Nicholas Mucks via groups.io wrote:
Hi!
For HTML we run an ANT target against the output directory to replace certain strings in all svgs with another string. It’s just one older font name to another.

What’s the best way to accomplish this task in PDF? We notice that the topic.fo file references the source file locations, not temp or output locations. In testing we modified our ANT target to use the source folder instead of the output folder and although it works we don’t want to edit the source files.



Take care,
- Nick

Sent from mobile



1 - 20 of 46324