Topics

Allow variables to be referenced from @scope="peer" maps


Chris Papademetrious
 

Hi everyone,

This is an FYI for an enhancement request I filed for the DITA-OT, in case anyone else would find it useful too:

Allow variables to be referenced from @scope="peer" maps #3685

The text is as follows:

Description

I have two books, bookA and bookB. bookA references bookB as a @scope="peer" map:

<map>
  <title>Book A Map</title>
  <mapref href="bookB.ditamap" scope="peer" keyscope="bookB" processing-role="resource-only"/>
  <!--          ^^^^^^^^^^^^^         ^^^^            ^^^^^
  ...
</map>


In a topic in bookA, I would like to reference a "Product" variable from bookB:

<p>BookB provides more information on <ph keyref="bookB.Product"/>.</p>
<!--                                              ^^^^^^^^^^^^^ -->

Unfortunately, @scope="peer" suppresses all bookB content from the bookA transformation - topic content and variable definitions.

This enhancement requests that keyscoped @scope="peer" maps pull their keys (but not topics) from the referenced map into the current map.

To test this, unarchive the following testcase:

#### ditaot_keys_from_peer_books.zip ####

then run

dita -i bookA.ditamap -i bookA.ditamap -f html5 -o out

Possible Solution

I tried implementing a proof-of-concept solution, which you can see in the testcase directory as follows:

diff maprefImpl.xsl.ORIG maprefImpl.xsl.NEW

The attempt was to remember when we're in a @scope="peer-with-keys" context, that would keep all mapref contents except topic references. Unfortunately, even this naive attempt did not work - the variables still did not resolve, and bookB topics still showed up in the bookA output directory. (You will need to change @scope="peer" to @scope="peer-with-keys" to try this proof-of-concept on the provided testcase.)

Potential Alternatives

I am aware of warehouse topics, shared key definition files, etc. However, those are not a good fit for this need. This enhancement is for the case where we're already referencing a peer book and that book already defines the variables we want (for its own use), and we simply want to access them as well.


ekimber@contrext.com
 

To implement this you have to do all the preprocessing on all the peer maps referenced so that you know what the key space in each referenced peer map is.

That would be very inefficient in a naïve implementation because every time you ran map A that references peer B you'd also have to process peer B.

An obvious solution is to implement some kind of "key service" that maintains a persistent resolved key space for each map such that a processor can just ask for the resolved value of a given key in a given map. Of course, there are a number of potential issues here, including: how do you know when to update your persistent store? How do you handle runtime conditions, in particular, active filtering settings (DITAVAL)?

You start to see that this is the domain of content management where you have systems that have general knowledge of bodies of DITA content, track their status, versions, etc.

A relatively simplistic key service probably wouldn't be that difficult to implement (and my now-fallow DITA for Small Teams link manager application was working towards being a key service, among other things) but probably way outside the scope of the Open Toolkit project.

I've thought for a while that it would be useful to have a general key access API that defines the minimum features for DITA key stores. Given such an API a tool like Open Toolkit could have general key store requests that are then resolved against whatever key store you happen to have made a available, if any. A REST API could be implemented in any number of common technologies (Java, JavaScript, XQuery, etc.).

Another brute force solution would be to have Open Toolkit write out an easy-to-access representation of a given publication's key space whenever it's processed (or on request as a runtime option). When resolving peer references, could then try to find the key space data file and use it if found. This would leave it up to users to make sure that the key spaces were up to date, basically, processing every publication once and then processing them a second time to ensure they've used up to date key spaces from any peer publications.

Essentially, using peer maps takes you from having a collection of otherwise-independent publications to having a system of interdependent publications. As soon as you have that you have unavoidable management problems inherent in trying to produce deliverables from these interdependent publications, problems that are solvable but require some kind of content management system that maintains knowledge about all the publications, their components, their versions, their relationships to each other, provides processing optimizations (indexes, caches, etc.).

Or you have to create ad-hoc solutions that take advantage of simplifying assumptions you can make about your content or constraints you can impose on your content (and its authors). For example, for your purposes, it might be sufficient to have some kind of script that reads peer maps and generates sets of key definitions to be used in the referencing publications as local-scope key definitions. That's relatively easy to do but you then have to keep these generated key definition sets up to date as the publications are modified.

Cheers,

E.

--
Eliot Kimber
http://contrext.com


On 1/21/21, 5:39 PM, "Chris Papademetrious" <main@dita-users.groups.io on behalf of chrispitude@gmail.com> wrote:

Hi everyone,

This is an FYI for an enhancement request I filed for the DITA-OT, in case anyone else would find it useful too:

Allow variables to be referenced from @scope="peer" maps #3685 <https://github.com/dita-ot/dita-ot/issues/3685>

The text is as follows:
Description
I have two books, bookA and bookB. bookA references bookB as a @scope="peer" map:

<map>
<title>Book A Map</title>
<mapref href="bookB.ditamap" scope="peer" keyscope="bookB" processing-role="resource-only"/>
<!-- ^^^^^^^^^^^^^ ^^^^ ^^^^^
...
</map>


In a topic in bookA, I would like to reference a "Product" variable from bookB:

<p>BookB provides more information on <ph keyref="bookB.Product"/>.</p>
<!-- ^^^^^^^^^^^^^ -->

Unfortunately, @scope="peer" suppresses all bookB content from the bookA transformation - topic content and variable definitions.

This enhancement requests that keyscoped @scope="peer" maps pull their keys (but not topics) from the referenced map into the current map.

To test this, unarchive the following testcase:

#### ditaot_keys_from_peer_books.zip ####

then run

dita -i bookA.ditamap -i bookA.ditamap -f html5 -o out
Possible Solution
I tried implementing a proof-of-concept solution, which you can see in the testcase directory as follows:

diff maprefImpl.xsl.ORIG maprefImpl.xsl.NEW

The attempt was to remember when we're in a @scope="peer-with-keys" context, that would keep all mapref contents except topic references. Unfortunately, even this naive attempt did not work - the variables still did not resolve, and bookB topics still showed up in the bookA output directory. (You will need to change @scope="peer" to @scope="peer-with-keys" to try this proof-of-concept on the provided testcase.)
Potential Alternatives
I am aware of warehouse topics, shared key definition files, etc. However, those are not a good fit for this need. This enhancement is for the case where we're already referencing a peer book and that book already defines the variables we want (for its own use), and we simply want to access them as well.


David Hollis
 

Hi Chris, Eliot,

It's obvious that the two products and their documentation share an environment because there is a mapref from one to the other. With that in mind, I'm struggling to understand why you are not keen to use one of the alternatives that you mention.

It might seem like a good idea for each product to manage its own keys and content, but surely that leads to product silos? One of the strengths of DITA is that it breaks silos down.

I'm also a little concerned that the solution Eliot suggests could lead to quite complicated key processing?

Maybe I'm missing something?

David

To implement this you have to do all the preprocessing on all the peer maps referenced so that you know what the key space in each referenced peer map is.

That would be very inefficient in a naïve implementation because every time you ran map A that references peer B you'd also have to process peer B.

An obvious solution is to implement some kind of "key service" that maintains a persistent resolved key space for each map such that a processor can just ask for the resolved value of a given key in a given map. Of course, there are a number of potential issues here, including: how do you know when to update your persistent store? How do you handle runtime conditions, in particular, active filtering settings (DITAVAL)?

You start to see that this is the domain of content management where you have systems that have general knowledge of bodies of DITA content, track their status, versions, etc.

A relatively simplistic key service probably wouldn't be that difficult to implement (and my now-fallow DITA for Small Teams link manager application was working towards being a key service, among other things) but probably way outside the scope of the Open Toolkit project.

I've thought for a while that it would be useful to have a general key access API that defines the minimum features for DITA key stores. Given such an API a tool like Open Toolkit could have general key store requests that are then resolved against whatever key store you happen to have made a available, if any. A REST API could be implemented in any number of common technologies (Java, JavaScript, XQuery, etc.).

Another brute force solution would be to have Open Toolkit write out an easy-to-access representation of a given publication's key space whenever it's processed (or on request as a runtime option). When resolving peer references, could then try to find the key space data file and use it if found. This would leave it up to users to make sure that the key spaces were up to date, basically, processing every publication once and then processing them a second time to ensure they've used up to date key spaces from any peer publications.

Essentially, using peer maps takes you from having a collection of otherwise-independent publications to having a system of interdependent publications. As soon as you have that you have unavoidable management problems inherent in trying to produce deliverables from these interdependent publications, problems that are solvable but require some kind of content management system that maintains knowledge about all the publications, their components, their versions, their relationships to each other, provides processing optimizations (indexes, caches, etc.).

Or you have to create ad-hoc solutions that take advantage of simplifying assumptions you can make about your content or constraints you can impose on your content (and its authors). For example, for your purposes, it might be sufficient to have some kind of script that reads peer maps and generates sets of key definitions to be used in the referencing publications as local-scope key definitions. That's relatively easy to do but you then have to keep these generated key definition sets up to date as the publications are modified.

Cheers,

E.

--
Eliot Kimber
http://contrext.com


On 1/21/21, 5:39 PM, "Chris Papademetrious" <main@dita-users.groups.io on behalf of chrispitude@gmail.com> wrote:

Hi everyone,

This is an FYI for an enhancement request I filed for the DITA-OT, in case anyone else would find it useful too:

Allow variables to be referenced from @scope="peer" maps #3685 <https://github.com/dita-ot/dita-ot/issues/3685>

The text is as follows:
Description
I have two books, bookA and bookB. bookA references bookB as a @scope="peer" map:

<map>
<title>Book A Map</title>
<mapref href="bookB.ditamap" scope="peer" keyscope="bookB" processing-role="resource-only"/>
<!-- ^^^^^^^^^^^^^ ^^^^ ^^^^^
...
</map>


In a topic in bookA, I would like to reference a "Product" variable from bookB:

<p>BookB provides more information on <ph keyref="bookB.Product"/>.</p>
<!-- ^^^^^^^^^^^^^ -->

Unfortunately, @scope="peer" suppresses all bookB content from the bookA transformation - topic content and variable definitions.

This enhancement requests that keyscoped @scope="peer" maps pull their keys (but not topics) from the referenced map into the current map.

To test this, unarchive the following testcase:

#### ditaot_keys_from_peer_books.zip ####

then run

dita -i bookA.ditamap -i bookA.ditamap -f html5 -o out
Possible Solution
I tried implementing a proof-of-concept solution, which you can see in the testcase directory as follows:

diff maprefImpl.xsl.ORIG maprefImpl.xsl.NEW

The attempt was to remember when we're in a @scope="peer-with-keys" context, that would keep all mapref contents except topic references. Unfortunately, even this naive attempt did not work - the variables still did not resolve, and bookB topics still showed up in the bookA output directory. (You will need to change @scope="peer" to @scope="peer-with-keys" to try this proof-of-concept on the provided testcase.)
Potential Alternatives
I am aware of warehouse topics, shared key definition files, etc. However, those are not a good fit for this need. This enhancement is for the case where we're already referencing a peer book and that book already defines the variables we want (for its own use), and we simply want to access them as well.


Chris Papademetrious
 

Hi Eliot, David,

Thanks to both of you for sharing your thoughts!

Some background information... We use cross-book links in our content. On the authoring side, we have a compiled Perl script (accessible as an external tool in Oxygen) that populates cross-book link target text in the DITA source files. On the production side, we use a variant of the following plugin to late-resolve links in the final HTML output:

https://github.com/chrispy-snps/DITA-fix-xbook-html-links

Thus, many of our books already reference other books as peer maps.

A writer asked me how to get the title of a referenced book so she could get the title of a peer-referenced book, something like this:

<p>See <ph keyref="bookB.BookTitle"/> for more information.</p>

I thought it would be neat to define a BookTitle variable in every book, even using its value as the <bookmap> title so the title is single-sourced everywhere from the variable. Clever me! But this is when I found out that variables in peer maps are not accessible.

Could I solve this by defining every book title in a shared "booktitles" map, then referencing that map from every book? Well sure I guess, but yuck. Every book has a title, I just want to get to it.

Right now DITA offers

  • @scope="peer" maprefs (ignores map+topic content)
  • @scope="local" maprefs (processes map+topic content)

Functionally, I want something between these. I want to process a peer book's map structure but not its topic content, with the exception of preserving topic <title> elements to enable processing to populate cross-book link target text.

This would elegantly solve some challenges we have with cross-book links to topics with conditional-content titles, as then the DITA-OT profiling behaviors simply play out in the peer maps and resolve the content for us. Currently our Perl script flags these for human intervention. It would also avoid issues with stale target text because the target topic title changed but the referencing-topic writer didn't know to rerun the Perl script.

Would this increase processing time? Sure, but I'd gladly exchange that for the flow simplification. And if I can figure out how to ignore <body> elements during peer-map topic read-in, the vast bulk of the peer map content would never even make it into memory; we'd basically just have skeleton peer maps available to resolve references, and the processing overhead should be minimal.

I'm going to try hacking my way through this to see how far I get. I will definitely follow up here with my progress, and I will likely follow up with questions. :)

 - Chris


ekimber@contrext.com
 

This is an example of where a local, targeted solution is relatively easy (just a bit of preprocessing or relying on knowledge of cases that won't occur in your content) but the general solution would be hard (because it has to handle all cases, perform well, be extensible, report exception conditions clearly, etc.).

It's not hard to implement simple processes that operate on maps and do specific things, either using the output of the OT preprocessing stage or just operating on the maps directly if you know you don't need to worry about stuff like filtering, metadata propagation, etc.

The DITA Community utilities area on github has general XSLT scripts for operating on maps, including resolving key references and resolving topicrefs to topics. With that code as a base it should be as easy as it can be to extract topic titles or book titles. https://github.com/dita-community/dita-utilities/tree/master/src/xslt

Then, as you say, it's a matter of optimizing performance: do you do the processing every time or cache results? If you cache results, how do you keep the cache up to date?

Cheers,

E.

--
Eliot Kimber
http://contrext.com


On 1/25/21, 8:06 AM, "Chris Papademetrious" <main@dita-users.groups.io on behalf of chrispitude@gmail.com> wrote:

Hi Eliot, David,

Thanks to both of you for sharing your thoughts!

Some background information... We use cross-book links in our content. On the authoring side, we have a compiled Perl script (accessible as an external tool in Oxygen) that populates cross-book link target text in the DITA source files. On the production side, we use a variant of the following plugin to late-resolve links in the final HTML output:

https://github.com/chrispy-snps/DITA-fix-xbook-html-links

Thus, many of our books already reference other books as peer maps.

A writer asked me how to get the title of a referenced book so she could get the title of a peer-referenced book, something like this:

<p>See <ph keyref="bookB.BookTitle"/> for more information.</p>

I thought it would be neat to define a BookTitle variable in every book, even using its value as the <bookmap> title so the title is single-sourced everywhere from the variable. Clever me! But this is when I found out that variables in peer maps are not accessible.

Could I solve this by defining every book title in a shared "booktitles" map, then referencing that map from every book? Well sure I guess, but yuck. Every book has a title, I just want to get to it.

Right now DITA offers


* @scope="peer" maprefs (ignores map+topic content)
* @scope="local" maprefs (processes map+topic content)


Functionally, I want something between these. I want to process a peer book's map structure but not its topic content, with the exception of preserving topic <title> elements to enable processing to populate cross-book link target text.

This would elegantly solve some challenges we have with cross-book links to topics with conditional-content titles, as then the DITA-OT profiling behaviors simply play out in the peer maps and resolve the content for us. Currently our Perl script flags these for human intervention. It would also avoid issues with stale target text because the target topic title changed but the referencing-topic writer didn't know to rerun the Perl script.

Would this increase processing time? Sure, but I'd gladly exchange that for the flow simplification. And if I can figure out how to ignore <body> elements during peer-map topic read-in, the vast bulk of the peer map content would never even make it into memory; we'd basically just have skeleton peer maps available to resolve references, and the processing overhead should be minimal.

I'm going to try hacking my way through this to see how far I get. I will definitely follow up here with my progress, and I will likely follow up with questions. :)

- Chris