Topics

Con(key)reffing terms: uppercase - lowercase


Yves Barbion
 

Hi group

Suppose you wish to reuse a term, for example the term "color definitions". This term can, of course, occur anywhere in a sentence, so also at the beginning, where it needs to start with a capital.

What's the best practice in this case? Having the term twice in the "warehouse topic": once with uppercase "C" and once with lowercase?

Or are there any better solutions?

Thanks

Yves


Tom Magliery
 

As a rule of thumb, when something feels like a hack on the markup side, I think to myself "Could/should this be handled on the processing side?" So how about having logic in the transformation that fixes uncapitalized letters in terms at beginnings of sentences? This sounds like kinda no-fun logic to write in a transformation, and that's with me not even knowing if it's even less fun in any languages besides English. So it's just something to wonder.
 
<sentence>Option 2: What if every sentence were its own element?</sentence> <sentence>Then it would be much easier to write the necessary logic in the transformation.</sentence> <sentence><term>color definitions</term> are so cool.</sentence> <sentence>Ok, I'm totally just kidding about Option 2.</sentence> <sentence>Please don't really try this one at home.</sentence>
 
mag
 

On 2020-08-19 05:46, Yves Barbion wrote:

Hi group
 
Suppose you wish to reuse a term, for example the term "color definitions". This term can, of course, occur anywhere in a sentence, so also at the beginning, where it needs to start with a capital.
 
What's the best practice in this case? Having the term twice in the "warehouse topic": once with uppercase "C" and once with lowercase?
 
Or are there any better solutions?
 
Thanks
 
Yves



Jarno Elovirta
 

Hi,

I've written a DITA-OT implementation for a customer where if you had markup such as

<p><keyword keyref="color-def"/> blaa blaa <keyword keyref="color-def"/>. <keyword keyref="color-def"/> blaa.</p>

When the keyrefs were processed, got transformed into something like

<p><keyword><?keyref?>color definitions</keyword> blaa blaa <keyword><?keyref?>color definitions</keyword>. <keyword><?keyref?>color definitions</keyword> blaa.</p>

So processing instruction keyref is used to mark that something comes from a keyref. Then, another step goes through all those processing instructions and tries to guess if the first character of the keyref content should be uppercased. The rules were simple,
  • Is this the first character in a block element
  • is this character preceded by one or more whitespace characters and a period, exclamation mark or question mark.
This would result in

<p><keyword><?keyref?>Color definitions</keyword> blaa blaa<keyword><?keyref?>color definitions</keyword>. <keyword><?keyref?>Color definitions</keyword> blaa.</p>

This worked well enough, but you could add more complex rules using this same approach.

So you can solve this problem at processing side.

Jarno

On Wed, 19 Aug 2020 at 15:46, Yves Barbion <yves.barbion@...> wrote:
Hi group

Suppose you wish to reuse a term, for example the term "color definitions". This term can, of course, occur anywhere in a sentence, so also at the beginning, where it needs to start with a capital.

What's the best practice in this case? Having the term twice in the "warehouse topic": once with uppercase "C" and once with lowercase?

Or are there any better solutions?

Thanks

Yves


Nicholas Mucks
 

Hi Yves,
We added custom processing on keyword, term, and abbreviated-form that transforms the first letter to uppercase if the writer sets outputclass = cap_first_letter.

We decided to the let the writer decide instead of doing this automatically because all the edge cases would make processing more complex. It also means they can use the same keyreffed element but control the output depending on where the element exists in context.

Take care,
- Nick

Sent from mobile

On Oct 17, 2020, at 8:21 AM, Jarno Elovirta <jelovirt@...> wrote:


Hi,

I've written a DITA-OT implementation for a customer where if you had markup such as

<p><keyword keyref="color-def"/> blaa blaa <keyword keyref="color-def"/>. <keyword keyref="color-def"/> blaa.</p>

When the keyrefs were processed, got transformed into something like

<p><keyword><?keyref?>color definitions</keyword> blaa blaa <keyword><?keyref?>color definitions</keyword>. <keyword><?keyref?>color definitions</keyword> blaa.</p>

So processing instruction keyref is used to mark that something comes from a keyref. Then, another step goes through all those processing instructions and tries to guess if the first character of the keyref content should be uppercased. The rules were simple,
  • Is this the first character in a block element
  • is this character preceded by one or more whitespace characters and a period, exclamation mark or question mark.
This would result in

<p><keyword><?keyref?>Color definitions</keyword> blaa blaa<keyword><?keyref?>color definitions</keyword>. <keyword><?keyref?>Color definitions</keyword> blaa.</p>

This worked well enough, but you could add more complex rules using this same approach.

So you can solve this problem at processing side.

Jarno

On Wed, 19 Aug 2020 at 15:46, Yves Barbion <yves.barbion@...> wrote:
Hi group

Suppose you wish to reuse a term, for example the term "color definitions". This term can, of course, occur anywhere in a sentence, so also at the beginning, where it needs to start with a capital.

What's the best practice in this case? Having the term twice in the "warehouse topic": once with uppercase "C" and once with lowercase?

Or are there any better solutions?

Thanks

Yves


Joe Russo
 

From my experience working with content that is translated, use conref for whole sentences, proper names that aren't translated, or snippets like button names. Don't use conrefs for text that is part of the grammar of the sentence.
"A color definition", "the color definition", "the color definitions" could all be translated and capitalized differently in different contexts.

As an example, in Dutch, "a system" is "en system" but "the system" is "systemet". This means I can't use conrefs for text like "Thanks for buying an <ph>XYZ system</ph>. The <ph>XYZ system</ph> is the best in the industry!".

Even if your content stays in only one language, be aware what happens if you replace product names for OEM. "Thanks for buying an <ph>XYZ system</ph>." breaks if you change the name to PDQ system.


jang
 

Sorry, that is not Dutch, but Danish. I believe that is a common confusion. The country is just as flat and has as many bicycles. But the Dutch phrases would be ‘een systeem’ and ‘het systeem’.

Just sayin'

Jang 

Smart Information Design
Amsterdam, Netherlands
Cell: +31 646 854 996

On 19 Oct 2020, 19:21 +0200, Joe Russo <dita.dude@...>, wrote:
From my experience working with content that is translated, use conref for whole sentences, proper names that aren't translated, or snippets like button names. Don't use conrefs for text that is part of the grammar of the sentence.
"A color definition", "the color definition", "the color definitions" could all be translated and capitalized differently in different contexts.

As an example, in Dutch, "a system" is "en system" but "the system" is "systemet". This means I can't use conrefs for text like "Thanks for buying an <ph>XYZ system</ph>. The <ph>XYZ system</ph> is the best in the industry!".

Even if your content stays in only one language, be aware what happens if you replace product names for OEM. "Thanks for buying an <ph>XYZ system</ph>." breaks if you change the name to PDQ system.


Joe Russo
 

Thanks Jang! Silly mistake on my part. As an avid cyclist, who hasn't been to either, I hope to make it to both some day!