#DITA-OT #DITA-OT


matthias.menn@...
 

Hi there,

We are currently adding Indonesian, Korean and Vietnamese as new languages to our CMS and I was wondering if anybody has some experience related to the index sorting and if the default configuration files in the i18 plugin of the DITA-OT work fine in general for these languages or if a customization is required?

Thanks,
Matthias


Mica Semrick
 

Hi, you'll want to make sure that you're correctly using the <index-sort-as> element in your topics, as this mechanism will sort all your index terms correctly.

-m


On August 26, 2021 8:34:02 AM PDT, "matthias.menn via groups.io" <matthias.menn@...> wrote:
Hi there,

We are currently adding Indonesian, Korean and Vietnamese as new languages to our CMS and I was wondering if anybody has some experience related to the index sorting and if the default configuration files in the i18 plugin of the DITA-OT work fine in general for these languages or if a customization is required?

Thanks,
Matthias


Toshihiko Makita
 

Hi Mtthias,

Using <index-sort-as> for these languages is so bad idea because it is mainly used for inputting readings for Japanese <indexterm> and it is not applicable for other languages without some exceptions in real publishing world.
Instead our product I18n Index Library all supports Indonesian("id"), Korean("ko"), Vietnamese("vi") and offers out of the box index sorting solution.
Refer to the following URL for details:
https://www.antennahouse.com/i18n-index-library

If you have any questions regarding this product, please let me know.

Regards,
-- 
/*----------------------------------------------------------------------
 Toshihiko Makita
 Development Group. Antenna House, Inc. Ina Branch
 Web site:
 http://www.antenna.co.jp/
 http://www.antennahouse.com/
 ----------------------------------------------------------------------*/ 


ekimber@contrext.com
 

The DITA Community project https://github.com/dita-community/org.dita-community.i18n is (was) my attempt to implement general I18N support for Open Toolkit, including both locale-aware sorting and grouping as well as other locale-specific features (such as line breaking and word detection).

 

It also includes an open-source Simplified Chinese dictionary-based collator, which offers an open-source alternative to Antenna House’s licensed Simplified Chinese collator (definitely buy theirs if you have the budget).

 

Unfortunately, life took some turns and I haven’t been able to maintain this project the last several years, so it might need a little attention. The main issue I was running into (and that *should* be solved) is the automatic registration of Java extension functions to Saxon through Open Toolkit. It was one of those things that worked in my local development environment but then wasn’t working in other environments but then I had to put it down and never came back to it.

 

If it’s not working it shouldn’t take much to make work, just somebody who can attend to the Java-and-OT details. Otherwise the processing should all be solid as far as the grouping and sorting and other ICU4J-based stuff goes.

 

Cheers,

 

E.

 

--

Eliot Kimber

http://contrext.com

 

 

 

Hi there,

We are currently adding Indonesian, Korean and Vietnamese as new languages to our CMS and I was wondering if anybody has some experience related to the index sorting and if the default configuration files in the i18 plugin of the DITA-OT work fine in general for these languages or if a customization is required?

Thanks,
Matthias


teamwis
 

In real-world life, I doubt how passionate people are about embracing
a so-called sorting or indexing in PDF. To be honest, at Sony
Ericsson, Antenna House I18N utility was implemented in the DITA OT
toolchain from within SDL Trisoft or Tridon Doc, but nobody seemed to
have known what value indexing brings. By the way, at that time Sony
Ericsson translated into up to 56 different languages, including
Japanese, Korean, Thai, and both Traditional and Simplified Chinese.

In other words, even if with AHF, you probably end up doing something
that adds no value. Well, in a regulated industry as Varian is in, it
is really uncommon a regulation, if any will require a PDF indexing
for a specific language.

my2cents.

On 8/28/21, ekimber@contrext.com <ekimber@contrext.com> wrote:
The DITA Community project
https://github.com/dita-community/org.dita-community.i18n is (was) my
attempt to implement general I18N support for Open Toolkit, including both
locale-aware sorting and grouping as well as other locale-specific features
(such as line breaking and word detection).



It also includes an open-source Simplified Chinese dictionary-based
collator, which offers an open-source alternative to Antenna House’s
licensed Simplified Chinese collator (definitely buy theirs if you have the
budget).



Unfortunately, life took some turns and I haven’t been able to maintain this
project the last several years, so it might need a little attention. The
main issue I was running into (and that *should* be solved) is the automatic
registration of Java extension functions to Saxon through Open Toolkit. It
was one of those things that worked in my local development environment but
then wasn’t working in other environments but then I had to put it down and
never came back to it.



If it’s not working it shouldn’t take much to make work, just somebody who
can attend to the Java-and-OT details. Otherwise the processing should all
be solid as far as the grouping and sorting and other ICU4J-based stuff
goes.



Cheers,



E.



--

Eliot Kimber

http://contrext.com







Hi there,

We are currently adding Indonesian, Korean and Vietnamese as new languages
to our CMS and I was wondering if anybody has some experience related to the
index sorting and if the default configuration files in the i18 plugin of
the DITA-OT work fine in general for these languages or if a customization
is required?

Thanks,
Matthias








--
Keep an Exacting Eye for Detail


 

The way most (automated) indexing tools work, you are right: indexing done in that way just repeats the words from titles or body text and presents not much added value over a well-organised TOC. If indexing is done right, it requires an indexer who understands the subject matter and the possible target audiences and adds index words that are NOT in the running text - synonyms in particular contexts (which are almost impossible to inject using automated tools). This also means that indexes cannot simply be translated as each language group will have different synonyms and idioms that the indexing needs to handle.

Sorting a translated index is the smallest of your problems with indexing.

Smart Information Design
Amsterdam, Netherlands
Cell: +31 646 854 996

On 30 Aug 2021, 08:45 +0200, teamwis <dfanster@...>, wrote:
In real-world life, I doubt how passionate people are about embracing
a so-called sorting or indexing in PDF. To be honest, at Sony
Ericsson, Antenna House I18N utility was implemented in the DITA OT
toolchain from within SDL Trisoft or Tridon Doc, but nobody seemed to
have known what value indexing brings. By the way, at that time Sony
Ericsson translated into up to 56 different languages, including
Japanese, Korean, Thai, and both Traditional and Simplified Chinese.

In other words, even if with AHF, you probably end up doing something
that adds no value. Well, in a regulated industry as Varian is in, it
is really uncommon a regulation, if any will require a PDF indexing
for a specific language.

my2cents.

On 8/28/21, ekimber@... <ekimber@...> wrote:
The DITA Community project
https://github.com/dita-community/org.dita-community.i18n is (was) my
attempt to implement general I18N support for Open Toolkit, including both
locale-aware sorting and grouping as well as other locale-specific features
(such as line breaking and word detection).



It also includes an open-source Simplified Chinese dictionary-based
collator, which offers an open-source alternative to Antenna House’s
licensed Simplified Chinese collator (definitely buy theirs if you have the
budget).



Unfortunately, life took some turns and I haven’t been able to maintain this
project the last several years, so it might need a little attention. The
main issue I was running into (and that *should* be solved) is the automatic
registration of Java extension functions to Saxon through Open Toolkit. It
was one of those things that worked in my local development environment but
then wasn’t working in other environments but then I had to put it down and
never came back to it.



If it’s not working it shouldn’t take much to make work, just somebody who
can attend to the Java-and-OT details. Otherwise the processing should all
be solid as far as the grouping and sorting and other ICU4J-based stuff
goes.



Cheers,



E.



--

Eliot Kimber

http://contrext.com







Hi there,

We are currently adding Indonesian, Korean and Vietnamese as new languages
to our CMS and I was wondering if anybody has some experience related to the
index sorting and if the default configuration files in the i18 plugin of
the DITA-OT work fine in general for these languages or if a customization
is required?

Thanks,
Matthias











--
Keep an Exacting Eye for Detail