Warning, /sdk/pology/doc/user/summit.docbook is written in an unsupported language. File is not indexed.

0001 <?xml version="1.0" encoding="UTF-8"?>
0002 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
0003  "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
0004 
0005 <chapter id="ch-summit">
0006 <title>Summitting Translation Branches</title>
0007 
0008 <para>Computer programs (though not only them) are sometimes concurrently developed and released from several <emphasis>branches</emphasis>. For example, there may be one "stable" branch, which sees only small fixes and from which periodical releases are made, and another, "development" branch, which undergoes larger changes and may or may not be periodically released as well; at one point, the development branch will become the new stable branch, and the old stable branch will be abandoned. There may also be more than two branches which see active work, such as "development", "stable", and "old stable".</para>
0009 
0010 <para>From programmers' point of view, working by branches can be very convenient. They can freely experiment with new features in the development branch, without having to wory that they will mess something up in the stable branch, from which periodical releases are made. In the stable branch they may fix some bugs discovered between the releases, or carry over some important and well-tested features from the development branch. For users who want to be on the cutting edge, they may provide experimental releases from the development branch.</para>
0011 
0012 <para>For translators, however, having to deal with different branches of the same collection of PO files is rarely a convenience. It is text to be translated just as any, only duplicated across two or more file hierarchies. This means that translators additionaly have to think about how to make sure that new and modified translations made in one branch appear in other branches too. It gets particularly ugly if there are mismatches in PO file collections in different branches, like when a PO file is renamed, split into two or more PO files, or merged into another PO file.<footnote>
0013 <para>One may think of relying upon the translation memory: translate only PO files from one branch, and batch-apply translation memory to PO files other branches, accepting only exact matches. This is dangerous, because short messages may need different translations in different PO files, resulting in hilarious mistranslations.</para>
0014 </footnote> Sometimes this branch juggling is not necessary; in strict two-branch setting, translators may choose to work only on the stable branch, and switch to the next stable branch when it gets created (or switch to the development branch shortly before it becomes stable). Even so, branch switching may not go very smooth in presence of mismatches in PO file collections.</para>
0015 
0016 <para>Instead, for translators the most convenient would be to work on a single, "supercollection" of PO files, from which new and modified translations would be automatically periodically sent to appropriate PO files in branches. Such a supercollection can be created and maintained by Pology's <command>posummit</command> script. In terms of this script, the supercollection is called the <emphasis>summit</emphasis>, the operation of creating and updating it is called <emphasis>gathering</emphasis>, and the operation of filling out branch PO files is called <emphasis>scattering</emphasis>.</para>
0017 
0018 <para>How do summit PO files look like? When all branches contain the same PO file, then the counterpart summit PO file is simply the union of all messages from branch PO files. A message in the summit PO file differs from branch messages only by having the special <literal>#. +> ...</literal> comment, which lists the branches that contain this message. If there would be two branches, named with <literal>devel</literal> and <literal>stable</literal> keywords, an excerpt from a summit PO file could be:
0019 <programlisting language="po">
0020 #. +> devel
0021 #: kdeui/jobs/kwidgetjobtracker.cpp:469
0022 msgctxt "the destination URL of a job"
0023 msgid "Destination:"
0024 msgstr ""
00250026 #. +> stable
0027 #: kdeui/jobs/kwidgetjobtracker.cpp:469
0028 msgid "Destination:"
0029 msgstr ""
00300031 #. +> devel stable
0032 #: kdeui/jobs/kwidgetjobtracker.cpp:517
0033 msgid "Keep this window open after transfer is complete"
0034 msgstr ""
0035 </programlisting>
0036 The first message above exists only in the development branch, the second only in the stable branch, and the third in both branches. The source reference always refers to the source file in the first listed branch. Any other extracted comments (<literal>#.</literal>) are also taken from the first listed branch.</para>
0037 
0038 <para>Note that the first two messages are different only by context. The context was added in development branch, but not in stable, probably in order not to break the message freeze. However, due to special ordering of messages in summit PO files, these two messages appear together, allowing the translator to immediately make the correction in stable branch too if the new context in development branch shows it to be necessary.</para>
0039 
0040 <para>When a PO file from one branch has a different name in another branch, or several PO files from one branch are represented with a single PO file in another branch, the summit can still handle it gracefully, by manually <emphasis>mapping</emphasis> branch PO files to summit PO files. One branch PO file can be mapped to one or more summit PO files, and several branch PO files can be mapped to one summit PO file. Usually, but not necessarily, one branch (e.g. the development branch) is taken as reference for the summit file organization, and stray PO files from other branches are mapped accordingly.</para>
0041 
0042 <para>If a team of translators works in the summit, it is sufficient that one team member (and possibly another one as backup) manages the summit. After the initial setup, this team member should periodically run <command>posummit</command> to update summit and branch PO files. All other team members can simply translate the summit PO files, oblivious of any summit operations behind the scenes. It is also possible that team members perform summit operations on their own, on a subset of PO files that they are about to work on. It is up to the team to agree upon the most convenient workflow.</para>
0043 
0044 <!-- ======================================== -->
0045 <sect1 id="sec-susetup">
0046 <title>Setting Up The Summit with <command>posummit</command></title>
0047 
0048 <para>There are two major parts in setting up the summit: linking locations and organization of PO files in the branches to that of the summit, and deciding what summit <emphasis>mode</emphasis> will be used.</para>
0049 
0050 <para>Great flexibility is possible in linking branches to the summit, but at the expense of possibly heavy configuring. To make it simpler, currently there are two types of branch organization which can be handled automatically, just by specifying a few paths and options. In the <emphasis>by-language</emphasis> branch organization, PO files in branches are grouped by language and their file names reflect their domain names:
0051 <screen>
0052 devel/                  # development branch
0053     aa/                 # language A
0054         alpha.po
0055         bravo.po
0056         charlie.po
0057         ...
0058     bb/                 # language B
0059         alpha.po
0060         bravo.po
0061         charlie.po
0062         ...
0063     ...
0064     templates/          # templates
0065         alpha.pot
0066         bravo.pot
0067         charlie.pot
0068         ...
0069 stable/                 # stable branch
0070     aa/
0071         ...
0072     bb/
0073         ...
0074     templates/
0075         ...
0076 ...
0077 </screen>
0078 The other organization that can be automatically handled is <emphasis>by-domain</emphasis>:
0079 <screen>
0080 devel/                  # development branch
0081     alpha/              # domain alpha
0082         aa.po           # language A
0083         bb.po           # language B
0084         ...
0085         alpha.pot       # template
0086     bravo/
0087         aa.po
0088         bb.po
0089         ...
0090         bravo.pot
0091     charlie/
0092         aa.po
0093         bb.po
0094         ...
0095         charlie.pot
0096     ...
0097 stable/                 # stable branch
0098     alpha/
0099         ...
0100     bravo/
0101         ...
0102     charlie/
0103         ...
0104 ...
0105 </screen>
0106 In both organizations, there can be any number of subdirectories in the middle, between the branch top directory and directory where PO files are. For example, in by-language organization there could be some categorization:
0107 <screen>
0108 path/to/devel/
0109     aa/
0110         utilities/
0111             alpha.po
0112             bravo.po
0113             ...
0114         games/
0115             charlie.po
0116             ...
0117     bb/
0118         ...
0119 </screen>
0120 while in by-domain categorization the domain directories could be within their respective sources<footnote>
0121 <para>Unfortunatelly, the following common organization cannot be automatically supported:
0122 <screen>
0123 path/to/devel/
0124     appfoo/
0125         src/
0126         doc/
0127         po/
0128             aa.po
0129             bb.po
0130             ...
0131             # no template!
0132         ...
0133     appbar/
0134         ...
0135 </screen>
0136 The problem is that there is no way to determine domain names from the file tree alone, and that different handling would be required for sources which actually have multiple PO domains.</para>
0137 </footnote>:
0138 <screen>
0139 devel/
0140     appfoo/
0141         src/
0142         doc/
0143         po/
0144             foo/
0145                 aa.po
0146                 bb.po
0147                 ...
0148                 foo.pot
0149             libfoo/
0150                 aa.po
0151                 bb.po
0152                 ...
0153                 libfoo.pot
0154         ...
0155     appbar/
0156         ...
0157 </screen>
0158 </para>
0159 
0160 <para>There are three possible summit modes: direct summit, summit over dynamic templates, and summit over static templates. In the <emphasis>direct summit</emphasis>, only branch PO files are processed, in that new and modifed messages are gathered from them and summit translations scattered to them. In <emphasis>summit over dynamic templates</emphasis>, messages from branch PO files are gathered only once, at creation of the summit; after that, it is branch templates (POT files) that are gathered into summit templates, and then summit PO files are merged with them. Summit templates are not actually seen, but are gathered internally when merging command is issued and removed after merging is done. <emphasis>Summit over static templates</emphasis> is quite the same, except that summit templates are explicitly gathered and kept around, and merging is done separately.</para>
0161 
0162 <para>What is the reason for having three summit modes to choose from? Direct summit mode is there because it is the easiest to explain and understand, and does not require that branches contain templates. It is however not recommended, for two reasons. Firstly, someone may mistakenly translate directly in a branch<footnote>
0163 <para>New translations do not have to appear in branches only by mistake. For example, some external sources, which have been translated elsewhere, may be integrated into the project.</para>
0164 </footnote>, and those translations may be silently gathered into the summit. This is bad for quality control (review, etc.), as it is expected that the summit is the sole source of translations. Secondly, you may want to perform some automatic modifications on translation when scattering, but not to get those modifications back into the summit on gathering, which would happen with direct summit. These issues are avoided by using summit over dynamic templates, though now branches must provide templates. Finally, summmit over static templates makes sense when several language teams share the summit setup: since gathering is the most complicated operation and sometimes requires manual intervention, it can be done once (by one person) on summit templates, while language teams can then merge and scatter their summits in a fully automatic fashion.</para>
0165 
0166 <para>There is one important design decisions which holds for all summit modes: all summit PO files must be <emphasis role="strong">unique by domain name</emphasis> (i.e. base file name without extension), even if they are in different subdirectories within the summit top directory. This in turn means that in automatically supported branch organizations (by-domain and by-language) PO domains should be unique as well.<footnote>
0167 <para>More precisely, if there are two same-name PO domains inside one branch, they will both be gathered into the same summit PO file. The assumption is that PO files with same domain names have mostly common messages.</para>
0168 </footnote> This was done for two reasons. Less importantly, it is convenient to be able to identify a summit PO file simply by its domain name rather than the full path (especially in some <command>posummit</command> invocations). More importantly, uniqueness of domain names allows that PO files are located in different subdirectories between different branches. This happens, for example, in large projects in which code moves between modules. If branches do not satisfy this property, i.e. they contain same-name PO domains with totally different content, it is necessary to define a <emphasis>path transformation</emphasis> (see <xref linkend="sec-sustpptransf"/>) which will produce unique domain names with respect to the summit.</para>
0169 
0170 <para>The following sections describe how to set up each of the modes, in each of the outlined branch organizations. They should be read in turn up to the mode that you want to use, because they build upon each other.</para>
0171 
0172 <sect2 id="sec-sustpdirect">
0173 <title>Setting Up Direct Summit</title>
0174 
0175 <para>Let us assume that branches are organized by-language, that branch top directories are in the same parent directory, and that you want the summit top directory to be on the level of branch parent directory. That is:
0176 <screen>
0177 branches/
0178     devel-aa/
0179         alpha.po
0180         bravo.po
0181         ...
0182     stable-aa/
0183         alpha.po
0184         bravo.po
0185         ...
0186 summit-aa/
0187     alpha.po
0188     bravo.po
0189     ...
0190     summit-config
0191 </screen>
0192 <literal>aa</literal> is the language code, which can be added for clarity, but is not necessary. It could also be a subdirectory, as in <filename>branches/devel/aa</filename> and <filename>summit/aa</filename>. At start you have the <filename>branches/</filename> directory ready; now you create the <filename>summit-aa/</filename> directory, and within it the summit configuration file <filename>summit-config</filename> with the following content:
0193 <programlisting language="python">
0194 S.lang = "aa"
0195 
0196 S.summit = dict(
0197     topdir=S.relpath("."),
0198 )
0199 
0200 S.branches = [
0201     dict(id="devel",
0202          topdir=S.relpath("../branches/devel-aa")),
0203     dict(id="stable",
0204          topdir=S.relpath("../branches/stable-aa")),
0205     # ...and any other branches.
0206 ]
0207 
0208 S.mappings = [
0209 ]
0210 </programlisting>
0211 This is all that is necessary to set up a direct summit. The configuration file must be named exactly <filename>summit-config</filename>, because <command>posummit</command> will look for a file named like that through parent directories and automatically pick it up. As you may have recognized, <filename>summit-config</filename> is actually a Python source file; <command>posummit</command> will insert the special <literal>S</literal> object when evaluating <filename>summit-config</filename>, and it is through this object that summit options are set. <literal>S.lang</literal> states the language code of the summit. <literal>S.summit</literal> is a Python dictionary that holds options for the summit PO files (here only its location, through <literal>topdir=</literal> key), while <literal>S.branches</literal> is a list of dictionaries, each specifying options per branch (here the branch identifier by <literal>id=</literal> key and top directory). The <function>S.relpath</function> function is used to make file and directory paths relative to <filename>summit-config</filename> itself. <literal>S.mappings</literal> is a list of PO file mappings, for cases of splitting, mergings and renamings between branches. In this example <literal>S.mappings</literal> is set to empty only to point out its importance, but it does not need to be present if there are no mappings.</para>
0212 
0213 <para id="p-bydomorg">If branches are organized by-domain, the summit tree will still look the same, with PO files named by domain rather than by language:
0214 <screen>
0215 branches/
0216     devel/
0217         alpha/
0218             aa.po
0219             bb.po
0220             ...
0221         bravo/
0222             aa.po
0223             bb.po
0224             ...
0225         ...
0226     stable/
0227         alpha/
0228             aa.po
0229             bb.po
0230             ...
0231         bravo/
0232             aa.po
0233             bb.po
0234             ...
0235         ...
0236 summit-aa/
0237     alpha.po
0238     bravo.po
0239     ...
0240     summit-config
0241 </screen>
0242 The only difference in the summit configuration is the addition of <literal>by_lang=</literal> keys into the branch dictionaries:
0243 <programlisting language="python">
0244 S.branches = [
0245     dict(id="devel",
0246          topdir=S.relpath("../branches/devel"),
0247          by_lang=S.lang),
0248     dict(id="stable",
0249          topdir=S.relpath("../branches/stable"),
0250          by_lang=S.lang),
0251 ]
0252 </programlisting>
0253 Presence of the <literal>by_lang=</literal> key signals that the branch is organized by-domain (i.e. PO files named by language), and the value is the language code within the branch. Normaly it is set to previously defined <literal>S.lang</literal>, but it can also be something else in case different codes are used between the branches or the branches and the summit.</para>
0254 
0255 <para>When the configuration file has been written, the summit can be gathered for the first time (i.e. summit PO files created):
0256 <programlisting language="bash">
0257 $ cd .../summit-aa/
0258 $ posummit gather --create
0259 </programlisting>
0260 The path of each created summit PO file will be written out, along with paths of branch PO files from which messages were gathered into the summit file. After the run is finished, the summit is ready for use.</para>
0261 
0262 <para>While this was sufficient to set up a summit, there is a miriyad of options available for specialized purposes, which will be presented throughout this chapter. Also, given that summit configuration file is Python code, you can add into it any scripting that you wish. Some summit options (defined through the <literal>S</literal> object) even take Python functions as values.</para>
0263 
0264 </sect2>
0265 
0266 <sect2 id="sec-sustpdyntpl">
0267 <title>Setting Up Summit over Dynamic Templates</title>
0268 
0269 <para>Again consider by-language organization of branches, similar to the direct summit example above, except that now template directories too must be present in branches:
0270 <screen>
0271 branches/
0272     devel/
0273         aa/
0274             alpha.po
0275             bravo.po
0276             ...
0277         templates/
0278             alpha.pot
0279             bravo.pot
0280             ...
0281     stable/
0282         aa/
0283             alpha.po
0284             bravo.po
0285             ...
0286         templates/
0287             alpha.pot
0288             bravo.pot
0289             ...
0290 summit-aa/
0291     alpha.po
0292     bravo.po
0293     ...
0294     summit-config
0295 </screen>
0296 Here the language PO files and templates are put in subdirectories within the branch directory only for convenience, but this is not mandatory. For example, language files could reside in <filename>branches/devel-aa</filename> and templates in <filename>branches/devel-templates</filename>, no path connection is required between the two. This is because the template path per branch is explicitly given in <filename>summit-config</filename>, which would look like this:
0297 <programlisting language="python">
0298 S.lang = "aa"
0299 S.over_templates = True
0300 
0301 S.summit = dict(
0302     topdir=S.relpath("."),
0303 )
0304 
0305 S.branches = [
0306     dict(id="devel",
0307          topdir=S.relpath("../branches/devel/aa"),
0308          topdir_templates=S.relpath("../branches/devel/templates")),
0309     dict(id="stable",
0310          topdir=S.relpath("../branches/stable/aa"),
0311          topdir_templates=S.relpath("../branches/stable/templates")),
0312 ]
0313 
0314 S.mappings = [
0315 ]
0316 </programlisting>
0317 Compared to the configuration of a direct summit, two things are added here. <literal>S.over_templates</literal> option is set to <literal>True</literal> to indicate that summit over templates is used. The path to templates is set with <literal>topdir_templates=</literal> key for each branch.</para>
0318 
0319 <para>In by-domain branch organization, the directory tree looks just the same as for direct summit, except that each domain directory also contains the templates:
0320 <screen>
0321 branches/
0322     devel/
0323         alpha/
0324             aa.po
0325             bb.po
0326             ...
0327             alpha.pot
0328         bravo/
0329             aa.po
0330             bb.po
0331             ...
0332             bravo.pot
0333         ...
0334     stable/
0335         alpha/
0336             aa.po
0337             bb.po
0338             ...
0339             alpha.pot
0340         bravo/
0341             aa.po
0342             bb.po
0343             ...
0344             bravo.pot
0345         ...
0346 summit-aa/
0347     alpha.po
0348     bravo.po
0349     ...
0350     summit-config
0351 </screen>
0352 Here only the branch specifications need to differ compared to the by-language configuration, by having the <literal>by_lang=</literal> key and omitting the path to templates:
0353 <programlisting language="python">
0354 S.branches = [
0355     dict(id="devel",
0356          topdir=S.relpath("../branches/devel"),
0357          by_lang=S.lang),
0358     dict(id="stable",
0359          topdir=S.relpath("../branches/stable"),
0360          by_lang=S.lang),
0361 ]
0362 </programlisting>
0363 </para>
0364 
0365 <para>Initial gathering of the summit is done slightly differently compared to the direct summit:
0366 <programlisting language="bash">
0367 $ cd .../summit-aa/
0368 $ posummit gather --create --force
0369 </programlisting>
0370 The <option>--force</option> option must be used here because, unlike in direct summit, explicit gathering is not regularly done in summit over dynamic templates.</para>
0371 
0372 </sect2>
0373 
0374 <sect2 id="sec-sustpstattpl">
0375 <title>Setting Up Summit over Static Templates</title>
0376 
0377 <para>As mentioned earlier, summit over static templates can be used when several language teams want to share the summit setup, for the reasons of greater efficiency. The branch directory tree looks exactly the same as in summit over dynamic templates (with several languages being present), but the summit tree is somewhat different:
0378 <screen>
0379 branches/
0380     # as before, either by-language or by-domain
0381 summit/
0382     summit-config-shared
0383     aa/
0384         alpha.po
0385         bravo.po
0386         ...
0387     bb/
0388         alpha.po
0389         bravo.po
0390         ...
0391     templates/
0392         alpha.pot
0393         bravo.pot
0394         ...
0395 </screen>
0396 First of all, there is now the <filename>summit/</filename> directory which contains subdirectories by language (the language summits) and one subdirectory for summit templates (the template summit). Then, there is no more the <filename>summit-config</filename> file, but <filename>summit-config-shared</filename>; the name can actually be anything, so long as it is not exactly <filename>summit-config</filename>. This is in order to prevent <command>posummit</command> from automatically picking it up, as now the configuration is not tied to a single language summit. Instead, the path to the configuration file and the language code are explicitly given as arguments to <command>posummit</command>.</para>
0397 
0398 <para>The configuration file for by-language branches looks like this:
0399 <programlisting language="python">
0400 S.over_templates = True
0401 
0402 S.summit = dict(
0403     topdir=S.relpath("%s" % S.lang),
0404     topdir_templates=S.relpath("templates"),
0405 )
0406 
0407 S.branches = [
0408     dict(id="devel",
0409          topdir=S.relpath("../branches/devel/%s" % S.lang),
0410          topdir_templates=S.relpath("../branches/devel/templates")),
0411     dict(id="stable",
0412          topdir=S.relpath("../branches/stable/%s" % S.lang),
0413          topdir_templates=S.relpath("../branches/stable/templates")),
0414 ]
0415 
0416 S.mappings = [
0417 ]
0418 </programlisting>
0419 Compared to summit over dynamic templates, here <literal>S.lang</literal> is no longer hardcoded in the configuration file, but set at each run of <command>posummit</command> through the command line. This means that paths of language directories too have to be dynamically adapted based on <literal>S.lang</literal>, hence the string interpolations <literal>"...%s..." % S.lang</literal>.</para>
0420 
0421 <para>For by-domain branches, branch specifications differ by having the <literal>by_lang=</literal> key, and not having the path to templates:
0422 <programlisting language="python">
0423 S.branches = [
0424     dict(id="devel",
0425          topdir=S.relpath("../branches/devel"),
0426          by_lang=S.lang),
0427     dict(id="stable",
0428          topdir=S.relpath("../branches/stable"),
0429          by_lang=S.lang),
0430 ]
0431 </programlisting>
0432 </para>
0433 
0434 <para>In summit over static templates mode, initital gathering is first done for summit templates, like this:
0435 <programlisting language="bash">
0436 $ cd .../summit/
0437 $ posummit summit-config-shared templates gather --create
0438 </programlisting>
0439 The first two arguments are now the path to the configuration file and the language code, where <literal>templates</literal> is the dummy language code for templates<footnote>
0440 <para>It can be changed by assigning another string to <literal>S.templates_lang</literal>.</para>
0441 </footnote>. After this is finished, language summits can be gathered:
0442 <programlisting language="bash">
0443 $ posummit summit-config-shared aa gather --create --force
0444 $ posummit summit-config-shared bb gather --create --force
0445 $ ...
0446 </programlisting>
0447 Note that <option>--force</option> was not needed when gathering templates, because in this mode the template summit is periodically gathered, while language summits are not.</para>
0448 
0449 </sect2>
0450 
0451 <sect2 id="sec-sustpptransf">
0452 <title>Transforming Branch Paths</title>
0453 
0454 <para>When branches contain only PO files which are used natively, by programs fetching translations at runtime, then all branch PO files will be unique by their domain name (as mandated by the Gettext runtime system). It will not happen that two branch subdirectories contain a PO file with the same name. This fits perfectly with the summit requirement that all summit PO files be unique by domain names.</para>
0455 
0456 <para>However, if PO files are used as <link linkend="p-dynstattr">an intermediate</link> to other formats, branches may contain same-name PO files which have otherwise nothing in common, in different subdirectories. For example, each subdirectory may contain a PO file named <filename>index.po</filename>, <filename>help.po</filename>, etc. If this would be left unattended, all the same-name PO files would be collapsed into single summit PO file, which makes no sense given that they have (almost) no common messages. For this reason, it is possible to define transformations which modify absolute branch paths during processing, such that branch PO files are seen with unique names.</para>
0457 
0458 <para>Consider the following example of two branches for language <literal>aa</literal> (i.e. by-language organization) with PO files non-unique by domain name:
0459 <screen>
0460 branches/
0461     devel-aa/
0462         chapter1/
0463             intro.po
0464             glossary.po
0465             ...
0466         chapter2/
0467             intro.po
0468             glossary.po
0469             ...
0470         ...
0471     stable-aa/
0472         chapter1/
0473             intro.po
0474             glossary.po
0475             ...
0476         chapter2/
0477             intro.po
0478             glossary.po
0479             ...
0480         ...
0481 </screen>
0482 These branches cover some sort of a book, where each chapter has some standard elements, and thus some same-name PO files with totally different content in each chapter's subdirectory. To have unique domain names in the summit, you might decide upon a flat file tree with chapter in prefix:
0483 <screen>
0484 summit-aa/
0485     chapter1-intro.po
0486     chapter1-glossary.po
0487     ...
0488     chapter2-intro.po
0489     chapter2-glossary.po
0490     ...
0491     summit-config
0492 </screen>
0493 To achieve this, you must first write two Python functions (remember that the summit configuration file is a normal Python source file), one to split branch paths and another to join them, and add them to branch specifications in <literal>S.branches</literal>.</para>
0494 
0495 <para>The function to split branch paths takes a single argument, the branch PO file path relative to the branch top directory, and returns the summit PO domain name and the summit subdirectory. For the example above, the splitting function would look like this:
0496 <programlisting language="python">
0497 def split_branch_path (subpath):
0498     import os
0499     filename = os.path.basename(subpath)      # get PO file name
0500     domain0 = filename[:filename.rfind(".")]  # strip .po extension
0501     subdir0 = os.path.dirname(subpath)        # get branch subdirectory
0502     domain = subdir0 + "-" + domain0          # set final domain name
0503     subdir = ""                               # set summit subdirectory
0504     return domain, subdir
0505 </programlisting>
0506 Note that the branch subdirectory was used only to construct the summit domain name, while the summit subdirectory is an empty string because summit flat file tree should be flat.</para>
0507 
0508 <para>The function to join branch paths takes three arguments. The first two are the summit PO domain name and the summit subdirectory. The third argument is the the value of <link linkend="p-bydomorg"><literal>by_lang=</literal> key</link> for the given branch. The return value is the branch PO file path relative to the branch top directory. It would look like this:
0509 <programlisting language="python">
0510 def join_branch_path (domain, subdir, bylang):
0511     import os
0512     subdir0, domain0 = domain.split("-", 1)    # get branch domain name
0513                                                # and branch subdirectory
0514                                                # from summit domain name
0515     filename = domain0 + ".po"                 # branch PO file name
0516     subpath = os.path.join(subdir0, filename)  # branch relative path
0517     return subpath
0518 </programlisting>
0519 Here the <varname>subdir</varname> argument (summit subdirectory) is not used is not used because it is always empty due to flat summit file tree, and <varname>bylang</varname> is not used because it is <literal>None</literal> due to by-language branch organization.</para>
0520 
0521 <para>The definitions of splitting and joining functions are written into the <filename>summit-config</filename> file somewhere before the <literal>S.branches</literal> branch specification, and added to each branch through <literal>transform_path=</literal> key:
0522 <programlisting language="python">
0523 S.branches = [
0524     dict(id="devel",
0525          topdir=S.relpath("../branches/devel-aa"),
0526          transform_path=(split_branch_path, join_branch_path)),
0527     dict(id="stable",
0528          topdir=S.relpath("../branches/stable-aa"),
0529          transform_path=(split_branch_path, join_branch_path)),
0530 ]
0531 </programlisting>
0532 This means that it is possible, if necessary, to define different splitting and joining functions per branch.</para>
0533 
0534 </sect2>
0535 
0536 </sect1>
0537 
0538 <!-- ======================================== -->
0539 <sect1 id="sec-sumaintain">
0540 <title>Maintaining the Summit</title>
0541 
0542 <para>From time to time, summit PO files need to be updated to reflect changes in branch PO files, and scattered so that branch PO files get new translations from the summit. How are summit PO files updated, by whom and in which amount, depends on the summit mode and the organization of the translation team. The same holds for when and by whom the scattering is done.</para>
0543 
0544 <sect2 id="sec-sumntbasic">
0545 <title>Centralized Summit Maintenance</title>
0546 
0547 <para>The usual maintenance procedure would be for one designated person (e.g. the team coordinator) to update all summit PO files and to scatter new translations to branch PO files, at certain periods of time agreed upon in the translation team.</para>
0548 
0549 <para>If there are no mismatches between the branch and summit PO files, the summit update procedure is fully automatic. How the summit is updated depends on the summit mode. In direct summit, the update is performed by gathering:
0550 <programlisting language="bash">
0551 $ cd $SUMMITDIR
0552 $ posummit gather
0553 </programlisting>
0554 In summit over dynamic templates, merging is performed instead:
0555 <programlisting language="bash">
0556 $ cd $SUMMITDIR
0557 $ posummit merge
0558 </programlisting>
0559 Finally, in summit over static templates, first the template summit is gathered, and then language summits are merged:
0560 <programlisting language="bash">
0561 $ posummit $SOMEWHERE/summit-config-shared templates gather
0562 $ posummit $SOMEWHERE/summit-config-shared aa merge
0563 $ posummit $SOMEWHERE/summit-config-shared bb merge
0564 ...
0565 </programlisting>
0566 Note that unlike when setting up the summit, no <option>--create</option> or <option>--force</option> options are used. Without them, <command>posummit</command> will warn about any new mismatches between branches and the summit and abort the operation, leaving the user to examine the situation and take corrective measures. <xref linkend="sec-sumntmism"/> discusses this in detail.</para>
0567 
0568 <para>Scattering to branches is always fully automatic. For direct summit and summit over dynamic templates it is performed with:
0569 <programlisting language="bash">
0570 $ cd $SUMMITDIR
0571 $ posummit scatter
0572 </programlisting>
0573 For summit over static templates, scattering is done for each language summit:
0574 <programlisting language="bash">
0575 $ posummit $SOMEWHERE/summit-config-shared aa scatter
0576 $ posummit $SOMEWHERE/summit-config-shared bb scatter
0577 ...
0578 </programlisting>
0579 </para>
0580 
0581 <para>If summit update (merge, gather, or both, depending on the summit mode) is scheduled to run automatically, the maintainer should make sure to be notified when <command>posummit</command> aborts, so that mismatches can be promptly handled.</para>
0582 
0583 <para>The obvious advantage of this maintenance method is that other team members do not need to know anything about workings of the summit. They only fetch updated summit PO files, translate them, and submit them back. The disadvantage is that summit update may interfere with a particular translator who happened to be working on a PO file which just got updated in the repository, causing merge conflicts when he attempts to submit that PO file.</para>
0584 
0585 </sect2>
0586 
0587 <sect2 id="sec-sumntdistrib">
0588 <title>Distributed Summit Maintenance</title>
0589 
0590 <para>In this maintenance mode, each team member performs summit operations on exactly the PO files that he wants to work on. This has the advantage over centralized maintenance in that translators do not interfere in each others work, as summit PO files get updated only at the request of the translator working on it. Additionally, it may provide faster gather(-merge)-scatter turnaround time. Unfortunately, the disadvantage is that now all team members have to know how the summit is maintained, so this method is likely applicable only to strongly technical teams.</para>
0591 
0592 <para>Distributed maintenance is in general the same as centralized, except that now all <command>posummit</command> command lines take extra arguments, namely the selection of PO files to operate on -- so called <emphasis>operation targets</emphasis>. Operation targets can be given in two ways. One is directly by file or directory paths. For example, in summit over dynamic templates mode, when working on the <filename>foobaz.po</filename> file, the translator would use the following summit commands to merge it and scatter to the branches:
0593 <programlisting language="bash">
0594 $ cd $SUMMITDIR
0595 $ posummit merge foosuite/foobaz.po
0596 $ # ...update the translation...
0597 $ posummit scatter foosuite/foobaz.po
0598 </programlisting>
0599 To update all files in <filename>foosuite/</filename> subdirectory at once, the translator can execute instead:
0600 <programlisting language="bash">
0601 $ cd $SUMMITDIR
0602 $ posummit merge foosuite/
0603 $ posummit scatter foosuite/
0604 </programlisting>
0605 It is also possible to single out a particular branch for scattering, by giving the path to the PO file in that branch instead of the summit. To scatter <filename>foobaz.po</filename> only to <literal>devel</literal> branch, in by-language branch organization the translator would use:
0606 <programlisting language="bash">
0607 $ posummit scatter $SOMEWHERE/devel/aa/foosuite/foobaz.po
0608 </programlisting>
0609 and in by-domain branch organization:
0610 <programlisting language="bash">
0611 $ posummit scatter $SOMEWHERE/devel/foosuite/foobaz/po/foobaz/aa.po
0612 </programlisting>
0613 Note that the current working directory still has to be within the summit directory, so that <command>posummit</command> can find the summit configuration file. (This requirement is not present for summit over static templates, as there the path to configuration file is given in command line.)</para>
0614 
0615 <para>The other kind of operation targets are PO domain names and subdirectory names alone. In this formulation, the first example above could be replaced with:
0616 <programlisting language="bash">
0617 $ posummit merge foobaz
0618 $ posummit scatter foobaz
0619 </programlisting>
0620 Since all summit PO file names are unique, this is sufficient information for <command>posummit</command> to know what it should operate on. To limit operation to a certain branch, the branch name is added in front of the domain names, separated by a colon. To scatter <filename>foobaz.po</filename> to <literal>devel</literal> branch:
0621 <programlisting language="bash">
0622 $ posummit scatter devel:foobaz
0623 </programlisting>
0624 and to scatter the complete <filename>foosuite/</filename> subdirectory to the same branch:
0625 <programlisting language="bash">
0626 $ posummit scatter devel:foosuite/
0627 </programlisting>
0628 Note that trailing slash is significant here, since otherwise the argument would be interpreted as single PO file (<command>posummit</command> would exit with an error, reporting that such a file does not exist). Summit also has a "branch name" assigned for use in operation targets of this kind, and that is <literal>+</literal>.</para>
0629 
0630 <para>When merging (or gathering in direct summit mode) is attempted, <command>posummit</command> may abort with the report of mismatches between branches and the summit. The translator must then make the adjustments (<xref linkend="sec-sumntmism"/> describes how, case by case), or report it to someone else to handle.</para>
0631 
0632 <para>After selected summit and branch PO files have been updated, the translator can commit them. Alternatively, a half-distributed workflow could be used, where translators only update and commit summit PO files, while scattering to branches is centralized, and automatically performed at a given period. This makes sense because the scattering in no way interferes with translators' workflow and never needs any manual intervention.</para>
0633 
0634 </sect2>
0635 
0636 <sect2 id="sec-sumntmism">
0637 <title>Handling Mismatches Between Branches and Summit</title>
0638 
0639 <para>When something changes in the PO file tree in one of the branches, <command>posummit</command> will by default abort gathering (or merging in summit over dynamic templates), and present a list of its findings. At this point <command>posummit</command> could be made to continue by issuing the <option>--create</option> option, but then it will resolve mismatches in a simplistic way, which will be wrong in many cases. Instead, you should examine what had happened in branches, possibly manually perform some operations on summit PO files and possibly add some branch-to-summit mappings, and rerun <command>posummit</command> after the necessary adjustments have been made.</para>
0640 
0641 <para>Typical mismatches and their remedies are as follows:
0642 <variablelist>
0643 
0644 <varlistentry>
0645 <term>A branch PO file has been moved to another subdirectory (<emphasis>moving</emphasis>).</term>
0646 <listitem>
0647 <para>In a translation project with modules represented by subdirectories, it may happen that a program or a library is moved from one module to another, with its PO files following the move. If this happened in all branches, <command>posummit</command> will report that the summit PO file should be moved as well; it can be rerun with <option>--create</option> to do the move itself, or you can make the move manually. If the move happened in only one of the branches, <command>posummit</command> will not complain at all; more precisely, if at least one branch PO file is in same relative subdirectory as the summit PO file, it is not considered a mismatch.</para>
0648 
0649 <para>Another, less obvious case of moving may arise when two same-named branch PO files appear in different subdirectories of the same branch. <command>posummit</command> will by default simply gather them into single summit PO file, without reporting anything. However, it may be that one of the two subdirectories is of higher priority for translation. Then that it would be better if the summit PO file is located in that subdirectory, and that <command>posummit</command> reports if that is not the case, or make the move itself under <option>--create</option>. Subdirectory precedence can be specified through <literal>S.subdir_precedence</literal> field, which is simply a list of subdirectories:
0650 <programlisting language="python">
0651 S.subdir_precedence = [
0652     "library",
0653     "application",
0654     "plugins/base",
0655     ...
0656 ]
0657 </programlisting>
0658 Earlier subdirectories in the list have higher precedence. If a subdirectory is below one of the listed subdirectories, that subdirectory will have the same precedence as its listed top directory. If a subdirectory is neither listed nor it is below any of the listed, its precedence will be lower than all the listed.</para>
0659 </listitem>
0660 </varlistentry>
0661 
0662 <varlistentry>
0663 <term>A totally new branch PO file has been added (<emphasis>addition</emphasis>).</term>
0664 <listitem>
0665 <para>When a piece of software appears (created or imported) in the project, its PO files will appear with it. These PO files are "totally" new, in the sense that they are not derived from any existing PO file. In this case, <command>posummit</command> will report that new branch PO files have no corresponding summit PO files, and expected paths of the missing summit PO files. After having checked that the branch PO files are indeed totally new, you can rerun <command>posummit</command> with <option>--create</option>, or manually copy branch PO files to expected summit paths (they will be equipped with summit-specific information when <command>posummit</command> rolls over them).</para>
0666 </listitem>
0667 </varlistentry>
0668 
0669 <varlistentry>
0670 <term>A branch PO file has been removed (<emphasis>removal</emphasis>).</term>
0671 <listitem>
0672 <para>A piece of software may be removed from the project (not maintained any more, moved to another project), which will cause its PO files to disappear. <command>posummit</command> will then report that some summit PO files have no corresponding branch PO files. You should check that branch PO files have indeed been simply removed, and then rerun <command>posummit</command> with <option>--create</option>, or manually remove summit PO files.</para>
0673 </listitem>
0674 </varlistentry>
0675 
0676 <varlistentry>
0677 <term>A branch PO file has been renamed (<emphasis>renaming</emphasis>).</term>
0678 <listitem>
0679 <para>When, for example, a program changes its name, normally its PO file will be renamed as well. What will happen in this case is that <command>posummit</command> will report two problems: a branch PO file without corresponding summit PO file (new name), and a summit PO file without any corresponding branch PO files (old name). When you realize that the cause of these paired reports is indeed renaming (they could also be an unrelated addition and removal), you must rename the summit PO file manually. Note that if you had not done this and issued <option>--create</option> option instead, the existing summit PO file would have been removed, and an empty one with the new name created -- definitely not what was desired.</para>
0680 
0681 <para>A more complicated case of renaming is when the name is changed in only one branch. <command>posummit</command> then reports only the branch PO file with the new name as having no summit PO file, since the existing summit PO file matches non-renamed branch PO files. In this case, the usual correction is to rename the summit PO file to new name and map old names from other branches to the new name. If <filename>foobaz.po</filename> was renamed to <filename>fooqwyx.po</filename> in <literal>devel</literal> branch, but kept its name in <literal>stable</literal>, then the mapping in the summit configuration file would be:
0682 <programlisting language="python">
0683 S.mappings = [
0684     ...
0685     ("stable", "foobaz", "fooqwyx"),
0686     ...
0687 ]
0688 </programlisting>
0689 Each mapping entry is a sequence of strings in parenthesis. The first string is the branch name, the second string is the domain name of the PO file in that branch, and the third string the domain name of the PO file in summit. When you add this mapping (and rename summit <filename>foobaz.po</filename> to <filename>fooqwyx.po</filename>), you can rerun <command>posummit</command>.</para>
0690 
0691 <para>If the summit is over static templates, i.e. there are separate template and language summits, then renamings should be done in all of them.</para>
0692 </listitem>
0693 </varlistentry>
0694 
0695 <varlistentry>
0696 <term>A branch PO file has been split into several files (<emphasis>splitting</emphasis>).</term>
0697 <listitem>
0698 <para>If a single PO file becomes very big, it may be split into several smaller files by categories of messages (e.g. UI and help texts). A program may also be modularized, when the factored modules may take over part of the messages from the main PO file into their own PO files. Either way, <command>posummit</command> will again simply report that some new branch PO files have appeared and possibly some disappeared, and you recognize that the cause of this is a splitting. Splitting typically happens in the newest branch, but not in older branches. You should then make the same split in summit PO files and map the monolithic PO file from older branches to the newly split summit files. For example, if <filename>foobaz.po</filename> in <literal>devel</literal> branch got split into <filename>foobaz.po</filename> (of reduced size), <filename>libfoobaz.po</filename>, and <filename>foobaz_help.po</filename>, the mapping for the old monolithic PO file in the <literal>stable</literal> branch would be:
0699 <programlisting language="python" id="l-splmap">
0700 S.mappings = [
0701     ...
0702     ("stable", "foobaz", "foobaz", "libfoobaz", "foobaz_help"),
0703     ...
0704 ]
0705 </programlisting>
0706 The first string in the mapping is the branch name, the second string is the PO domain name in that branch, and all following strings are the new summit PO domain names which contain part of original messages. The order of summit PO domains is somewhat important: if a message exists only in the monolithic PO file in the <literal>stable</literal> branch and not in split PO files in <literal>devel</literal> branch, and summit heuristics detects no appropriate insertion point into one of the summit PO files, that message will be added to the end of the first summit PO file listed.</para>
0707 
0708 <para>"Making the same split in summit" deserves some special attention. For the templates summit (which exists in summit over static templates), this simply means adding any new files and removing old ones (<command>posummit</command> will do that itself if run with <option>--create</option>). But for language summits, you should manually copy the original summit PO file to each new name in turn, and then perform gather (direct summit) or merge (summit over templates). In this way no translated messages will be lost in the split.<footnote>
0709 <para>One could also skip this and allow immediate loss of translations, and rely on the translation memory when later translating new PO files. But, especially in centralized summit maintenance, it is better to make things right early. Also, translation memory matches may not be as reliable, since they come not only from the original PO file, but from all PO files in the project.</para>
0710 </footnote></para>
0711 </listitem>
0712 </varlistentry>
0713 
0714 <varlistentry>
0715 <term>Several branch PO files have been merged into one (<emphasis>merging</emphasis>).</term>
0716 <listitem>
0717 <para>Sometimes formerly independent pieces of software are joined into a single package, for more effective maintenance and release. This can happen, for example, when selected plugins are taken into the host program distribution as "core plugins". Their separate PO files may then be merged into a single new PO file, or into an existing PO file. Like in the opposite case of splitting, <command>posummit</command> will simply report that some summit PO files no longer have branch counterparts, and possibly that a new branch PO file has appeared. This usually happens in the newest branch first, while older branches retain the separation. Then the same merging should be done in summit too, and mappings added for each of the old separate PO files in other branches. If <filename>foobaz_info.po</filename>, <filename>foobaz_backup.po</filename>, and <filename>foobaz_filters.po</filename> have been merged into existing <filename>foobaz.po</filename> in <literal>devel</literal> branch, the following mappings for the <literal>stable</literal> branch should be added:
0718 <programlisting language="python" id="l-mrgmap">
0719 S.mappings = [
0720     ...
0721     ("stable", "foobaz_info", "foobaz"),
0722     ("stable", "foobaz_backup", "foobaz"),
0723     ("stable", "foobaz_filters", "foobaz"),
0724     ...
0725 ]
0726 </programlisting></para>
0727 
0728 <para>As for making the same merge in the summit, for templates summit (in summit over static templates) you should manually remove old separate files and possibly add the new monolithic one, or run <command>posummit</command> with <option>--create</option>. In language summits, in order to retain all existing translations, you should manually concatenate separate files into one (using Gettext's <command>msgcat</command>) and then perform gather (direct summit) or merge (summit over templates).</para>
0729 </listitem>
0730 </varlistentry>
0731 
0732 <varlistentry>
0733 <term>A language branch PO file has appeared in summit over templates (<emphasis>injection</emphasis>).</term>
0734 <listitem>
0735 <para>In summit over templates modes (dynamic or static), the normal way for a language summit PO file to appear is by starting from a clean template, and the corresponding branch PO file is then created on scatter. However, when a program previously developed elsewhere is imported into the project, its PO files are imported too. This will lead to the situation where there are translated branch PO files with no corresponding language summit PO files. This is corrected by forced gathering of the "injected" branch PO file. If the injected file is <filename>alien.po</filename>, in summit over dynamic templates you would execute:
0736 <programlisting language="bash">
0737 $ cd $SUMMITDIR
0738 $ posummit gather --create --force alien
0739 </programlisting>
0740 and in summit over static templates:
0741 <programlisting language="bash">
0742 $ posummit $SOMEWHERE/summit-config-shared aa gather --create --force alien
0743 $ posummit $SOMEWHERE/summit-config-shared bb gather --create --force alien
0744 $ ...
0745 </programlisting>
0746 The <option>--force</option> option is necessary because, in summit over template modes, language summit PO files are normally gathered just once when the summit is created, and later only merged.</para>
0747 </listitem>
0748 </varlistentry>
0749 
0750 </variablelist>
0751 </para>
0752 
0753 <para>Important thing to note about mismatches is that reports produced by <command>posummit</command> may be misleading, especially in more complicated situations (splitting, merging). This means that you must carefully examine what has actually happened, not based only on the branch file trees themselves, but also by keeping an eye on channels (e.g. mailing lists) where information for translators is most likely to appear.</para>
0754 <!-- FIXME: Mention podescprob once documented. -->
0755 
0756 <para>There is also the possibility to map a whole branch subdirectory to another directory in the summit. Since summit PO files are unique by domain name, the only effect of subdirectory mapping is to prevent <command>posummit</command> from reporting that files should be moved to another subdirectory, and to have it report proper expected summit paths when new branch PO files are discovered. For example, if the PO files from subdirectory <filename>foosuite/</filename> in <literal>devel</literal> branch and from subdirectory <filename>foopack/</filename> in <literal>stable</literal> branch should both be collected in summit subdirectory <filename>foo/</filename>, the subdirectory mapping would be:
0757 <programlisting language="python">
0758 S.subdir_mappings = [
0759     ...
0760     ("devel", "foosuite", "foo"),
0761     ("stable", "foopack", "foo"),
0762     ...
0763 ]
0764 </programlisting>
0765 Subdirectory mappings should be needed rarely compared to file mappings. A tentative example could be when two closely related software forks are translated within the same project, and they have many PO files in their own subdirectories.</para>
0766 
0767 <para>At some moment translation branches will be "shifted", for example <literal>devel</literal> will become the new <literal>stable</literal>, <literal>stable</literal> may become <literal>oldstable</literal> (if three branches are maintained), etc. When that happens, mappings should be shifted too. A typical case would be two branches, <literal>devel</literal> and <literal>stable</literal>, and some mappings only for <literal>stable</literal>; then, when the shift comes, all existing mappings would be simply removed.</para>
0768 
0769 </sect2>
0770 
0771 <sect2 id="sec-sumntdeps">
0772 <title>Checking Summit Dependencies</title>
0773 
0774 <para>As the number of mappings grows, or if <link linkend="sec-sustpptransf">branch path transformation</link> is employed, it may not be readily clear which summit PO files are related to which branch PO files. Translator may need this information to know exactly which summit PO files to work on in order to have some set of branch files fully translated. For this reason, <command>posummit</command> provides the operation mode <literal>deps</literal>, in which any number of operation targets are given in command line, and the dependency chains are reported for those targets.</para>
0775 
0776 <para>If you recall the <link linkend="l-mrgmap">example mapping due to merging</link>, you can check the dependency chain for the file <filename>foobaz_info.po</filename> in <literal>stable</literal> branch by executing one of:
0777 <programlisting language="bash">
0778 $ cd $SUMMITDIR
0779 $ posummit deps $STABLEDIR/foobaz_info.po
0780 $ posummit deps stable:foobaz_info
0781 </programlisting>
0782 in direct summit or summit over dynamic templates, or
0783 <programlisting language="bash">
0784 $ posummit $SOMEWHERE/summit-config-shared aa deps $STABLEDIR/foobaz_info.po
0785 $ posummit $SOMEWHERE/summit-config-shared aa deps stable:foobaz_info
0786 </programlisting>
0787 in summit over static templates. The output would look like this:
0788 <programlisting>
0789 :    <replaceable>summit-dir</replaceable>/foobaz.po  <replaceable>devel-dir</replaceable>/foobaz.po <replaceable>stable-dir</replaceable>/foobaz_info.po \
0790      <replaceable>stable-dir</replaceable>/foobaz_backup.po <replaceable>stable-dir</replaceable>/foobaz_filters.po
0791 </programlisting>
0792 You can see that the complete dependency chain to which <filename>foobaz_info.po</filename> from <literal>stable</literal> belongs to has been written out. The first path in the chain is always the summit PO file, followed by all mapped PO files from each branch in turn.</para>
0793 
0794 <para>If the file for which the dependency is mapped to more than one summit PO file, then the dependency chains for each of them is displayed. In the <link linkend="l-splmap">example of mapping due to splitting</link>, if you request dependency for monolithic <filename>foobaz.po</filename> from <literal>stable</literal> branch, you would get three dependency chains:
0795 <programlisting>
0796 :    <replaceable>summit-dir</replaceable>/foobaz.po  <replaceable>devel-dir</replaceable>/foobaz.po  <replaceable>stable-dir</replaceable>/foobaz.po
0797 :    <replaceable>summit-dir</replaceable>/libfoobaz.po  <replaceable>devel-dir</replaceable>/libfoobaz.po  <replaceable>stable-dir</replaceable>/foobaz.po
0798 :    <replaceable>summit-dir</replaceable>/foobaz_help.po  <replaceable>devel-dir</replaceable>/foobaz_help.po  <replaceable>stable-dir</replaceable>/foobaz.po
0799 </programlisting>
0800 </para>
0801 
0802 </sect2>
0803 
0804 </sect1>
0805 
0806 <!-- ======================================== -->
0807 <sect1 id="sec-suconfig">
0808 <title>Elements of Summit Configuration</title>
0809 
0810 <para>Other then the main configuration fields for setting the summit type, summit and branch locations, and mappings, there are many other optional configuration fields. They can be used to make the translation workflow yet more efficient, by relieving translators from taking care of various details.</para>
0811 
0812 <sect2 id="sec-sucfghooks">
0813 <title>Summit Hooks</title>
0814 
0815 <para>Summit operations (gather, merge, scatter) are characterized by having PO files and messages flowing between the summit and branches. It is then natural to think of adding some <emphasis>filtering</emphasis> into these flows. For example, on scatter, one could do small ortographic adjustments in translation, or automatically insert translated UI references.<footnote>
0816 <para>Another possibility are validation filters, which do not modify the text but report possible problems, though <link linkend="sec-lgrules">validation rules</link> and <link linkend="sv-check-rules">the <command>check-rules</command> sieve</link> are likely a better solution.</para>
0817 </footnote></para>
0818 
0819 <para>Filtering is implemented by being able to insert Pology hooks (see <xref linkend="sec-cmhooks"/>) into various stages of summit operations; a particular stage will admit only certain types of hooks. To <link linkend="sec-lguirefs">fetch and insert translated UI references</link> on scatter, the <literal>resolve-ui</literal> hook can be added like this:
0820 <programlisting language="python">
0821 from pology.uiref import resolve_ui
0822 S.hook_on_scatter_msgstr = [
0823     (resolve_ui(uicpathenv="UI_PO_DIR"),),
0824 ]
0825 </programlisting>
0826 <literal>S.hook_on_scatter_msgstr</literal> is a list of hooks which are applied on translation (<varname>msgstr</varname> fields) before it is written into branch PO files on scatter. Each element of this list is a tuple of one to three elements. The first element in the tuple is the hook function, here <link linkend="hk-uiref-resolve-ui"><literal>resolve_ui</literal></link><footnote>
0827 <para><literal>resolve_ui</literal> is not the hook function itself, but a hook factory. It is called with the argument <literal>uicpathenv="UI_PO_DIR"</literal> to produced the actual hook function. See its documentation for details.</para>
0828 </footnote>. <literal>resolve_ui</literal> is an F3C hook, which is the type of hooks expected in <literal>S.hook_on_scatter_msgstr</literal> list.</para>
0829 
0830 <para>The second and third element in the hook tuple are, respectively, selectors by branch and file. These are used when the hook is not meant to be applied on all branches and all PO files. The selector can be either a regular expression string, which is matched against the branch name or PO domain name (positive match means to apply the hook), or a function (return value evaluating as true means to apply the hook). If it is a function, the branch selector gets the branch name as input argument, and the file selector gets the summit PO domain name and summit subdirectory. For example, to add the specialized <literal>resolve_ui_docbook4</literal> hook only to <literal>foobaz-manual.po</literal> file, and plain <literal>resolve_ui</literal> to all other files, the hook list would be:
0831 <programlisting language="python">
0832 from pology.uiref import resolve_ui, resolve_ui_docbook4
0833 
0834 S.hook_on_scatter_msgstr = [
0835     (resolve_ui_docbook4(uicpathenv="UI_PO_DIR"), "", "-manual$"),
0836     (resolve_ui(uicpathenv="UI_PO_DIR"), "", "(?&lt;!-manual)$"),
0837 ]
0838 </programlisting>
0839 The branch selector here is empty string, which means that both hooks apply to all branches (since empty regular expression matches any string). The <literal>resolve_ui_docbook4</literal> hook has <literal>"-manual$"</literal> regular expression as the file selector, which means that is should be applied to all PO domain names ending in <literal>-manual</literal>. The <literal>resolve_ui</literal> hook has been given the opposite regular expression, <literal>"(?&lt;!-manual)$"</literal>, which matches any PO domain name <emphasis>not</emphasis> ending in <literal>-manual</literal>.<footnote>
0840 <para>This pattern makes use of a <emphasis>negative lookbehind</emphasis> token, a fairly advanced bit of regular expression syntax.</para>
0841 </footnote> Regular expressions can quickly become unreadable, so here is how the same selection could be achieved with selector functions:
0842 <programlisting language="python">
0843 from pology.uiref import resolve_ui, resolve_ui_docbook4
0844 
0845 def is_manual (domain, subdir):
0846     return domain.endswith("-manual")
0847 def is_not_manual (domain, subdir):
0848     return not is_manual(domain, subdir)
0849 
0850 S.hook_on_scatter_msgstr = [
0851     (resolve_ui_docbook4(uicpathenv="UI_PO_DIR"), "", is_manual),
0852     (resolve_ui(uicpathenv="UI_PO_DIR"), "", is_not_manual),
0853 ]
0854 </programlisting>
0855 When is more than one hook in the list, they are applied in the order if which they are listed.</para>
0856 
0857 <para>This is all there is to say about hook application in general. What follows is a list of all presently defined hook insertion lists, with admissible hook types given in parentheses. Usually paired F* and S* hook types are possible, such that F* hooks are primary used for modification, while S* hooks could be employed for validation (e.g. writing out warnings).
0858 <variablelist>
0859 
0860 <varlistentry>
0861 <term><literal>S.hook_on_scatter_msgstr</literal> (F3A, F3C, S3A, S3C)</term>
0862 <listitem>
0863 <para>Applied to the branch translation (<varname>msgstr</varname> fields) on scatter, before it is written into the branch PO file.</para>
0864 </listitem>
0865 </varlistentry>
0866 
0867 <varlistentry>
0868 <term><literal>S.hook_on_scatter_msg</literal> (F4A, S4A)</term>
0869 <listitem>
0870 <para>Applied to branch message on scatter, before it is written into the branch PO file. These hooks can modify any part of the message, like comments, or even the <varname>msgid</varname> field.</para>
0871 </listitem>
0872 </varlistentry>
0873 
0874 <varlistentry>
0875 <term><literal>S.hook_on_scatter_cat</literal> (F5A, S5A)</term>
0876 <listitem>
0877 <para>Applied to the branch PO file while still in internal parsed state on scatter, after <literal>S.hook_on_scatter_msgstr</literal> had been applied to all messages.</para>
0878 </listitem>
0879 </varlistentry>
0880 
0881 <varlistentry>
0882 <term><literal>S.hook_on_scatter_file</literal> (F6A, S6A)</term>
0883 <listitem>
0884 <para>Applied to the branch PO file as raw file on disk on scatter, after <literal>S.hook_on_scatter_cat</literal> had been applied. If one of the hooks reports non-zero value, the rest of the hooks in the list are not applied to that file.</para>
0885 </listitem>
0886 </varlistentry>
0887 
0888 <varlistentry>
0889 <term><literal>S.hook_on_scatter_branch</literal></term>
0890 <listitem>
0891 <para>Applied to the complete branch on scatter, after all other hooks on scatter had been applied. Functions used here are not part of the formal hook system. They take a single argument, the branch name, and return a number. If the return value is not zero, rest of the hooks are skipped on that branch.</para>
0892 </listitem>
0893 </varlistentry>
0894 
0895 <varlistentry>
0896 <term><literal>S.hook_on_gather_file_branch</literal> (F6A, S6A)</term>
0897 <listitem>
0898 <para>Applied to the branch PO file as raw file on disk on gather, before <literal>S.hook_on_gather_cat_branch</literal> is applied. The branch PO file will not be modified for real, but only its temporary copy.</para>
0899 </listitem>
0900 </varlistentry>
0901 
0902 <varlistentry>
0903 <term><literal>S.hook_on_gather_cat_branch</literal> (F5A, S5A)</term>
0904 <listitem>
0905 <para>Applied to the branch PO file while still in internal parsed state on gather, before <literal>S.hook_on_gather_msg_branch</literal> is applied to all messages.</para>
0906 </listitem>
0907 </varlistentry>
0908 
0909 <varlistentry>
0910 <term><literal>S.hook_on_gather_msg_branch</literal> (F4A, S4A)</term>
0911 <listitem>
0912 <para>Applied to the branch message on gather, before it is used to gather the corresponding summit message.</para>
0913 </listitem>
0914 </varlistentry>
0915 
0916 <varlistentry>
0917 <term><literal>S.hook_on_gather_msg</literal> (F4A, S4A)</term>
0918 <listitem>
0919 <para>Applied to the summit message on gather, after it had been gathered from the corresponding branch messages, but before it is written into the summit PO file.</para>
0920 </listitem>
0921 </varlistentry>
0922 
0923 <varlistentry>
0924 <term><literal>S.hook_on_gather_cat</literal> (F5A, S5A)</term>
0925 <listitem>
0926 <para>Applied to the summit PO file while still in internal parsed state on gather, after <literal>S.hook_on_gather_msgstr</literal> had been applied to all messages.</para>
0927 </listitem>
0928 </varlistentry>
0929 
0930 <varlistentry>
0931 <term><literal>S.hook_on_gather_file</literal> (F6A, S6A)</term>
0932 <listitem>
0933 <para>Applied to the summit PO file as raw file on disk on gather, after <literal>S.hook_on_gather_cat</literal> had been applied.</para>
0934 </listitem>
0935 </varlistentry>
0936 
0937 <varlistentry>
0938 <term><literal>S.hook_on_merge_head</literal> (F4B, S4B)</term>
0939 <listitem>
0940 <para>Applied to summit PO header on merge, after the summit PO file has been merged.</para>
0941 </listitem>
0942 </varlistentry>
0943 
0944 <varlistentry>
0945 <term><literal>S.hook_on_merge_msg</literal> (F4A, S4A)</term>
0946 <listitem>
0947 <para>Applied to summit message on merge, after <literal>S.hook_on_merge_head</literal> had been applied.</para>
0948 </listitem>
0949 </varlistentry>
0950 
0951 <varlistentry>
0952 <term><literal>S.hook_on_merge_cat</literal> (F5A, S6A)</term>
0953 <listitem>
0954 <para>Applied to the summit PO file while still in internal parsed state on merge, after <literal>S.hook_on_gather_msg</literal> had been applied to all messages.</para>
0955 </listitem>
0956 </varlistentry>
0957 
0958 <varlistentry>
0959 <term><literal>S.hook_on_merge_file</literal> (F6A, S6A)</term>
0960 <listitem>
0961 <para>Applied to the summit PO file as raw file on disk on merge, after <literal>S.hook_on_merge_cat</literal> had been applied.</para>
0962 </listitem>
0963 </varlistentry>
0964 
0965 </variablelist>
0966 You may notice that some logically possible hook insertion lists are missing (e.g. <literal>S.hook_on_merge_msgstr</literal>). This is because they are implemented on demand, as the need is observed in practice, and not before the fact.</para>
0967 
0968 <para>Here is another example of hook interplay. Branch PO files may still rely on <link linkend="p-embctxt">embedding the context</link> into the <varname>msgid</varname> field:
0969 <programlisting language="po">
0970 msgid "create new document|New"
0971 msgstr ""
0972 </programlisting>
0973 but you would nevertheless like to have proper <varname>msgctxt</varname> contexts in the summit:
0974 <programlisting language="po">
0975 msgctxt "create new document"
0976 msgid "New"
0977 msgstr ""
0978 </programlisting>
0979 You can achieve this by writing two small F4A hooks, and inserting them at proper points:
0980 <programlisting language="python">
0981 def context_from_embedded (msg, cat):
0982     if "|" in msg.msgid:
0983         msg.msgctxt, msg.msgid = msg.msgid.split("|", 1)
0984 
0985 def context_to_embedded (msg, cat):
0986     if msg.msgctxt is not None:
0987         msg.msgid = "%s|%s" % (msg.msgctxt, msg.msgid)
0988         msg.msgctxt = None
0989 
0990 S.hook_on_gather_msg_branch = [
0991     (context_from_embedded,),
0992 ]
0993 
0994 S.hook_on_scatter_msg = [
0995     (context_to_embedded,),
0996 ]
0997 </programlisting>
0998 In this way, branch messages will be converted to proper context just before they are gathered into the summit, and the proper context will be converted back into the embedded when the messages are scattered to branches.</para>
0999 
1000 </sect2>
1001 
1002 <sect2 id="sec-sucfgvcs">
1003 <title>Integration with Version Control Systems</title>
1004 
1005 <para>Most likely, branch and summit directories will be kept under some sort of <link linkend="sec-cmsuppvcs">version control</link>. This means that when <command>posummit</command> has finished running, any files that it had added, moved or removed, would have to be manually "reported" to the version control system (VCS). To avoid this, you can set in the summit configuration which VCS is used, among those supported by Pology, and <command>posummit</command> will issue proper VCS commands when changing the file tree. Then, after the <command>posummit</command> run, you can simply issue the VCS commit command on appropriate paths.</para>
1006 
1007 <para>Since different VCS may be used for the summit and the branches, it is possible to set them separately. For example, if branches are in a Subversion repository and the summit in a Git repository, the summit configuration would contain:
1008 <programlisting language="python">
1009 S.summit_version_control = "git"
1010 S.branches_version_control = "svn"
1011 </programlisting>
1012 If the same VCS is used for branches and the summit (whether or not they are in the same repository), only one configuration field can be set:
1013 <programlisting language="python">
1014 S.version_control = "git"
1015 </programlisting>
1016 If you would like <command>posummit</command> to execute VCS commands only in the summit and not in branches, then you would set only the <literal>S.summit_version_control</literal> field.</para>
1017 
1018 </sect2>
1019 
1020 <sect2 id="sec-sucfgwrap">
1021 <title>Text Wrapping in PO Files</title>
1022 
1023 <para>While wrapping of text fields in PO files (<varname>msgid</varname>, <varname>msgstr</varname>, etc) makes no technical difference, it may be <link linkend="sec-cmwrap">convenient for editing</link> for them to be wrapped in a particular way. Since <command>posummit</command> is anyway modifying PO files both in the summit and branches, it might as well be told what kind of wrapping to use.</para>
1024 
1025 <para>For example, a reasonable wrapping setup could be:
1026 <programlisting language="python">
1027 S.summit_wrap = False
1028 S.summit_fine_wrap = True
1029 S.branches_wrap = True
1030 S.branches_fine_wrap = False
1031 </programlisting>
1032 <literal>S.*_wrap</literal> fields activate or deactivate basic (column-based) wrapping, while <literal>S.*_fine_wrap</literal> fields do the same for logical breaks. So in this example, summit messages are wrapped only on logical breaks (may be good for editing), while branch messages are wrapped only on columns (may reduce size of VCS deltas).</para>
1033 
1034 <para>If not set, the default is basic wrapping without fine wrapping, for both branches and the summit.</para>
1035 
1036 </sect2>
1037 
1038 <sect2 id="sec-sucfgviv">
1039 <title>Vivification of Summit PO Files</title>
1040 
1041 <para>In direct summit, summit PO files spring into existence by gathering branch PO files. However, in summit over static templates, by default translators have to start a new PO file by copying over the summit template and initializing it. While dedicated PO editors can do this automatically, all translators in the team have to configure their PO editor correctly (language code, plural forms...), and they have to have templates at hand. Furthermore, any statistic processing on the translation project as whole has to specifically consider templates as empty PO files.</para>
1042 
1043 <para>Instead of this, it is possible to tell <command>posummit</command> to automatically initialize summit PO files from summit templates -- to "vivify" them -- when the language summit is merged. There is a summit configuration switch to enable vivification, as well as several fields to specify the information needed to initialize a PO file. Here is an example:
1044 <programlisting language="python">
1045 S.vivify_on_merge = True
1046 S.vivify_w_translator = "Simulacrum"
1047 S.vivify_w_langteam = "Nevernissian &lt;l10n@neverwhere.org&gt;"
1048 S.vivify_w_language = "nn"
1049 S.vivify_w_plurals = "nplurals=7; plural=(n==1 ? ...)"
1050 </programlisting>
1051 Setting <literal>S.vivify_on_merge</literal> to <literal>True</literal> engages vivification. The <literal>S.vivify_w_translator</literal> field specifies the value of <literal>Last-Translator:</literal> header field in vivified PO file; it can be something catchy rather than a real translator's name, to be able to see later which summit PO files were not yet worked on. <literal>S.vivify_w_langteam</literal> is the contents of <literal>Translation-Team:</literal> header field (team's name and email address), <literal>S.vivify_w_language</literal> of <literal>Language:</literal> (language code), and <literal>S.vivify_w_plurals</literal> of <literal>Plural-Forms:</literal>.</para>
1052 
1053 <para>In summit over dynamic templates, vivification is unconditionally active, whether <literal>S.vivify_on_merge</literal> is set or not. This is because synchronization of branches and the summit is checked by comparing template trees, and summit PO files are the only indicator of "virtual" presence of summit templates (while in summit over static templates, the summit template tree is physically present). Without vivification, it would also be very hard for project-wide statistics to take templates into account as empty summit PO files.</para>
1054 
1055 </sect2>
1056 
1057 <sect2 id="sec-sucfgbmrg">
1058 <title>Merging in Branches</title>
1059 
1060 <para>By default it is assumed that branch PO files are merged with branch templates using a separate mechanism, which was already in place when the summit was introduced into the workflow. In summit over templates modes, if branch merging is performed asynchronously to summit merging, on scatter it may happen that some messages recently added to branch PO file are not yet present in corresponding summit PO file. In that case, <command>posummit</command> will issue warnings about missing messages in the summit. This is normally not a problem, because merging asynchronicity will stop causing such differences as the pre-release message freeze in the source sets in.</para>
1061 
1062 <para>However, on the one hand side, warnings of about messages missing in the summit may be somewhat disconcerting, or aesthetically offending in the otherwise clean scatter output. On the other hand side, perhaps the existing mechanism of merging in branches is not too clean, and it would be nice to replace it with something more thorough. Therefore, in summit over templates modes, it is possible to configure the summit such that on merge, <command>posummit</command> merges not only the summit PO files, but also all branch PO files. This is achieved simply by adding the <literal>merge=</literal> key to each branch that should be merged:
1063 <programlisting language="python">
1064 S.branches = [
1065     dict(id="devel", ..., merge=True),
1066     dict(id="stable", ..., merge=True),
1067 ]
1068 </programlisting>
1069 </para>
1070 
1071 <para>When merging in branches is activated, it is still possible to merge only the summit, or any single branch. This is done by using giving an operation target on merge, either the path to the branch top directory or the branch name. For example, in summit over dynamic templates:
1072 <programlisting language="bash">
1073 $ cd $SUMMITDIR
1074 $ posummit merge $DEVELDIR/  # merge only the devel branch
1075 $ posummit merge devel:      # same
1076 $ posummit merge .           # merge only the summit
1077 $ posummit merge +:          # same
1078 </programlisting>
1079 </para>
1080 
1081 </sect2>
1082 
1083 <sect2 id="sec-sucfghead">
1084 <title>Propagation of Header Parts</title>
1085 
1086 <para>PO headers are treated somewhat differently from PO messages in summit operations:
1087 <itemizedlist>
1088 
1089 <listitem>
1090 <para>On gather, almost all of the standard header field of the <emphasis>primary branch PO file</emphasis> are copied into the summit PO file. The primary branch PO file is defined as the first branch PO file (in case of several branch files being mapped onto the same summit PO file) from the first branch (as listed in the branch specification in summit configuration). The only exception is the <literal>POT-Creation-Date:</literal>, which is set to the time of gathering, if there were any other modifications to the summit PO file. Header comments are not copied over, except when the summit PO files is being automatically created for the first time.</para>
1091 </listitem>
1092 
1093 <listitem>
1094 <para>On merge, the summit PO file is merged with the summit PO template using <command>msgmerge</command>, so its header propagation rules apply. For example, no header comments will be touched, <literal>POT-Creation-Date:</literal> will be copied over from templates but <literal>Last-Translator:</literal> will not be touched, etc. This also means that, by default, any non-standard fields in the template (e.g. those starting with <literal>X-*</literal>) will be silently dropped.</para>
1095 </listitem>
1096 
1097 <listitem>
1098 <para>On scatter, almost the complete header is copied over from the <emphasis>primary summit PO file</emphasis> into the branch PO file. The primary summit PO file is defined as the first mapped summit PO file, in cases when the single branch PO file has been mapped to several summit PO files. The exception are <literal>Report-Msgid-Bugs-To:</literal> and <literal>POT-Creation-Date:</literal>, which are preserved as they are in the branch PO file. Also, <literal>PO-Revision-Date:</literal> is set to that of the primary summit PO file only if there were any other modifications to the branch PO file (because it may happen that all updates to the summit PO file since the last scatter were for messages from other branches).</para>
1099 </listitem>
1100 
1101 </itemizedlist>
1102 </para>
1103 
1104 <para>There exists the possibility to influence this default header propagation. In particular, non-standard header fields may be added into branch and summit PO files and templates by different tools, and it may be significant to preserve and propagate these fields in some contexts. The following summit configuartion fields can used for that purpose:
1105 <itemizedlist>
1106 
1107 <listitem>
1108 <para><literal>S.header_propagate_fields</literal> field can be set to a list of non-standard header field names which should be propagated in gather and merge operations, from branch into summit PO files. For example, to propagate fields named <literal>X-Accelerator-Marker:</literal> and <literal>X-Sources-Root-URL:</literal>, the following can be added to summit configuration:
1109 <programlisting language="python">
1110 S.header_propagate_fields = [
1111     "X-Accelerator-Marker",
1112     "X-Sources-Root-URL",
1113 ]
1114 </programlisting>
1115 Only the primary branch PO file is considered for determining the presence and getting the values of these header fields.</para>
1116 </listitem>
1117 
1118 <listitem>
1119 <para>Instead of simply overwriting on scatter most of the branch PO header fields with summit PO header fields, some additional branch fields may be preserved by setting <literal>S.header_skip_fields_on_scatter</literal> to the list of header field names to preserve. For example, to preserve <literal>X-Scheduled-Release-Date:</literal> field in branch PO files:</para>
1120 <programlisting language="python">
1121 S.header_skip_fields_on_scatter = [
1122     "X-Scheduled-Release-Date",
1123 ]
1124 </programlisting>
1125 </listitem>
1126 
1127 </itemizedlist>
1128 </para>
1129 
1130 </sect2>
1131 
1132 <sect2 id="sec-sucfgascf">
1133 <title>Filtering by Ascription on Scatter</title>
1134 
1135 <para><xref linkend="ch-ascript"/> describes a translation review system available in Pology, in which every PO message has its modification and review history kept up to some depth in the past. Based on that history, it is possible to select which messages from working PO files (those under ascription) can be passed into release PO files, provided that these two file trees exist. Summit and branches can be viewed exactly as an instance of such separation, where the summit is the working tree, and each branch a release tree.</para>
1136 
1137 <para>In this context, only the summit tree should be kept under ascription. Filtering for release is then, naturally, performed on scatter: to each summit PO message a sequence of one or more ascription selectors is applied, and if the message is selected by the selector sequence, it is passed into the branch PO file. Several selector sequences may be defined, for use in various release situations, through <literal>S.ascription_filters</literal> configuration field.</para>
1138 
1139 <para>For example, to have a single filtering sequence which simply lets through all messages not modifed after last review, the following should be added to summit configuration:
1140 <programlisting language="python">
1141 S.ascription_filters = [
1142     ("regular", ["nmodar"]),
1143 ]
1144 </programlisting>
1145 Each filtering sequence is represented by a two-element tuple. The first element is the name of the filtering sequence, here <literal>regular</literal>. You can set the name to anything you like; when there is only one filtering sequence, the name is actually nowhere used later. The second element of the tuple is a list of ascription selectors, which are given just as the values to <option>-s</option> options in <command>poascribe</command> command line. Here only one selector is issued, <literal>nmodar</literal>, which is the negation of modified-after-review selector. This yields the desired filter to pass all messages "not modifed after last review".</para>
1146 
1147 <para>A more involving example would be that of having one filter for regular scatter, and another "emergency" filter, which relaxes the strictness a bit in case there was no time to properly review all new translations. This emergency filter may let through unreviewed messages if modified by a select few persons, which are known to produce sufficient quality translators in first attempt. If these persons are, for example, <literal>alice</literal> and <literal>bob</literal> (by their ascription user names), then the two-filter setup could look like this:
1148 <programlisting language="python">
1149 S.ascription_filters = [
1150     ("regular", ["nmodar"]),
1151     ("emergency", ["nmodar:~alice,bob"]),
1152 ]
1153 </programlisting>
1154 The <literal>regular</literal> filter looks like in the previous example. The <literal>emergency</literal> filter also uses just one <literal>nmodar</literal> selector, but with additional argument to consider all users except for <literal>alice</literal> and <literal>bob</literal>. Due to the fact that it is listed first, the <literal>regular</literal> filter is applied on scatter by default. Application of the <literal>emergency</literal> filter is requested by issuing the <option>-a</option>/<option>--asc-filter</option> option with filter name as value:
1155 <programlisting language="bash">
1156 $ cd $SUMMITDIR
1157 $ posummit scatter -a emergency
1158 </programlisting>
1159 </para>
1160 
1161 <para>When scattering is performed under the ascription filter, messages stopped by the filter will be counted and their number (if non-zero) reported per branch PO file.</para>
1162 
1163 </sect2>
1164 
1165 <sect2 id="sec-sucfgbranch">
1166 <title>Other Branch Options</title>
1167 
1168 <para>Each branch entry in branch specification (<literal>S.branches</literal> configuration field) can have some keys in addition to those described earlier.</para>
1169 
1170 <para>It is possible to exclude some branch PO files from summit operations, or to include only certain branch PO files into summit operations. This is done by setting <literal>excludes=</literal> and <literal>includes=</literal> keys. The value is a list of tests on branch PO file absolute path: if any test matches, the file is matched on the whole (logical OR). Each test can be either a regular expression string, or a function taking the file path as argument and returning a truth value. If only <literal>excludes=</literal> is set, then all files not matched are operated on, and if <literal>includes=</literal> is set, only matched files are operated on. If both keys are set, then only files matched by <literal>includes=</literal> and not matched by <literal>excludes=</literal> are operated on.</para>
1171 
1172 <para>If branches are under version control and <command>posummit</command> is told to <link linkend="sec-sucfgvcs">issue version control commands</link> as appropriate (i.e. <literal>S.branches_version_control</literal> configuration field is set), it is possible to exclude a specific branch from this, by setting its <literal>skip_version_control=</literal> key to <literal>True</literal>.</para>
1173 
1174 <para>When a message from a non-primary branch PO file does not exist in the primary branch PO file, on gathering it is inserted into the summit PO file by evaluating its similarity to other messages (among other criteria). When the branch PO file has many and long messages, these similarity checks can cause excessive gathering times (in hours), to the point of making the summit operation infeasible. As a compromise, the <literal>insert_nosim=</literal> key can be set to <literal>True</literal> to skip insertion by similarity on that branch. If only some PO files are exhibiting this problem, <literal>insert_nosim=</literal> can be instead set to a list of regular expression strings to match those particular files, by their branch domain name. Finally, the value can also be a function which takes the branch domain and subdirectory path as arguments (e.g. <literal>test_insert_nosim(domain, subdir)</literal>) and returns <literal>True</literal> to skip similarity check.</para>
1175 
1176 </sect2>
1177 
1178 <sect2 id="sec-sucfgbonmrg">
1179 <title>Other Merge Options</title>
1180 
1181 <para>As is usual, merging performed by <command>posummit</command> by default produces <link linkend="sec-pofuzzy">fuzzy messages</link>; in summit PO files, as well as in branch PO files if <link linkend="sec-sucfgbmrg">merging in branches</link> is enabled. It is possible to prevent fuzzy matching, by setting <literal>S.summit_fuzzy_merging</literal> and <literal>S.branches_fuzzy_merging</literal> configuration fields to <literal>True</literal>. There should be little reason to disable fuzzy matching in summit PO files, but it may be convenient to do so in branch PO files, which are not directly translated. For example, lack of fuzzy message will lead to smaller version control deltas.</para>
1182 
1183 <para>Fuzzy messages are by default produced by <command>msgmerge</command> alone. This can be more finely tuned by processing the PO file before and after it has been merged, as done by <link linkend="sec-miselfmerge">the <command>poselfmerge</command> command</link>. The <literal>S.merge_min_adjsim_fuzzy</literal> configuration field can be set to a number in range from 0 to 1, having the same effect on fuzzy matching as the <option>-A</option>/<option>--min-adjsim-fuzzy</option> option of <command>poselfmerge</command>. The <literal>S.merge_rebase_fuzzy</literal> field can be set to <literal>True</literal>, with the same meaning as the <option>-b</option>/<option>--rebase-fuzzies</option> option of <command>poselfmerge</command>.</para>
1184 
1185 <para>Summit PO files may be merged by consulting a compendium, to produce additional exact and fuzzy matches. This possibility also draws on the functionality provided by <command>poselfmerge</command>. The <literal>S.compendium_on_merge</literal> configuration field is used to set the path to a compendium<footnote>
1186 <para>Here you can also use the <literal>S.relpath()</literal> function, to have the compendium path be relative to the directory of the summit configuration file.</para>
1187 </footnote>, equivalently to the <option>-C</option>/<option>--compendium</option> option of <command>poselfmerge</command>. Since compendium matches are less likely to be appropriate than own matches, you may set the <literal>S.compendium_fuzzy_exact</literal> field to <literal>True</literal>, or the <literal>S.compendium_min_words_exact</literal> field to a positive integer number, with the same effect as <option>-x</option>/<option>--fuzzy-exact</option> and <option>-W</option>/<option>--min-words-exact</option> options of <command>poselfmerge</command>, respectively.</para>
1188 
1189 </sect2>
1190 
1191 <sect2 id="sec-sucfgbonsct">
1192 <title>Other Scatter Options</title>
1193 
1194 <para>Sometimes a summit PO file may be "pristine", meaning that all messages in it are clear, neither translated nor fuzzy. Pristine summit PO files may appear, for example, when <link linkend="sec-sucfgviv">vivification</link> is active. A pristine summit PO file will by default cause a likewise empty branch PO file to appear on scatter. This may or may not be a problem in a given project. If it is a problem, it is possible to set the minimal translation completeness of a summit PO file at which the branch PO file will be created on scatter. For example:
1195 <programlisting language="python">
1196 S.scatter_min_completeness = 0.8
1197 </programlisting>
1198 sets the minimum completeness to scatter at 80%. Completeness is taken to be the ratio of the number of translated to all messages in the file (excluding obsolete).</para>
1199 
1200 <para>Translation completeness of a summit PO file may deteriorate over time, as it is periodically gathered or merged, and no one comes around to update the translation. At some point, the completeness may become too low to be considered useful, so that it is better to stop releasing remaining translations in that file until it is updated. The completeness at which this happens, at which the branch PO file is automatically cleared of all translations on scatter, can be set through <literal>S.scatter_acc_completeness</literal> configuration field. The meaning of the value of this field is the same as for <literal>S.scatter_min_completeness</literal>; in fact, one might ask why not simply use <literal>S.scatter_min_completeness</literal> for this purpose as well. The reason is that sometimes a higher bar is put for the initial release, and having two separate configuration fields enables you to make this difference.</para>
1201 
1202 </sect2>
1203 
1204 </sect1>
1205 
1206 <!-- ======================================== -->
1207 <sect1 id="sec-suproblems">
1208 <title>Disadvantages to Summit Workflow and Remedies</title>
1209 
1210 <para>Although hopefully shadowed by the advantages, working in summit is not without some disadvantages. These should be weighed in when deciding on whether to try out the summit workflow.</para>
1211 
1212 <para>In summit over template modes, any changes made manually in branch PO files will not propagate into summit, and will be soon lost to scattering. This means that the whole translation team must work in the summit. It is not possible for some members to use the summit, and some not. In direct summit mode, modifying branches directly would be even messier, as some changes would find their way into the summit and some not, depending on which branch contains the change and the order of gather and scatter operations.</para>
1213 
1214 <para>A summit PO file will necessarily have more messages than either of the branch files. For example, in two successive development-stable branch cyclings within the KDE translation project (at the time about 1100 PO files with 750.000 words), summit PO files were on average 5% bigger (by number of words) than their branch counterparts. This percentage can be taken as the top limit of possibly wasted translation effort due to messages in development branch coming and going, given that as the next branch cycling approaches more and more messages become fixed and make into the next stable branch.</para>
1215 
1216 <para>A more pressing issue with increased size of summit PO files is the following scenario: next stable release is around the corner, and the translation team has no time to update summit PO files fully, but could update only stable messages in them. For example, there are 1000 untranslated and fuzzy messages in the summit, out of which only 50 are coming from the stable branch. A clever dedicated PO editor could allow jumping only through untranslated and fuzzy messages which additionaly satisfy a general search criteria, in this case that a comment matches <literal>\+>.*stable</literal> regular expression (assuming the stable branch is named <literal>stable</literal> in summit configuration). Lacking such a feature, with some external help it is enough if the editor can merely search through comments. First, Pology's <link linkend="ch-sieve"><command>posieve</command> command</link> can be used to equip all untranslated and fuzzy stable messages in summit PO files with an <literal>untranslated</literal> flag (producing <literal>#, ..., untranslated</literal> comment):
1217 <programlisting language="bash">
1218 $ posieve tag-untranslated -sbranch:stable -swfuzzy <replaceable>paths...</replaceable>
1219 </programlisting>
1220 Then, in the PO editor you can jump through incomplete stable messages by simply searching for this flag. While doing that, you are not obligated to manually remove the flag: it will either automatically disappear on the next merge, or you can remove all flags afterwards by running:
1221 <programlisting language="bash">
1222 $ posieve tag-untranslated -sstrip <replaceable>paths...</replaceable>
1223 </programlisting>
1224 </para>
1225 
1226 <para>There are some organizational issues with starting to use the summit, and, if it turns out counter-productive, stopping to use it. Team members have first to be reminded not to send in or commit branch PO files, and then if the summit is disbanded, to be reminded to go back to branch PO files. On the plus side, disbanding the summit is technically simple, simply removing its top directory and summit configuration file will do the job.</para>
1227 
1228 </sect1>
1229 
1230 </chapter>