SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
Extended Reach:
An Efficient Content Management Technique
for Sharing and Localizing Content



            IBM Technical Report TR-40.0032
            December, 2003


Sheila Monheit                   David Leip
IBM Corporate Webmasters         IBM Corporate Webmasters
San Jose, CA, United States      Hawthorne, NY, United States
Monheit@us.ibm.com               Leip@us.ibm.com

Sara Elo Dean                    Hidekazu Shirayama
IBM Corporate Webmasters         IBM Corporate Webmasters
Helsinki, Finland                Tokyo, Japan
EloDean@fi.ibm.com               Flyhard@jp.ibm.com
Table of Contents

1    Introduction............................................................................................................................. 3
2    Objectives ............................................................................................................................... 3
3    IBM URI Taxonomy............................................................................................................... 3
4    Approach................................................................................................................................. 4
  4.1     ibm.com Content Model ................................................................................................. 5
  4.2     Multi-Page Publish Scheme............................................................................................ 5
  4.3     Enabling Localized Content............................................................................................ 6
  4.4     Automating Country Code References ........................................................................... 8
  4.5     Shared vs. Localized Text Blocks................................................................................. 13
  4.6     Leadspace Rotation....................................................................................................... 17
  4.7     Hybrid Approach .......................................................................................................... 18
5    Evaluating Extended Reach in Pilots.................................................................................... 19
  5.1     Pilot 1: Basic extended reach with identical content .................................................... 19
  5.2     Pilot 2: Enhanced extended reach with localized content............................................. 20
  5.3     Pilot 2 Evaluation.......................................................................................................... 20
6    Future Work .......................................................................................................................... 21
1 Introduction
For global companies such as IBM, it is important from a marketing and brand perspective that
they represent themselves as being “in touch” with the many local national markets in which
they do business. This applies to all aspects and representations of the corporation, including
their web presence. In some cases these markets can be quite small, and it can be difficult to
justify the investment to create and maintain separate web content for each of these markets
individually. The alternative, simply grouping countries together and creating a single web site
for a region, is not particularly attractive. It leaves that set of end users feeling not on par with
the corporation’s larger markets.

A large corporate web site such as ibm.com is faced with the challenge to serve as wide a set of
customers as efficiently as possible. Two strategies exist for achieving this goal. The first is to
leverage the same content across different formats. For example, the ibm.com corporate news
content is shared across XHTML for the standard web browsers, WML, HDML, cHTML for
pervasive devices, and RSS for content syndication. The second approach is to share the same
content across different sites. This paper discussed the second approach, named Extended
Reach. Specifically, the paper explains the way IBM has set up multiple country portals that can
be managed, from a content maintenance perspective, as a single portal.

2 Objectives
The Extended Reach project has three main business goals:

   1. To make ibm.com available on a wider basis world wide
   2. To reduce the workload of maintaining country portals, especially for smaller countries
   3. Flaunt the “I” (International) in IBM

IBM took the early lead in establishing a web presence for quite a few countries, more than its
              competitors. In recent years some of its larger competitors (Dell, HP &
              Microsoft) surpassed IBM, creating a web presence in more countries. With the
              rollout of the Extended Reach project, IBM has regained the leadership position.
              Today IBM presents a country portal in 83 countries, while Dell, HP and
              Microsoft and other competitors cover fewer countries.

3 IBM URI Taxonomy
The IBM URI taxonomy centers on subject matter keywords in English and the ISO standard for
two-letter country and language codes [1]. These elements allow presenting a web site visitor
with consistent naming conventions across applications and web sites worldwide.

Examples:

   •   http://www.ibm.com/ibm/au (About IBM in Australia)
   •   http://www.ibm.com/news/ve (News in Venezuela)
   •   http://www.ibm.com/servers/de (Servers in Germany)
If more than one language is used for a country, URIs follow the /<cc>/<lc> format where <cc>
is two-letter code as specified in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code
elements and <lc> is the two character language code.

Examples:

   •   http://www.ibm.com/e-business/ch/fr (e-business in Switzerland, French version)
   •   http://www.ibm.com/products/ca/fr (Products & Services in Canada, French version)


Top-level, or root level, directories are restricted to IBM registered trademarks and service
marks, and major, global, cross divisional content areas such as /e-business, /thinkpad, /services
and /products. These keywords must be in English only. For worldwide consistency, URIs are
not translated to the local language. Use of regional web sites and regional URIs is strongly
discouraged.

Furthermore, if consistent URIs do not or cannot be implemented due to application constraints
for strategic pages, the ibm.com web servers are configured with redirects so that the advertised
URI still abide to the URI taxonomy.

Examples:

   •   http://www.ibm.com/shop/it/customerservice (Online customer support Italy) redirects to
       http://www-
       134.ibm.com/webapp/wcs/stores/servlet/HelpDisplay?subject=2294556&storeId=380&catalogI
       d=-380&langId=-4
   •   http://www.ibm.com/shop/uk/help (Online shopping support UK) redirects to http://www-
       134.ibm.com/webapp/wcs/stores/servlet/HelpDisplay?storeId=826&catalogId=-
       826&dualCurrId=20&langId=826&subject=2294556


The Extended Reach technique builds on the fact that the IBM web URI taxonomy is country
code centric. URIs between corresponding pages for countries vary in general only by country
code. This enables URIs to be programmatically localized for countries within a group.



4 Approach
The Extended Reach technique is applicable for a group of country web sites with the following
criteria:
    • Maximum content sharing across multiple countries. The goal is to share most of the
         content that makes up the web site, with only a small amount of unique information
         maintained separately for each country. Allow for variation in content where a country
         has a local business need.
    • Group similar small market countries together based on common language and region.
         For example:
             o 20 Caribbean English language countries
             o 7 ASEAN English language countries
         Due to translation issues, it is not possible to share content between different languages.
    • Enforce a standard layout.
•   Support rotation of content to give a greater sense of freshness and even uniqueness
       across countries.
   •   Comply with a standard URI taxonomy to enable the automated localization of standard
       URIs.
   •   Cater for automated country name substitution, but with care.



4.1 ibm.com Content Model
Today, a content management system based on the Extensible Markup Language (XML) is used
to create and maintain ibm.com country portals. By encoding content in XML and layout logic
in the Extensible Stylesheet Language (XSL), the system enforces the separation of content and
presentation. The system also supports reusable XML fragments and manages the dependencies
between such fragments. Using a Java-based user interface, a content editor can upload XSL
stylesheets and multimedia objects, create and edit XML content fragments, compose pages out
of fragments, preview pages, review final published pages, and reject them or promote them to
the final stage in the publishing flow [2].

Every ibm.com web page consists of several fragments: a masthead, footer, left and right
navigation bars, and the main white space. Each of these is built as a separate XML fragment
included into one or more XML documents, or servables. The XML fragments and servables
abide to Document Type Definitions (DTDs). Fragments correspond to reusable components
such as a navigation bar, an image, or a link, and servables to specific page types, such as an
index page, a homepage, or a news article. An XML servable may contain fragments that are
unique to the white space of the page type or reusable fragments. An XML servable is
transformed to output pages in various formats by dedicated XSL stylesheets that control the
presentation of a page. Thus content input and output presentation are tightly controlled by the
appropriate servable and fragment DTDs and the XSL stylesheets.


4.2 Multi-Page Publish Scheme
For countries not within Extended Reach, ibm.com corporate portal country pages are generated
on a 1-1 basis. One input XML servable transformed with one XSL generates one output page
(in HTML, WML, HDML, or RSS format) for one country in one language. Thus, ten XML
servables tagged for ten different countries are transformed by one XSL stylesheet, generating
ten resulting pages. In this way the IBM standard layout, along with the tight DTD control over
the page content, are ensured across every country portal page.

Extended Reach presented the challenge of creating more than one output page from one XSL
transformation of one input XML. The input XML was now a fully reusable XML servable,
made up already reused fragments.

The existing content model and content were analyzed to identify how content could be
efficiently shared across a group of countries. Countries that share a common language and
common content could be grouped together.
The first design introduced no changes to the DTDs in order to avoid the maintenance of two sets
 of DTDs, one set for countries with unique content and one for the Extended Reach countries
 with identical content. The Extended Reach technique was implemented as a multi-page
 publishing scheme in the XSL stylesheets. The existing XSL logic was enhanced to include a
 looping mechanism. The new logic could generate multiple outputs from a single XML and
 result in a distinct ibm.com corporate portal page for each specified target country. The output
 pages were identical in content, apart from the automated localization of the masthead, footer
 and URIs.

 Once the groupings of countries had been identified, rendering the countries to IBM standard
 layouts became very straightforward. Within every XML servable is a COUNTRY element tag,
 which specifies the target country page being generated. By adding this tag multiple times, the
 stylesheet can process any number of countries.

 Single country tagging:
 <COMMON>
        <LANGUAGE >en</LANGUAGE>
        <COUNTRY>bd</COUNTRY>
 </COMMON>

 Multiple country tagging:
 <COMMON>
        <LANGUAGE >en</LANGUAGE>
        <COUNTRY>bd</COUNTRY>
        <COUNTRY>lk</COUNTRY>
        <COUNTRY>vn</COUNTRY>
        <COUNTRY>ph</COUNTRY>
        <COUNTRY>my</COUNTRY>
        <COUNTRY>th</COUNTRY>
        <COUNTRY> id</COUNTRY>
        <AUDIENCE >all</AUDIENCE>
 </COMMON>


 An XML servable also contains the STYLESHEET tag, which identifies the XSL stylesheet to
 transform with:

 <STYLESHEET>regional_newsindex_xml_html.xsl</STYLESHEET>

 4.3    Enabling Localized Content

The first Extended Reach implementation successfully created multiple near-identical,
automatically localized output pages and enforced the IBM layout standard. However,
the approach was too rigorous: identical pages left no room for unique country
distinctions. Some ASEAN Extended Reach candidate countries were unable to adopt
the technique because the design did not allow for any localization on the pages. An
enhanced design needed to allow for some custom content identification within the
existing page structures defined in the DTDs.

A content analysis of countries in the same region provided insight into the localization
requirements. Fig 1 and Fig 2 show the www.ibm.com homepages for Malaysia and
Indonesia:
Fig 1: www.ibm.com/my




               www.ibm.com/planetwide/select




     www.ibm.com/my/offers/thinkpad/


 Every link on the page refers either to a country-specific
 page or a www.ibm.com general page. The country code
                                                                  www.ibm.com/services/my/
 occurs anywhere within the URI, or not at all.




 Fig 2: www.ibm.com/id



                   www.ibm.com/planewtwide/select


                                          Leadspace views rotate per hit, for every
                                          country.




www.ibm.com/services/bcs/id/

                                                               www.ibm.com/services/id/
Fig 3: Services links

   Malaysian Homepage Services section:                 Indonesian Homepage Services section:




         Optional link to:                                                        No optional
         www.ibm.com/financing/my                                                 link

    Further investigation of content, such as lists of links in Fig 3, reveals the following:

     •   The URI taxonomy is consistent within defined sections on a page, so enabling
         country references can be automated.
     •   Some links appear only for a subset of countries in a group, so country tagging
         of a link must be enabled
     •   Some text blocks are identical across all countries with the exception of the
         local country name, so enabling automatic country references within text could
         be enabled.



   4.4 Automating Country Code References
   Before Extended Reach, links, such as the ones in Fig 3, were defined in XML as ITEM_TITLE
   and ITEM_URL element pairs. The sample below defines the left navigation bar on the
   www.ibm.com/us homepage:

     <PRIMARY_LINKS>
           <ITEM>
                 <ITEM_TITLE>Home / home office</ITEM_TITLE>
                 <ITEM_URL>http://www.ibm.com/homeoffice/</ITEM_URL>
           </ITEM>
   </PRIMARY_LINKS>
   <PRIMARY_LINKS>
           <ITEM>
                 <ITEM_TITLE>Small & medium business</ITEM_TITLE>
                 <ITEM_URL>http://www.ibm.com/businesscenter/us/<ITEM_URL>
           </ITEM>
   </PRIMARY_LINKS>
   <PRIMARY_LINKS>
           <ITEM>
                 <ITEM_TITLE>Large enterprise</ITEM_TITLE>
                 <ITEM_URL>http://www.ibm.com/largeenterprise/us/</ITEM_URL>
           </ITEM>
    </PRIMARY_LINKS>
The transforming XSL loops over all the PRIMARY_LINK elements and generates the
following output html:

                            http://www.ibm.com/hom
                            http://www.ibm.com/businesscenter/us/
                            http://www.ibm.com/largeenterprise/us/



Based on the definition in the XML, all the links point to US URIs and every title and URI pair
is included in the output. No mechanism, or need, exists to specify conditions of links, such as
their presence or absence in the output, because the navigation bar is dedicated to the US.

The following patterns were defined to enable flexible localization of links. Content and XSL
stylesheets were enhanced to respectively include and process the new logic.


%%CC                                 substitute every country (cc) listed under
                                     <COMMON/COUNTRY> in the URI string
%%INCLIST_cc_cc_%%                   substitute ONLY countries included in the INCLIST
                                     string
[[%%INCLIST_cc_cc_%%]]               include this link (which contains no CC references
                                     at all, ex:www.ibm.com) for countries in the
                                     INCLIST (note: this string is added at the end of
                                     the URI string)
%%EXCLIST_cc_cc_%%                   substitute ONLY countries NOT included in the
                                     EXCLIST string
[[%%EXCLIST_cc_cc_%%]]               include this link (which contains no CC references
                                     at all, ex:www.ibm.com) for countries NOT
                                     included in the EXCLIST (note: this string is added
                                     at the end of the URI string

Going back to the sample services section in Fig 3, the XML for that section in the new syntax
becomes:
<SERVICES_BOX>
       <SERVICES_GRAY_TITLE>Services</SERVICES_GRAY_TITLE>
       <SERVICES_LINKS>
              <LINK_TEXT>Business and IT services</LINK_TEXT>
              <LINK_URL>http://www.ibm.com/services/%%CC/</LINK_URL>
       </SERVICES_LINKS>
       <SERVICES_LINKS>
              <LINK_TEXT>Business consulting services</LINK_TEXT>
       <LINK_URL>http://www.ibm.com/bcs/%%CC/</LINK_URL>
       </SERVICES_LINKS>
       <SERVICES_LINKS>
              <LINK_TEXT>Infrastructure services</LINK_TEXT>
       <LINK_URL>http://www.ibm.com/services/%%CC/strategy/capability/fullin
fra.html</LINK_URL>
       </SERVICES_LINKS>
       <SERVICES_LINKS>
<LINK_TEXT>On demand services</LINK_TEXT>
       <LINK_URL>http://www.ibm.com/services/%%CC/ondemand/</LINK_URL>
       </SERVICES_LINKS>
       <SERVICES_LINKS>
              <LINK_TEXT>Financing</LINK_TEXT>
       <LINK_URL>http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/</LINK_URL>
       </SERVICES_LINKS>
</SERVICES_BOX>


Further down in the same XML servable the country definitions are:

<COMMON>
       <LANGUAGE>en</LANGUAGE>
       <COUNTRY>my</COUNTRY>
       <COUNTRY>ph</COUNTRY>
       <COUNTRY>th</COUNTRY>
       <COUNTRY>id</COUNTRY>
</COMMON>

The output seen in Fig 3 for the Malaysian and Indonesian homepages is generated by the
Extended Reach XSL below:

<xsl:template name=quot;regionalLinksquot;>
   <xsl:param name=quot;ccquot;/>
   <xsl:param name=quot;linkquot;/>
   <xsl:choose>
      <xsl:when test=quot;contains($link,'%%CC')quot;>
         <xsl:value-of select=quot;concat(substring-
before($link,'%%CC'),$cc,substring-after($link,'%%CC'))quot;/>
      </xsl:when>
      <xsl:when test=quot;contains ($link,'%%INCLIST_')quot;>
         <xsl:variable name=quot;IncListquot; select=quot;substring-before (substring-
after ($link, '%%INCLIST_'), '%%')quot;/>
         <!--xsl:value-of select=quot;concat('this is dolist variable:',
$doList)quot;/-->
         <xsl:choose>
            <xsl:when test=quot;contains ($IncList, $cc)quot;>
               <xsl:choose>
                  <xsl:when test=quot;contains ($link, '[[%%INCLIST_')quot;>
                     <xsl:value-of select=quot;substring-before($link,
'[[%%INCLIST')quot;/>
                  </xsl:when>
                  <xsl:otherwise>
                     <xsl:value-of select=quot;concat(substring-before($link,
'%%INCLIST_'),$cc,substring-after($link, '_%%'))quot;/>
                  </xsl:otherwise>
               </xsl:choose>
            </xsl:when>
            <xsl:otherwise>
               <xsl:value-of select=quot;''quot;/>
            </xsl:otherwise>
         </xsl:choose>
      </xsl:when>
      <xsl:when test=quot;contains ($link,'%%EXCLIST_')quot;>
         <xsl:variable name=quot;ExcListquot; select=quot;substring-before (substring-
after ($link, '%%EXCLIST_'), '%%')quot;/>
         <!--xsl:value-of select=quot;concat('this is dolist variable:',
$doList)quot;/-->
         <xsl:choose>
            <xsl:when test=quot;contains ($ExcList, $cc)quot;>
               <xsl:value-of select=quot;''quot;/>
</xsl:when>
            <xsl:otherwise>
               <xsl:choose>
                  <xsl:when test=quot;contains($link, '[[%%EXCLIST_')quot;>
                     <xsl:value-of select=quot;substring-before($link,
'[[%%EXCLIST')quot;/>
                  </xsl:when>
                  <xsl:otherwise>
                     <xsl:value-of select=quot;concat(substring-before($link,
'%%EXCLIST_'),$cc,substring-after($link, '_%%'))quot;/>
                  </xsl:otherwise>
               </xsl:choose>
            </xsl:otherwise>
         </xsl:choose>
      </xsl:when>
      <xsl:otherwise>
         <xsl:value-of select=quot;$linkquot;/>
      </xsl:otherwise>
   </xsl:choose>
</xsl:template>

A detailed explanation of the XSL follows:

The XSL template gets passed 2 parameters from the parent routine:
         1. cc, which is the country code of the pass it is performing under the FOR-EACH
            loop for COMMON/COUNTRY:
                      <COMMON>
                            <LANGUAGE>en</LANGUAGE>
                            <COUNTRY>my</COUNTRY>
                            <COUNTRY>ph</COUNTRY>
                            <COUNTRY>th</COUNTRY>
                            <COUNTRY>id</COUNTRY>
                      </COMMON>
                In the first pass cc=my (Malaysia), then ph (Philippines) and so on.

           2. link, which is the string containing the URI information, the contents
              of the <LINK_URL> element:
                <LINK_URL>http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/</
                LINK_URL>.
           The template above is executed within the COMMON/COUNTRY for-
           each loop N times, once for each time a URI requires processing. In this
           example, the cc variable does not change values until all the LINK_URL
           elements are processed. At that point the cc variable is assigned the value
           of the next COUNTRY element and the processing for each LINK_URL
           is repeated.


The links being processed are in order:
           1. http://www.ibm.com/services/%%CC/
           2. http://www.ibm.com/bcs/%%CC/
           3. http://www.ibm.com/services/%%CC/strategy/capability/fullinfra.html
           4.   http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/
The following XSL logic occurs for each pass through these links:

For the first three links (1, 2 and 3) the value of cc, the country being processed, is substituted
directly into the link string at the exact location of the %%CC notation. Thus, when processing
the first COMMON/COUNTRY element my, the first three links print as

                     http://www.ibm.com/services/my/
                     http://www.ibm.com/bcs/my/
                     http://www.ibm.com/services/my/strategy/capability/fullinfra.html

and when processing the second COMMON/COUNTRY element ph, the same links print as
                     http://www.ibm.com/services/ph/
                     http://www.ibm.com/bcs/ph/
                     http://www.ibm.com/services/ph/strategy/capability/fullinfra.html

The processing of the fourth link is more complicated.
               http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/

When the XSL encounters the %%INCLIST or %%EXCLIST pattern, it triggers two
conditional loops:

1. First, it parses the string until the closing _%% to see whether or not the current cc variable is
relevant for this string. In this case, Malaysia (my), Thailand (th), and Philippines (ph)
homepages should all include this URI. Indonesia (id) homepage should not include it.

This could also have been represented as:
               http://www.ibm.com/financing/%%EXCLIST_id_%%/

and would have produced the same results. For the EXCLIST pattern, the conditional loop parses
the string to see if the current cc is NOT in the list, and if so, the link is included.

2. Second, if it is established that the URI string is applicable for the current cc variable, the next
conditional test determines if the URI string contains a country reference in its syntax, or
whether it’s a general ibm.com URI that has no country reference in it at all. This test performs
a second parse on the INCLIST or EXCLIST patterns to determine if the INCLIST or EXCLIST
pattern is at the end of the URI string, and if so, if the [[ opening and ]] closing brackets surround
it. This indicates that the URI string does include a country reference.

An example is the pattern:

                http://www.lotus.com/[[%%INCLIST_my_%%]]

which prints out the link without any country code http://www.lotus.com/ on the Malaysian page
only.
Last, if the string being processed with the INCLIST or EXCLIST pattern is not applicable to the
 current cc variable, the XSL returns a blank string. This is necessary for later processing when
 the URI and TITLE are both processed for the final output. The TITLE is always included in the
 input XML, regardless of country tagging, so to ensure that no TITLE without a corresponding
 URI is inserted into output HTML, a blank string is required for a last test before the HTML
 output is created. If the returned URI string is blank, no TITLE/URI combination is included in
 the HTML; if it isn’t blank, the returned string, now containing the correct country tags, along
 with the corresponding TITLE, is included in the HTML.


 4.5 Shared vs. Localized Text Blocks
 A comparison of About IBM pages provides a good example of the types of text blocks shared
 among and localized by countries.

 Fig 4: www.ibm.com/ibm/my




Text block that all countries
                                                                                Localized photo
share. May include country name
                                                                                (optional)
in the text.




                                   Shared financial info, additional section for
                                   localized financial info allowed

 Text block with localized
 information




 Fig 5: www.ibm.com/ibm/id
Text block that all countries
 share. May include country name                                                   Localized photo
 in the text.                                                                      (optional)




                                                      Shared financial info, additional section for
                                                      localized financial info allowed

Text block with localized
information




The examples in Figs 4 and 5 illustrate different types of text blocks, namely shared and
localized. A shared text block is reusable, but requires some processing to allow for minor
localization in order to give the text a country specific feel. For example, in the first section, it
would be ideal if a country could use the general text, and insert one or more localized sentences.

A localized text block is specific to a country only. For example: the history of IBM in the
country, the picture of the local general manager, or the contact information for the country
shown in Fig 6.

Fig 6: www.ibm.com/ibm/my continued
For the shared text block, the text processing XSL template is modified to accept a TAG that
serves as a placeholder and country identifier within a text block, (%%COUNTRYNAME).
Using standard XSL, this text processing template can be invoked, using the country name
(Malaysia). The XSL processing is a standard text substitution/replacement template, one that
recursively parses a sting and substitutes any instance of TAG with the passed in parameter
value.

A less obvious, but equally beneficial, outcome of this first type of text substitution is its
application to the HTML Meta tags:

Malaysian Meta tags:
<meta name=quot;IBM.Countryquot; content=quot;myquot;/>
<meta name=quot;Descriptionquot; content=quot;The IBM Malaysia home page, entry point to
information about IBM products and services.quot;/>
<meta name=quot;Abstractquot; content=quot;The IBM Malaysia home page, entry point to
information about IBM products and services.quot;/>

The corresponding Indonesian Meta tags:
<meta name=quot;IBM.Countryquot; content=quot;idquot;/>
<meta name=quot;Descriptionquot; content=quot;The IBM Indonesia home page, entry point
to information about IBM products and services.quot;/>
<meta name=quot;Abstractquot; content=quot;The IBM Indonesia home page, entry point to
information about IBM products and services.quot;/>


The second type, the localized text block, requires a change beyond the Extended Reach
approach described so far where only XSL processing and content are enhanced. Minor DTD
changes need to be introduced to accommodate the inclusion of localized blocks of text in an
XML servable.

The DTD for the About IBM page, along with all the other portal pages, already accommodates
the inclusion of reusable XML fragments.

The root element for About IBM DTD:

<!ELEMENT ABOUT_IBM      (SYSTEM,TITLE,TITLE_GRAPHIC?, LONG_DESCRIPTION?,
SITE_SECTION, LEFT_NAVBAR, PHOTO?, PHOTO_URL?, CAPTION?, BLUE_TITLE?
COMPANY_INFO?, COUNTRY_COMPANY_INFO? CONTACT_INFO?, FINANCIAL?,
ADDITIONAL_INFO*, INLINE_ELEMENTS?, PUBLISHINFO+, COMMON, META_INFORMATION)>


In this example, the underlined elements are subfragments, reusable pieces of XML that can be
included in the full About IBM XML servable. To accommodate the requirements for localized
text blocks, the About IBM DTD was modified to create the Regional About IBM DTD:

<!ELEMENT REGIONAL_ABOUTIBM      (SYSTEM, TITLE,TITLE_GRAPHIC?
LONG_DESCRIPTION?, SITE_SECTION, LEFT_NAVBAR, PHOTO_SECTION*, BLUE_TITLE?,
COMPANY_INFO?, COUNTRY_COMPANY_INPUT*, CONTACT_INFO*,
FINANCIAL*,ADDITIONAL_INFO*, INLINE_ELEMENTS?, PUBLISHINFO+,COMMON,
META_INFORMATION)>

The difference between the two versions of the DTD are the additional fragment elements in the
regional version: PHOTO_SECTION and COUNTRY_COMPANY_INPUT. In addition, some
of the fragments formerly defined as ‘cardinality zero or one’ (?) were modified to ‘cardinality
zero or more’ (*).

These changes provide the ability to include separate XML fragments for the localized text
blocks. For example, in the regional About IBM servable created for Bangladesh (bd), Sri Lanka
(lk), Vietnam (vn), Philippines (ph), Malaysia (my), Thailand (th) and Indonesia (id), the
following XML fragments are included:

<PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”>
      <COUNTRY_PHOTO>
            .     .     .
               <COMMON DATATYPE=quot;NOLABELquot;>
                      <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE>
                      <COUNTRY DATATYPE=quot;ASSOCLISTquot;>my</COUNTRY>
                      <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
                 </COMMON>
      </COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”>
      <COUNTRY_PHOTO>
            .     .     .
               <COMMON DATATYPE=quot;NOLABELquot;>
                      <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE>
                      <COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY>
                      <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
      </COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”>
      <COUNTRY_PHOTO>
            .     .     .
               <COMMON DATATYPE=quot;NOLABELquot;>
                      <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE>
                      <COUNTRY DATATYPE=quot;ASSOCLISTquot;>id</COUNTRY>
                      <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
      </COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”>
<COUNTRY_PHOTO>
            .     .     .
               <COMMON DATATYPE=quot;NOLABELquot;>
                      <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE>
                      <COUNTRY DATATYPE=quot;ASSOCLISTquot;>th</COUNTRY>
                      <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
      </COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”>
<COUNTRY_PHOTO>
            .     .     .
               <COMMON DATATYPE=quot;NOLABELquot;>
                      <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE>
                      <COUNTRY DATATYPE=quot;ASSOCLISTquot;>vn</COUNTRY>
                      <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
</COUNTRY_PHOTO>
</PHOTO_SECTION>

Similar sections of XML exist for other selected subfragment types
such as COUNTRY_COMPANY_INPUT and CONTACT_INFO.


This is a collapsed view of the XML containing only the ID of each included subfragment. Note
the different number of fragments of each type due to the fact that localized fragments are of
cardinality zero or more.

Expanding any of the subfragments reveals the XML elements that identify the applicable
country.

<COMMON DATATYPE=quot;NOLABELquot;>
       <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE>
       <COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY>
       <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
</COMMON>


During XSL processing of this servable, within the for-each loop for the servable
COMMON/COUNTRY, a test is performed to verify the existence of a localized fragment and
its applicability to the current cc variable. If the cc variable of the servable matches the cc
variable of the fragment, the contents of the XML fragment are included in the generation of the
output.

This test within the XSL is shown below:

<xsl:if test=quot;boolean(../../COUNTRY_COMPANY_INPUT
[COUNTRY_COMPANY_INFO/COMMON/COUNTRY=$cc])quot; >
      <xsl:apply-templates
select=quot;../../COUNTRY_COMPANY_INPUT[COUNTRY_COMPANY_INFO/COMMON/COUNTRY=$cc] quot;>
         <xsl:with-param name=quot;directoryPrefixquot; select=quot;$directoryPrefixquot;/>
         <xsl:with-param name=quot;countryNamequot; select=quot;$countryNamequot;/>
         <xsl:with-param name=quot;ccquot; select=quot;$ccquot;/>
      </xsl:apply-templates>
</xsl:if>



4.6   Leadspace Rotation

The www.ibm.com homepages have a unique set of criteria: the ability to display rotating
leadspace fragments at the top of the white space for each homepage. This feature is shown in
Figures 1 and 2, where the leadspaces differ between the Malaysian and Indonesian homepages.
This feature is enabled by the homepage engine, which is run for every www.ibm.com
homepage, regardless of the manner in which it was created. No modifications were required of
the homepage engine to enable it to be used with the extended reach model. However, the use of
the engine with the extended reach model adds another level of uniqueness to each country page
generated from only one XML source.
4.7   Hybrid Approach

During the first phase of the Extended Reach project, the design was restricted to an
implementation that would not require modification of the existing DTDs. Not having to rebuild
existing content was a major consideration. For the most part, existing DTDs could
accommodate content for the multi-publish output model.

In the second phase of Extended Reach, an opportunity rose to add a new set of pages, with no
existing DTDs, into the www.ibm.com Corporate Portal: the Software pages for the ASEAN
countries. Since there were no existing DTDs for this set of pages, a completely new design
could be implemented, limited only by the restrictions set by the content management system.

The design team decided that combining the earlier approach with some modifications works
best. The %%CC notation within an XML tag is still used as a placeholder for country code
substitutions. However, rather than using the inclusion/exclusion notation within the XML tag,
e.g.

      http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/

editors add discrete country tags in the XML to identify the applicable countries. This approach
makes content preparation simpler and less error prone for editors. They can choose a country
from a dropdown list rather than typing out a string in the defined syntax for each URI.
Furthermore, this approach is more consistent with standard XML tagging, as it separates the
URI from the country restrictions set upon it.

<PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTOquot;>
<COUNTRY_PHOTO>
       <TITLE DATATYPE=quot;STRINGquot; LINKABLE=quot;TITLEquot;>asean Software Home #Lead Image - IBM
Lotus Workplace</TITLE>
       <PHOTO DATATYPE=quot;STRINGquot; SUBFRAGMENTTYPE=quot;IMAGEquot;>
              <IMAGE>
                     …
              </IMAGE>
              </PHOTO>
              <PHOTO_URL>
              <ITEM_URL>http://www.ibm.com/software/%%CC/lotusworkplace/</ITEM_URL>
              </PHOTO_URL>
              <COMMON DATATYPE=quot;NOLABELquot;>
                     <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE>
                     <COUNTRY DATATYPE=quot;ASSOCLISTquot;>id</COUNTRY>
                     <COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY>
                     <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
              </COMMON>
       </COUNTRY_PHOTO>
</PHOTO_SECTION>

In the example above the PHOTO_SECTION element is a fragment tagged to work for id
(Indonesia) and ph (Philippines) only. It is not applied for the other top-level country tags that
denote the overall applicability of the page.

Note the element
<ITEM_URL>http://www.ibm.com/software/%%CC/lotusworkplace/</ITEM_URL>

and the following elements

<COUNTRY DATATYPE=quot;ASSOCLISTquot;>id</COUNTRY>
<COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY>


The XSL processes the ITEM_URL tag and includes it only for ID and PH.

5 Evaluating Extended Reach in Pilots
The first Extended Reach pilot supported the output of identical pages with minimal automated
localization. The second Enhanced Extended Reach pilot supported localized content within
otherwise identical pages.

5.1 Pilot 1: Basic extended reach with identical content
The Extended Reach functionality was first rolled out in the fall of 2002 for two groups: twenty
Caribbean English speaking countries and three ASEAN English speaking countries. At the time,
the definition of the Extended Reach technique was strict: the countries in an Extended Reach
group had to share identical content. The only localization the model allowed was the automatic
replacement of the ISO country code in URIs. Each portal page was otherwise identical across
the countries with automatically localized masthead and footer.

This model proved to fit the very lowest resource countries, where little or no localized content
existed. Such country portals consisted of little other than the minimum 9 required top-level
pages and a sufficient flow of news articles to keep the news section up to date. The three
ASEAN countries that adopted this technique, namely Bangladesh, Sri Lanka, and Vietnam, as
well as the twenty Caribbean countries, did benefit from the feature to an extent. A single update
in the content management system published out to three web pages, thus reducing the time and
money required to keep the sites fresh. An additional benefit was the reduced time required to
launch new sites. The twenty Caribbean country portals did not exist before Extended Reach.
Their parallel launch took less than one hour, instead of roughly twenty times that if each one
was managed and launched as a separate portal.

However, when evaluating whether the pilot had resulted in improvements to the site quality, it
became obvious that the Extended Reach approach did not solve the problem of content creation.
The countries have so little resource that even uploading news articles of regional relevance
written and published for the larger ASEAN or Americas markets could not be done.

This result led to the re-examination of the Extended Reach model itself. A second round of
requirements was gathered from the ASEAN web management. Each Extended Reach group of
countries clearly needs at least one country with sufficient funds to create fresh content on an on-
going basis. As the content is uploaded into the content management system, the other countries
in the same group immediately benefit from the content updates. The question to the ASEAN
team was: How does the restriction on identical content need to be relaxed in order to
accommodate countries with localized content into the same Extended Reach group?
5.2 Pilot 2: Enhanced extended reach with localized content
During the summer of 2003, the ASEAN web management articulated the requirements for
including Indonesia, Malaysia, Philippines, and Thailand into the existing Extended Reach
group. The web manager analyzed each page and stated their need for localization, i.e. which
areas of the page needed to be optional and filled in with content for a subset of the countries in
the group.

For example, one requirement was:

“The right hand navigation modules on the Products & Services page need the ability to be
localized as they are used to link to features that do not exist for all countries.”

In order to build the rules for a more general approach to allow for future modification, each rule
was generalized. For example, the specific requirement above turned into the following rule in
the content model:

“The element called related_info in the portal DTDs must be able to be tagged for one, some, or
all of the countries in the Extended Reach group, and should only appear on the output pages for
the tagged countries.”

The technique described in Section 4.5 enables this functionality today for all pages, not only the
Products & Services page.


5.3 Pilot 2 Evaluation
Enhanced Extended Reach for ASEAN Country Portals was successfully deployed on September
24th 2003 for the following 7 countries: Malaysia, Indonesia, Philippines, Thailand, Sri Lanka,
Bangladesh, Vietnam. The improvement of 85.8% in time, and thus in web site maintenance
cost, was achieved for news articles.

Enhanced Extended Reach for the ASEAN Software Portal was successfully deployed on
October 8th 2003 for the following 5 countries: India, Singapore, Malaysia, Thailand,
Philippines.

Quotes from ASEAN team on reduced workload:

From Yee Nam Sng, ASEAN Site manager:
        “ The News section provides the most savings and efficiency. This is because most
        news articles are replicated without any change (except for local URIs) across all
        countries.”
        “Homepage marketing modules provide savings. We are able to achieve faster
         turnaround and some savings by planning our updates and marketing modules across
         ASEAN carefully.”

From AP Creative Services editors:
“Out of the three Enhanced Extended Reach implementations, the news fragments gain
           the most benefits. Although it might only save around 15 minutes per news/country,
           it has saved us from tedious job to replicate the content and manually reposition the
           tiers. It also has limited the chance of errors. Publishing is now a bliss. Less
           fragments to load, review and publish.”

          “ The psychological efficiency is what we feel most. It's really tedious to duplicate
           the same thing over and over again. This Enhanced Extended Reach approach has
           increased the quot;Moralequot; of the editor by taking off these duplicate tasks.”

6 Future Work
Given the success of the Enhanced Extended Reach model, and the demonstrated cost savings it
has resulted in, new country groupings will certainly be created. Some candidates include
regions where ibm.com does not yet have existing country portals:
       • Americas Spanish
       • Middle East Arabic
       • Africa English
       • Africa French

Another direction is to apply the same technique for new sets of pages, much like was done for
Software pages for the ASEAN countries. In addition, the results of the Software page pilot will
certainly provide lessons learned to ibm.com Software group on how best to include the software
portals worldwide in the same framework.

Yet another approach is to multi-publish output pages from one XML source regardless of pre-
determined country groupings. For example, the legal statements for many IBM countries are the
same, regardless of the region or size of market. XML pages for wireless.ibm.com are generated
using this approach.


7 Acknowledgements
The authors wish to thank Dikran Meliksetian, Rosa Bolger, Lisa Intravio Chris Wang and
Marie Shafi who helped put the methods described in this paper into practice, and who have
consistently supported and contributed to its further development.




8 References
[1] ISO web site at http://www.iso.ch/iso/en/prods-services/iso3166ma/02iso-3166-code-
lists/list-en1.html

[2] quot;XML Content Management: Challenges and Solutionsquot; XML Europe 2001 Nianjun Zhou,
Dikran Meliksetian, Louis Weitzman, Sara Elo Dean, Jeff Milton, Peter Davis, Jessica Wu. May
2001

Contenu connexe

Similaire à Extended Reach: An Efficient Content Management Technique for Sharing and Localizing Content

Training report on web developing
Training report on web developingTraining report on web developing
Training report on web developingJawhar Ali
 
In-Fisherman.com - Building an Enterprise Level Drupal Site
In-Fisherman.com - Building an Enterprise Level Drupal SiteIn-Fisherman.com - Building an Enterprise Level Drupal Site
In-Fisherman.com - Building an Enterprise Level Drupal SiteMediacurrent
 
Case study infisherman
Case study infishermanCase study infisherman
Case study infishermanmrquy
 
MINOR PROZECT REPORT on WINDOWS SERVER
MINOR PROZECT REPORT on WINDOWS SERVERMINOR PROZECT REPORT on WINDOWS SERVER
MINOR PROZECT REPORT on WINDOWS SERVERAsish Verma
 
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docxA Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docxbartholomeocoombs
 
Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...
Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...
Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...valcker
 
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...MakoLab SA
 
Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org sopekmir
 
How Browsers Work -By Tali Garsiel and Paul Irish
How Browsers Work -By Tali Garsiel and Paul IrishHow Browsers Work -By Tali Garsiel and Paul Irish
How Browsers Work -By Tali Garsiel and Paul IrishNagamurali Reddy
 
Unit 5 application layer
Unit 5 application layerUnit 5 application layer
Unit 5 application layerKritika Purohit
 
Word press intro 4x3 draft 12
Word press intro 4x3 draft 12Word press intro 4x3 draft 12
Word press intro 4x3 draft 12msz
 
Raybiztech Content Management Approach
Raybiztech Content Management ApproachRaybiztech Content Management Approach
Raybiztech Content Management Approachray biztech
 
Integration Approach for MES
Integration Approach for MESIntegration Approach for MES
Integration Approach for MESVinod Kumar
 

Similaire à Extended Reach: An Efficient Content Management Technique for Sharing and Localizing Content (20)

Srs documentation
Srs documentationSrs documentation
Srs documentation
 
Training report on web developing
Training report on web developingTraining report on web developing
Training report on web developing
 
In-Fisherman.com - Building an Enterprise Level Drupal Site
In-Fisherman.com - Building an Enterprise Level Drupal SiteIn-Fisherman.com - Building an Enterprise Level Drupal Site
In-Fisherman.com - Building an Enterprise Level Drupal Site
 
Multilingual websites
Multilingual websitesMultilingual websites
Multilingual websites
 
Case study infisherman
Case study infishermanCase study infisherman
Case study infisherman
 
DITA
DITADITA
DITA
 
MINOR PROZECT REPORT on WINDOWS SERVER
MINOR PROZECT REPORT on WINDOWS SERVERMINOR PROZECT REPORT on WINDOWS SERVER
MINOR PROZECT REPORT on WINDOWS SERVER
 
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docxA Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
 
Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...
Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...
Building a multilingual & multi-country e-commerce site with Drupal 7 @ NYC C...
 
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...
 
Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org
 
How Browsers Work -By Tali Garsiel and Paul Irish
How Browsers Work -By Tali Garsiel and Paul IrishHow Browsers Work -By Tali Garsiel and Paul Irish
How Browsers Work -By Tali Garsiel and Paul Irish
 
Tech talk php_cms
Tech talk php_cmsTech talk php_cms
Tech talk php_cms
 
9 10 july2020
9 10 july20209 10 july2020
9 10 july2020
 
Unit 5 application layer
Unit 5 application layerUnit 5 application layer
Unit 5 application layer
 
Word press intro 4x3 draft 12
Word press intro 4x3 draft 12Word press intro 4x3 draft 12
Word press intro 4x3 draft 12
 
Raybiztech Content Management Approach
Raybiztech Content Management ApproachRaybiztech Content Management Approach
Raybiztech Content Management Approach
 
Integration Approach for MES
Integration Approach for MESIntegration Approach for MES
Integration Approach for MES
 
Presemtation Tier Optimizations
Presemtation Tier OptimizationsPresemtation Tier Optimizations
Presemtation Tier Optimizations
 
Web Programming introduction
Web Programming introductionWeb Programming introduction
Web Programming introduction
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Extended Reach: An Efficient Content Management Technique for Sharing and Localizing Content

  • 1. Extended Reach: An Efficient Content Management Technique for Sharing and Localizing Content IBM Technical Report TR-40.0032 December, 2003 Sheila Monheit David Leip IBM Corporate Webmasters IBM Corporate Webmasters San Jose, CA, United States Hawthorne, NY, United States Monheit@us.ibm.com Leip@us.ibm.com Sara Elo Dean Hidekazu Shirayama IBM Corporate Webmasters IBM Corporate Webmasters Helsinki, Finland Tokyo, Japan EloDean@fi.ibm.com Flyhard@jp.ibm.com
  • 2. Table of Contents 1 Introduction............................................................................................................................. 3 2 Objectives ............................................................................................................................... 3 3 IBM URI Taxonomy............................................................................................................... 3 4 Approach................................................................................................................................. 4 4.1 ibm.com Content Model ................................................................................................. 5 4.2 Multi-Page Publish Scheme............................................................................................ 5 4.3 Enabling Localized Content............................................................................................ 6 4.4 Automating Country Code References ........................................................................... 8 4.5 Shared vs. Localized Text Blocks................................................................................. 13 4.6 Leadspace Rotation....................................................................................................... 17 4.7 Hybrid Approach .......................................................................................................... 18 5 Evaluating Extended Reach in Pilots.................................................................................... 19 5.1 Pilot 1: Basic extended reach with identical content .................................................... 19 5.2 Pilot 2: Enhanced extended reach with localized content............................................. 20 5.3 Pilot 2 Evaluation.......................................................................................................... 20 6 Future Work .......................................................................................................................... 21
  • 3. 1 Introduction For global companies such as IBM, it is important from a marketing and brand perspective that they represent themselves as being “in touch” with the many local national markets in which they do business. This applies to all aspects and representations of the corporation, including their web presence. In some cases these markets can be quite small, and it can be difficult to justify the investment to create and maintain separate web content for each of these markets individually. The alternative, simply grouping countries together and creating a single web site for a region, is not particularly attractive. It leaves that set of end users feeling not on par with the corporation’s larger markets. A large corporate web site such as ibm.com is faced with the challenge to serve as wide a set of customers as efficiently as possible. Two strategies exist for achieving this goal. The first is to leverage the same content across different formats. For example, the ibm.com corporate news content is shared across XHTML for the standard web browsers, WML, HDML, cHTML for pervasive devices, and RSS for content syndication. The second approach is to share the same content across different sites. This paper discussed the second approach, named Extended Reach. Specifically, the paper explains the way IBM has set up multiple country portals that can be managed, from a content maintenance perspective, as a single portal. 2 Objectives The Extended Reach project has three main business goals: 1. To make ibm.com available on a wider basis world wide 2. To reduce the workload of maintaining country portals, especially for smaller countries 3. Flaunt the “I” (International) in IBM IBM took the early lead in establishing a web presence for quite a few countries, more than its competitors. In recent years some of its larger competitors (Dell, HP & Microsoft) surpassed IBM, creating a web presence in more countries. With the rollout of the Extended Reach project, IBM has regained the leadership position. Today IBM presents a country portal in 83 countries, while Dell, HP and Microsoft and other competitors cover fewer countries. 3 IBM URI Taxonomy The IBM URI taxonomy centers on subject matter keywords in English and the ISO standard for two-letter country and language codes [1]. These elements allow presenting a web site visitor with consistent naming conventions across applications and web sites worldwide. Examples: • http://www.ibm.com/ibm/au (About IBM in Australia) • http://www.ibm.com/news/ve (News in Venezuela) • http://www.ibm.com/servers/de (Servers in Germany)
  • 4. If more than one language is used for a country, URIs follow the /<cc>/<lc> format where <cc> is two-letter code as specified in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements and <lc> is the two character language code. Examples: • http://www.ibm.com/e-business/ch/fr (e-business in Switzerland, French version) • http://www.ibm.com/products/ca/fr (Products & Services in Canada, French version) Top-level, or root level, directories are restricted to IBM registered trademarks and service marks, and major, global, cross divisional content areas such as /e-business, /thinkpad, /services and /products. These keywords must be in English only. For worldwide consistency, URIs are not translated to the local language. Use of regional web sites and regional URIs is strongly discouraged. Furthermore, if consistent URIs do not or cannot be implemented due to application constraints for strategic pages, the ibm.com web servers are configured with redirects so that the advertised URI still abide to the URI taxonomy. Examples: • http://www.ibm.com/shop/it/customerservice (Online customer support Italy) redirects to http://www- 134.ibm.com/webapp/wcs/stores/servlet/HelpDisplay?subject=2294556&storeId=380&catalogI d=-380&langId=-4 • http://www.ibm.com/shop/uk/help (Online shopping support UK) redirects to http://www- 134.ibm.com/webapp/wcs/stores/servlet/HelpDisplay?storeId=826&catalogId=- 826&dualCurrId=20&langId=826&subject=2294556 The Extended Reach technique builds on the fact that the IBM web URI taxonomy is country code centric. URIs between corresponding pages for countries vary in general only by country code. This enables URIs to be programmatically localized for countries within a group. 4 Approach The Extended Reach technique is applicable for a group of country web sites with the following criteria: • Maximum content sharing across multiple countries. The goal is to share most of the content that makes up the web site, with only a small amount of unique information maintained separately for each country. Allow for variation in content where a country has a local business need. • Group similar small market countries together based on common language and region. For example: o 20 Caribbean English language countries o 7 ASEAN English language countries Due to translation issues, it is not possible to share content between different languages. • Enforce a standard layout.
  • 5. Support rotation of content to give a greater sense of freshness and even uniqueness across countries. • Comply with a standard URI taxonomy to enable the automated localization of standard URIs. • Cater for automated country name substitution, but with care. 4.1 ibm.com Content Model Today, a content management system based on the Extensible Markup Language (XML) is used to create and maintain ibm.com country portals. By encoding content in XML and layout logic in the Extensible Stylesheet Language (XSL), the system enforces the separation of content and presentation. The system also supports reusable XML fragments and manages the dependencies between such fragments. Using a Java-based user interface, a content editor can upload XSL stylesheets and multimedia objects, create and edit XML content fragments, compose pages out of fragments, preview pages, review final published pages, and reject them or promote them to the final stage in the publishing flow [2]. Every ibm.com web page consists of several fragments: a masthead, footer, left and right navigation bars, and the main white space. Each of these is built as a separate XML fragment included into one or more XML documents, or servables. The XML fragments and servables abide to Document Type Definitions (DTDs). Fragments correspond to reusable components such as a navigation bar, an image, or a link, and servables to specific page types, such as an index page, a homepage, or a news article. An XML servable may contain fragments that are unique to the white space of the page type or reusable fragments. An XML servable is transformed to output pages in various formats by dedicated XSL stylesheets that control the presentation of a page. Thus content input and output presentation are tightly controlled by the appropriate servable and fragment DTDs and the XSL stylesheets. 4.2 Multi-Page Publish Scheme For countries not within Extended Reach, ibm.com corporate portal country pages are generated on a 1-1 basis. One input XML servable transformed with one XSL generates one output page (in HTML, WML, HDML, or RSS format) for one country in one language. Thus, ten XML servables tagged for ten different countries are transformed by one XSL stylesheet, generating ten resulting pages. In this way the IBM standard layout, along with the tight DTD control over the page content, are ensured across every country portal page. Extended Reach presented the challenge of creating more than one output page from one XSL transformation of one input XML. The input XML was now a fully reusable XML servable, made up already reused fragments. The existing content model and content were analyzed to identify how content could be efficiently shared across a group of countries. Countries that share a common language and common content could be grouped together.
  • 6. The first design introduced no changes to the DTDs in order to avoid the maintenance of two sets of DTDs, one set for countries with unique content and one for the Extended Reach countries with identical content. The Extended Reach technique was implemented as a multi-page publishing scheme in the XSL stylesheets. The existing XSL logic was enhanced to include a looping mechanism. The new logic could generate multiple outputs from a single XML and result in a distinct ibm.com corporate portal page for each specified target country. The output pages were identical in content, apart from the automated localization of the masthead, footer and URIs. Once the groupings of countries had been identified, rendering the countries to IBM standard layouts became very straightforward. Within every XML servable is a COUNTRY element tag, which specifies the target country page being generated. By adding this tag multiple times, the stylesheet can process any number of countries. Single country tagging: <COMMON> <LANGUAGE >en</LANGUAGE> <COUNTRY>bd</COUNTRY> </COMMON> Multiple country tagging: <COMMON> <LANGUAGE >en</LANGUAGE> <COUNTRY>bd</COUNTRY> <COUNTRY>lk</COUNTRY> <COUNTRY>vn</COUNTRY> <COUNTRY>ph</COUNTRY> <COUNTRY>my</COUNTRY> <COUNTRY>th</COUNTRY> <COUNTRY> id</COUNTRY> <AUDIENCE >all</AUDIENCE> </COMMON> An XML servable also contains the STYLESHEET tag, which identifies the XSL stylesheet to transform with: <STYLESHEET>regional_newsindex_xml_html.xsl</STYLESHEET> 4.3 Enabling Localized Content The first Extended Reach implementation successfully created multiple near-identical, automatically localized output pages and enforced the IBM layout standard. However, the approach was too rigorous: identical pages left no room for unique country distinctions. Some ASEAN Extended Reach candidate countries were unable to adopt the technique because the design did not allow for any localization on the pages. An enhanced design needed to allow for some custom content identification within the existing page structures defined in the DTDs. A content analysis of countries in the same region provided insight into the localization requirements. Fig 1 and Fig 2 show the www.ibm.com homepages for Malaysia and Indonesia:
  • 7. Fig 1: www.ibm.com/my www.ibm.com/planetwide/select www.ibm.com/my/offers/thinkpad/ Every link on the page refers either to a country-specific page or a www.ibm.com general page. The country code www.ibm.com/services/my/ occurs anywhere within the URI, or not at all. Fig 2: www.ibm.com/id www.ibm.com/planewtwide/select Leadspace views rotate per hit, for every country. www.ibm.com/services/bcs/id/ www.ibm.com/services/id/
  • 8. Fig 3: Services links Malaysian Homepage Services section: Indonesian Homepage Services section: Optional link to: No optional www.ibm.com/financing/my link Further investigation of content, such as lists of links in Fig 3, reveals the following: • The URI taxonomy is consistent within defined sections on a page, so enabling country references can be automated. • Some links appear only for a subset of countries in a group, so country tagging of a link must be enabled • Some text blocks are identical across all countries with the exception of the local country name, so enabling automatic country references within text could be enabled. 4.4 Automating Country Code References Before Extended Reach, links, such as the ones in Fig 3, were defined in XML as ITEM_TITLE and ITEM_URL element pairs. The sample below defines the left navigation bar on the www.ibm.com/us homepage: <PRIMARY_LINKS> <ITEM> <ITEM_TITLE>Home / home office</ITEM_TITLE> <ITEM_URL>http://www.ibm.com/homeoffice/</ITEM_URL> </ITEM> </PRIMARY_LINKS> <PRIMARY_LINKS> <ITEM> <ITEM_TITLE>Small & medium business</ITEM_TITLE> <ITEM_URL>http://www.ibm.com/businesscenter/us/<ITEM_URL> </ITEM> </PRIMARY_LINKS> <PRIMARY_LINKS> <ITEM> <ITEM_TITLE>Large enterprise</ITEM_TITLE> <ITEM_URL>http://www.ibm.com/largeenterprise/us/</ITEM_URL> </ITEM> </PRIMARY_LINKS>
  • 9. The transforming XSL loops over all the PRIMARY_LINK elements and generates the following output html: http://www.ibm.com/hom http://www.ibm.com/businesscenter/us/ http://www.ibm.com/largeenterprise/us/ Based on the definition in the XML, all the links point to US URIs and every title and URI pair is included in the output. No mechanism, or need, exists to specify conditions of links, such as their presence or absence in the output, because the navigation bar is dedicated to the US. The following patterns were defined to enable flexible localization of links. Content and XSL stylesheets were enhanced to respectively include and process the new logic. %%CC substitute every country (cc) listed under <COMMON/COUNTRY> in the URI string %%INCLIST_cc_cc_%% substitute ONLY countries included in the INCLIST string [[%%INCLIST_cc_cc_%%]] include this link (which contains no CC references at all, ex:www.ibm.com) for countries in the INCLIST (note: this string is added at the end of the URI string) %%EXCLIST_cc_cc_%% substitute ONLY countries NOT included in the EXCLIST string [[%%EXCLIST_cc_cc_%%]] include this link (which contains no CC references at all, ex:www.ibm.com) for countries NOT included in the EXCLIST (note: this string is added at the end of the URI string Going back to the sample services section in Fig 3, the XML for that section in the new syntax becomes: <SERVICES_BOX> <SERVICES_GRAY_TITLE>Services</SERVICES_GRAY_TITLE> <SERVICES_LINKS> <LINK_TEXT>Business and IT services</LINK_TEXT> <LINK_URL>http://www.ibm.com/services/%%CC/</LINK_URL> </SERVICES_LINKS> <SERVICES_LINKS> <LINK_TEXT>Business consulting services</LINK_TEXT> <LINK_URL>http://www.ibm.com/bcs/%%CC/</LINK_URL> </SERVICES_LINKS> <SERVICES_LINKS> <LINK_TEXT>Infrastructure services</LINK_TEXT> <LINK_URL>http://www.ibm.com/services/%%CC/strategy/capability/fullin fra.html</LINK_URL> </SERVICES_LINKS> <SERVICES_LINKS>
  • 10. <LINK_TEXT>On demand services</LINK_TEXT> <LINK_URL>http://www.ibm.com/services/%%CC/ondemand/</LINK_URL> </SERVICES_LINKS> <SERVICES_LINKS> <LINK_TEXT>Financing</LINK_TEXT> <LINK_URL>http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/</LINK_URL> </SERVICES_LINKS> </SERVICES_BOX> Further down in the same XML servable the country definitions are: <COMMON> <LANGUAGE>en</LANGUAGE> <COUNTRY>my</COUNTRY> <COUNTRY>ph</COUNTRY> <COUNTRY>th</COUNTRY> <COUNTRY>id</COUNTRY> </COMMON> The output seen in Fig 3 for the Malaysian and Indonesian homepages is generated by the Extended Reach XSL below: <xsl:template name=quot;regionalLinksquot;> <xsl:param name=quot;ccquot;/> <xsl:param name=quot;linkquot;/> <xsl:choose> <xsl:when test=quot;contains($link,'%%CC')quot;> <xsl:value-of select=quot;concat(substring- before($link,'%%CC'),$cc,substring-after($link,'%%CC'))quot;/> </xsl:when> <xsl:when test=quot;contains ($link,'%%INCLIST_')quot;> <xsl:variable name=quot;IncListquot; select=quot;substring-before (substring- after ($link, '%%INCLIST_'), '%%')quot;/> <!--xsl:value-of select=quot;concat('this is dolist variable:', $doList)quot;/--> <xsl:choose> <xsl:when test=quot;contains ($IncList, $cc)quot;> <xsl:choose> <xsl:when test=quot;contains ($link, '[[%%INCLIST_')quot;> <xsl:value-of select=quot;substring-before($link, '[[%%INCLIST')quot;/> </xsl:when> <xsl:otherwise> <xsl:value-of select=quot;concat(substring-before($link, '%%INCLIST_'),$cc,substring-after($link, '_%%'))quot;/> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:value-of select=quot;''quot;/> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:when test=quot;contains ($link,'%%EXCLIST_')quot;> <xsl:variable name=quot;ExcListquot; select=quot;substring-before (substring- after ($link, '%%EXCLIST_'), '%%')quot;/> <!--xsl:value-of select=quot;concat('this is dolist variable:', $doList)quot;/--> <xsl:choose> <xsl:when test=quot;contains ($ExcList, $cc)quot;> <xsl:value-of select=quot;''quot;/>
  • 11. </xsl:when> <xsl:otherwise> <xsl:choose> <xsl:when test=quot;contains($link, '[[%%EXCLIST_')quot;> <xsl:value-of select=quot;substring-before($link, '[[%%EXCLIST')quot;/> </xsl:when> <xsl:otherwise> <xsl:value-of select=quot;concat(substring-before($link, '%%EXCLIST_'),$cc,substring-after($link, '_%%'))quot;/> </xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:value-of select=quot;$linkquot;/> </xsl:otherwise> </xsl:choose> </xsl:template> A detailed explanation of the XSL follows: The XSL template gets passed 2 parameters from the parent routine: 1. cc, which is the country code of the pass it is performing under the FOR-EACH loop for COMMON/COUNTRY: <COMMON> <LANGUAGE>en</LANGUAGE> <COUNTRY>my</COUNTRY> <COUNTRY>ph</COUNTRY> <COUNTRY>th</COUNTRY> <COUNTRY>id</COUNTRY> </COMMON> In the first pass cc=my (Malaysia), then ph (Philippines) and so on. 2. link, which is the string containing the URI information, the contents of the <LINK_URL> element: <LINK_URL>http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/</ LINK_URL>. The template above is executed within the COMMON/COUNTRY for- each loop N times, once for each time a URI requires processing. In this example, the cc variable does not change values until all the LINK_URL elements are processed. At that point the cc variable is assigned the value of the next COUNTRY element and the processing for each LINK_URL is repeated. The links being processed are in order: 1. http://www.ibm.com/services/%%CC/ 2. http://www.ibm.com/bcs/%%CC/ 3. http://www.ibm.com/services/%%CC/strategy/capability/fullinfra.html 4. http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/
  • 12. The following XSL logic occurs for each pass through these links: For the first three links (1, 2 and 3) the value of cc, the country being processed, is substituted directly into the link string at the exact location of the %%CC notation. Thus, when processing the first COMMON/COUNTRY element my, the first three links print as http://www.ibm.com/services/my/ http://www.ibm.com/bcs/my/ http://www.ibm.com/services/my/strategy/capability/fullinfra.html and when processing the second COMMON/COUNTRY element ph, the same links print as http://www.ibm.com/services/ph/ http://www.ibm.com/bcs/ph/ http://www.ibm.com/services/ph/strategy/capability/fullinfra.html The processing of the fourth link is more complicated. http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/ When the XSL encounters the %%INCLIST or %%EXCLIST pattern, it triggers two conditional loops: 1. First, it parses the string until the closing _%% to see whether or not the current cc variable is relevant for this string. In this case, Malaysia (my), Thailand (th), and Philippines (ph) homepages should all include this URI. Indonesia (id) homepage should not include it. This could also have been represented as: http://www.ibm.com/financing/%%EXCLIST_id_%%/ and would have produced the same results. For the EXCLIST pattern, the conditional loop parses the string to see if the current cc is NOT in the list, and if so, the link is included. 2. Second, if it is established that the URI string is applicable for the current cc variable, the next conditional test determines if the URI string contains a country reference in its syntax, or whether it’s a general ibm.com URI that has no country reference in it at all. This test performs a second parse on the INCLIST or EXCLIST patterns to determine if the INCLIST or EXCLIST pattern is at the end of the URI string, and if so, if the [[ opening and ]] closing brackets surround it. This indicates that the URI string does include a country reference. An example is the pattern: http://www.lotus.com/[[%%INCLIST_my_%%]] which prints out the link without any country code http://www.lotus.com/ on the Malaysian page only.
  • 13. Last, if the string being processed with the INCLIST or EXCLIST pattern is not applicable to the current cc variable, the XSL returns a blank string. This is necessary for later processing when the URI and TITLE are both processed for the final output. The TITLE is always included in the input XML, regardless of country tagging, so to ensure that no TITLE without a corresponding URI is inserted into output HTML, a blank string is required for a last test before the HTML output is created. If the returned URI string is blank, no TITLE/URI combination is included in the HTML; if it isn’t blank, the returned string, now containing the correct country tags, along with the corresponding TITLE, is included in the HTML. 4.5 Shared vs. Localized Text Blocks A comparison of About IBM pages provides a good example of the types of text blocks shared among and localized by countries. Fig 4: www.ibm.com/ibm/my Text block that all countries Localized photo share. May include country name (optional) in the text. Shared financial info, additional section for localized financial info allowed Text block with localized information Fig 5: www.ibm.com/ibm/id
  • 14. Text block that all countries share. May include country name Localized photo in the text. (optional) Shared financial info, additional section for localized financial info allowed Text block with localized information The examples in Figs 4 and 5 illustrate different types of text blocks, namely shared and localized. A shared text block is reusable, but requires some processing to allow for minor localization in order to give the text a country specific feel. For example, in the first section, it would be ideal if a country could use the general text, and insert one or more localized sentences. A localized text block is specific to a country only. For example: the history of IBM in the country, the picture of the local general manager, or the contact information for the country shown in Fig 6. Fig 6: www.ibm.com/ibm/my continued
  • 15. For the shared text block, the text processing XSL template is modified to accept a TAG that serves as a placeholder and country identifier within a text block, (%%COUNTRYNAME). Using standard XSL, this text processing template can be invoked, using the country name (Malaysia). The XSL processing is a standard text substitution/replacement template, one that recursively parses a sting and substitutes any instance of TAG with the passed in parameter value. A less obvious, but equally beneficial, outcome of this first type of text substitution is its application to the HTML Meta tags: Malaysian Meta tags: <meta name=quot;IBM.Countryquot; content=quot;myquot;/> <meta name=quot;Descriptionquot; content=quot;The IBM Malaysia home page, entry point to information about IBM products and services.quot;/> <meta name=quot;Abstractquot; content=quot;The IBM Malaysia home page, entry point to information about IBM products and services.quot;/> The corresponding Indonesian Meta tags: <meta name=quot;IBM.Countryquot; content=quot;idquot;/> <meta name=quot;Descriptionquot; content=quot;The IBM Indonesia home page, entry point to information about IBM products and services.quot;/> <meta name=quot;Abstractquot; content=quot;The IBM Indonesia home page, entry point to information about IBM products and services.quot;/> The second type, the localized text block, requires a change beyond the Extended Reach approach described so far where only XSL processing and content are enhanced. Minor DTD changes need to be introduced to accommodate the inclusion of localized blocks of text in an XML servable. The DTD for the About IBM page, along with all the other portal pages, already accommodates the inclusion of reusable XML fragments. The root element for About IBM DTD: <!ELEMENT ABOUT_IBM (SYSTEM,TITLE,TITLE_GRAPHIC?, LONG_DESCRIPTION?, SITE_SECTION, LEFT_NAVBAR, PHOTO?, PHOTO_URL?, CAPTION?, BLUE_TITLE? COMPANY_INFO?, COUNTRY_COMPANY_INFO? CONTACT_INFO?, FINANCIAL?, ADDITIONAL_INFO*, INLINE_ELEMENTS?, PUBLISHINFO+, COMMON, META_INFORMATION)> In this example, the underlined elements are subfragments, reusable pieces of XML that can be included in the full About IBM XML servable. To accommodate the requirements for localized text blocks, the About IBM DTD was modified to create the Regional About IBM DTD: <!ELEMENT REGIONAL_ABOUTIBM (SYSTEM, TITLE,TITLE_GRAPHIC? LONG_DESCRIPTION?, SITE_SECTION, LEFT_NAVBAR, PHOTO_SECTION*, BLUE_TITLE?, COMPANY_INFO?, COUNTRY_COMPANY_INPUT*, CONTACT_INFO*, FINANCIAL*,ADDITIONAL_INFO*, INLINE_ELEMENTS?, PUBLISHINFO+,COMMON, META_INFORMATION)> The difference between the two versions of the DTD are the additional fragment elements in the regional version: PHOTO_SECTION and COUNTRY_COMPANY_INPUT. In addition, some
  • 16. of the fragments formerly defined as ‘cardinality zero or one’ (?) were modified to ‘cardinality zero or more’ (*). These changes provide the ability to include separate XML fragments for the localized text blocks. For example, in the regional About IBM servable created for Bangladesh (bd), Sri Lanka (lk), Vietnam (vn), Philippines (ph), Malaysia (my), Thailand (th) and Indonesia (id), the following XML fragments are included: <PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”> <COUNTRY_PHOTO> . . . <COMMON DATATYPE=quot;NOLABELquot;> <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>my</COUNTRY> <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE> </COMMON> </COUNTRY_PHOTO> </PHOTO_SECTION> <PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”> <COUNTRY_PHOTO> . . . <COMMON DATATYPE=quot;NOLABELquot;> <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY> <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE> </COUNTRY_PHOTO> </PHOTO_SECTION> <PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”> <COUNTRY_PHOTO> . . . <COMMON DATATYPE=quot;NOLABELquot;> <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>id</COUNTRY> <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE> </COUNTRY_PHOTO> </PHOTO_SECTION> <PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”> <COUNTRY_PHOTO> . . . <COMMON DATATYPE=quot;NOLABELquot;> <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>th</COUNTRY> <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE> </COUNTRY_PHOTO> </PHOTO_SECTION> <PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTO”> <COUNTRY_PHOTO> . . . <COMMON DATATYPE=quot;NOLABELquot;> <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>vn</COUNTRY> <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE>
  • 17. </COUNTRY_PHOTO> </PHOTO_SECTION> Similar sections of XML exist for other selected subfragment types such as COUNTRY_COMPANY_INPUT and CONTACT_INFO. This is a collapsed view of the XML containing only the ID of each included subfragment. Note the different number of fragments of each type due to the fact that localized fragments are of cardinality zero or more. Expanding any of the subfragments reveals the XML elements that identify the applicable country. <COMMON DATATYPE=quot;NOLABELquot;> <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY> <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE> </COMMON> During XSL processing of this servable, within the for-each loop for the servable COMMON/COUNTRY, a test is performed to verify the existence of a localized fragment and its applicability to the current cc variable. If the cc variable of the servable matches the cc variable of the fragment, the contents of the XML fragment are included in the generation of the output. This test within the XSL is shown below: <xsl:if test=quot;boolean(../../COUNTRY_COMPANY_INPUT [COUNTRY_COMPANY_INFO/COMMON/COUNTRY=$cc])quot; > <xsl:apply-templates select=quot;../../COUNTRY_COMPANY_INPUT[COUNTRY_COMPANY_INFO/COMMON/COUNTRY=$cc] quot;> <xsl:with-param name=quot;directoryPrefixquot; select=quot;$directoryPrefixquot;/> <xsl:with-param name=quot;countryNamequot; select=quot;$countryNamequot;/> <xsl:with-param name=quot;ccquot; select=quot;$ccquot;/> </xsl:apply-templates> </xsl:if> 4.6 Leadspace Rotation The www.ibm.com homepages have a unique set of criteria: the ability to display rotating leadspace fragments at the top of the white space for each homepage. This feature is shown in Figures 1 and 2, where the leadspaces differ between the Malaysian and Indonesian homepages. This feature is enabled by the homepage engine, which is run for every www.ibm.com homepage, regardless of the manner in which it was created. No modifications were required of the homepage engine to enable it to be used with the extended reach model. However, the use of the engine with the extended reach model adds another level of uniqueness to each country page generated from only one XML source.
  • 18. 4.7 Hybrid Approach During the first phase of the Extended Reach project, the design was restricted to an implementation that would not require modification of the existing DTDs. Not having to rebuild existing content was a major consideration. For the most part, existing DTDs could accommodate content for the multi-publish output model. In the second phase of Extended Reach, an opportunity rose to add a new set of pages, with no existing DTDs, into the www.ibm.com Corporate Portal: the Software pages for the ASEAN countries. Since there were no existing DTDs for this set of pages, a completely new design could be implemented, limited only by the restrictions set by the content management system. The design team decided that combining the earlier approach with some modifications works best. The %%CC notation within an XML tag is still used as a placeholder for country code substitutions. However, rather than using the inclusion/exclusion notation within the XML tag, e.g. http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/ editors add discrete country tags in the XML to identify the applicable countries. This approach makes content preparation simpler and less error prone for editors. They can choose a country from a dropdown list rather than typing out a string in the defined syntax for each URI. Furthermore, this approach is more consistent with standard XML tagging, as it separates the URI from the country restrictions set upon it. <PHOTO_SECTION SUBFRAGMENTTYPE=quot;COUNTRY_PHOTOquot;> <COUNTRY_PHOTO> <TITLE DATATYPE=quot;STRINGquot; LINKABLE=quot;TITLEquot;>asean Software Home #Lead Image - IBM Lotus Workplace</TITLE> <PHOTO DATATYPE=quot;STRINGquot; SUBFRAGMENTTYPE=quot;IMAGEquot;> <IMAGE> … </IMAGE> </PHOTO> <PHOTO_URL> <ITEM_URL>http://www.ibm.com/software/%%CC/lotusworkplace/</ITEM_URL> </PHOTO_URL> <COMMON DATATYPE=quot;NOLABELquot;> <LANGUAGE DATATYPE=quot;ASSOCLISTquot;>en</LANGUAGE> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>id</COUNTRY> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY> <AUDIENCE DATATYPE=quot;ASSOCLISTquot; LINKABLE=quot;AUDIENCEquot;>all</AUDIENCE> </COMMON> </COUNTRY_PHOTO> </PHOTO_SECTION> In the example above the PHOTO_SECTION element is a fragment tagged to work for id (Indonesia) and ph (Philippines) only. It is not applied for the other top-level country tags that denote the overall applicability of the page. Note the element
  • 19. <ITEM_URL>http://www.ibm.com/software/%%CC/lotusworkplace/</ITEM_URL> and the following elements <COUNTRY DATATYPE=quot;ASSOCLISTquot;>id</COUNTRY> <COUNTRY DATATYPE=quot;ASSOCLISTquot;>ph</COUNTRY> The XSL processes the ITEM_URL tag and includes it only for ID and PH. 5 Evaluating Extended Reach in Pilots The first Extended Reach pilot supported the output of identical pages with minimal automated localization. The second Enhanced Extended Reach pilot supported localized content within otherwise identical pages. 5.1 Pilot 1: Basic extended reach with identical content The Extended Reach functionality was first rolled out in the fall of 2002 for two groups: twenty Caribbean English speaking countries and three ASEAN English speaking countries. At the time, the definition of the Extended Reach technique was strict: the countries in an Extended Reach group had to share identical content. The only localization the model allowed was the automatic replacement of the ISO country code in URIs. Each portal page was otherwise identical across the countries with automatically localized masthead and footer. This model proved to fit the very lowest resource countries, where little or no localized content existed. Such country portals consisted of little other than the minimum 9 required top-level pages and a sufficient flow of news articles to keep the news section up to date. The three ASEAN countries that adopted this technique, namely Bangladesh, Sri Lanka, and Vietnam, as well as the twenty Caribbean countries, did benefit from the feature to an extent. A single update in the content management system published out to three web pages, thus reducing the time and money required to keep the sites fresh. An additional benefit was the reduced time required to launch new sites. The twenty Caribbean country portals did not exist before Extended Reach. Their parallel launch took less than one hour, instead of roughly twenty times that if each one was managed and launched as a separate portal. However, when evaluating whether the pilot had resulted in improvements to the site quality, it became obvious that the Extended Reach approach did not solve the problem of content creation. The countries have so little resource that even uploading news articles of regional relevance written and published for the larger ASEAN or Americas markets could not be done. This result led to the re-examination of the Extended Reach model itself. A second round of requirements was gathered from the ASEAN web management. Each Extended Reach group of countries clearly needs at least one country with sufficient funds to create fresh content on an on- going basis. As the content is uploaded into the content management system, the other countries in the same group immediately benefit from the content updates. The question to the ASEAN team was: How does the restriction on identical content need to be relaxed in order to accommodate countries with localized content into the same Extended Reach group?
  • 20. 5.2 Pilot 2: Enhanced extended reach with localized content During the summer of 2003, the ASEAN web management articulated the requirements for including Indonesia, Malaysia, Philippines, and Thailand into the existing Extended Reach group. The web manager analyzed each page and stated their need for localization, i.e. which areas of the page needed to be optional and filled in with content for a subset of the countries in the group. For example, one requirement was: “The right hand navigation modules on the Products & Services page need the ability to be localized as they are used to link to features that do not exist for all countries.” In order to build the rules for a more general approach to allow for future modification, each rule was generalized. For example, the specific requirement above turned into the following rule in the content model: “The element called related_info in the portal DTDs must be able to be tagged for one, some, or all of the countries in the Extended Reach group, and should only appear on the output pages for the tagged countries.” The technique described in Section 4.5 enables this functionality today for all pages, not only the Products & Services page. 5.3 Pilot 2 Evaluation Enhanced Extended Reach for ASEAN Country Portals was successfully deployed on September 24th 2003 for the following 7 countries: Malaysia, Indonesia, Philippines, Thailand, Sri Lanka, Bangladesh, Vietnam. The improvement of 85.8% in time, and thus in web site maintenance cost, was achieved for news articles. Enhanced Extended Reach for the ASEAN Software Portal was successfully deployed on October 8th 2003 for the following 5 countries: India, Singapore, Malaysia, Thailand, Philippines. Quotes from ASEAN team on reduced workload: From Yee Nam Sng, ASEAN Site manager: “ The News section provides the most savings and efficiency. This is because most news articles are replicated without any change (except for local URIs) across all countries.” “Homepage marketing modules provide savings. We are able to achieve faster turnaround and some savings by planning our updates and marketing modules across ASEAN carefully.” From AP Creative Services editors:
  • 21. “Out of the three Enhanced Extended Reach implementations, the news fragments gain the most benefits. Although it might only save around 15 minutes per news/country, it has saved us from tedious job to replicate the content and manually reposition the tiers. It also has limited the chance of errors. Publishing is now a bliss. Less fragments to load, review and publish.” “ The psychological efficiency is what we feel most. It's really tedious to duplicate the same thing over and over again. This Enhanced Extended Reach approach has increased the quot;Moralequot; of the editor by taking off these duplicate tasks.” 6 Future Work Given the success of the Enhanced Extended Reach model, and the demonstrated cost savings it has resulted in, new country groupings will certainly be created. Some candidates include regions where ibm.com does not yet have existing country portals: • Americas Spanish • Middle East Arabic • Africa English • Africa French Another direction is to apply the same technique for new sets of pages, much like was done for Software pages for the ASEAN countries. In addition, the results of the Software page pilot will certainly provide lessons learned to ibm.com Software group on how best to include the software portals worldwide in the same framework. Yet another approach is to multi-publish output pages from one XML source regardless of pre- determined country groupings. For example, the legal statements for many IBM countries are the same, regardless of the region or size of market. XML pages for wireless.ibm.com are generated using this approach. 7 Acknowledgements The authors wish to thank Dikran Meliksetian, Rosa Bolger, Lisa Intravio Chris Wang and Marie Shafi who helped put the methods described in this paper into practice, and who have consistently supported and contributed to its further development. 8 References
  • 22. [1] ISO web site at http://www.iso.ch/iso/en/prods-services/iso3166ma/02iso-3166-code- lists/list-en1.html [2] quot;XML Content Management: Challenges and Solutionsquot; XML Europe 2001 Nianjun Zhou, Dikran Meliksetian, Louis Weitzman, Sara Elo Dean, Jeff Milton, Peter Davis, Jessica Wu. May 2001