instantiationLanguage

The descriptor language identifies the primary language of a media item’s audio or text. Alternative audio or text tracks and their associated languages should be identified using the descriptor instantiationAlternativeModes.

XPATH LOCATION:

/ pbcoreDescriptionDocument / pbcoreInstantiation / instantiationLanguage

USAGE RULES:

Occurs:

1 time or less

May Contain:

4 or less optional attributes, specific:

Contained by:

Contained with:

[Any elements used MUST appear in this relative order]

EXAMPLES:

  • <instantiationLanguage source="ISO 639.2" ref="http://www.loc.gov/standards/iso639-2/php/code_list.php">eng;fre</instantiationLanguage>
  • <instantiationLanguage source="ISO 639.2" ref="http://www.loc.gov/standards/iso639-2/php/code_list.php" annotation="Algonquian languages">alg</instantiationLanguage>

VOCABULARIES:

  • PBCore recommends use of the ISO 639.2 or ISO 639.3 3-letter language codes.
  • If the media item has more than one language that is considered part of the same primary audio or text, then a combination statement can be crafted, e.g., eng;fre for the presence of both English and French in the primary audio. Separating three-letter language codes with a semi-colon (no additional spaces) is preferred.
  • Alternative audio or text tracks and their associated languages should be identified using the descriptor instantiationAlternativeModes.

3 responses to “instantiationLanguage”

  1. Dave Rice

    The strange formatting of this field makes handling these values really challenging. Since PBCore is expressed in XML, I recommend that instantiationLanguage simply be maxOccurs = unbounded. The odd manner of handling language expressions as semicolon-delimited strings is inconsistent with the rest of the standard and a hassle to manage when building or parsing a PBCore document.

    For example, having:

    <instantiationLanguage>eng</instantiationLanguage>
    <instantiationLanguage>spa</instantiationLanguage>

    would be a lot more convenient, then having to deal with the string parsing inside:

    <instantiationLanguage>eng;spa</instantiationLanguage>

    Dave

  2. Jack Brighton

    I second Dave’s motion.

    Martina McGinn added this on the PBCore Talk listserv:
    “I agree and suggest that instantiationAlternativeModes could also
    warrant multiple occurrences (e.g. a film dubbed in English and sub-
    titled in Spanish). And it seems logical that if instantiationLanguage
    allows multiple occurrences then so should essenceTrackLanguage.”

    Seems like a minor schema tweak would iron out these rough spots. But of course there are others, and we need to follow through on identifying them, like this one.

    Jack

  3. Henri Cook

    I found ISO-639-3 largely unsuitable for this, I wanted to be able to represent different variations of english (e.g. en-US and en-GB) which led me to stumble upon http://www.rfc-editor.org/pdfrfc/rfc5646.txt.pdf in Language-Region format which in my opinion should be the standard here, of course I welcome opinions.

    This discussion was also started on Stack Overflow: http://stackoverflow.com/questions/15965003/what-language-standard-should-i-use-if-i-want-to-at-least-differentiate-between

Leave a Reply