Meta Tag ISO Language and Country Codes

This page lists and discusses Language Codes and Attributes. Page two of this series lists and discusses Country Codes and ccTLD issues. Page three discusses gTLD (generic Top Level Domain) issues.

The lang  attribute is used to define the base language to be used for displaying text and characters on a Web site. This allows an internationalization of HTML for a very large number of languages. If the metatag generator does not have the one you want, feel free to use one of these by typing it into the code (see below).The languages are designated by a two letter code, such as “en” for English or “es” for Spanish. One or more hyphenated values can be tacked on to the initial two letter code to specify regional or ethnic variations, such as “en-us” for U.S. English. You can get a list of Country Codes here

Note that html texts that contain foreign languages that use special characters will have to be saved as a Unicode file, rather than as an ANSI file, in order for the characters to be properly displayed. Please be aware that some browsers may not be capable of correctly displaying a Unicode file.

HTML tags are always considered English. Therefore, the DocType declaration at the top of web pages will always say EN, even though the rest of the page is in a different language:

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.1//EN” “xhtml11.dtd”>
-or
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN“>

Don’t get carried away and change this. It has to be EN for the DOCTYPE.

Got that? Good. Now, on to the more interesting part of the story…

DocType

This:

<meta http-equiv=”Content-Language” content=”en”>

Defines the encoding of the whole page as English as far as the language set goes – in short, it tells the user agent (browser) to expect English language characters. It’s purely a technical declaration. There can be only one language declared in this tag.

Other Tags

The lang attribute applies to all HTML tags except applet, base, basefront, br, frame, frameset, iframe, param and  script.

I find it most useful in the HTML, DIV and SPAN tags.

This:

HTML:

<html lang=”en”>

-or-

XHTML:

<html xmlns=”http://www.w3.org/1999/xhtml” xml:lang=”en” />

Tells the visitor/search engine what audience the text is intended for.

It’s perfectly possible for the meta http-equiv and html lang tags to be out of sync – for example, I declare the technical language of my Canadian websites to be English, but will have French and English on the same page at times, both with properly declared language tags (the language can be declared in spans, divs and other tags, not just the HTML one.

From a search engine standpoint, the most useful declaration of a language is via the HTML, Div, or Span language declaration, which describes the intended audience, rather than the meta-equiv, which only addresses the software that renders the language (i.e. the browser).

Naturally, both should exist and be proper, but the SE would only look at the language declaration of the content intended for the visitor, not the technical declaration of the document intended for the browser, as far as returning results to a searcher is concerned.

Reference: http://www.w3.org/International/questions/qa-http-and-lang

Multiple Languages

A document can declare more than one language or language group for a visitor, but only one for a user-agent (browser).

This means that the meta http-equiv=”Content-Language” tag can only reference one language – i.e. “en”

The Lang= or xml:lang= tags can reference several languages – i.e. “en, zh, pt, fr” or even local variants: “en-us” (US English).

One method of having more than one language on a page is by using the DIV or SPAN language attributes:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”>
<html lang=”en”>
<DIV Lang=”en”>
Let us drink a carbonated beverage and sit on this chesterfield, ok?
</DIV>
<DIV Lang=”en-us”>
Let’s drink a soda on the sofa, hey?
</DIV>
<DIV Lang=”en-ca”>
Let’s down a pop on the couch, eh?
<DIV Lang=”fr”>
Buvons une boisson carbonatée et reposons-nous sur ce Chesterfield.
</DIV>

This lets you tell the browser what to expect, but lets a search engine or screen reader how to deal with certain blocks of text.

Case

Language and country codes are NOT case sensitive.

 

ISO 639: Two-letter Language Codes

Sorted by Language

Sorted by Language Code

Abkhazian AB AA Afar
Afar AA AB Abkhazian
Afrikaans AF AF Afrikaans
Albanian SQ AM Amharic
Amharic AM AR Arabic
Arabic AR AS Assamese
Armenian HY AY Aymara
Assamese AS AZ Azerbaijani
Aymara AY BA Bashkir
Azerbaijani AZ BE Byelorussian
Bashkir BA BG Bulgarian
Basque EU BH Bihari
Bengali, Bangla BN BI Bislama
Bhutani DZ BN Bengali, Bangla
Bihari BH BO Tibetan
Bislama BI BR Breton
Breton BR CA Catalan
Bulgarian BG CO Corsican
Burmese MY CS Czech
Byelorussian BE CY Welsh
Cambodian KM DA Danish
Catalan CA DE German
Chinese ZH DZ Bhutani
Corsican CO EL Greek
Croatian HR EN English, American
Czech CS EO Esperanto
Danish DA ES Spanish
Dutch NL ET Estonian
English, American EN EU Basque
Esperanto EO FA Persian
Estonian ET FI Finnish
Faeroese FO FJ Fiji
Fiji FJ FO Faeroese
Finnish FI FR French
French FR FY Frisian
Frisian FY GA Irish
Gaelic (Scots Gaelic) GD GD Gaelic (“Scots Gaelic”)
Galician GL GL Galician
Georgian KA GN Guarani
German DE GU Gujarati
Greek EL HA Hausa
Greenlandic KL HI Hindi
Guarani GN HR Croatian
Gujarati GU HU Hungarian
Hausa HA HY Armenian
Hebrew IW IA Interlingua
Hindi HI IE Interlingue
Hungarian HU IK Inupiak
Icelandic IS IN Indonesian
Indonesian IN IS Icelandic
Interlingua IA IT Italian
Interlingue IE IW Hebrew
Inupiak IK JA Japanese
Irish GA JI Yiddish
Italian IT JW Javanese
Japanese JA KA Georgian
Javanese JW KK Kazakh
Kannada KN KL Greenlandic
Kashmiri KS KM Cambodian
Kazakh KK KN Kannada
Kinyarwanda RW KO Korean
Kirghiz KY KS Kashmiri
Kirundi RN KU Kurdish
Korean KO KY Kirghiz
Kurdish KU LA Latin
Laothian LO LN Lingala
Latin LA LO Laothian
Latvian, Lettish LV LT Lithuanian
Lingala LN LV Latvian, Lettish
Lithuanian LT MG Malagasy
Macedonian MK MI Maori
Malagasy MG MK Macedonian
Malay MS ML Malayalam
Malayalam ML MN Mongolian
Maltese MT MO Moldavian
Maori MI MR Marathi
Marathi MR MS Malay
Moldavian MO MT Maltese
Mongolian MN MY Burmese
Nauru NA NA Nauru
Nepali NE NE Nepali
Norwegian NO NL Dutch
Occitan OC NO Norwegian
Oriya OR OC Occitan
Oromo, Afan OM OM Oromo, Afan
Pashto, Pushto PS OR Oriya
Persian FA PA Punjabi
Polish PL PL Polish
Portuguese PT PS Pashto, Pushto
Punjabi PA PT Portuguese
Quechua QU QU Quechua
Rhaeto-Romance RM RM Rhaeto-Romance
Romanian RO RN Kirundi
Russian RU RO Romanian
Samoan SM RU Russian
Sangro SG RW Kinyarwanda
Sanskrit SA SA Sanskrit
Serbian SR SD Sindhi
Serbo-Croatian SH SG Sangro
Sesotho ST SH Serbo-Croatian
Setswana TN SI Singhalese
Shona SN SK Slovak
Sindhi SD SL Slovenian
Singhalese SI SM Samoan
Siswati SS SN Shona
Slovak SK SO Somali
Slovenian SL SQ Albanian
Somali SO SR Serbian
Spanish ES SS Siswati
Sudanese SU ST Sesotho
Swahili SW SU Sudanese
Swedish SV SV Swedish
Tagalog TL SW Swahili
Tajik TG TA Tamil
Tamil TA TE Tegulu
Tatar TT TG Tajik
Tegulu TE TH Thai
Thai TH TI Tigrinya
Tibetan BO TK Turkmen
Tigrinya TI TL Tagalog
Tonga TO TN Setswana
Tsonga TS TO Tonga
Turkish TR TR Turkish
Turkmen TK TS Tsonga
Twi TW TT Tatar
Ukrainian UK TW Twi
Urdu UR UK Ukrainian
Uzbek UZ UR Urdu
Vietnamese VI UZ Uzbek
Volapuk VO VI Vietnamese
Welsh CY VO Volapuk
Wolof WO WO Wolof
Xhosa XH XH Xhosa
Yiddish JI YO Yoruba
Yoruba YO ZH Chinese
Zulu ZU ZU Zulu

 


Unless otherwise noted, all articles written by Ian McAnerin, BASc, LLB. Copyright © 2002-2006 All Rights Reserved. Permission must be specifically granted in writing for use or reprinting anywhere but on this site, but we do allow it and don’t charge for it, other than a backlink. Contact Us for more information.