Multilingual support (Indic)
Editor-In-Chief: C. Michael Gibson, M.S., M.D. [1]
Overview
Several pages on Wikipedia use Indic scripts to illustrate the native representation of names, places, quotes and literature. Unicode is the encoding used on Wikipedia and it contains support for a number of Indic scripts. However, before Indic scripts can be viewed or edited, support for Complex Text Layout must be enabled on your operating system. Some older operating systems do not support complex text rendering and you should not use such systems to edit Indic scripts.
This page lists the methods for enabling complex text rendering based on the operating environment or browser you are using. Many of the methods highlighted can be used for non-Indic complex scripts such as Arabic.
Check for existing support
The following table compares how a correctly enabled computer would render the following scripts with how your computer renders them:
Script | Correct rendering | Your computer |
Bengali | File:Examples.of.complex.text.rendering.Bengali.png | ক + ি → কি |
Devanagari | File:Examples.of.complex.text.rendering.Devanagari.png | क + ि → कि |
Gujarati | File:Examples.of.complex.text.rendering.Gujarati.png | ક + િ → કિ |
Gurmukhi | File:Examples.of.complex.text.rendering.Gurmukhi.png | ਕ + ਿ → ਕਿ |
Kannada | File:Examples.of.complex.text.rendering.Kannada.png | ಕ + ಿ → ಕಿ |
Malayalam | File:Examples.of.complex.text.rendering.Malayalam.png | ക + െ → കെ |
Oriya | File:Examples.of.complex.text.rendering.Oriya.png | କ + େ → କେ |
Tibetan | File:Examples of complex text rendering Tibetan.png | ར + ྐ + ྱ → རྐྱ |
Tamil | File:Examples.of.complex.text.rendering.Tamil.png | க + ே → கே |
Telugu | File:Examples.of.complex.text.rendering.Telugu.png | య + ీ → యీ |
If the rendering on your computer matches the rendering in the images for the scripts, then you have already enabled complex text support! You should be able to view text correctly in that script. However, this does not mean you will be able to edit text in that script. To edit such text you need to have the appropriate text entry software on your operating system.
Platform Independent support on Mozilla Firefox
Indic IME, a plugin for Firefox 1.0+ can help you write in many indian languages in your webpages. It is easy to install and works on all platforms where Firefox or other Mozilla-based browsers are running.
The Indic IME toolbar project was started to address the need of typing in Indian Languages in Web Forms, Emails, Blog, Search Boxes etc.
Padmas, a plugin for Firefox 2.0+ converts several Indic fonts to Unicode. This helps several popular Indian vernacular websites to render correctly, without the need for any additional font installation.
Windows 95, 98, ME and NT
These operating systems contain no inbuilt support for Indic scripts. Indic Scripts can only be seen properly in Internet Explorer. You also need to have a appropriate unicode font installed in your system for that script. It is suggested to install Internet Explorer 6.0 because it has better support for Indic scripts.
Mozilla Firefox does not support Indic scripts properly on these operating systems unless a modified version of the program is used, such as the one found here. This is due to a bug in Firefox [2], [3]. This bug is now removed in Firefox 3 Alpha. But Firefox 3 does not support Windows 98/ME.
No Unicode Keyboard Driver Engines (Like Indic IME, BarahaIME etc) are available for these older systems. One can either use online typing tools or offline text editors specially made for this purpose. A list of such tools is given here.
Windows 2000
Supports: Devanagari, Tamil
Complex text support needs to be manually enabled.
Viewing Indic text
- Go to Start > Settings > Control Panel > Regional Options > General [Tab].
- In the "Language settings for this system" frame, check the box next to "Indic".
- Copy the appropriate files from the Windows 2000 CD when prompted.
- If prompted, reboot your computer once the files have been installed.
If you don't have the Windows CD or don't want to juggle with CD right now, you can simply download this zip file and extract its contents to a folder. When prompted for Windows CD, simply point to this folder using 'Browse' option of the prompt window.
Inputting Indic text
You must follow the steps above before you perform the remaining steps.
- Select "Input Locale" [Tab].
- Click the "Add" button in the "Installed input locales" frame.
- Select the desired language in the "Input Locale" drop-down box on the "Add Input Locale" dialogue box.
- Now select the appropriate keyboard you wish to use.
- For the people who are not able to use the above InScript Keyboard, They can use the Phonetic keyboards from Baraha. Baraha Direct included in Baraha Package supports both ANSI & Unicode while BarahaIME supports only Unicode.
- For people who cannot download the above software, or for people on the move, dboard is an Indian language sandbox which provides an online virtual (visual) keyboard, you can use the following application, copy the text on the clipboard and then copy it back to the Wikipedia editing box.
Windows XP and Server 2003
Supports: Bengali (XP SP2), Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam (XP SP2), Tamil, Telugu
Complex text support needs to be manually enabled.
Viewing Indic text
- Go to Start > Control Panel.
- If you are in "Category View" select the icon that says "Date, Time, Language and Regional Options" and then select Regional and Language Options".
- If you are in Classic View select the icon that says "Regional and Language Options".
- Select the "Languages" tab and make sure you select the option saying "Install files for complex script and right-to-left languages (including Thai)". A confirmation message should now appear - press "OK" on this confirmation message.
- Allow the OS to install necessary files from the Windows XP CD and then reboot if prompted.
Inputting Indic text
Windows XP have inbuilt InScript Keyboards for nearly all Indian languages. You can add them via Control Panel. You must follow the steps above before you perform the remaining steps.
- In the "Regional and Language Options", click the "Languages" tab.
- Click on the "Details" tab.
- Click the "Add" button to add a keyboard for your particular language.
- In the drop-down box, select your required Indian language.
- Make sure the check box labelled "Keyboard layout/IME" is selected and ensure you select an appropriate keyboard.
- Now select "OK" to save changes.
You can use the combination ALT + SHIFT to switch between different keyboard layouts (e.g. from a UK Keyboard to Gurmukhi and vice-versa). If you want a language bar, you can select it by pressing the "Language Bar..." button on the "Text Services and Input Languages" dialog and then selecting "Show the language bar on my desktop". The language bar enables you to visually select the keyboard layout you are using.
- For the people who are not able to use the above InScript Keyboard, there are some other Keyboard Drivers available. For Phonetic typing BarahaIME is suggested and for Remington typing IndicIME is suggested.
Baraha is Phonetic based software and includes nearly all of Indic languages. Baraha Direct included in Baraha Package supports both ANSI & Unicode while BarahaIME supports only Unicode.
- Indic IME 1 (v5.0) is available from Microsoft Bhasha India. This supports Hindi Scripts, Gujrati, Kannada and Tamil. Indic IME 1 gives the user a choice between a number of keyboards including Phonetic, InScript and Remington.
If you do not have Windows CD, there is a modified version of the installer for Hindi named Hindi Toolkit which automatically installs Indic Support as well as Hindi Indic IME.
- For people who cannot download the above software, or for people on the move, dboard is an Indian language sandbox which provides an online virtual (visual) keyboard, you can use the following application, copy the text on the clipboard and then copy it back to the Wikipedia editing box.
- MyMyanmar Projects provide MyMyanmar Unicode System to input Myanmar(Burmese) text.[1]
Windows Vista
Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Sinhala, Tamil, Telugu, Tibetan
Complex text support is automatically enabled.
Viewing Indic text
You do not need to do anything to enable viewing of Indic text.
Inputting Indic text
Windows Vista like Windows XP have inbuilt InScript Keyboards for nearly all Indian languages. You can add them via Control Panel.
For Phonetic typing BarahaIME is suggested and for Remington typing IndicIME is suggested.
Mac OS 9 and earlier
The Indian Language Kit, available from Apple at additional cost,[4] provides support for Devanagari, Gujarati and Gurmukhi. No third-party Unicode solutions are known, though numerous custom-encoded fonts exist.
Mac OS X
- Mac OS 10.3 and earlier support Devanagari, Gujarati, Gurmukhi
- Mac OS 10.4 adds support for Tamil
- Mac OS 10.5 adds support for Tibetan
- Free Bangla fonts & keyboard available from ekushey.org
- Free Kannada fonts and keyboard available from nickshanks.com
- Free Telugu font (no keyboard) available from nickshanks.com
- Non-free fonts and keyboards for all Indic scripts available from xenotypetech.com
Viewing Indic text
You do not need to do anything to enable viewing of Indic text as long as you use Safari or most other Cocoa applications, which fully support rearrangement and substitution for AAT-based fonts. Firefox up to version 2.0 does not support Indic script rendering at all because it does not use ATSUI (Firefox renders little rectangles instead). Opera also provides some support, although considerable bugs remain as of version 9.2 (though Opera at least renders the glyphs).
Carbon software such as Microsoft Word, Adobe Photoshop and their siblings do not generally support Indic scripts, due to broken or non-existent ATSUI implementations.
Inputting Indic text
Specific keyboard layouts can be enabled in System Preferences, in the International pane. Switching among enabled keyboard layouts is done through the input menu in the upper right corner of the screen. The input menu appears as an icon indicating the current input method or keyboard layout — often a flag identified with the country, language, or script. Specific instructions are available from the "Help" menu (search for "Writing text in other languages").
Mac OS 10.4 system software comes with two installable Keyboard input options for Tamil: Murasu Anjal and Tamilnet 99. One needs to do the following steps to activate them:
i) Open "international" located within System Preferences and select "language". Select the "edit list", select "Tamil" from the list of languages shown and click OK.
ii) Select "input menu" to see a list of keyboard options available. Select "Anjal" and "Tamilnet99" keyboards under Murasu Anjal Tamil and Click OK.
iii) Anjal and Tamilnet99 keyboard icons appear immediately in the list of keyboards to select under the country flag in the top menu bar.
An alternate way to activate the keyboard(s) for Devanagari (Hindi etc.):
i) Open "International" located within System Preferences and select the "Input Menu" tab. (ii) Check the option for "Devanagari" and/or "Devanagari - QWERTY". (iii) Check the "Show input menu in menu bar" option at the bottom of the "International" panel. Close the panel, and the new keyboard(s) should be available for selection when you click on the menu bar icon (upper right corner).
SIL distributes a freeware Ukelele that allows anyone to design their own input keyboard for Mac OS X.
GNOME
Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Tibetan
Viewing Indic text
You do not need to do anything to enable viewing of Indic text in GNOME 2.8 or later. Older versions may have support for some, but not all Indic scripts. Ensure you have appropriate Unicode fonts for each script you wish to view or edit.
Some web browsers may require you to enable Pango rendering to view Indic text properly.
- For Epiphany, Pango rendering can be enabled in GConf. Press Alt+F2 to bring up the Run Application dialog, then enter
gconf-editor
and click Run. The Configuration Editor window will appear. In the left pane, unfoldapps
→epiphany
and click theweb
section. In the right pane, check the box next to theenable_pango
option, then restart Epiphany. - When using Mozilla or Firefox, you can enable Pango rendering by opening xterm and typing
MOZ_ENABLE_PANGO=1 mozilla
orMOZ_ENABLE_PANGO=1 firefox
. After this, all future sessions of Mozilla or Firefox will have Indic language support.- This will work only on Firefox compiled with --enable-pango. Only the firefox binaries supplied by Fedora Core 4 and 5, Ubuntu Linux, and Kate OS are compiled with this build option.
- For Ubuntu 6.06, this support has been turned off due to speed issues. To enable support, you must type
MOZ_DISABLE_PANGO=0 firefox
. Future sessions do not remember this setting, so it must be repeated. - For SUSE 10.1 you have to add the "MOZ_ENABLE_PANGO=1″ to your .profile to make the effect permanent.
- Go to your home directory, then edit the .profile file -it is a hidden file.
- Scroll down to the last line of the file and add: export MOZ_ENABLE_PANGO=1
- Save the .profile file. Restart for the effect to take place
- The easiest way to check whether --enable-pango was used in your copy of Firefox is to type about:buildconfig in the address bar and to look for the string (--enable-pango).
Inputting Indic text
- Go to Applications > Preferences > Keyboard.
- Select the "Layouts" tab.
- Select the keyboard for the language or script you wish to use from the "Available Layouts" frame and then press "Add".
- Press "Close" to discard the dialogue box.
- Right click on the main menu on your desktop and select "Add to Panel...".
- Select "Keyboard Indicator" and click "Add".
- Position the keyboard indicator on your menu bar and click it to switch between keyboard layouts.
KDE
Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu.
Viewing Indic text
You do not need to do anything to enable viewing of Indic text. Ensure you have appropriate Unicode fonts for each script you wish to view or edit.
Inputting Indic text
- In the Control Center, go to Regional & Accessibility, Keyboard Layout
- In the tab Layout, click on Enable keyboard layouts
- Choose the layout you want in Available layouts
- Click on Apply
- Now, you will have an icon for the KDE Keyboard Tool in your panel, in which you can choose the layout you want
Debian Based GNU/Linux Distributions
Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Tibetan, Punjabi.
Viewing Indic text
Enter as root:
apt-get install ttf-indic-fonts
and when the installation is complete restart the X server.
For Tibetan script:
apt-get install ttf-tmuni
For Mozilla and Firefox, see the comments above under "gnome". Rendering should work correctly "out of the box" as of Debian-4.0 (etch).
Fedora Core 6 and Fedora 7 Linux Distribution
Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu,Punjabi among others.
Installing Indic Fonts
For example, to install Kannada fonts, Simply enter as root on the console and type in the command:
yum install fonts-kannada
This will download the Kannada fonts from the repositories and install it.
Similarly, for Hindi, say, enter as root on the console and type in the command:
yum install fonts-hindi
Keyboard Support for Indic texts
Start the Add/Remove software applet. For example in KDE, say, navigate to System and then Add/Remove software. In the applet window, select Languages on the list box to your left hand side. In the right hand side list box, select the Indian languages of interest to you.
For example, to have Kannada key board support, check the box for Kannada Support. Similarly, for Hindi support, say, check the box for Hindi Support.
It has observed that for Kannada, Fedora not only puts in Kannada keyboard support, but also provides transliteration support and also the keyboard support for KGP (Kannada Ganaka Parishad) keyboards. With this feature, users can directly type in Kannada words in Roman script to be transliterated to Kannada text in the application of your choice. For example into your browser, text editor, document editor, email client etc. Users can also use native Kannada keyboards, KGP based or otherwise to type in Kannada texts directly.
Gentoo Linux
Supports: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.
Installing Indic fonts
emerge fonts-indic
(The mozilla-*-bin products shipped by gentoo are directly taken from mozilla's ftp servers and aren't built with pango support. Unless you notice a problem with this you need to compile your own version with USE-"-moznopango". Firefox 3 will be shipping with pango enabled by default)
Inputting Indic text
emerge -av scim-tables scim-m17n
Study the USE flags and the LINGUAS flags and set them accordingly depending on your desktop environment and language support needed. The following needs to be set whenever you login (append it to your .xinitrc or .xsession).
export XMODIFIERS=@im=SCIM #case matters for this variable! export GTK_IM_MODULE=scim export QT_IM_MODULE=scim
Mozilla apps and precompiled software such as acroread might not play well with scim (C++). In such cases, make use of scim-bridge (C - avoiding C++ ABI issues) [5].
emerge scim-bridge
and startup firefox as:
% GTK_IM_MODULE=scim-bridge firefox
You might have to start the scim daemon manually. (Add it your session's startup)
scim -d
SCIM is a unified frontend for currently available input method libraries.
Unicode OpenType fonts
- This section lists OpenType fonts, supported by Microsoft Windows and most Linux distributions. For AAT fonts (required for the Apple Macintosh), see the Mac OS X section above.
If you have followed the instructions for your computer system as mentioned above and you still cannot view Indic text properly, you may need to install a Unicode font:
- Bengali: SolaimanLipi
- Burmese: MyMyanmar Unicode System Padauk Uni 4.1.1
- Devanagari: Mangal
- Gujarati: Padma (Currently Unavailable)
- Gurmukhi: AnmolUni, Saab
- Khmer: KhmerOS
- Lao: Laoplanet.net
- Malayalam: Anjali or Rachana
- Sinhala: LKLUG, Malathi
- Tamil: Akshar Unicode
- Telugu: Akshar Unicode
- Tibetan:Tibetan Machine Uni
Department of Information Technology, India has provided Unicode Indic fonts for most of the Indian languages.
WAZU JAPAN's Gallery of Unicode Fonts is an excellent resource for all Indic scripts.
References
External links
- Microsoft BhashaIndia Article - How to enable Indic Language Support at OS level?
- QuillPad, a tool for transliterating into native scripts
- Enabling Kannada at the Kannada Wikipedia
- Punjabi Computing Resource Centre - Resources
- Bangla Unicode fonts and typing system project
- Fedora Core 3 release notes, with instructions for enabling Pango rendering in Mozilla.
- Homepage of Indlinux
- Information provided on Marathi Wikipedia about Enabling Devanagari Fonts
- Information at THDL about Tibetan Fonts & Unicode
- Online writing of Devanagari using English keyboard. Website also supports Tamil, Telegu, Malayalam, Kannada, Gujarati, Oriya, Bengali and Punjabi. See Devanagari transliteration for help as to which letters correspond with which Devanagari characters.
- paahijen - Applications in Indian Languages.
- entrans: GPL Online, collaborative translation tool package
- Online Indic Keyboard Input: Uses indic_web_input package from entrans
- Pages with broken file links
- Articles containing Bengali-language text
- Articles containing Hindi-language text
- Articles containing Gujarati-language text
- Articles containing Punjabi-language text
- Articles containing Kannada-language text
- Articles containing Malayalam-language text
- Articles containing Odia-language text
- Articles containing Standard Tibetan-language text
- Articles containing Tamil-language text
- Articles containing Telugu-language text
- Wikipedia how-to
- Wikipedia multilingual support
- Indic Computing
- Help