Text-only version.

This is the first of two entries I have planned to write on language and content negotiation. A lot of non-techies do not even know what content negotiation means – which is not really a problem –, but they also miss out on an opportunity to improve their user experience on the Internet. Since in this post I am going to limit myself to language negotiation – as opposed to content negotiation in general –, I believe that it is particularly relevant to language professionals.

The basics of HTTP headers and content negotiation

When your browser requests an Internet page, it does not merely go and fetch it. It communicates and “negotiates” with the web server so to speak. The language it uses to do so is called HTTP (hyper-text transfer protocol), along with HTML (hyper-text mark-up language) one of the two main pillars of the world-wide web.

For example, your browser may typically tell the server something like this: I would like to display the page referenced by the URL http://christianflury.com/blog/. By the way, I am a Mozilla-based browser that runs on Windows XP, and I have got a copy of that page's version last updated on the 15th of December, 11:04h, so if it has not been modified in the meantime, I do not need to download a fresh copy. For text, I like html or xhtml best, but text-only is okay as well, and when it comes to images, I prefer PNG to other formats such as gif or jpeg.

The server could then reply: Nice to see you, don't worry, your copy of the page is okay, no need to transmit it once again. Or: Here you are, this is the version of the page that best respects your preferences.

Obviously, they don't speak plain English, but HTTP which looks a bit like such:

http://christianflury.com/blog/

GET /blog/ HTTP/1.1

Host: christianflury.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; de; rv:1.8.1.2) Gecko/20060601 Firefox/2.0.0.2 (Ubuntu-edgy)
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: de,en;q=0.8,fr;q=0.5,it;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://christianflury.com/
Cookie: language=last_changed&1174519453&language&de

HTTP/1.x 200 OK

Date: Wed, 21 Mar 2007 23:24:43 GMT
Server: Apache/2.0.58 (Unix) mod_ssl/2.0.58 OpenSSL/0.9.7a
Content-Language: de, en
Last-Modified: Wed, 21 Mar 2007 23:24:13 GMT
Content-Type: application/xhtml+xml

One note to security-paranoid techies reading this: Yes, now you can copy my cookie and log on to my site pretending that you have set the interface to German and last changed your preferences yesterday evening. Big deal.

The Accept-Language header

In the funny HTTP-ish conversation above, one line is of particular interest to us:
Accept-Language: de,en;q=0.8,fr;q=0.5,it;q=0.3
Translated into plain English, this means: I prefer to read in German. If you don't have German, give me English; if you don't have English, give me French; and if you don't speak French, you might as well serve me your content in Italian.

Believe it or not: servers do listen to you (sometimes). For example, when you visit my homepage for the very first time, the interface and all the content (except this blog) will appear in German, French or English according to the Accept-language header your browser sends. The same goes for a lot of other websites.

How to tell your browser

Great, you may object, but how do I tell my browser? After all, I don't speak HTTP.

Luckily, you can tell most browsers quite easily what Accept-language headers you want them to send. To learn how to change your language preferences in your browser of choice, you may want to check out this nice W3C document.

In Firefox 2.0, it looks as follows:

Image of my language settings in Firefox 2.0

In how far will this improve your user experience?

Suppose your mother tongue is Icelandic (a lovely language, by the way), you are also fluent in English and speak a bit of Flemish. If you visit a Belgian website that's available in French and Flemish, the website can determine that the Flemish version will be most relevant to you and spare you the hassle of looking for the language navigation and changing the interface language to Flemish manually. For people with a multilingual background (and thanks to increased mobility there are more and more of them) who are regularly searching for information on the Internet, this can prove quite useful.

What the future might bring

For the time being, search engines or online glossaries do not leverage the browser's language settings yet. For example, Google weights search results only according to the language that the user chose for the interface as far as I know. One reason might be that most users are not even aware of the possibility to change their language preferences in such a fine-grained way and go with the default values (mostly their local language and English). However, you never know what the future will bring. Search engines should leverage every possibility to further increase the relevance of their search results.

There would also be some potential for multilingual resources such as glossaries which could guess a linguist's working languages from these language settings. As always, when a website makes assumptions about a user, it should give him or her the opportunity easily to change their preferences, but still this could be quite a time-saver in a lot of environments.

Back to the top of this page

Categories: Language and Translation Web Development and Programming

Keywords/tags:

| Comments (0) | Trackbacks (0)

Trackbacks

Trackback URL for this entry:
http://christianflury.com/cgi-bin/mt/mt-tb.cgi/9

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Christian Flury

World 0.1

Link to the RSS feed for my blog.