Tag Archive for 'google'

Conflict between Google Translate widget and Firefox Flashblock extension

The Google Translate widget allows webmasters to add on-demand translation of a website page. Very easy to configure and deploy.

The rendered widget appears in the browser as a simple select dropdown with options for all the language supported by Google Translate. When the user selects his desired target language, the widget is supposed to contact the Google Translate mother ship, translate the text on the page, and then add an fixed iframe panel at the top of the browser viewport, followed by the translation of the page. The translated page even implements onmouseover handlers for text-based elements that display the original source text. Sweet.

It has worked great for me in the past. But I recently did a new deployment and I was unable to get the widget to work. When I (as the user) selected my target language, I got the following:

Error: The server could not complete your request. Try again later.

A bit more poking showed that it was only happening in Firefox. All my other browsers were ok. Eventually, I narrowed it down to a conflict with the Flashblock extension. Disabling the extension solved the problem.

Now, the tough choice is to run without the Flashblock or without the Google Translate. But at least I can deploy for the customer.

2010-02-20 Update: The mere presence of the enabled Flashblock extension when visiting a page with the translate widget does not cause the problem. The issue only occurs when the page has Flash content and the extension is configured to block Flash on that page.

The reason: The widget appears to actually use Flash!

In the file

http://translate.googleapis.com/translate_static/js/element/main.js

the function wf() seems to add Flash embed code into the page. I imagine that Flashblock is detecting this attempted insertion and is doing its magic.

I actually find it interesting – and impressive – that the Flashblock extension is smart enough to not only find/block Flash content on initial page load, but also at any time after that. I imagine it monitors the DOM and is vigilantly swaps out any Flash embeddings with its own replacement button.

I can see that the widget creates several iframes, at least one of which has Flash content pulled from the domain translate.googleapis.com. It stands to reason – well, at least to me – that enabling Flash content for this domain should do the trick. But so far, no luck. ;-(

No canonical domain on NYTimes.com

I am surprised to find that the NYTimes.com does not provide a canonical domain for its homepage or for its articles (at least, for the small sampling I checked). That is, I get the “exact” same behavior in my browser irrespective of entering http://nytimes.com or http://www.nytimes.com.

There is no shortage of opinion about whether to use a www or no www on your domain. Nearly all agree that your site should respond to both versions.

Read the rest of this entry »

Google news sitemaps and article url requirements

One customer of mine has been trying to get his content listed in Google News index (as distinct from Google’s main web index).

Google’s general guidelines for news publishers includes a collection of Technical Requirements for Article URLs.

One of those requirements strikes me as just odd:

Display a three-digit number. The URL for each article must contain a unique number consisting of at least three digits. For example, we can’t crawl an article with this URL: http://www.google.com/news/article23.html. We can, however, crawl an article with this URL: http://www.google.com/news/article234.html. Keep in mind that if the only number in the article consists of an isolated four-digit number that resembles a year, such as http://www.google.com/news/article2006.html, we won’t be able to crawl it. Please note, this rule is waived with News sitemaps.

What the heck is that is that for? As long as each article has its own unique URL, why must it have a unique number as part of the URL? It sounds like a pointless hoop through which news sites are simply forced to jump in order to demonstrate that they are “big” enough to jump through it. I would be grateful to anyone who can provide a definitive answer to what purpose is served by this requirement.

Anyway, rather than change the entire CMS, I opted to develop a Google News sitemap, a creature that uses a different format than the standard sitemap protocol. Google Webmaster Tools confirms that the the sitemap is being spidered and is error free.

Although Google representatives confirmed to me via email on several occasions that the 3-digit-url requirement is waived for sites that submit a news sitemap and the spider is still successfully pulling/parsing the news sitemap file, the customer has yet to see any content show up in Google News.

It actually occurred to me later that since the Google will probably only spider/add new content to the news index (as opposed to the standard index), we really could get away with simply changing the scheme for new URLs only. So, we bit the bullet and changed the CMS to add the digits.

It will be interesting to see if it makes a difference. If so, it undermines the Google claim that the url rule is waived for sites implementing a news sitemap.

Stay tuned.

2009-04-13: Update Content now appearing Google News. Apparently, the three-digit rule waiver is unreliable.