A previous post here was showing problems occurring by PayPal's inability to receive UTF-8 encoding. Although I still consider it a major PayPal fault, it may be possible to override this bug by setting your site's encoding on your PayPal account. When you find the 'Edit Profile' link under 'Account' tab when logged in, there should be a link to change your language encoding. It's not very noticeable but it's there. I haven't tested it and prefer not to use non-English alphabet in the value part of the input.
Good luck :)
Colnect, Connecting Collectors. Colnect offers revolutionizing services to Collectors the world over. Colnect is available in 63 languages and offers extensive collectible catalogs and the easiest personal collection management and Auto-Matching for deals. Join us today :)
Wednesday, April 15, 2009
Invalid URL Requests From Legitimate Bots
In a former post I've mentioned that I have no idea how come invalid URLs for which no link on the site (nor sitemap) exists are being tried by legitimate bots such as GoogleBot.
Now I have a partial answer for the non existing URLs presented in the post. Some time ago, a twitter account for Colnect editors has been opened @ColnectEdits. It automatically twits about edits done on Colnect's catalogs so that other collectors may track it.
An interesting thing that you can see in the attached picture is the the links generated by the tweets are shown as http://colnect.com/en/phone... but actually do link to the correct full URLs, such as http://colnect.com/en/phonecards/item/id/9212. So it seems that the web crawlers read both as legitimate URLs and try to fetch them. Since it seems GoogleBot does not want to learn that /en/phone returns 404 from Colnect, I am now forced to add these as legitimate URLs to my site to avoid seeing more 404s in my logs. Oh well...
Now I have a partial answer for the non existing URLs presented in the post. Some time ago, a twitter account for Colnect editors has been opened @ColnectEdits. It automatically twits about edits done on Colnect's catalogs so that other collectors may track it.
An interesting thing that you can see in the attached picture is the the links generated by the tweets are shown as http://colnect.com/en/phone... but actually do link to the correct full URLs, such as http://colnect.com/en/phonecards/item/id/9212. So it seems that the web crawlers read both as legitimate URLs and try to fetch them. Since it seems GoogleBot does not want to learn that /en/phone returns 404 from Colnect, I am now forced to add these as legitimate URLs to my site to avoid seeing more 404s in my logs. Oh well...
Labels:
googlebot,
twitter,
web crawlers
Phone cards catalog: biggest, most extensive, free
Happy to announce that Colnect's phone cards catalog, the world's most-extensive phone cards catalogs, has now over 150,000 phone cards listed in it.
Colnect's catalog is an endeavor of many collectors from around the world who constantly improve it.
Using Colnect's catalog, collectors from around the world can easily manage their personal collection on Colnect and find swap buddies from around the world.
Special thanks goes to all the contributors, editors and translators of Colnect.
Happy collecting :)
Colnect's catalog is an endeavor of many collectors from around the world who constantly improve it.
Using Colnect's catalog, collectors from around the world can easily manage their personal collection on Colnect and find swap buddies from around the world.
Special thanks goes to all the contributors, editors and translators of Colnect.
Happy collecting :)
Subscribe to:
Posts (Atom)
Link and Search
Did you like reading it? Stay in the loop via RSS. Thanks :)