Remove Joomla PDFs from Google and Yahoo search results |
Tuesday, 24 June 2008 | ||||
As you already know, Joomla has a built-in PDF generator. The problem with PDF's is that sometimes Google places the PDFs in search results instead of the original Joomla HTML content article. Somehow, the PDFs are more optimized than the HTML, probably because their keyword density is higher, and they don't include the navigation and modules usually found on a Joomla HTML page. When visitors search google and find the PDF instyead of the article, you may lose them, because they have no navigation menu, nosite search, and so on. They just get annoyed waiting for Adobe's reader browser plugin to load. The solution is simple, you need to alter your robots.txt (found in site root) and add these 2 lines to prevent PDF's from being crawled and included in Google's index User-agent: Googlebot
Disallow: /index2.php?option=com_content&do_pdf=1* Here are another 2 lines to block Yahoo Slurp crawler from indexing Joomla generated PDFs User-agent: Slurp
Disallow: /index2.php?option=com_content&do_pdf=1* Google/Yahoo allow wildcard matches in robots.txt, while other search engine robots may not. This technique will yeld its results when Google reindexes your site. Resources: Google Webmaster help center I don't want to list every file that I want to block. Can I use pattern matching?
Write Comment Powered by AkoComment Tweaked Special Edition v.1.4.2 |
||||
Last Updated ( Wednesday, 25 June 2008 ) |
Newsletter
Joomla books
Auto tags
joomla
joomla pdf generator
block pdf from Google
how to block pdfs from google
joomla index2.php security
joomla google indexes pdf
robots.txt block pdfs
remove pdf from joomla
code into joomla google search
html page to pdf in joomla
add pdf joomla