How to make PDF files SEO Google friendly
The other day I received one of the best search engine optimization (SEO) questions ever! It came as no surprise that the question came from a traditional print magazine media publisher client, so I asked him if he'd mind if I used the question and answer for an article on my website.
Kevin Ireland, Publisher of http://www.InsiteGainesville.com and http://www.GainesvilleBizReport.com (both sites of which are using the Joomla content management system) which represent his Gainesville, Florida print media, asked:
Hey Joe, I'm trying to figure a way that I can make the online PDF's of my
magazines searchable (SEO) by Google. By that I mean if someone plugs "inventor
John Smith" into Google, the PDF of my magazine that includes an article
with inventor John Smith will come up high in the returned results. We've
already saved all PDF's in searchable form, so someone who opens the
magazine can search for specific words but we can't figure a way to get
Google to drill down into the pages to identify specific key words. Do you
know of a method?
My e-mailed PDF SEO response which I reserve the right to edit for the benefit of everyone down the road:
PDF's are searchable by default these days and Google has got really good at it. However, if the PDF is created in Photoshop as opposed to Adobe Pagemaker or MS Word, it'll be one big image and Google can't index the text within images. So Adobe. MS Word, or any text processor editor that'll convert to PDF is the only way to go. FYI, all the articles within Joomla have the ability to be converted to PDF, assuming your development company didn't turn the feature off.
The example you gave regarding a John Smith, is well, not the best example because, the last name Smith is one of the most common ;) However I'd suggest if someone typed John Smith in Gainesville, you'd have a shot.
The fastest method of getting Google to index your PDF's is to first have a sitemap and in Joomla I'd recommend Xmap. Then you'd go to webmaster tools using your Google account or one you have established for all the Google goodies and your site(s) and make sure it's registered.
Another thing to be aware of is that Google doesn't actually index every single word! There are what's known as stop words and here's a URL of the most common:
http://www.link-assistant.com/seo-stop-words.html
Also, the same principles apply to PDF's as HTML pages. ie:
1. The name of the file in the actual URL or address bar. If you are trying to be at the top for a name like smith you'd better have it everywhere ;)
2. The Title and or header ie: usually at the top, is actually more important than the majority of the content.
3. The 1st paragraph following the title or header.
4. Of course, the remaining content on the page.
5. If you are using photos consider adding a caption of text underneath
6. Take advantage of the document meta data that PDF-creation software or Adobe Reader itself offers.
The big thing you need to do is to make sure the PDF's are in a site map for better SEO effectiveness. If not picked up via the Joomla Xmap XML site map, then you'll need to spend a few bucks for one that will index all non-joomla files not attached within your web site. The one I recommend and have set up several times includes:
http://www.xml-sitemaps.com/ and they also have a free up to 500 pages index that can be submitted to Google or copied to your web site :)
This was a great question! I might even have to edit this response for an article for my web site SEO FAQ's since SEO for PDF files is often overlooked!
If you have any further questions about PDF SEO, please contact me.
Best 'net regards,
Joe
