Google Now Indexes Scanned Documents

Google has announced that it will now begin including scanned documents in its search results – a feat that requires an immense amount of processing power and advanced image recognition technology. Unlike standard text documents, scanned files don’t contain any text data that Google’s spiders can index. Instead, Google has employed Optical Character Recognition (OCR) technology, converting photos of words into digital text files. – TechCrunch – Google Now Indexes Scanned Documents

The implications of this on search engine optimization (SEO) are fairly huge. In order for PDFs to be indexed by google, they had to be saved in text format (instead of image format), which counted out millions of older documents and documents from sources not aware of this caveat. There is a wealth of information online in the form of scientific papers and technical documents that could not previously be included in search results.

For business owners, stop worrying about whether documents on your website will be included in search results. Instead, shift your concerns to more important issues such as content, usability and increasing sales.

Reader Interactions