Reprinted from TidBITS by permission; reuse governed by Creative Commons license BY-NC-ND 3.0. TidBITS has offered years of thoughtful commentary on Apple and Internet topics. For free email subscriptions and access to the entire TidBITS archive, visit http://www.tidbits.com/ PDFpen Scan+ Puts a Scanner in Your Pocket Michael E. Cohen Hot on the heels of the iOS 7 release (see '[1]iOS 7 Pre-flight Checklist,' 18 September 2013), [2]Smile Software has released a new iOS app, [3]PDFpen Scan+, that turns your iPad or iPhone into a document scanner. You can make use of the camera in your device to scan a document directly ' the app offers its own camera interface for that ' or you can use images from your device's photo collection. Yes, I said 'images' ' with the app you can combine multiple images into a single multi-page document. The result is a PDF that you can share with Smile's PDFpen for [4]iPad or [5]iPhone, or with a number of cloud services, including iCloud, Dropbox, Evernote, Google Docs, Alfresco, and Box. The app also features page-edge detection and OCR (optical character recognition) capability that supports sixteen languages. That, of course, all sounds great ' in theory. But how does it work in practice? I bought a copy of it (it costs $4.99) so I could test it out, and I discovered that it works well enough as a pure image scanner, but the OCR capability has lots of room for refinement and improvement. Using It -- When you first launch the app on an iPhone, for example, you see the Documents screen along with a set of simple buttons along the bottom: a camera button, for shooting a document and converting it to PDF; an images button, for creating a PDF from images stored in your Photos collections; and an add button, for importing copies of PDFs or images from other sources, such as Evernote or Dropbox. [6][tn_scanplus-inputs.jpg] Tap the camera button to take a picture of the document you want to make into a PDF. The interface is straight-forward, although, inexplicably, it uses the old iOS 6 style shutter control button, and already looks out of place in iOS 7. But that's just a cosmetic issue: the real issue is that it is just darn hard to hold the camera still enough to get a clear, clean picture of a text document: Smile recommends you exhale and shoot when your body is at its most still. You can accept or reject each picture, and you can specify whether your document will consist of single images or multiple images by tapping the number button at the top left of the screen. [7][tn_scanplus-camera-interface.PNG.jpg] After you have taken your best shot(s), you then see the edge-setting screen. When you shoot a paper document against a dark background, the auto-edge detection works quite well; against lighter, less contrasty backgrounds, not so much. Fortunately, the draggable corner-control points make it easy to specify the image area you want to include if auto-edge detection messes up or if you only want to capture a portion of the image. You can also specify the entire image on this screen, set the paper size of the final document (the default is US Letter), or just discard the image and try again. Note that if the selected area created when you set the edges is more trapezoid than rectangle (a common problem when shooting a document at a slight angle), the cropped area is scaled appropriately to fit into a rectangle, eliminating the perspective effect. [8][tn_scanplus-autoedge.PNG.jpg] [9][tn_scanplus-manual-edge.PNG.jpg] With the edges squared away, you next adjust the image quality itself, using the controls along the bottom of the screen to fix the orientation and to adjust the brightness and contrast of the image. You want to fiddle with these last controls to make the text as legible as you can against the page background, eliminating shadows as much as possible. [10][tn_scanplus-imagecontrol.PNG.jpg] Finally, with the image looking as good as you can make it, you arrive at a screen where you can choose to perform OCR or to add additional pages. Performing OCR is a simple matter of tapping the OCR button and then tapping OCR Document from the menu that appears. The actual OCR process is not fast: yellow bars march down the page image as PDFpen Scan+ identifies text to interpret, and once that stage is done, it then spends some time (as much as a minute or two) digesting and analyzing the results before it adds an invisible text layer to the PDF. [11][tn_scanplus-ocr-in-action.PNG.jpg] How well does it work? -- Unfortunately, the OCR capability is not particularly smart or forgiving. For example, it understands nothing about a two-column page, so while it might include all the text on the page, when you choose the Copy Document Text option from the OCR menu, you end up with a mashup of text that can read more like a poem by Ezra Pound on a bender than normal prose. And it can easily be baffled by text that, to human eyes, looks quite readable. For example, here is the beginning of text derived from the two column page I scanned for this article: The real-estate to and respond with. lt distracts the user from his task, nml it the chances for the device being dropped. how much ml estate is esme . devices. In the iPhOne has the least the iPzd the moa ThlS not account display which may be another costly. it takes tesoumes to create. display. and inletfau- views; time to design and test the code. On the other hand. sticking to a single simpliï¬es design and cuts down lllllt' Even if a text image is of exceedingly high quality, the OCR can do some peculiar things. For example, I took a screenshot on my iPad with Retina display of a page from a novel in iBooks and used that for OCR testing. While PDFpen Scan+ managed to recognize almost every word correctly, it would sometimes, inexplicably, swap lines of text so that they appeared out of order when I copied them to the Notes app. But it can also do a creditable job deciphering other texty images. For example, I shot an exhibit sign at Santa Monica Airport's Museum of Flying over the weekend, and gave a portion of it to PDFpen Scan+ to munch on. [12][tn_scanplus-f86-image.jpg] The resulting interpretation was actually quite good, containing only a few blunders and dropping an occasional word: North Amencan The North American F-86 Sabre (sometimes called the Sabrejet) was a transonic jet ï¬ghter aircraft. The is best known as America's ï¬rst swept wing ï¬ghter which could counter the similarly-winged Soviet MiG-15 in high speed dogï¬ghts. The was produced as both a ï¬ghter- intereeptor and ï¬ghter-bomber. Although developed in the late 1940s and outdated by the end of the 1950s, the Sabre proved versatile and adapt- able, and continued as a front-line ï¬ghter in numerous air forces. The F-86 Sabre was the ï¬rst American aircraft to take advantage of ï¬ig ht re- search data seized from the German aerodynamicists at the end of the war. Use of a swept wing would solve their speed a slat on the wing's leading edge which extended at low speeds would enhance low- speed stability. Its success led to an extended production run of more than 7,800 aircraft between 1949 and i956, by far the most-produced Western jet ï¬ghter. About this Aircraft This is an F-86H model that is currently undergoing restoration and will be ï¬nished with a California Air National Guard livery andis onloan from the National Museum of Naval Aviation. Bottom Line -- On the one hand, the product is a remarkably capable app considering its price and the conditions under which it must obtain images ' shaky hands and available light as opposed to a stable flatbed scanner with a cover and its own light-source. In fact, I'm surprised that the results are as good as they are considering that they are being produced by a device that can fit in my pocket! Frankly, if you need to make a quick PDF of some paper documents while on the go, I can't think of an easier or cheaper way to do it. Plus, its close integration with Smile's PDFpen apps for iPhone and iPad make it simple to annotate and mark up the PDFs you make with PDFpen Scan+. At the same time, the OCR results are quite variable, and, all too often, disappointing (though sometimes unintentionally hilarious). Before I could recommend it to a serious road-warrior who needs high quality OCR on the go, I would want to see some significant improvement in this feature. However, for casual use I have no qualms: I fully intend to use it on my next trip to the museum! References 1. http://tidbits.com/article/14117 2. http://www.smile.com/ 3. http://www.smilesoftware.com/PDFpen/Scan/index.html 4. http://www.smilesoftware.com/PDFpen/iPad/index.html 5. http://www.smilesoftware.com/PDFpen/iPhone/index.html 6. http://tidbits.com/resources/2013-09/scanplus-inputs.png 7. http://tidbits.com/resources/2013-09/scanplus-camera-interface.PNG 8. http://tidbits.com/resources/2013-09/scanplus-autoedge.PNG 9. http://tidbits.com/resources/2013-09/scanplus-manual-edge.PNG 10. http://tidbits.com/resources/2013-09/scanplus-imagecontrol.PNG 11. http://tidbits.com/resources/2013-09/scanplus-ocr-in-action.PNG 12. http://tidbits.com/resources/2013-09/scanplus-f86-image.jpg .