Reprinted from TidBITS by permission; reuse governed by Creative Commons
license BY-NC-ND 3.0. TidBITS has offered years of thoughtful commentary
on Apple and Internet topics. For free email subscriptions and access to the
entire TidBITS archive, visit http://www.tidbits.com/


   PDFpen Scan+ Puts a Scanner in Your Pocket

   Michael E. Cohen

   Hot on the heels of the iOS 7 release (see '[1]iOS 7 Pre-flight
   Checklist,' 18 September 2013), [2]Smile Software has released a new
   iOS app, [3]PDFpen Scan+, that turns your iPad or iPhone into a
   document scanner. You can make use of the camera in your device to scan
   a document directly ' the app offers its own camera interface for that
   ' or you can use images from your device's photo collection.

   Yes, I said 'images' ' with the app you can combine multiple images
   into a single multi-page document. The result is a PDF that you can
   share with Smile's PDFpen for [4]iPad or [5]iPhone, or with a number of
   cloud services, including iCloud, Dropbox, Evernote, Google Docs,
   Alfresco, and Box. The app also features page-edge detection and OCR
   (optical character recognition) capability that supports sixteen
   languages.

   That, of course, all sounds great ' in theory. But how does it work in
   practice? I bought a copy of it (it costs $4.99) so I could test it
   out, and I discovered that it works well enough as a pure image
   scanner, but the OCR capability has lots of room for refinement and
   improvement.

   Using It -- When you first launch the app on an iPhone, for example,
   you see the Documents screen along with a set of simple buttons along
   the bottom: a camera button, for shooting a document and converting it
   to PDF; an images button, for creating a PDF from images stored in your
   Photos collections; and an add button, for importing copies of PDFs or
   images from other sources, such as Evernote or Dropbox.

   [6][tn_scanplus-inputs.jpg]

   Tap the camera button to take a picture of the document you want to
   make into a PDF. The interface is straight-forward, although,
   inexplicably, it uses the old iOS 6 style shutter control button, and
   already looks out of place in iOS 7. But that's just a cosmetic issue:
   the real issue is that it is just darn hard to hold the camera still
   enough to get a clear, clean picture of a text document: Smile
   recommends you exhale and shoot when your body is at its most still.
   You can accept or reject each picture, and you can specify whether your
   document will consist of single images or multiple images by tapping
   the number button at the top left of the screen.

   [7][tn_scanplus-camera-interface.PNG.jpg]

   After you have taken your best shot(s), you then see the edge-setting
   screen. When you shoot a paper document against a dark background, the
   auto-edge detection works quite well; against lighter, less contrasty
   backgrounds, not so much. Fortunately, the draggable corner-control
   points make it easy to specify the image area you want to include if
   auto-edge detection messes up or if you only want to capture a portion
   of the image. You can also specify the entire image on this screen, set
   the paper size of the final document (the default is US Letter), or
   just discard the image and try again. Note that if the selected area
   created when you set the edges is more trapezoid than rectangle (a
   common problem when shooting a document at a slight angle), the cropped
   area is scaled appropriately to fit into a rectangle, eliminating the
   perspective effect.

   [8][tn_scanplus-autoedge.PNG.jpg]

   [9][tn_scanplus-manual-edge.PNG.jpg]

   With the edges squared away, you next adjust the image quality itself,
   using the controls along the bottom of the screen to fix the
   orientation and to adjust the brightness and contrast of the image. You
   want to fiddle with these last controls to make the text as legible as
   you can against the page background, eliminating shadows as much as
   possible.

   [10][tn_scanplus-imagecontrol.PNG.jpg]

   Finally, with the image looking as good as you can make it, you arrive
   at a screen where you can choose to perform OCR or to add additional
   pages. Performing OCR is a simple matter of tapping the OCR button and
   then tapping OCR Document from the menu that appears. The actual OCR
   process is not fast: yellow bars march down the page image as PDFpen
   Scan+ identifies text to interpret, and once that stage is done, it
   then spends some time (as much as a minute or two) digesting and
   analyzing the results before it adds an invisible text layer to the
   PDF.

   [11][tn_scanplus-ocr-in-action.PNG.jpg]

   How well does it work? -- Unfortunately, the OCR capability is not
   particularly smart or forgiving. For example, it understands nothing
   about a two-column page, so while it might include all the text on the
   page, when you choose the Copy Document Text option from the OCR menu,
   you end up with a mashup of text that can read more like a poem by Ezra
   Pound on a bender than normal prose. And it can easily be baffled by
   text that, to human eyes, looks quite readable. For example, here is
   the beginning of text derived from the two column page I scanned for
   this article:

   The real-estate to and respond with. lt distracts the user from his
   task, nml it the chances for the device being dropped. how much ml
   estate is esme . devices. In the iPhOne has the least the iPzd the moa
   ThlS not account display which may be another costly. it takes tesoumes
   to create. display. and inletfau- views; time to design and test the
   code. On the other hand. sticking to a single simpliï¬es design and
   cuts down lllllt'

   Even if a text image is of exceedingly high quality, the OCR can do
   some peculiar things. For example, I took a screenshot on my iPad with
   Retina display of a page from a novel in iBooks and used that for OCR
   testing. While PDFpen Scan+ managed to recognize almost every word
   correctly, it would sometimes, inexplicably, swap lines of text so that
   they appeared out of order when I copied them to the Notes app.

   But it can also do a creditable job deciphering other texty images. For
   example, I shot an exhibit sign at Santa Monica Airport's Museum of
   Flying over the weekend, and gave a portion of it to PDFpen Scan+ to
   munch on.

   [12][tn_scanplus-f86-image.jpg]

   The resulting interpretation was actually quite good, containing only a
   few blunders and dropping an occasional word:

   North Amencan The North American F-86 Sabre (sometimes called the
   Sabrejet) was a transonic jet ï¬ghter aircraft. The is best known as
   America's ï¬rst swept wing ï¬ghter which could counter the
   similarly-winged Soviet MiG-15 in high speed dogï¬ghts. The was
   produced as both a ï¬ghter- intereeptor and ï¬ghter-bomber. Although
   developed in the late 1940s and outdated by the end of the 1950s, the
   Sabre proved versatile and adapt- able, and continued as a front-line
   ï¬ghter in numerous air forces. The F-86 Sabre was the ï¬rst American
   aircraft to take advantage of ï¬ig ht re- search data seized from the
   German aerodynamicists at the end of the war. Use of a swept wing would
   solve their speed a slat on the wing's leading edge which extended at
   low speeds would enhance low- speed stability. Its success led to an
   extended production run of more than 7,800 aircraft between 1949 and
   i956, by far the most-produced Western jet ï¬ghter. About this Aircraft
   This is an F-86H model that is currently undergoing restoration and
   will be ï¬nished with a California Air National Guard livery andis
   onloan from the National Museum of Naval Aviation.

   Bottom Line -- On the one hand, the product is a remarkably capable app
   considering its price and the conditions under which it must obtain
   images ' shaky hands and available light as opposed to a stable flatbed
   scanner with a cover and its own light-source. In fact, I'm surprised
   that the results are as good as they are considering that they are
   being produced by a device that can fit in my pocket! Frankly, if you
   need to make a quick PDF of some paper documents while on the go, I
   can't think of an easier or cheaper way to do it. Plus, its close
   integration with Smile's PDFpen apps for iPhone and iPad make it simple
   to annotate and mark up the PDFs you make with PDFpen Scan+.

   At the same time, the OCR results are quite variable, and, all too
   often, disappointing (though sometimes unintentionally hilarious).
   Before I could recommend it to a serious road-warrior who needs high
   quality OCR on the go, I would want to see some significant improvement
   in this feature. However, for casual use I have no qualms: I fully
   intend to use it on my next trip to the museum!

References

   1. http://tidbits.com/article/14117
   2. http://www.smile.com/
   3. http://www.smilesoftware.com/PDFpen/Scan/index.html
   4. http://www.smilesoftware.com/PDFpen/iPad/index.html
   5. http://www.smilesoftware.com/PDFpen/iPhone/index.html
   6. http://tidbits.com/resources/2013-09/scanplus-inputs.png
   7. http://tidbits.com/resources/2013-09/scanplus-camera-interface.PNG
   8. http://tidbits.com/resources/2013-09/scanplus-autoedge.PNG
   9. http://tidbits.com/resources/2013-09/scanplus-manual-edge.PNG
  10. http://tidbits.com/resources/2013-09/scanplus-imagecontrol.PNG
  11. http://tidbits.com/resources/2013-09/scanplus-ocr-in-action.PNG
  12. http://tidbits.com/resources/2013-09/scanplus-f86-image.jpg

.