Referencing pages of a multi-page PDF file during data merge… workaround

At the time of writing, there are three multi-page/artboard file formats that Adobe InDesign can import when placing a file via the File/Place function. These formats are:

  • PDF
  • Adobe Illustrator
  • Adobe InDesign

(While it is possible to create many artboards in Adobe Photoshop, it is not possible to import a specific Photoshop artboard into Adobe InDesign… – at the time of writing that is – but that is another article!)

When placing one of these three formats, it is possible to control several import functions using the show import dialog box, such as:

  • Which page (or pages) to import;
  • How the pages should be cropped;
  • Whether or not to place the pages with a transparent background; and
  • What layers to show and their visibility;

However, when importing these file types as variable images during a data merge, these options are unavailable and replaced with the following:

  • Only the first absolute page of the file is imported (not always the page numbered 1 as the first page can also be – for example – in roman numerals or start at a page other than one); and
  • Page cropping, transparency and layer visibility is determined by the same variables as the last file of that type to be placed into the artwork.

For now, there is no workaround to control the latter issues during a data merge, other than to be familiar with this behaviour and plan the merge accordingly. There is a workaround for importing pages beyond the first page of a PDF file… but not an Illustrator or InDesign file.

Workaround: Split the PDF

The term “workaround” is used loosely in this context. Unfortunately, the solution is to break the PDFs into single page records. This can be done within Acrobat using the split button from the organise pages panel.

This feature also allows multiple files to be split at once.

By default, the resulting files will maintain the same filename with the addition of _Partx prior to the filename, with x representing the absolute page number.

Otherwise, I’ve prepared an action that you can download here that will save the PDFs to the Documents folder of the machine running the action.

(Yes, I’m also aware that there are quite literally hundreds of websites out there that will split multi-page PDFs to single PDFs for free. However, the methods outlined above will do so without involving a third party).

The next part of the workaround involves the data itself, and I’ll be using Microsoft Excel to create formulas to make the numbering for the resulting pages. All variable images being referenced will also be in the same folder as the data file, meaning only the filename is required and not the full path and the filename.

For data where the page number is known

Add a column to the database that references the absolute PDF page number that needs to be imported.

Absolute vs Section numbers abridged:

Absolute numbers refers to a page number based on the total count of pages in the document, while section numbers refers to the page number that was applied using page numbering in the application that made the PDF.

For example, take a PDF that contains 20 pages with the first six pages being in roman numerals, and the remainder being in decimal numbers. These two different styles of numbering are section numbers, while absolute page numbers refer to the total count of pages. To reference page iv of the PDF, the absolute page number to reference is 4. To reference page 5 of the PDF, the absolute page number reference is 11.

In this example, the A column represents the PDF to reference, the B column represents the absolute page number, and C represents the result. To obtain this result, the following formula can be used:

=SUBSTITUTE(A2,".PDF","_Part"&B2&".pdf")

This formula will look at filename reference and substitute the .PDF portion of the filename for _Partx.pdf, where x represents the figure in the B column. Using this formula, only filenames with the PDF extension will be affected, while filenames in other formats will be unaffected.

For data where the page reference needs to increment by one more than the row above

The same formula can be used for the naming, but another formula is used to determine if the page reference should increase if the same base file is being referenced in the row directly above.

In this example, the N column represents the PDF to reference, the O column represents the absolute page number, and P represents the result. A 24 page file NS91912 is being merged and needs to have the page reference incremented by one so that the filenames are NS91912_Part1.pdf to NS91912_Part24.pdf. The following formula can be used to change the page reference:

=IF(N2=N1,O1+1,1)

This formula will look at the filename and determine that if the filename is different to the row above, put the number 1 in the cell, BUT if the filename is the same as the row above, take the page value from the cell above and add 1 to it into this cell.

In a perfect world

Again, this is a workaround – it will only work for PDFs and requires some upfront work to prepare. Ideally, if I had my way and could implement some improvements, I’d like to see:

  • Not just the ability to choose a specific page, but choose the correct trim box and layers as well. For example, a file reference such as myFile.pdf;1,trim;Layer1,Layer2 where 1 represents the absolute page number, trim represents what trim box to use, and Layer1,Layer2 represent the layers I would like to appear (or leave the layer bit blank if all layers should be visible).
  • The ability to perform a similar task for incoming INDD, AI or PSD files.

Add date selectors to date fields in interactive PDF

A feature of Acrobat DC that can be quite handy is the prepare form feature. It allows a scan (or a document with no form-field elements) to have form-field elements applied to it, so long as the formatting of the artwork follows the practices listed in this document.

However, there is an improvement that I feel could be made to this feature, but may have been missed by the Acrobat team, and that is date fields. Take the following example:

Now run the Prepare Form feature of Adobe Acrobat DC Professional:

The signature is picked up OK, but the date field is just a text field.

After doing a little digging online, I found that changing the name of the Date field to something like Date_af_date (the importance being the _af_date text) and this will change it to a date field;

But it doesn’t truly act like a date field. If I close out of preview mode and tab to the text field, it behaves like a regular text field.

It isn’t until the format category is changed to date that the field behaves like a date field with a date picker.

So that’s fine to edit one field, but if there are lots of date fields to edit, or this is a regular task, it can be time consuming. Ultimately, I’d like Acrobat’s prepare form feature to detect the date fields just like other fields like text inputs and signature fields are auto detected.

Until that happens, I’ve created an Acrobat action that will run not just the prepare form feature, but also a javascript that will find any of the resulting fields that have the word Date (case-sensitive) in them and make them selectable date fields. That action can be downloaded here.

To change the date format, open up the Acrobat action and change the following line in the script:

The number in brackets can be changed from 5 to a value between 0-13 that represents a format as shown below:

0: m/d
1: m/d/yy
2: mm/dd/yy
3: mm/yy
4: d-mmm
5: d-mmm-yy
6: dd-mmm-yy
7: yy-mm-dd
8: mmm-yy
9: mmmm-yy
10: mmm d, yyyy
11: mmmm d, yyyy
12: m/d/yy h:MM tt
13: m/d/yy HH:MM

In the meantime, if you would like the Acrobat team to update the prepare form feature so that date fields are automatically detected, I’ve added it to the Acrobat Uservoice wishlist.

Droplet like it’s hot

As a prepress operator, a great deal of my time is spent making sure that artwork supplied by clients will print without any prepress issues. Given that most client-supplied files are PDFs, a great deal of my time is spent in Adobe Acrobat checking the files using the print production tools and an invaluable plug-in called Enfocus Pitstop Professional.

While I’ve given the Adobe Acrobat team plenty of grief over my last few blog posts, I do have to sing their praises over a rather massive feature that – for me at least – has gone unnoticed since its inception in Acrobat 7 – preflight droplets.

What is a droplet?

A droplet acts as a “hot folder” that – once a PDF is dragged onto it –  will run a preflight profile on that PDF.

preflight01

This works for one or many PDFs. I first learned of this feature from this Jean-Claude Tremblay’s post to an InDesignSecrets article about using the preflight feature to convert a file to outlines, rather than using InDesign-based methods. That said, the droplets feature has been available since at least 2007!

Making a droplet is simple. While in the print production panel of Adobe Acrobat, click the preflight button, and in the new dialog box, select Create Droplet… from the Options button.

preflight02

The next dialog box will ask what preflight profile to use, where success/failed PDFs should be processed to, and if a summary PDF needs to be created of each file.

preflight03

Many of the built-in preflight profiles either force compliance to one of the PDF/X standards, or analyse a PDF and report the errors that were encountered. However, it is the custom fixup portion that may interest readers in a production role. To see where this can be found, click the Edit Profiles… selection from the Options button of the preflight dialog box.

preflight04

Underneath the warnings and standards compliance, there is a section titled custom fixups.

preflight05

In this panel is a plethora of changes that Acrobat can make to an entire document to fix common preflight issues such as:

  • Faux blacks
  • White overprint, or other colours that should knockout instead of overprint
  • Black instead of Registration
  • Remove trim marks and take back to 3mm bleed
  • Make pantone spot color names consistent

In addition, it is possible to make your own custom fixups rather than use the built-in ones. Click the add button to add your own fixup.

preflight06

It is also possible to drill down even further in the editing by clicking additional edit buttons.

preflight07

This allows for further variables to be made.

preflight08

Usually, many of these changes would be done using Enfocus Pitstop Professional’s action lists or global changes, but with the creation of an appropriate preflight droplet, not only can they be done without the Enfocus Pitstop Professional plug-in, they can also be done without opening the PDF.

Wouldn’t use it as a catch-all

It would be great to have one preflight that will catch all scenarios and fix the PDFs so that all that needs to be done is make sure the content is right and that the art is fit for its purpose… but because there are so many edge-cases that I deal with, it is more appropriate to make a “catch-most” preflight for common errors such as the ones mentioned earlier.

It can be confusing

With so many options to choose from, it can also be very confusing and – at times – frustrating, especially when some custom fixups contradict each other with no way of being able to sort out what one should go first.

Some of the commands are also not so intuitive. One instruction that I wanted to use – that was to make any object that wasn’t 100% black to knock out – wasn’t where I thought it would be.

preflight09

It took hours of trial and error to realise that the color range to select was Gray Object (black below 96%) is set to overprint… but who would know with the other options that appear to make more sense?

preflight10

It’s not a magic bullet

That’s not to imply that the Enfocus Pitstop Professional plug-in isn’t necessary – it is an absolute must for prepress operators. Preflight droplets complement the Enfocus plug-in, saving hours of time manually scanning a PDF looking for “the usual suspects” and allow PDFs in a workflow to be “normalised” for colour profile, trim/bleed size, appropriate overprints and knockouts as required, etc.

There are some fixups that work better using the Enfocus Pitstop plug-in, such as the generate bleed action. When run as a custom fixup via Acrobat preflight, it only adds bleeds to rendered art, and usually by scaling it. The Enfocus pitstop plug-in is more versatile in that it will apply to both vector and raster images, and bleed off appropriate edges only.

Importantly, the preflight fixups won’t be able to make content-related changes, such as fixing typographical errors or moving artwork away from a trim-edge… these changes have to be made with manual intervention using the Enfocus tools.

Lastly, preflight droplets are not a substitute for a skilled prepress operator examining a file, given that droplets cannot:

  • Ensure that artwork will fold correctly or be suitable for their intended purpose;
  • Confirm that the artwork is the correct version supplied by the client;
  • Understand the context of the content such as spelling, grammar or “design features”.

Extract an Image from an image field in an Acrobat Form

In January 2017, Acrobat DC added two new buttons to the prepare form panel in Adobe Acrobat DC: Add Image and Add Date:

eximage01

The Add Image button creates a rectangle that – when clicked in Adobe Acrobat Pro or Reader DC – launches Finder (Mac) or Explorer (Windows) to navigate to an image to be inserted into that field.

To demonstrate this, I have created a business card order form in Adobe InDesign for a Travel Agency.

eximage02

Note that I have not made the image field in Adobe InDesign. There is a good reason for this: it isn’t possible at the time of writing the article as the option doesn’t exist in the buttons and forms panel in Adobe InDesign.

eximage03

While this is frustrating, it can be added in Adobe Acrobat. I’ll leave a link to the indesign uservoice feature request to hopefully have this (and the add date button) added in future (ignore that the Adobe Staff says its fixed at the time of writing – I disagree).

For now, I’ll export this file as an interactive PDF and add the add image button to the artwork.

eximage04

I can then close out of preview and look at the form. This should be fine for testing purposes.

eximage05

For the purposes of prototyping this form, I’ll type some dummy data and use a stock photo from Adobe Stock.

eximage06

Fields all look fine, the text can be extracted by either cutting and pasting into my InDesign card template, or using the export option from the Prepare Form tools. While the image isn’t juxtaposed correctly, I can do that once I extract the image from the PDF… or at least I thought.

The image won’t extract

If I go to the Edit PDF tools of Acrobat, the image (and its field) cannot be selected.

eximage07

The image isn’t shown as an attachment in the attachments tab.

eximage08

If I use the Export all as images from the Export PDF tab, will that work?

eximage09

No, it only exports the images of the beer bottles and the Eiffel Tower shown in the original card.

How about if I use the Edit Object tools, right click on the image and select “edit image”? Unfortunately, this is unavailable too.

eximage10

Using the Enfocus Pitstop Professional Plug-in, can I extract the image this way? No!

eximage11

Yes, I could zoom in and take a screen capture, or render the PDF in Adobe Photoshop, but neither will retrieve the image to the exact resolution the original image was supplied. Looking at this particular image, if I zoom in at 3200%, it is quite a high resolution image.

eximage12

At this point, I turned to the internet for help, only to find the following thread on the Adobe Forums that contained a response from an Adobe Staff Member that read as follows:

eximage13

To me, this is bizarre… the whole purpose of adding an image would be to remove it later for another purpose, especially since the form field doesn’t have any cropping, scaling or rotating options. The whole point of me making this form was so that:

  • the client didn’t need the full version of acrobat to add the image as an attachment to the PDF;
  • the client Didn’t need to send the PDF and the image separately;
  • I could receive one file to prepare the content of the business cards, rather than bits and pieces from various emails or downloads.

However, all is not lost!

There is a way

Create a new InDesign file and place the filled in interactive PDF as an image.

eximage14

Export the file as a print PDF using the [High Quality Print] setting with the following change to the compression panel:

eximage15

Now, when the PDF opens in Adobe Acrobat Professional DC, I’m able to use the Print Production Tools to click on the image and then select Edit Image.

eximage16

Once the image opens into Photoshop, I can see it is the same size as the original.

eximage17

So yes, it is possible to extract an image from the Image Field of a PDF, but it takes a little work. I’m just frustrated why the Acrobat Team made it difficult “by design”.

Lastly, if anyone from the Acrobat Team is reading this going “he’s having a go at us again”, rest assured, I will be praising the team in an upcoming post.

%d bloggers like this: