Referencing pages of a multi-page PDF file during data merge… workaround

At the time of writing, there are three multi-page/artboard file formats that Adobe InDesign can import when placing a file via the File/Place function. These formats are:

  • PDF
  • Adobe Illustrator
  • Adobe InDesign

(While it is possible to create many artboards in Adobe Photoshop, it is not possible to import a specific Photoshop artboard into Adobe InDesign… – at the time of writing that is – but that is another article!)

When placing one of these three formats, it is possible to control several import functions using the show import dialog box, such as:

  • Which page (or pages) to import;
  • How the pages should be cropped;
  • Whether or not to place the pages with a transparent background; and
  • What layers to show and their visibility;

However, when importing these file types as variable images during a data merge, these options are unavailable and replaced with the following:

  • Only the first absolute page of the file is imported (not always the page numbered 1 as the first page can also be – for example – in roman numerals or start at a page other than one); and
  • Page cropping, transparency and layer visibility is determined by the same variables as the last file of that type to be placed into the artwork.

For now, there is no workaround to control the latter issues during a data merge, other than to be familiar with this behaviour and plan the merge accordingly. There is a workaround for importing pages beyond the first page of a PDF file… but not an Illustrator or InDesign file.

Workaround: Split the PDF

The term “workaround” is used loosely in this context. Unfortunately, the solution is to break the PDFs into single page records. This can be done within Acrobat using the split button from the organise pages panel.

This feature also allows multiple files to be split at once.

By default, the resulting files will maintain the same filename with the addition of _Partx prior to the filename, with x representing the absolute page number.

Otherwise, I’ve prepared an action that you can download here that will save the PDFs to the Documents folder of the machine running the action.

(Yes, I’m also aware that there are quite literally hundreds of websites out there that will split multi-page PDFs to single PDFs for free. However, the methods outlined above will do so without involving a third party).

The next part of the workaround involves the data itself, and I’ll be using Microsoft Excel to create formulas to make the numbering for the resulting pages. All variable images being referenced will also be in the same folder as the data file, meaning only the filename is required and not the full path and the filename.

For data where the page number is known

Add a column to the database that references the absolute PDF page number that needs to be imported.

Absolute vs Section numbers abridged:

Absolute numbers refers to a page number based on the total count of pages in the document, while section numbers refers to the page number that was applied using page numbering in the application that made the PDF.

For example, take a PDF that contains 20 pages with the first six pages being in roman numerals, and the remainder being in decimal numbers. These two different styles of numbering are section numbers, while absolute page numbers refer to the total count of pages. To reference page iv of the PDF, the absolute page number to reference is 4. To reference page 5 of the PDF, the absolute page number reference is 11.

In this example, the A column represents the PDF to reference, the B column represents the absolute page number, and C represents the result. To obtain this result, the following formula can be used:

=SUBSTITUTE(A2,".PDF","_Part"&B2&".pdf")

This formula will look at filename reference and substitute the .PDF portion of the filename for _Partx.pdf, where x represents the figure in the B column. Using this formula, only filenames with the PDF extension will be affected, while filenames in other formats will be unaffected.

For data where the page reference needs to increment by one more than the row above

The same formula can be used for the naming, but another formula is used to determine if the page reference should increase if the same base file is being referenced in the row directly above.

In this example, the N column represents the PDF to reference, the O column represents the absolute page number, and P represents the result. A 24 page file NS91912 is being merged and needs to have the page reference incremented by one so that the filenames are NS91912_Part1.pdf to NS91912_Part24.pdf. The following formula can be used to change the page reference:

=IF(N2=N1,O1+1,1)

This formula will look at the filename and determine that if the filename is different to the row above, put the number 1 in the cell, BUT if the filename is the same as the row above, take the page value from the cell above and add 1 to it into this cell.

In a perfect world

Again, this is a workaround – it will only work for PDFs and requires some upfront work to prepare. Ideally, if I had my way and could implement some improvements, I’d like to see:

  • Not just the ability to choose a specific page, but choose the correct trim box and layers as well. For example, a file reference such as myFile.pdf;1,trim;Layer1,Layer2 where 1 represents the absolute page number, trim represents what trim box to use, and Layer1,Layer2 represent the layers I would like to appear (or leave the layer bit blank if all layers should be visible).
  • The ability to perform a similar task for incoming INDD, AI or PSD files.

Script: export an InDesign file to split PDF ranges

For the last month, I’ve been feverishly working away on some Data Merge javascripts that will ultimately answer the question that is commonly asked on the Adobe Forums – is there any way to Data Merge to uniquely named PDFs directly from InDesign? I can tell you now that the answer is yes… but developing a one-size fits all solution that will keep everybody happy is another matter!

Even though these scripts aren’t being released just yet, the research did yield some information that could be applied in another script that is as equally sought-after – the ability to export an InDesign file directly to split PDFs. There are many that can export directly to single pages, but not many (if any at all) that can export a PDF from InDesign directly to PDFs that allow the user to choose how many pages long each PDF should be. Well now there is!

exportscreengrabIt’s simple to use. Open the InDesign file, run the script and the following dialog will appear. Just choose where you want the PDFs, what preset to use and how many pages each PDF should be, and click OK!

Better still, it’s FREE!

Download the script from this link.

Any feedback concering this script is greatly appreciated. If you would like to more information about the Data Merge scripts that are in development, contact me on twitter: #colecandoo.

Fixing readers spreads: Third time lucky

For print providers, finished art PDFs supplied as readers spreads can create a nuisance. So that the imposition software can correctly impose the pages in the correct order for press, the pages have to be presented as individual pages as opposed to readers spreads. Rather than inconvenience a customer and ask for the file to be prepared again, it is easier to split the PDF into individual pages, but until recently this procedure was a tedious method of copying and pasting a PDF into an InDesign file set as spreads and then preparing an output PDF as individual pages.

However, this blog has provided two solutions so far to do this:

  1. Via two javascripts and a procedure within InDesign (read the full story here);
  2. A javascript within Acrobat (read the full story here)

Now a third method exists. This method uses both Acrobat and an InDesign script.

1)   With the “spreads” PDF open in Adobe Acrobat, save the document as single pages. This can be done (in Acrobat X) by using the split document feature and splitting into 1pp documents;

split

or (in Acrobat 9 or above) by extracting pages as single pages.

extract

Put these PDFs into a folder of their own.

2)   Create a new InDesign file the correct finished trim size, as readers spreads, and the same amount of pages as the intended finished artwork.

3)   Once the file is created, run this script:  A prompt will ask for the folder of PDFs to import – navigate to that folder and click OK.

4)   A prompt will warn once the file is finished. All that is then left to do is tidy up the first and last pages that are centered within the spreads, they only need to be centered within their appropriate pages and any unnecessary pages deleted.

I’m not sure who was the original author of the script but can only credit those who contributed to the forum where the script was adapted for this purpose.

%d bloggers like this: