Data Merge “Did You Know” Part Two

This is the second article in the Data Merge “Did You Know” series. If you’ve not read the first article, be sure to do so here. Carrying on from the first article, let’s dive into more lesser-known Data Merge behaviours of Adobe InDesign.

Data is there… even if the link is missing

If you ever have to share a Data Merge file with anyone else but do not want to share the data with them but instead only give them the base InDesign file, note that simply removing the link doesn’t remove the data.

In Part One, I wrote that InDesign doesn’t package the source file of a Data Merge… but that doesn’t mean the data isn’t there. Take this example of a packaged InDesign file used for a Data Merge. While opening the document, there is a missing link warning.

Once the document opens, I can then see in the links panel that there is a warning next to this text file that was used as the source.

If I right click on the link, I can’t unembed the text file as it simply isn’t linked

However, if I go to the Data Merge panel and check the preview box, I can see the records that were in the unlinked text file.

If the file didn’t have fields added to the page, I can also add those fields to the page and check the preview button on and the data will appear for these records.

It also merges correctly to PDF and InDesign files.

Be aware of this if you ever have to package a Data Merge file to others whom you do not wish to provide unredacted data.

Merge fields can be removed via the hyperlinks panel

A Creative Pro article referred to this as “ghost hyperlinks” but it is a great way of solving issues where a newly provided data merge source file can’t be previewed because of a mismatch of source names.

By opening the Hyperlinks panel, it is possible to see the fields that InDesign is using for the data merge as they are within this panel, though they aren’t obvious at first glance.

If one of the hyperlinks is double clicked, it will reveal the field that is being referred to.

From here, a hyperlink can removed, thereby replacing the field codes in the document back to regular text.

Shift clicking during the import does not show options

If the show import options checkbox is toggled off when placing an image, it is possible to perform a “one-time” request to show the import options without clicking on the checkbox. This is done by holding shift and then clicking Open.

But this doesn’t work with Data Merge. A similar option is available when selecting a data source, though holding shift and clicking Open will simply open the document – the Show Import Options checkbox has to be checked if it needs to appear. Hopefully this is a bug that is eventually fixed.

Put linked images in the same folder as the source text file…

While it is possible to add images to a data merge project by supplying its link in the source file, I so often see users put the complete file path of the link being used in the field.

If the images are filed in the same location as the text file, the only item that needs to be added here is the name of the file.

However, this means the links need to be in the same folder as the source text file.

…or use relative syntax as well

It is also possible to use syntax that is relative to the folder where the database is. Take the following folder structure.

To link to these images, the database needs to use syntax for the previous folder and then folders above. That will look like this

..: is syntax for go back a folder, whereas / plus the folder’s name is syntax to look into that particular folder.

I hope you found this short series useful, and if you have any Data Merge “did you know” tips, please feel free to submit them either in the comments, or contact me via my contact page.

Data Merge “Did you know” Part One

Regulars to the site will know that many of my articles relate to InDesign’s Data Merge feature. Given the amount of tutorials already available online elsewhere concerning basic tutorials for Data Merge, the Colecandoo site focuses more on articles about Data Merge in relation to scripts, GREP styles, or advanced techniques.

But there is a middle-ground that hasn’t been covered in many Data Merge tutorials, nor here on Colecandoo, so over the next two articles, I will attempt to bridge that gap and highlight some lesser known issues that can become a problem if users aren’t initially aware of them.

Can’t package the data or links used in the data

When InDesign packages an INDD file, it will save a copy of the file and copy any links used in the document into a Links folder, and any fonts used (within licensing restrictions) to a Document fonts folder.

However, this does not extend to the source data of a Data Merge file, nor any links that the source data may refer to.

PDF made from merge is different to regular PDF

I have written about this before but ultimately when exporting a PDF directly from Data Merge, it makes a variety of PDF that is similar but not the same as a usual PDF, as the following options cannot be chosen.

  • The ability to merge to an interactive PDF
  • The page range (not the record range)
  • Spreads
  • Create Tagged PDF
  • Create Acrobat Layers
  • Hyperlinks

I’ve speculated why this might be the case in this article but until this addressed, it is a consideration to be aware of.

Headers with the same name

If the headers in a database are exactly the same name, InDesign’s Data Merge will add a sequential number after the first instance of the field name to make a distinction between the field names.

Colons can cause weird issues in the header

This featured briefly in my creative pro article “Troubleshooting data merge errors” but in short, colons used in field names can cause one of two dialog boxes when used in particular circumstances. Thankfully, if a colon appears at the start or end (or both) of a field name, the data will import without any issues, but if a colon is within the field, then a dialog box with the words “Generic extended parser error” appears.

If there are two or more colons in the field name (neither at the start or end of a field name), a dialog box that says “not well formed” appears.

UTF-16 is the format built for Data Merge

InDesign’s Data Merge is designed with UTF-16 text in mind. However, CSV and TXT files exported from programs such as Microsoft Excel usually export to UTF-8.

This is usually fine for most circumstance in English, but can cause problems when:

  • using an alphabet other than the Roman alphabet;
  • the data contains punctuation or characters that may not be available via UTF-8

Excel does have an option to export to UTF-16 and it is worth using. The option is here when exporting via Excel:

In part two of this Did You Know series, we will look at other lesser known phenomenon, such as:

  • Data is there even if link is missing
  • Merge fields can be removed via the hyperlinks panel
  • Shift clicking during the import does not show options
  • The benefits of linked images in the same folder as the source text file

If you have any lesser-known Data Merge behaviours that you think would easily make this list, please feel free to mention them in the comments.

Data Merge to Single Records Pro: Now Available

Since 2016, Colecandoo has provided the free version of the Data Merge to Single Records script for Adobe InDesign – a script that allows single records to be exported from Data Merge with unique filenames available from the Data Merge database itself. This improves Adobe InDesign’s default – naming each file Untitled-N and is only available for InDesign files, not PDFs.

On that note, the PRO version of this script is now available!

This script improves upon the free original by:

  • Exporting to various additional file formats, such as interactive PDF, EPS, PNG, JPG, direct to print, or PDF via InDesign first;
  • Add a primary key to either the start or the end of a filename;
  • When exporting to certain file formats – the ability to run a user-selected additional script before the export.

The script can be purchased for A$15 from the Buy Now button below.


The original Data Merge to Single Records script offered by Colecandoo remains free and can be downloaded from the scripts page.

Referencing pages of a multi-page PDF file during data merge… workaround

At the time of writing, there are three multi-page/artboard file formats that Adobe InDesign can import when placing a file via the File/Place function. These formats are:

  • PDF
  • Adobe Illustrator
  • Adobe InDesign

(While it is possible to create many artboards in Adobe Photoshop, it is not possible to import a specific Photoshop artboard into Adobe InDesign… – at the time of writing that is – but that is another article!)

When placing one of these three formats, it is possible to control several import functions using the show import dialog box, such as:

  • Which page (or pages) to import;
  • How the pages should be cropped;
  • Whether or not to place the pages with a transparent background; and
  • What layers to show and their visibility;

However, when importing these file types as variable images during a data merge, these options are unavailable and replaced with the following:

  • Only the first absolute page of the file is imported (not always the page numbered 1 as the first page can also be – for example – in roman numerals or start at a page other than one); and
  • Page cropping, transparency and layer visibility is determined by the same variables as the last file of that type to be placed into the artwork.

For now, there is no workaround to control the latter issues during a data merge, other than to be familiar with this behaviour and plan the merge accordingly. There is a workaround for importing pages beyond the first page of a PDF file… but not an Illustrator or InDesign file.

Workaround: Split the PDF

The term “workaround” is used loosely in this context. Unfortunately, the solution is to break the PDFs into single page records. This can be done within Acrobat using the split button from the organise pages panel.

This feature also allows multiple files to be split at once.

By default, the resulting files will maintain the same filename with the addition of _Partx prior to the filename, with x representing the absolute page number.

Otherwise, I’ve prepared an action that you can download here that will save the PDFs to the Documents folder of the machine running the action.

(Yes, I’m also aware that there are quite literally hundreds of websites out there that will split multi-page PDFs to single PDFs for free. However, the methods outlined above will do so without involving a third party).

The next part of the workaround involves the data itself, and I’ll be using Microsoft Excel to create formulas to make the numbering for the resulting pages. All variable images being referenced will also be in the same folder as the data file, meaning only the filename is required and not the full path and the filename.

For data where the page number is known

Add a column to the database that references the absolute PDF page number that needs to be imported.

Absolute vs Section numbers abridged:

Absolute numbers refers to a page number based on the total count of pages in the document, while section numbers refers to the page number that was applied using page numbering in the application that made the PDF.

For example, take a PDF that contains 20 pages with the first six pages being in roman numerals, and the remainder being in decimal numbers. These two different styles of numbering are section numbers, while absolute page numbers refer to the total count of pages. To reference page iv of the PDF, the absolute page number to reference is 4. To reference page 5 of the PDF, the absolute page number reference is 11.

In this example, the A column represents the PDF to reference, the B column represents the absolute page number, and C represents the result. To obtain this result, the following formula can be used:

=SUBSTITUTE(A2,".PDF","_Part"&B2&".pdf")

This formula will look at filename reference and substitute the .PDF portion of the filename for _Partx.pdf, where x represents the figure in the B column. Using this formula, only filenames with the PDF extension will be affected, while filenames in other formats will be unaffected.

For data where the page reference needs to increment by one more than the row above

The same formula can be used for the naming, but another formula is used to determine if the page reference should increase if the same base file is being referenced in the row directly above.

In this example, the N column represents the PDF to reference, the O column represents the absolute page number, and P represents the result. A 24 page file NS91912 is being merged and needs to have the page reference incremented by one so that the filenames are NS91912_Part1.pdf to NS91912_Part24.pdf. The following formula can be used to change the page reference:

=IF(N2=N1,O1+1,1)

This formula will look at the filename and determine that if the filename is different to the row above, put the number 1 in the cell, BUT if the filename is the same as the row above, take the page value from the cell above and add 1 to it into this cell.

In a perfect world

Again, this is a workaround – it will only work for PDFs and requires some upfront work to prepare. Ideally, if I had my way and could implement some improvements, I’d like to see:

  • Not just the ability to choose a specific page, but choose the correct trim box and layers as well. For example, a file reference such as myFile.pdf;1,trim;Layer1,Layer2 where 1 represents the absolute page number, trim represents what trim box to use, and Layer1,Layer2 represent the layers I would like to appear (or leave the layer bit blank if all layers should be visible).
  • The ability to perform a similar task for incoming INDD, AI or PSD files.
%d bloggers like this: