Data Merge “Did you know” Part One

Regulars to the site will know that many of my articles relate to InDesign’s Data Merge feature. Given the amount of tutorials already available online elsewhere concerning basic tutorials for Data Merge, the Colecandoo site focuses more on articles about Data Merge in relation to scripts, GREP styles, or advanced techniques.

But there is a middle-ground that hasn’t been covered in many Data Merge tutorials, nor here on Colecandoo, so over the next two articles, I will attempt to bridge that gap and highlight some lesser known issues that can become a problem if users aren’t initially aware of them.

Can’t package the data or links used in the data

When InDesign packages an INDD file, it will save a copy of the file and copy any links used in the document into a Links folder, and any fonts used (within licensing restrictions) to a Document fonts folder.

However, this does not extend to the source data of a Data Merge file, nor any links that the source data may refer to.

PDF made from merge is different to regular PDF

I have written about this before but ultimately when exporting a PDF directly from Data Merge, it makes a variety of PDF that is similar but not the same as a usual PDF, as the following options cannot be chosen.

  • The ability to merge to an interactive PDF
  • The page range (not the record range)
  • Spreads
  • Create Tagged PDF
  • Create Acrobat Layers
  • Hyperlinks

I’ve speculated why this might be the case in this article but until this addressed, it is a consideration to be aware of.

Headers with the same name

If the headers in a database are exactly the same name, InDesign’s Data Merge will add a sequential number after the first instance of the field name to make a distinction between the field names.

Colons can cause weird issues in the header

This featured briefly in my creative pro article “Troubleshooting data merge errors” but in short, colons used in field names can cause one of two dialog boxes when used in particular circumstances. Thankfully, if a colon appears at the start or end (or both) of a field name, the data will import without any issues, but if a colon is within the field, then a dialog box with the words “Generic extended parser error” appears.

If there are two or more colons in the field name (neither at the start or end of a field name), a dialog box that says “not well formed” appears.

UTF-16 is the format built for Data Merge

InDesign’s Data Merge is designed with UTF-16 text in mind. However, CSV and TXT files exported from programs such as Microsoft Excel usually export to UTF-8.

This is usually fine for most circumstance in English, but can cause problems when:

  • using an alphabet other than the Roman alphabet;
  • the data contains punctuation or characters that may not be available via UTF-8

Excel does have an option to export to UTF-16 and it is worth using. The option is here when exporting via Excel:

In part two of this Did You Know series, we will look at other lesser known phenomenon, such as:

  • Data is there even if link is missing
  • Merge fields can be removed via the hyperlinks panel
  • Shift clicking during the import does not show options
  • The benefits of linked images in the same folder as the source text file

If you have any lesser-known Data Merge behaviours that you think would easily make this list, please feel free to mention them in the comments.

Export many PDFs at once… plus security

A recent question on Reddit’s InDesign subreddit was whether two PDFs could be exported at the same time from the same document, but have two different properties – one with trims and one without. The answer is yes, but via a custom script written for the task.

I use such a script on a daily basis so that I can prepare a PDF for client proofing via email; and a separate PDF that has trim and crops that is sent directly to a hot-folder that prints it for me.

I’d submitted my script as a solution (that can be downloaded from the scripts page), but then realised that this concept was not a new idea. Ariel Walden over at ID-Extras had already written a similar script within a blog post of his own.

Similarly, Peter Kahrel’s Batch Convert script can perform the same task, with the added advantage that it can also do this for all open InDesign documents;

Or if no documents are open, a specified folder (and subfolders if desired) of InDesign files.

Can’t make these secure

One feature that all three scripts have in common is that the exports are based on the PDF presets available on the user’s machine. One feature that can’t be added to a PDF preset is security – this can only be done when a request to export the document is made, as security settings aren’t saved into PDF presets.

This is a problem if there are lots of documents that need to be exported with security settings as it requires the user to enter the security details each time a PDF is exported.

I’ve made an additional script

For this purpose, I thought I would make a script that not only makes several PDFs, but can also add password security to one version. The script can be downloaded from the scripts page.

When the script is run, it will generate two PDFs using different PDF export settings, but one will have the suffix “_secure” added to the filename, and a dialog box will appear once the export is finished:

Adjustability

The script can also be adjusted by opening the script in any text editing application and making the necessary changes, such as.

Use the same password for every document

Look for the line

    openDocumentPassword = myPassOpen; // requires a password to open the document

and change the myPassOpen to the desired password in quotations. For example:

    openDocumentPassword = "OpenSesame"; // requires a password to open the document

Similarly, do the same thing for the line underneath, making sure that the open password and edit password are not the same.

    changeSecurityPassword = myPassWrite; // requires a password to change the document

change to

    changeSecurityPassword = "EditSesame"; // requires a password to change the document

then search for the lines

dialog.show();
//alert("Done");

and swap the forward slashes in the lines around so that the lines now read like this.

//dialog.show();
alert("Done");

Only require a password to edit the document

Look for the following line:

    openDocumentPassword = myPassOpen; // requires a password to open the document

and add two forward slashes to the start of the line.

//    openDocumentPassword = myPassOpen; // requires a password to open the document

Adding two forward slashes to a line in a javascript tells the script to ignore the rest of the line and go to the next line of code.

Don’t show the “done” message

The default script has a dialog at the end for showing what the opening and editing passwords are, but if you want to edit the script so it makes a PDF that applies security to edit the document but does not provide the password (e.g. for the purpose of handing PDFs over to parties who may seek to deconstruct them in other applications) then make the adjustment mentioned a moment ago to restrict passwording to editing only, and then search for the lines

dialog.show();
//alert("Done");

and swap the forward slashes in the lines around so that the lines now read like this.

//dialog.show();
alert("Done");

Add more PDF exports

Look for the line

app.activeDocument.exportFile(ExportFormat.pdfType, File(resultsFolder + "/" + app.activeDocument.name.split(".indd")[0] + ".pdf"), false, "[High Quality Print]");

make a copy of the line and make the appropriate changes:

  • Replace the “[High Quality Print]” to the desired PDF preset exactly as it is written in the PDF export dialog box and put it in quotes. For example, if your PDF preset is called My Export then type “My Export”
  • Replace the “.pdf” with a suffix that denotes that this is an additional PDF. For example, if the pdf is a high res print, perhaps replace this with “_hi-res.pdf” so that the resulting file has _hi-res.pdf at the end of its filename.

Otherwise if you are after specific changes to the script to suit your needs, contact me via the contact page.

Things to know about the script

Opening and editing passwords must be different

One condition of preparing a secure PDF from Adobe InDesign is that the password required to open the PDF must be different to the password to edit the PDF, so if editing the script to replace the randomly generated password to a known one, the opening and editing passwords must be different. If the passwords are the same, the PDF will be made without security.

PDF Standard in the preset must be set to “None”

PDFs that use a PDFX standards can’t have security applied to them as the security panel of the PDF export box is greyed out, preventing security to be applied. The standards dropdown box in the desired PDF preset must be set to None.

Only password security is applied

When exporting a PDF from InDesign, only password security can be applied, unlike Adobe Acrobat’s choices of security that it can offer (as shown below).

While password security may deter or prevent a layperson from editing the PDF, the security can be broken through some effort. Several websites offer services where users can drag and drop a PDF to the site, and within moments the PDF will have the PDF password removed.

Similarly, there are desktop applications that can also be purchased to remove the security (as one of their many features), such as PDFsam Visual.

Applying character styles over character styles

There may be occasions where more than one character style has to be applied to the same words, such as a highlight, italic, etc. I recently saw this request over at the InDesign requests page.

In the request, the requestor does hint at a way that this can already be achieved in InDesign, though it can be time consuming. Let’s start from the beginning and look at some text that has an italic character style applied to it.

But if I apply a separate highlight character style that I’ve also made…

The highlight appears but the italic is removed. Reapplying the italic character style to the word only changes the word back to italic and doesn’t preserve the underline.

One solution is to do a local override – that is to manually apply the appearance but without using a character style

Note the plus that appears to the right of the Paragraph Style 1 – this indicates a local override is present.

That works, but let’s say that the client asks for all italics to now be a tint of the colour initially used. That’s fine if character styles were applied as the italic style needs to be changed once in the properties of the character style. However, all the italics applied using local overrides will need to have their fills reapplied with the new settings.

Yes, the eyedropper tool and find/change can assist, but if character styles were applied, these additional steps would not be necessary.

In this circumstance, making a third style that has both the underline and italic would make sense.

In this case, it adds one more character style – not a big deal, but in a large document, the quantity of character styles can grow fast.

GREP Styles to the rescue

Take this chemical equation in a science textbook. It currently looks like this:

The subscripts in this equation have been applied with a character style that I’ve named sub. However, the author wants the reaction only in bold. If the equation is highlighted and then has a bold character style applied, this happens:

All of the subscript formatting of the numbers are lost.

I can then create a second style called “bold sub” that has bold and subscript properties and base the style on the bold formatting, but I then have to make sure I correctly apply the newly created style to the appropriate numbers… this now introduces a level of human error.

But what if I could apply the bold style and keep the subscripts? It is possible using GREP styles. Using the GREP code from this CreativePro post (look for Laurent Tournier’s post dated Oct 9 2010 in the comments) apply it to the paragraph style.

[editor’s note – I’ve adjusted mine to account for the naming of elements 113-118 as of 2018, so if you want that amended code, contact me via my contact page]

Now apply the paragraph style to the recently bolded text.

Brilliant! Note how the I-beam cursor is between two subscript numbers, yet the character style shows that this is bold only.

This technique can also be applied to other formatting where subscripts or superscripts need to be preserved, such as:

  • Ordinal Numbers
  • Numbers written with scientific notation
  • Squared or cubed measurements

It just requires the right GREP syntax. All of the above examples used GREP styles to format the subscripts and superscripts only. To learn this technique and others, apply to join the Treasures of GREP Facebook page.

Once again to illustrate the point, the author wants these six lines in bold. By highlighting the lines and applying the bold character style, the subscripts and superscripts stay in tact.

Nested styles

Similarly, this can also be achieved with Nested styles. Take the last two lines in the last example prior to applying the bold – if I want the ordinal number at the start of the line to be bold, I don’t have to write a GREP style but I can use a nested style such as the one below.

That will give me this result without applying any manual character styles to the text:

There are catches to this technique

The first catch is that the character styles must have the minimal amount of style changes only. That is the sub character style only changes the position of the character to subscript, so that is the only item that style will apply, while maintaining the rest of the paragraph style’s formatting.

The second catch is to be aware of the style hierarchy. The following list is in order of what style overrules another (from most to least dominant):

  • Local override
  • Local character style
  • Nested style
  • GREP style lowest in list in the paragraph style settings
  • GREP style highest in list in the paragraph style settings

There can be several advantages to layering character styles by using GREP styles:

  • Less character styles.
  • Time saving for commonly formatted items such as ordinal numbers.
  • Consistency based on GREP patterns for words.

Similarly, there can be drawbacks with this technique:

  • Looks for particular words or phrases, so not appropriate for instances where dozens of words or phrases may make more GREP styles than are manageable.
  • Applies to paragraph styles, if used over many paragraph styles, the GREP style needs to be applied repeatedly. Scripts can help with this, such as one I wrote on my scripts page, or GREP Editor from Peter Kahrel.
  • Can’t take a bold style and italic style and combine them – it can only apply additional attributes that weren’t there previously.
  • GREP styles (along with live preflight, page thumbnails, dynamic spellcheck and any other service that has to run while the document is being composed) can slow the processing speed of the machine, particularly on larger documents.

Outlining the problem… text outlining

From time to time, I will prepare PDF artwork for third party providers and then note that their specifications indicate “Convert all text to outlines” (also known as converting to curves or paths). But why do some third parties recommend this practice?

The PDF is opened in software other than Acrobat

For commercial printers, PDFs are usually imported into Raster Image Processing (RIP) software that will impose and trap the artwork for their printing methods. However, not all providers work this way and may need to open the PDF in applications other than Adobe Acrobat. For example, a third party that prepares cutting formes may open the file in Corel Draw or a CAD application that supports its CNC software.

This means that as the file opens, the application may ask for fonts not available to the third party.

This can be exacerbated if the PDF is opened not only in a different application than Adobe Acrobat, but also a different alphabet and writing system. Converting the type to outlines maintains the appearance of the type without requiring the font to be present.

Other reasons that text is converted to outlines

So special effects can be applied

InDesign, Illustrator and Photoshop can apply interesting special effects to vector objects, but not all of those effects can be applied to live type. The solution is to convert the type to outlines, thus converting the type to vector shapes that can have the desired effect applied.

To prevent editing by third parties

Limited editing is possible within PDFs using either Acrobat’s own editing tools or using plugins such as Enfocus Pitstop Professional. These tools can allow last minute alterations to text so long as the text is type and not converted to outlines.

Locking the PDF with password protection isn’t an option as this can prevent the file from being placed into layout software or RIP software for output, so the password is then required to unlock the file. PDF password protection is also somewhat breakable, with many websites offering services where PDFs can be uploaded, and then unlocked and then downloaded without the password protection. There are also PDF editing and viewing applications such as PDF Sam that allow for decryption of PDFs.

Even without the Enfocus Pitstop plug-in, it is possible to open PDFs in Adobe Illustrator or Affinity Publisher and then – if the fonts are available – make the necessary alterations… though converting type to outlines will prevent this.

To circumvent the font EULA

A client may have acquired a font that has allowed for screen use only and prohibits embedding in a PDF, preventing the font from appearing correctly in the PDF. A way around this is to convert the type to outlines in the native application prior to PDF export, though it is worth noting that the End User Licence Agreement (EULA) of the font may forbid this workaround, so it is worth reading the font EULA.

That doesn’t mean it should be done!

There are issues that arise from converting type to outlines. Dov Isaacs – Principal Scientist for Adobe Systems – has a brilliant PDF that details this (and much more) but the basic takeaways concerning type to outlines are:

  • Increased filesize that takes forever to download or view onscreen
  • Smaller typefaces do not render as well
  • May potentially breach the font’s EULA

In addition, there are other issues such as:

  • Potential issues with fonts where type overlaps itself (it can knock out holes in the joins)
  • If the conversion from type to outlines has been done in the native application and then accidentally saved and closed, this means the type will no longer be live in the native application.
  • It can prevent or hinder minor type alterations being made in a PDF submitted for print.
  • Text (as outlines) that has special effects applied (as described earlier) may not always be able to have the same effect applied to live type. This can create issues with variable data campaigns where the effect needs to be applied to a text variable.
  • It can make it difficult to identify the font used, as the font’s information is no longer in the PDF and the only other way to identify the font is visually or with apps such as what the font, adobe capture, or identifont.
  • The conversion is usually a one-way conversion. There is a fantastic Adobe Illustrator plug-in from Astute Graphics called Vector First Aid 2 that – in some circumstances – can convert outlines back to type, but it isn’t a magic bullet (though definitely worth a look).

If your hand is forced…

In a perfect world, I’d only deal with providers that fully supported PDF/X-4 files. Unfortunately, not all providers do, and occasionally our hands will be forced into providing PDFs specifically as the provider has requested, which may mean converting text to outlines. Rather than doing this in the native application (e.g. InDesign or Illustrator) there is a great way to quickly convert all type to outlines using an Adobe Acrobat Preflight that is detailed over at CreativePro.

%d bloggers like this: