Wrangle up InDesign index entries… without InDesign.

A recent project involved creating an enormous index… in fact there were over 100,000 index entries to create.

Creating index entries is normally a chore. To create just one index entry, the normal procedure is to:

  • Highlight the text to be indexed
  • Select “New Page Reference” from the index palette (or command + 7)
  • Enter the details and click Add (or Add All) then OK
indexref1
indexref2

In a normal book, indexing is something that is done carefully by the author or staff dedicated to the task – entries in the index often refer to certain instances of a word rather than every instance of its use. This project however used the index as a lookup table instead, so the more advanced features of the index palette (e.g. see also references, index levels) were not necessary.

For this project, the items to be indexed were restaurant names. The name appeared in the same line as the description, so using a Paragraph Style to identify the item for an index entry could not be used. However, the restaurant names DID have a Character Style associated with them.

Because there were 100,000 index entries in this book and each entry had its own character style, there were easier methods to perform this task. There are several scripts online that can create index entries from character styles:

For this project, because there were two character styles used to identify the restaurant names, I used Peter Kahrel’s script. While testing the script on a sample chapter, everything appeared to work correctly… it took time but all the names in character styles were added to the index.

However, when the time came to apply this script to a document 1,628 pages long, the script would run, and then the spinning beach-ball of death would appear. Assuming I was not allowing the script enough time to finish its tasks, an attempt was made to let the script run over a weekend on the fastest machine in the office. Sadly, this did not work. Put simply, there were just too many entries for the machine to handle.

Enter Textwrangler…

Luckily, all the text for this project, while many pages long, was all in one text frame. This provided the option to enter the index entries while the document was in a raw text format. To do this, the text was exported as an Adobe InDesign tagged text file by placing the cursor anywhere in the text and selecting File/Export (command + E).

indexref3

The newly saved text file was then opened in Textwrangler. The a find/change using Textwrangler’s GREP was then made for the following:

indexref4

If the code is hard to see in the picture, here is the type:

Find:

(<cstyle:placename>)(.+?)(<cstyle:>)

*placename above refers to the style to index.

Replace:

<Idx:=<IdxEnType:IdxPgEn><IdxEnRngType:kCurrentPage><IdxEnDispStr:\2>>\1\2\3

(make sure the Grep checkbox is ticked)

Once the changes in the text files were saved, the type was imported in place of the old text in the InDesign file, and within moments the document was completely indexed as required.

The only other part that took time was to run the “Generate Index” function from InDesign itself, and considering the amount of index entries in the document, took an hour to generate.

7 comments

  1. I’m about to build a 1,200-page book from spreadsheet data from the client, loading it into my dbms to categorize and sort it into sections (chapters), and using Applescript between the dbms and ID CC to build chapter files, for the book. Lots of names to index which I’ll build with Applescript by creating separate and non-printing text frames which contain with the index character style. This creates ID CC index entries which sort by lastName.

    Thanks so much for this simple technique!

    Now all I have to do is figure out the Applescript. I haven’t been able to find any indexing commands in the AS dictionary, so this completes the puzzle.

  2. This works great if the character styles are already in the text, but what do you do if you need to add the character styles to raw text? My example needs a character style applied to the text at the beginning of the paragraph, up to the first tab character. Naturally, I tried building a nested style to apply this, but surprise! When exporting the text as ID Tagged Text, the nested character style does not show up in the markup. Does anyone know of a way to force the nested style definition to get included in the IDTT?

      • Hmm. I tried the tip above, using Tags to map the character style to it, but it didn’t produce the results I expected. However, the first comment had EXACTLY the right solution, and is SO SIMPLE it’s not surprising that it didn’t occur to me. I didn’t realize that a nested character style could be searched for using the Search by Format dialog, then Replace with Format, using the Character style. Then export to IDTT, and boom. The appropriate format to do the GREP search and replace.

  3. My case is completely consistent; what I really need is index tags added to all text from the beginning of the paragraph to the first tab character. My GREP skills are pretty rudimentary, but I know this should be possible. Should I just use ^.+?(?=\t) then apply the replace to get everything up to the first tab indexed?

Leave a Reply to colmin8rCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.