 |
|
|
|
 |
How do I generate an index in Word?
|
Article contributed by John McGhie
The Microsoft Word Help suggests that you can automatically generate an
index. Sorry, but you can't (the "result" looks like an index, but the
reader can't use it). You can automatically mark index entries:
however, the amount of work required to edit the result into a useable index is
usually double the effort required to manually mark the index entries
one-by-one.
Instead of automatically generating something that is not useable, the reader would far prefer you to express the document electronically and provide a free
text search. A free text search serves the reader's needs far better than a badly-constructed index, and the search engines available these days are
smart enough to look for what the reader wanted rather than what he or she
asked for.
Making an Index
An experienced technical writer wrote this
article. As a technical writer, I produce long documents
running to thousands of pages of technical material. Indexes are part of my game. I can't tell you how to produce one
automatically, but I can tell you
how to produce one easily!
Before 1990-ish, Indexing was a profession of its own; in addition to an
Author and an Editor, a large book had an Indexer. Even today, if you are making a book such as a medical
encyclopedia that is going to remain in
print for many years, it is simply stupid not to use a professional indexer. Really good indexes are
an even mix of science and art form, and the
quality improvement a professional makes is well worth paying for. Of course, few of us these days work on publications that are going to last
long enough to justify this effort. And even fewer of us have the time to produce such an index. If you do have the time, obtain a copy of
Indexing,
The Art of by G. Norman Knight (Allen & Unwin, ISBN 0-04-029002-6).
Norman Knight is a former President of The Society of Indexers, and his book is simple and charming. Reading it, you will soon realize that indexing is not difficult; it simply takes attention to detail and patience.
Planning the Job
Word has one of the nicest and most powerful index generators around built right in, so you have all the tools you are going to need.
You need to allow a week per 500 pages to generate an index in a technical book. Technical publications are fairly
information dense. Scholarly
monographs and the like are usually quicker to index.
Types of Index
In the old days (say, 1995 or thereabouts!) indexes were all produced by the
shoebox method. They literally used a shoebox into which they inserted index cards: three-inch by five-inch cards upon which they wrote the index term and its
page number. The Indexer would sit with a large pile of galley
proofs, single-page images as they were returned from the typesetter, and go through
each one line-by-line seeking and recording the index terms. At the finish, they typed the index out with its page numbers and sent it off to the
typesetter for publication. There is a software tool specially built for
indexing that emulates this process exactly. I tell you this simply because, in
certain circumstances, this method is still the best today. If your
document is going to be published from a different computer to the one it is
being created on, and that machine cannot interpret Microsoft Word XE tags, and
you do not know what the page numbers are yet because the other machine is going
to do the pagination, then use the shoebox method!
Word will do two forms of index: The Concordance Index and the Mark-up Index. It will also do something half-way in-between, using its
Mark All
command.
Mark-up Indexes
A Mark-up index is the method I recommend. It's quick,
accurate, easy to understand, and easy to correct. With a little care in
the planning, it normally results in a very useable index.
As the term implies, you produce a mark-up index by embedding mark-up
tags
in the Word document. Word automatically looks up the page numbers at Print time and generates and formats the index for you.
Study the help topic Create an
index and all its sub-topics. This is the way
I recommend. It's the way that all good writers create an index these days.
Mark by mark, page by page! It is explained in detail below.
Concordance Indexes
I implore you not to waste your time with a Concordance Index for
most publications. It results in a huge pile of
rubbish that is of very little use to the reader. And it takes nearly as long to make as it does to generate an index properly. The Concordance
Index is a hangover from the past when people were desperately hoping to produce an
automatic index to reduce the
labor. Every major
word-processor will do them, and no professional writer or editor would, these days,
permit one.
To make a Concordance index you make up a table of all the terms you want
Word to find in one column, and the index entry you want to see for each term in the other. For more information, see
Create a concordance
file in the
Word help file. But the end result is that you have every term indexed at EVERY place it
occurs. Most of the mentions of a term in a book are simply passing references: what the reader wants to see in the index is only
one page
number; the one that contains the main topic for the term. If you send them on a wild goose chase to 20 other places first, they will think most
unkindly of you.
The concordance mechanism does have its place: It can often be used to
good effect in Reference Books such as Programming Reference Manuals, where each
command or function is referred to only in a small section of the text, then
rarely mentioned anywhere else in the book.
For the truly adventurous...
Technical writers and other folk who publish seriously-huge documents in HTML
may want to spend a little time learning about Concordance Indexes. In
conjunction with VBA, a concordance index is a great way to automatically
generate hyperlinks in your document. You tag every mention of each term
with the concordance indexing mechanism, then use VBA to change the tags into
hyperlink tags.
Indexing Made Easy
Here are some worthwhile hints I can give you so you do not go mad during the process:
1. |
Print a copy of the book and go through it with a highlighter,
marking the items you would like to see in the index. If you are not the subject-matter expert, get someone who is
expert in the subject to do this for you (the process
is massively easier if you understand the subject well). Mark only places where the reader will get information about each item. For example, if you
want to include installation
procedure, you would mark
Follow the procedure below to
install... in Chapter 1, you would not mark
if you
completed the installation procedure... in Chapter 5. The first is what the reader would expect to see when he looks up 'Installation Procedure'.
The second might cause the reader to come and look you up {grin}. |
2. |
Make some design decisions before you start putting codes in the
file. The most important are:
|
How many levels of entry are you going to allow? If it is more
than three, I will personally come and shoot you! Such an Index is both unusable and unmaintainable {grin}. |
|
Are you going to reverse the terms? Indexing, the art
of or The
art of Indexing? Normally do the former, but whichever you select, you must do it for every entry |
|
How will you treat numbers? All as if they were spelled out; or all
up the front above the As? In technical books, do the second, but whichever you do, you must do it for every number. |
|
Will you use see references to condense the index? My vote in
modern times is: No, don't bother. See references mean the reader finds
the index entry, then has to go find another index entry before they can find the page. It
annoys your reader, it doesn't save much paper, and these
days paper is not very expensive. |
|
Will you put
the Table of Contents in the Index?
Debate rages in the more pedantic Indexing circles about this one.
The pedants
(sorry, purists)
say you should not include in the index terms that are contained in headings
in the table of contents. I say: Of
course you should. Research shows that some people (about 35 pct) look in Tables of
Contents, some people (about 60 pct) look in Indexes. Few readers these days have a clear picture of the conceptual difference between them, and
each reader will secretly thank you if he can find what he wants in both places. I always include an index entry for every heading in the book. So
shoot me! |
|
Sort order: Word-by-word or letter-by-letter? By default, Word
does the former. Purists like the latter: I don't; I can never find anything in such an index, and most readers hate it. So shoot me again! To
produce a letter-by-letter sort, you have to place the generated index in a
two-column table (page numbers in one column, text in the other). Then
copy the text column, remove the spaces from it with Find/Replace, then shift
that column to all
upper-case and sort by it. Then remove the uppercase column and turn the table back into text. |
|
Avoid the classic hilarity of putting the
book in the Index. If
you are writing a book called All About
Word you may get sued for a laughter-based injury if you include
Word as a term in the index. But for
your own amusement, have a look in the indexes (not indices!) of a few cheap-and-nasty technical manuals such as are often produced in-house as
training manuals. You will be surprised how often you see this classic faux pas. And you may immediately become suspicious that you are looking at an
automatically generated index! |
|
3. |
Now run through and tag the entries you have highlighted, according to the instructions in the help topic
Mark index entries. Unfortunately,
if you have made a few indexes, you will know how to do this, and if you haven't, your first attempt will contain errors. Sorry: I had to go
through this too {grin}.
I will give you a hint that will save you a bit of time (quite a lot, actually...)
Do not put in the subentries at this stage. By that I mean tag each
item as a main term. If the entry does belong as a subentry, you will find
that you can add the main term to the tag more simply on your second pass.
A Word About Tagging:
Word's index tags are both case-sensitive and "space-sensitive".
"Installing" and "installing" are not the same thing: each will appear under its
own heading. "Administration" and " Administration" are not the same
thing: one will sort right at the top of the index. See? When you
are debugging "entries out of sequence" you sometimes have to look extremely
closely to ensure that the tags really do match exactly.
To enter an index tag in a heading, ensure that your headings are formatted
by styles, and do not apply any formatting overrides to the heading. If you
apply direct formatting to the headings that contain index tags, the direct
formatting will be copied through to your Index.
A colon : and a semicolon ; are not the same thing! You use colons to
divide the levels of sub-entry in your index tags. When you are in a
hurry, it is too easy to type the un-shifted character (the semi-colon) instead
of the shifted character (the full colon) in the tag. If you do, you will
get some very weird errors in your generated index. There's no easy way to
find these, but the semi-colon will appear in the index. If you have
strange things happening (items that do not appear under their correct entries
or sub-entries) try searching your generated index for semi-colons. If you
find any, at least you know "what" is wrong: finding the tag that produced the
problem is a real chore (it will not be on the page in the index...). Try
this: Reveal your hidden text (so you can see your XE tags) then search
for a semi-colon with the font format hidden text. If you find any,
chances are they are in your bad index tags.
|
4. |
Now generate the index. Ignore the formatting at this stage; just print it. Leave it as a single column for ease of reference. If you have a
big screen, you can open a second window into the document and look at the index that way (see the Window menu) but for most, it's easier to print the
first result.
|
5. |
Now sit down with a colored pen or pencil (you can't see blue or black against black type...) and edit the index.
|
Mark all the terms that should become sub-entries, and show the term they should be sub-entries of. |
|
Now run down it, and for each term, ask yourself What
else could the reader possibly call this?
Add an entry for each. |
|
Run down it again, and for each term, ask yourself Is there anything else the reader would need to know about when looking this
up? Add a See
also
for each one you find. |
|
6. |
Go through and edit the tags in the file to implement the changes you have identified.
You can find index tags easily by using the Browse
buttons on your vertical scroll bar (see Browse to the next or previous page, table, or other
item in the help).
In later versions of Word (2002 and above) you can use Ctrl + G to
bring up the "Go To" dialog. Set "Go to what?" to "Field".
Set the Enter field name box to "XE". Click Next,
then Close. Your "Previous" and "Next" browse buttons (at the
extreme bottom right corner of the Word window, under the vertical scroll bar)
will now go to the next or previous index entry fields on each click, until you
change to something else.
If you use Find, or Browse by Find, you
can specify ^d XE as your Find string to find only index tags.
If you know exactly what the text of the tag is, you can use ^d XE "tag
text string" to find exactly that tag. However, this requires you
to work out exactly what the tag content will be, and that's not easy three
levels down in an Index.
So I prefer to use Ctrl + G, Page Number (from the index), then
Ctrl + F, ^d (to find the next XE tag. Then keep hitting Browse
Next to find the tag you want.
|
7. |
Now regenerate your index. (Click in it and press F9). You can now change it to double-column if you wish. You format an index by using
Format>Style to change the styles Index 1 through Index 9. Each style controls the formatting of one level of entry.
|
Page Number Conflation
Page number conflation is where only the first and last page numbers appear
for a topic. In the index you see 88 - 95 instead of 88, 89, 90...
I am very tempted to say "don't bother"! Tag the first instance of each
term. If your reader does not have the brains to see that the information
on a topic continues for several pages, they should be kept away from your book
in case they hurt themselves... However, if you absolutely must conflate,
there are two ways of doing it:
- If you place the same index tag on each page of the topic, Word will
automatically conflate the page numbers.
- If you bookmark the whole section, then place the name of the bookmark in
the XE tag, Word will generate a conflated page reference for you.
See! It isn't that hard
There! That's the way I do it. If you trust me and do it that way, you will find out why I do
it that way. If you don't trust me and do it
another way, you will find out why much sooner {grin}.
|





|