Annotation Guidelines
In the previous section, we described how to use the tool and how to assign tags. In the following, we give you guidelines regarding which tag should be assigned to a particular kind of text.
- Everything that is boilerplate is tagged red. Boilerplate is ...
- all navigation information,
- copyright information,
- hyperlinks that are not part of the text,
- all kinds of headers and footers.
Generally speaking, boilerplate is everything that can be used interchangeably with any other Web page or could be left out without changing the general content of the page.
- The following types of text are also tagged red:
- incomplete sentences, or text in telegraphic style,
- text containing 'non-words' such as file names,
- obvious off-site advertisements (i.e. advertisement from an external page),
- text in any other language than English (or the language to be tagged),
- lists and enumerations of any kind.
- All captions (labels, legends, titles, headings, sub-headings, etc.) are tagged yellow. And also everything that does not belong in the red or green category is tagged yellow.
- All text that is left is tagged green, i.e. ...
- text made up of complete sentences, even if it is in a list or enumeration,
- text that makes use of 'normal' words.
- text that is written in English (or the language to be tagged).
Simple, isn't it? You will notice that on some pages you can only highlight very large areas, on others the choices are less restricted. If you tag an element, the tag assigned is propagated to all elements that are contained in this area. However, if you are not sure whether a specific element is entailed, just tag it too to be on the safe side (remember the sidebar option mentioned in the previous section!).
In a previous section, we said that as a rule of thumb, it often makes sense to tag everything in red ('bad'), from top to bottom, and only then to start tagging smaller pieces in yellow or green ('uncertain' or 'good', respectively). The easiest way to tag a whole page red is to tag the outermost rim of the page and tag that as 'bad'. Due to the tag propagation, the whole page is now tagged as 'bad'. If you want to make sure that this is so, check the sidebar (see above).
This may all be a bit confusing now. But fear not, in the next sections you will have the possibility to check whether you understood everything.
egon w. stemle
2010-08-31