The Epicurean Cheese Shop Thesaurus

Structure of the Indexing Language

Home

Welcome

Introduction

Structure

How to Use

Alphabetical

Hierarchical

KWIC

KWOC

Top Term

Sources

Suitability for the Cheese Shop Environment

Terms have been kept fairly simple to make the thesaurus accessible to new employees and a broad range of the general public, both at the store and online. Literary warrant has been the main criterion in the determination of preferred terms. In addition to forming the integrated index to the online database, the thesaurus will be available in a printed format. The number of cheese varieties to be indexed is not expected to be excessive. Initially, the indexing will be restricted to the nineteen varieties of cheese that the store currently carries in its inventory. This range may expand to several dozen - or perhaps slightly over one hundred - if the store's plans for expansion do in fact eventuate. Many of the terms will also be used in the store on posters, in graphic displays and as the basis for the arrangement of the cheeses on display shelves. Whilst TELL has been hired to develop the initial thesaurus, the store manager or assistant manager will likely do the actual indexing of the cheese varieties. Furthermore, TELL will likely be retained on a contract to maintain and update the thesaurus as the store expands its inventory of cheese varieties and the need for new terms is warranted.

Type of Indexing Language

The thesaurus is a controlled vocabulary of some cheese terms based on the ANSI/NISO Z39.19.1993 Standard Guidelines for the Construction, Format, and Management of Monolingual Thesauri. There is a mixture of single-word and multi-word terms representing several aspects of the cheeses sold in The Epicurean Cheese Shop. Scope notes are used to clarify the meaning of some of the terms, especially terms that are not obvious in their meaning, such as barnyardy, and common terms used in a specific way, such as low fat, medium fat, and high fat [3.2.2]. Nearly all the terms are from English, although some French and Italian words may later be added as non-preferred terms, to meet the needs of cheese connoisseurs with a working knowledge of those languages.

Pre-coordinate or Post-coordinate Retrieval

The thesaurus terms will eventually be integrated into a highly structured online database, which will enable users to search for cheese varieties based on a variety of characteristics. Users will be able to select characteristics from several different pop-up boxes that reflect the main classes of terms in the thesaurus used to describe cheese varieties, such as fat content, flavour, flavour intensity, texture, milk type, or national origin. The thesaurus terms will, therefore, be post-coordinated at the retrieval stage, with the possibility of using Boolean operators to combine or restrict specific cheese characteristics. Indexers will apply all applicable terms to the cheese varieties.

Form of Terms

a). Single and Multi-word Descriptors

Both single words and multi-word terms are used in the thesaurus. The criteria for the use of multi-word terms [4.2 - 4.3] have been followed in most cases. The general test "part of-type of" was employed to determine whether compound terms should stay together or be split. For example, "soft" is a type of cheese, so the noun phrase soft cheeses is used as the preferred term. [4.3] The terms to show the animal of origin for milk posed certain problems. Using the terms cows, goats and sheep, for example, did not seem specific enough, and although goat milk is a term in common use, the terms sheep milk and cow milk do not have any literary warrant. The more commonly used terms are the ones we selected as preferred terms – cow’s milk, goat’s milk and sheep’s milk - but these terms do not properly meet the guidelines for the use of terms showing the possessive case [3.7.2.3.1]. The apostrophe poses a potential retrieval problem, but the structured searching employed in the store’s online database, by means of pop-up boxes of terms to be selected by the user, should obviate any problem in searching the database.

b). Singular and Plural Forms

The use of singular or plural forms of terms has followed the usage recommended in the standard [3.5]. For example, count nouns, such as "cheeses", and uncountable nouns, such as "milk", have been utilized.

c). Hyphens

The use of hyphens has been avoided throughout [3.7.2.2]. The terms semihard and semisoft are more commonly seen as hyphenated words, but we adhered to the guidelines in this case to maintain a consistent use of non-hyphenated terms.

d). Grammatical Form of Terms

Since many of the terms in the thesaurus are intended to describe the various characteristics of cheeses, several decisions had to be made with respect to the use of adjectives [3.4.2] or adjectival noun phrases [3.4.1.2]. For the several terms for <cheese flavours>, for example, the word "flavour" could have been added to each of the adjectives, but the adjectives alone were deemed sufficient. Barnyardy, for example, rather than barnyardy flavour, is the preferred form selected for the thesaurus. Similarly, the terms mild, medium and sharp were selected as terms to describe variations in flavour intensity. The scope of the thesaurus is sufficiently narrow so that terms such as medium, sharp and mild, which might be considered ambiguous in a broader context, have not been further qualified by adding the noun flavour. On the other hand, the noun cheeses is added to the terms listed under <cheese texture types> to avoid ambiguity and because of literary warrant. Thus, hard cheeses, semihard cheeses, and soft cheeses are the preferred terms.

e). Spelling

Spelling usage is based on the Canadian Oxford Dictionary. In the case of terms with an optional British or American spelling, the British spelling has been preferred. This is justified on the basis of the store specializing in British and other European cheeses. The primary example is the spelling of "flavour."

f). Capitalization

Lower case letters are employed throughout with the exception of the names for specific cheeses, which are treated as proper nouns named after the geographic areas in which they originated [3.7.1 and 5.3.3]

Relationship Structures

a). Equivalence

The equivalence relationship is one of the primary means by which a thesaurus can be said to control the indexing vocabulary. A preferred term is selected to which various non-preferred terms, such as synonyms and lexical variants, can be referred [5.2] There are numerous terms in common use with which to describe the various sensory characteristics of cheeses, so there are a number of USE and UF relationships in the thesaurus. The term sharp, for example, is the preferred term for various words with close or similar meanings, such as old, piquant, pungent, and strong. Establishing USE and UF relationships for the variant spellings of flavor and flavour was deemed unnecessary, as these terms would be listed next to each other in the alphabetical display.

b). Hierarchical

The Cheese Thesaurus makes use of a number of hierarchical relationships, using the conventional indicators BT for broader terms and NT for narrower terms [5.3]. There are several instances of the generic relationship [5.3.1] that can be demonstrated by the "is a" test or the "all-and-some" test: [narrower term] is a [broader term]; some [broader term] are [narrower term] or all [narrower term] are [broader term]. Examples include:

goat’s milk is a <milk type>
nutty is a <cheese flavours>
some <cheese types> are French cheeses
all Italian cheeses are <cheese types>

All of the specific cheese types listed under the <national origin> terms are examples of the instance relationship [5.3.3]. In most cases the specific cheeses are named after the geographic area in which they originated, so they are capitalized as proper nouns and can be considered individual instances of the more general category. Brie, Crottin de Chavignol, Munster, Picodon, Port Salut, and Roquefort, for example, are individual instances of the broader term French cheeses.

A number of node labels have been used throughout the thesaurus to indicate hierarchical relationships. Their function is similar to that of "broader terms", but they are not preferred descriptors. They are distinguished from descriptors by the use of angle <> brackets [5.3.3]. Examples include <cheese types>, <cheese flavours> and <national origin>. Scope notes, equivalence relationships and associative relationships are used with some of the node labels, as shown in the following example: 

  <cheese flavour intensity>

SN:  A relative measure of the strength of the flavour and aroma characteristic of a cheese; it is usually closely related to the age or maturity of the cheese.
UF: cheese flavour strength
NT: medium
mild
sharp
RT: <cheese flavours>

Because the thesaurus is designed to bring out the multiple characteristics of a limited number of cheese types, one might expect that many of the terms will belong to more than one broader category. In this thesaurus, as developed so far, all of the specific named cheese types belong to three or four broader classes: one of the <milk types>, one of the <cheese texture types>, one of the <national origin> terms, and some to blue cheeses. These are examples of polyhierachical relationships [5.3.4].

Gorgonzola, for example, is linked to the broader classes blue cheeses, cow’s milk, Italian cheeses, and semihard cheeses. The broader classes previously mentioned are also narrower terms for the broader class <cheese types>. The thesaurus, then, demonstrates a complex array of nested and polyhierarchical relationships. The Alphabetical Display of the thesaurus is a flat format [5.3] and does not show the multiple levels of the hierarchy. The Hierarchical Display does show the levels of the hierarchy by the use of indentations and the indicators BT1, BT2, BT3, NT1, NT2, and NT3. The Top Term Display also shows the levels of the hierarchies by employing stepped indentations.

c). Associative

The associative relationship is used to suggest ideas for further retrieval by users or to provide assistance to the indexer in applying terms. The RT [Related Terms] indicator is used to show terms that are related in various ways. Several sets of terms in the thesaurus represent concepts of degree. The terms mild, medium and sharp, for example, are sibling terms under the parent node <cheese flavour intensity> and are further related to each other using the RT indicator. These terms are not mutually exclusive, but, because they shade one into another by degree, are slightly overlapping terms [5.4.1.1]. The terms low fat, medium fat, and high fat are similarly related. An associative relationship between sibling terms under a broader term need not be shown if these sibling terms are mutually exclusive [5.4.1.2]. The sibling terms British cheeses, French cheeses, Italian cheeses and Swiss cheeses, for example, are mutually exclusive so the RT associative indicator is not used between these terms. Also mutually exclusive are the specific cheese types – Cheddar, Roquefort, Asiago, etc. - listed under several broader terms. The narrower terms listed under <milk types> - cow’s milk, goat’s milk and sheep’s milk- are also mutually exclusive, so the RT indicator is not used between these terms.

Precision/Recall

Recall refers to the number of relevant items that might be retrieved in a search, compared to the total number of relevant items in a collection. Precision refers to the number of relevant items in a retrieved set. Several factors in the design of the thesaurus will enhance search precision and recall. The specificity of the controlled vocabulary will enhance precision. The use of multi-word descriptors provides a greater degree of precision for the meaning of terms. Directing users and indexers to preferred terms from a variety of non-preferred terms also increases precision. Scope notes help clarify the meaning of terms, thus enhancing precision. The multi-leveled hierarchical structure will enable users and indexers to more readily find the term specificity required. The extensive use of associative relationships enhances recall by directing users and indexers to related aspects of cheese.

Specificity

A high level of specificity in the terms for specific cheeses and flavour characteristics is reflected in the thesaurus. The specificity required is determined by the anticipated needs of the store’s employees and customers and online users. Some general guidelines are suggested in the scope notes for the terms descriptive of <cheese fat content>. Total precision is not possible as guidelines for labeling and the designation of fat content vary from one country to another. Most of the names for specific cheeses, such as Brie and Cheddar, can almost be considered class terms, since varieties of each are produced in several countries. Further refinement of these terms might be required in the future. It is difficult to be very specific with regard to flavour terms. There is no standard set of terms in widespread use, so terms were chosen primarily on the basis of them being somewhat easily distinguishable. The many-layered hierarchical structure of the thesaurus is an important aid to users and indexers in finding the correct level of specificity for terms desired in a search or in indexing cheese types.

Exhaustivity

The depth of coverage of the thesaurus is definitely limited at present. The current thesaurus represents the first stage in the development of what will surely become a more expansive work. The current version reflects the rather limited range of inventory presently carried at the store. Nineteen specific types of cheese from four European countries are contained in the current inventory. The store plans to gradually add further varieties from these four countries, and expand to include some varieties from other European countries, such as Denmark and the Netherlands. The thesaurus, it is hoped, will expand to include terms involved in the making of cheese, as manufacturing processes are very significant in determining some of the desired characteristics of cheese.

TOP