|
Indexing Guidelines and Analysis
INTRODUCTION
The 2003 revision of the Guidelines for the Construction, Format, and Management of Monolingual Thesauri created by the National Information Standards Organization (called simply, the Guidelines, from now on), was consulted during the development and construction of the ABC Thesaurus. Various points that required careful consideration are discussed below and noted, where applicable, with the appropriate section number.
SUITABILITY AND TYPE OF INDEXING LANGUAGE
The indexing language chosen by GourmandeX to fulfill the needs of the Canadian Association of British Artisanal Cheeses Importers was developed through extensive research into the practices, laws, and literature of cheese making, importation, and sale. All terms included in the ABC Thesaurus were derived from authoritative, professional sources used within both the CABACHI community and the larger Western cheese community, and through interviews with highly knowledgeable members of CABACHI. The indexing language chosen by GourmandeX is a hybrid of controlled and natural language, derived from the above sources.
Natural language is human language, in this case, language used by cheese makers, importers, and sellers of British cheeses. CABACHI members are fluent in the language and terminology associated with cheese, and the creation of the ABC Thesaurus v.1 facilitates CABACHI's interaction with Agriculture and Agri-Food Canada when importing cheeses. The natural language used by CABACHI members and cheese makers would not necessarily be understood by officials of Agriculture and Agri-Food Canada, therefore the ABC Thesaurus offers users a defined set of controlled vocabulary from which CABACHI members can derive official terminology for use in completing importation forms. The ABC Thesaurus offers users access through entry terms, which are often natural language, and then clearly links these natural language terms (or non-preferred terms) to their preferred terms. Through the use of both natural and controlled language GourmandeX has optimized both the precision and recall of the ABC Thesaurus.
CABACHI has contracted GourmandeX for further work on the ABC Thesaurus through the creation in the fall of 2005 of the ABC Thesaurus v.2, including the cheeses of Wales, Scotland, Northern Ireland, Ireland, and the Orkney Islands. ABC Thesaurus v.2 will also include artisanal cheeses created in both Britain and the above locales, made from the milk of ewes and goats.
As a result CABACHI can rest assured that the ABC Thesaurus v.2 will remain authoritative and of the highest quality, thereby guaranteeing continued usefulness and efficacy in the practice of cheese importation from abroad.
While it is possible that CABACHI could encounter difficulties in procuring funding for future versions of the ABC Thesaurus, the artisanal cheese market in Canada is currently thriving. Agriculture and Agri-Food Canada continue to support the members of CABACHI in business ventures, and CABACHI continues to support cheese makers in both abroad and at home.
TOP
GUIDELINES
Domain
The ABC Thesaurus serves a narrow community (CABACHI) within the specific language domain related to cow's milk artisanal cheeses of England for the purpose of importation. The highly specialized nature of this language environment, in concert with the encouragement from the Guidelines (3.6.1) to select terms that reflect the usage of individuals familiar with that language, indicated that we follow the best practice of using terms as recognized by the user group wherever possible.
Compound Terms
The compound terms in the thesaurus are managed in accordance with the Guidelines. These terms express a single concept and their elements cannot be broken down and still retain their meaning within the context of the thesaurus. Overall, literary warrant is an essential and regular factor in our inclusion of compound terms (4.1.3.a). In addition, many of these qualified multi-word terms are proper names, such as Beenleigh Blue cheeses (4.2.f).
Proper Names
Inclusion of proper names in a thesaurus can be tricky according to the Guidelines. Levels of control, such as with a separate name authority file, are suggested to ensure the smooth structure of the thesaurus and to avoid confusing users (3.6.8). In our case, the cheeses are largely named for the regions or dairies from where they originated, and these names are well-established. We decided that suggested controls were unnecessary given the scope of this thesaurus and the high level of user-familiarity with the topic. Names of cheeses were established via the examination of authoritative, professional sources, and descriptors were selected for their common usage.
Character Case
The Guidelines suggest using lowercase characters in common noun descriptors to help clients distinguish them from capitalized proper names (3.7.1 and 6.3.4). However, a very high percentage of terms in the ABC Thesaurus are proper names. As there are so few common nouns, presenting them in their lowercase form alongside the proper nouns creates an awkward display. Indeed, the visual disconnect gives the impression that common nouns are somehow erroneously positioned nonpreferred terms, when in fact they are properly aligned descriptors. To eliminate this confusion all terms are capitalized to maintain consistency in the look of the thesaurus and facilitate its ease of use. The narrow focus of the thesaurus and the expertise of the user-group will allow this decision to function outside recommended practice.
Plurals
Considering the question of singular or plural form as it relates to cheese was troublesome during thesaurus construction. Cheese can be considered both a count noun (3.5.1), "How many cheeses may I choose from?"; and a noncount noun (3.5.2), "How much cheese do you want?". Both forms are utilized in the reviewed indexing languages, and user warrant could be employed in both instances as reasoning for selecting one form over the other. Consistency is very important in the thesaurus and so all terms must follow one form. Our decision to use the plural form stemmed from its usage in the resource Neal's Yard Dairy, a well respected dairy in the UK, that was also mentioned frequently in the interview with the cheese experts at Les Amis du Fromage (personal communication, 14 February 2005). Due to its high standing in the British cheesemonger community and among CABACHI members, Neal's Yard Dairy was deemed a reliable authority for this decision.
Lexical Variants
Lexical variants are fairly rare in the thesaurus. Spelling variations were solved by employing the power of literary warrant (3.6.2.1), as is the case of 'Bleu' versus 'Blue'. Further, this particular instance raises the question of loanword usage (3.6.7.1). As French terminology is thoroughly a part of the cheese lexicon no matter what the dominant language may be, its inclusion among the terms is permissible.
Hyphenation
In some instances during the indexing language review, hyphenated forms of terms, such as semi-hard, were encountered. The Guidelines suggest that hyphens should be avoided as frequently as possible (3.7.2.2). In keeping with this, hyphens were rejected in favour of consolidated terms, for example: semihard. Common usage of this consolidated form in a large portion of reviewed indexing languages and sources further supported this decision.
TOP
RELATIONSHIP STRUCTURES
There are three relationship structures at work in the ABC Thesaurus: equivalence relationships, generic hierarchical relationships, and associative relationships.
Equivalence Relationships
The equivalence relationship is seen when two or more terms, one preferred, the other(s) nonpreferred, specify the same concept (5.2.1). Cross-referencing connects the terms to one another. The nonpreferred terms are related to the preferred term with the designator, USE, and the preferred terms are related to nonpreferred with, UF.
Preferred terms will be in bold and nonpreferred terms will be in italics.
Example:
|
Bath cheeses |
|
UF: Bath Soft cheeses |
This indicates that the term Bath cheeses should be used in place of Bath Soft cheeses.
Hierarchical Relationships
The hierarchical relationship identifies the levels (one up, and one down) of superordinate and subordinate descriptors as they relate to a given term. In this thesaurus the hierarchical relationship is of the generic type (5.3.1). Here, the superordinate, or broader term (BT), identifies the larger body of which the given term is a part. The subordinate, or narrower term (NT), is distinguished as an element of the given term.
Example:
|
Cheddar cheeses |
|
BT: Hard cheeses |
|
NT: Cheshire cheeses |
This indicates that the term Cheddar cheeses is a type of Hard cheeses, and that a more narrow version of Cheddar cheeses are Cheshire cheeses.
Associative Relationships
The associative relationship is identified in the Guidelines as the most difficult to define of all the relationships (5.4). It addresses terms that are related conceptually but not hierarchically under the same broad concept.
Example:
|
Cheddar cheeses |
|
BT: Hard cheeses |
|
BT: Pasteurized cheeses |
|
RT: Wellington cheeses |
|
Wellington cheeses |
|
BT: Hard cheeses |
|
BT: Unpasteurized cheeses |
|
RT: Cheddar cheeses |
In this example Wellington cheeses and Cheddar cheeses are related as they both are made using a similar technique, they are both types of Hard cheeses, but one is a Pasteurized cheeses and the other is an Unpasteurized cheeses.
TOP
Precision
Precision retrieval devices operate best when excluding all terms that do not fit within the specialized scope of a search query. The presence of scope notes enhances precision, allowing CABACHI members to confirm appropriate term usage and relevant meaning (3.2.2). Normally, precision is sacrificed in a thesaurus where precoordinate terms have been developed as they have here. However, the scope of the ABC Thesaurus in its current state is narrow and the terms are optimized for the user group. Users will be searching for the preferred spelling and type-designation of known cheeses, therefore precision will be high.
Recall
The various types of structural relationships enhance recall by providing cross references, as well as broader, narrower, and related terms. Each of these can be utilized to open up an inquiry and expand potential relevant results. That being said, the specialized nature of the ABC Thesaurus and the specific goals of its users already impose limitations on recall ability.
TOP
Specificity
The demands of the user group requires specificity in the thesaurus in regards to the naming of the types of cheese, which has been satisfied. At the same time, specificity in describing the cheeses is not of particular importance to CABACHI. The purpose of the thesaurus is to ensure CABACHI members complete importation forms consistently, thus ensuring a smooth transaction through customs. For this purpose, we have limited this to pasteurized/unpasteurized and the texture/firmness of the cheeses. As the ABC Thesaurus expands, this specificity may need to include area of origin (Wales, Scotland, etc) as well as type of milk (cow, sheep, or goat).
Exhaustivity
Exhaustivity refers to the number of different terms that can be applied to an item being indexed. In creating this thesaurus, Gourmandex has attempted to include as many different English cows milk artisanal cheeses as possible. The inclusion of types of cheeses was based on the recipe and way of production used in the making of the cheese, and not based on the name of the manufacturer. While a Blue Stilton can also be classified as a Semihard Blue cheese and a Pasteurized cheese, it cannot be a Cheddar. The thesaurus presents a good listing of terms related to distinct cheeses as they fall under particular cheese types (i.e. hard), and milk treatments (i.e. pasteurized). All of the cheeses included are commonly imported by CABACHI users.
TOP
Pre-coordinate Headings
Most of the terms, both preferred and non-preferred, chosen for the ABC Thesaurus are pre-coordinate due to the nature of the items being indexed (cheeses). The term Beenleigh Blue cheeses cannot be separated into Beenleigh, Blue, and cheeses, and still make sense, as the cheese type is Beenleigh Blue. It is hoped that a search function will be included in future versions of the ABC Thesaurus, in which case the user will have the option of conducting post-coordinate searching. Until then access to the thesauri contents has been enhanced by use of pre-coordinate terms.
Computerized Retrieval Systems - Searching
The introduction of a search function to the thesaurus will provide users with retrieval capabilities that include truncation and key word searching, both of which have the potential to increase recall. Yet the usage of post-coordinated searching with pre-coordinated terms will enhance precision and lessen recall levels through a targeted search string. The ABC Thesaurus v.2, with its expansions into other milk types and additional regions in the United Kingdom, will benefit from these new search capabilities.
TOP
|