Start Over

Computer-produced mapping of dialectal variation

Authors :: Gerald M. Rubin
Source :: Computers and the Humanities. 4:241-246
Publication Year :: 1970
Publisher :: Springer Science and Business Media LLC, 1970.
Abstract: It has become apparent to some dialectologists that dialectology, especially in its interpretive phase, is a branch of linguistics particularly adapted to the use of computers. The dialectologist typically deals with large bodies of data, usually in the form of single words and short phrases, and he is interested in sorting and comparing individual items on many bases: phonological, morphological, lexical, and geographical. The major obstacle that has prevented widespread use of computers in dialect study is that the data for most of the great dialect surveys have been collected, recorded, and in most cases edited prior to the computer age. Thus the problem of preparing large bodies of data, much of it in narrow phonetic transcription, for computer use has been formidable. One of the aims of our research was to determine whether results can be obtained relatively easily by computerized sorting and mapping that would take endless hours by traditional methods. Accordingly we sought a problem complex enough to reveal the advantages of computerized dialectology while at the same time involving a body of data small enough to be quickly prepared. We turned to the published volumes of the Survey of English Dialects, which embody carefully controlled data, collected with professional skill and presented in convenient tabular form. And since one of the two areas covered by the volumes in print at the time the study was undertaken (May 1969) was the south of England, the problem of the voicing of initial fricatives in the southwest naturally suggested itself. This problem had the further advantage, for our purposes, of dealing with consonants (simpler than vowels in most varieties of English) in initial position, hence easily sorted and examined. The selection of this problem proved to be a happy one. A total of 75 locations in the ten southernmost counties of England were reported on by the survey. These were identified by a four-digit number, the first two digits indicating the county and the remaining two the locality. Thus 3906 stands for Burley, the sixth locality listed in Hampshire, the county numbered 39. Our data consisted of 68 words for each of the 75 localities. Thus our corpus had 5100 (68 x 75) items. A coding system was devised which preserved all significant features of the phonetic transcription. The words were transcribed in this code directly onto cards for the guidance of the keypuncher, who then punched the