Many attempts at the mechanical recognition of speech sound have been made and yet left many problems unsolved. Though the analysis method to be used is common to all the languages, the difficulty of performing recognition is largely dependent on the particular language. The Japanese language is considered to be easier in processing than the English ; for example, rather fewer cardinal vowels, simple monosyllable structure, and one to one correspondence between these monosyllables and the Japanese alphabet. In Japanese speech sound, the elemental units of articulation are monosyllables of one hundred odd, each of which consists of one consonant and one following vowel (some include semivowel, but no diphthong). The conversational speech sound is principally regarded as the successive utterances of these monosyllables with some modification by the influences from preceeding and following sound. From this point of view, we first tried to make a mechanical recognizer of Japanese monosyllables, and then extend it so as to be able to handle conversational speech sound by adding some functions to it. In this paper, the outline of the research model of the phonetic typewriter, which operates for Japanese monosyllables, is described. Speech sound has to be processed not only from the acouitical standpoint, but also from the linguistic one. According to the relations between them, many systems are obtained. For instance, the one which makes linguistic recongnition directly from the data acoustically analyzed, the other which, after recognizing phonemes or monosyllables from the acoustical data, applies to them the linguistic informations on syntax and redundancies. The machine presented here takes the method of monosyllable descrimination at the acoustical level. The followings are the outlines. (1) This machine accepts as the input Japanese monosyllables separately articulated from each other. (2) Analysis is made by distinctive feature extraction and zero-crossing interval analysis which correspond to the manner of articulation and the place of articulation in utterance. From the result of the distinctive feature extractor, the rough classification of input monosyllables is carried out, such as pure vowel, unvoiced, voiced, nasal, stop and contracted. From the zero-crossing interval analyzer, channel outputs are obtained, which are classified by the zero-crossing time interval of each rectangular wave. (3) All the control signals are derived from the input speech sound wave itself. (4) Discrimination of vowel part and consonant part is made by separate circuits, using digital technique such as order pulse, AND gate, OR gate and binary circuit. For consonant part, each channel output of zero-crossing wave analyzer is divalued with the threshold level which is previously decided by the statistical data. (5) All the obtained results are retained by register memories and gathered in the main matrix, where the final decision of input monosyllables is performed. (6) The results are indicated by the lamp indicator and also sent to high speed puncher or printer through code converter. These operations are carried out in real time, and by detecting the end point of speech sound final output is generated. In Fig. 1 the over-all block diagram of the phonetic typewriter is shown, and it is subdivided into the following parts ; (1) INPUT PART amplifies speech sound to a proper level and feeds it into each circuit of analyzing part. (2) ANALYZING PART performs phoneme classification and analyzes the vowel part and the consonant part. (3) JUDGING PART decides the phoneme of the vowel part and the consonant part, and then gets the final result of a whole monosyllable. (4) OUTPUT PART generates coded singnals to control puncher and printer, and also resets the whole circuit in preparation for the next input. [Figure omitted] Next, on the basis of the results of the above mentioned machine, an effort to translate conversational speech into machine codes is presented in the latter part of this paper. By adapting the machine to the continuous analysis and using displaying device, three dimensional pattern of distribution of the zero-crossing wave analysis in f1-f2 domain vs time axis was visualized on oscilloscope. Fig. 8, Fig. 9, Fig. 10 are the examples of patterns of monosyllables, words which involve typical cases to be taken into account in conversational speech, spoken digits in Japanese and English, respectively.