The general goal of this thesis is to increase comparability, accuracy, and diagnostical benefit of sentence intelligibility tests within one language and to give suggestions for realizing comparable sentence intelligibility tests across languages. Most sentence intelligibility tests are either composed of meaningful (everyday) sentences or syntactically fixed, but semantically unpredictable sentences. The second type tests were investigated in detail in this thesis that were based on the original Hagerman sentences (Hagerman, 1982) and the Oldenburg sentence test (Wagener et al., 1999c; Wagener et al., 1999a; Wagener et al., 1999b). The development, optimization, and evaluation of the Danish DANTALE II test (Wagener et al., 2003) is presented which closely resembles the Oldenburg sentence test. In order to test the comparability of this type of sentence intelligibility tests across languages, the average speech reception threshold (SRT), slope of the intelligibility function and spread across test lists was obtained and compared for these three tests. Only the Danish test yields a lower intelligibility function slope than both other tests, whereas a high comparability is maintained across test languages for most of the other parameters considered. A large number of tests was conducted with normal listeners employing a systematic parameter variation to study further the influence of various test parameters on the expected outcome of the sentence test and its comparability even within one language (i. e. German). This includes the usage of test lists in quiet using sentence tests that were originally introduced and optimized for speech tests with interfering noise. In addition, measurement parameters like noise presentation level, type of interfering noise, and type of presentation were varied both with normal–hearing and hearing–impaired subjects. Fluctuating interfering noises were found to differentiate best between different degrees of hearing loss. Therefore, these noises were investigated in more detail. In order to better understand the mechanisms of speech perception in fluctuating noise, speech intelligibility in such noises was predicted with different approaches. The most successful approach models speech perception in fluctuating noise by first computing the expected intelligibilities of sub–word units at the respective signal–to–noise ratio, considering the context effects of the sub–word units. Then the word intelligibilities (or error probabilities, respectively) are computed by multiplying the particular error probabilities for the sub–word units. Finally, the sentence intelligibility is computed by averaging across the words. Taken together, the sentence tests and measurement procedures considered here both experimentally and by means of theoretical models appear to yield the highest practically achievable accuracy and comparability within and across languages. This might therefore help to harmonize speech audiometry across both laboratories, clinics, and languages.