Language is a semiotic system. Its units fall into classes, so they can combine into different recognizable patterns. The task of the analyst, as the American structuralists of the middle third of the 20th century had it, is “segmentation and classification”. This means:

  1. Segment the spoken chain into sequential units down to the smallest units.
  2. Classify these into classes of linguistic units.

In the second step, the analyst's task is to find those classes which actually underlie the patterns according to which messages are formed. By its double articulation, language consists of distinctive and significative units. Before the advent of structural linguistics, roughly with Saussure 1916, it was thought that distinctive units are defined by phonetic properties and significative units by semantic properties. For instance, a class of sounds was defined as labiodental by virtue of their point of articulation, and a class of words was defined as adjectives on the basis of their designating a property.

This does not work, since neither do the phonetic properties of a sound, nor do the semantic properties of a word determine their regular patterning in the formation of messages. Several structuralist schools opted for the opposite approach to linguistic classes. They distinguished between linguistic form and substance and postulated an analysis of linguistic form which disregards completely linguistic substance and with it any positive properties of the units to be classified. As criterial for the constitution of a linguistic class, they considered exclusively the distribution of units in the chain formed by a message. The distribution of a linguistic element is the class of contexts in which it occurs. For instance, a particular class of phonemes was defined as occurring in the first position of a syllable, and a particular class of words was defined as being preceded by a definite article. This does not work, either, because the contexts constituting the distribution must be given beforehand. There are, however, too few such fixpoints given a priori – principally, start and end of an utterance and possible pauses in between – which can constitute a definitory context. All other contexts are constituted by linguistic units, whose distributional class must then be given. This thus leads to circularity. In the second half of the twentieth century, this was seen clearly by functional structuralists:

However, each venture to reduce language to its ultimate invariants, by means of a mere analysis of their distribution in the text and with no reference to their empirical correlates, is condemned to failure. (Jakobson & Halle 19712:26f)

The point here is that both significative and distinctive units do not only have a formal side which can be reduced to their distribution. They also have a functional side interfacing with linguistic substance as they function in human cognition and communication. Significative units have meaning, and distinctive units have physical properties.

As a result, the categories constituting linguistic classes are “hybrid”: they are neither purely formal (or structural) nor purely semantic or phonetic categories, but a combination of both. As a result, an adjective is a word which modifies a nominal expression and whose prototypical members designate a property; and a vowel is a sound which can constitute a syllable and which has a minimum degree of buccal aperture. The same goes, importantly, for grammatical categories. For instance, a case is a morphological property of a nominal expression which indicates its semantosyntactic relation to its dependency head. Such a definition combines formal and semantic properties of the class of significative units in question.

A genuine linguistic class is a class of elements which function in a linguistic rule or process by constituting its input, output or conditioning context. Obstruents in German phonology form such a class since they are the sounds which undergo syllable-final devoicing. On the other hand, the phonemes of the set /s a p/ share no phonetic or phonological property and consequently do not constitute the input, output or conditioning context of any phonological process. In phonology, genuine linguistic classes are called natural classes. A non-natural class is simply a set of linguistic units.