word


Given a set Σ, a word (or a string) over Σ is a juxtaposition (variously called concatenation or multiplicationPlanetmathPlanetmath) of a finite number of elements in Σ. The juxtaposition is taken as an associative binary operationMathworldPlanetmath on Σ. A word with zero number of elements is called an empty wordPlanetmathPlanetmathPlanetmath, typically denoted by λ or ϵ. The set of words over Σ is denoted Σ*.

Examples.

  1. 1.

    If Σ={a,b,c,,x,y,z}, the English alphabet written in the lower case, then “good”, “mathematics”, “fasluiwh” are all words (without the double quotes) over Σ, where as “PlanetMath” is not, because it contains upper case letters, which are not in Σ.

  2. 2.

    Let Σ={0,1,2,3,4,5,6,7,8,9,+,=}. Then “12”, “0345”, “9+3”, “87=123”, “++231++”, “6+7=13”, “7=” are also words over Σ.

  3. 3.

    The notion of words is used extensively in group theory. The juxtaposition here is the group multiplication, as the multiplication is associative. In other words, if g1,g2,,gm are elements in G then we can form the word w=g1g2gmG. For example, in the free group a,b| a word could be the commutatorMathworldPlanetmath [a,b]=aba-1b-1.

Remarks

  • Σ* is a monoid with juxtaposition as the monoid multiplication, and λ, the empty word, as the multiplicative identityPlanetmathPlanetmath.

  • For any element a in Σ, define a0=λ, and an+1=ana for non-negative integers n. Then an+m=anam since juxtaposition is associative.

  • Words, by definition, are finite in length. This notion can be generalized: an infinite word, or more precisely, a ω-word, over an alphabet Σ is just a function from to Σ. The set of all words over Σ, finite or infiniteMathworldPlanetmath, is Σ*Σ, and is denoted by Σ or Σω.

Subwords

A word u is called a subword of v if v=xuy, for some words x and y (may be empty words). If u is a subword of v, we also say that u occurs in v, or that v contains u. For example, “math” is a subword of “mathematics”.

Given the equation v=xuy, we call the triple (x,u,y) an occurrence of u in v. The collectionMathworldPlanetmath of occurrences of u in v is denoted O(u,v). The number of occurrences of u in v defined as the cardinality of O(u,v), and written |u|v. The position of occurrence (x,u,y) of u in v is the length of x plus 1.

For example, the number of occurrences of subword a3 in a3ba5c is 4, since

O(a3,a3ba5c)={(λ,a3,ba5c),(a3b,a3,a2c),(a3ba,a3,ac),(a3ba2,a3,c)}.

The positions of these occurrences are 1,5,6, and 7, respectively.

Generating Words using Rules

Some of the words in the second example above, such as “++231++” and “7=”, do not make any mathematical sense. The way to define words that make sense is through a process called definition by recursion. First, we declare that certain words over Σ are sensible. Then, we have a set of rules or a grammarMathworldPlanetmath that dictates how new sensible words can be formed from the old ones. Any word that can be formed from the old words by these rules in a finite number of steps is called sensible.

In the last example, we could declare that all symbols 0,1,,9 are sensible words. To form new sensible words, we have the rules:

  1. 1.

    if a,b do not contain either + or =, then ab is a sensible word;

  2. 2.

    if a two sensible words a,b do not contain the symbol =, then a+b and a=b are sensible words;

  3. 3.

    the only sensible words are the initially declared sensible words and those that can be formed by the previous two rules.

It is not hard to see based on the initially declared sensible words and the rules has one of the forms

  • a

  • a1+a2++an

  • a1+a2++an=b1+b2++bm.

where a,ai,bj are words without any occurrence of + and =, over Σ. As a result, we see that all words in the previous example are sensible (whether they are right or wrong), except “++231++” and “7=”, since they are not in any one of the forms specified above. Note that the third rule above ensures that “++231++” and “7=” are not sensible. Without it, we would be unable to say for sure if these words are sensible or not.

Generally, any collection of words is called a languagePlanetmathPlanetmath. The collection of all sensible words described above is called the language generated by 0,,9 under the rules above. In logic, one calls these sensible words well-formed formulas, or formulasMathworldPlanetmathPlanetmath or wff for short.

Title word
Canonical name Word
Date of creation 2013-03-22 16:04:44
Last modified on 2013-03-22 16:04:44
Owner juanman (12619)
Last modified by juanman (12619)
Numerical id 31
Author juanman (12619)
Entry type Definition
Classification msc 03B10
Classification msc 03B05
Classification msc 03B65
Classification msc 03B99
Classification msc 03D40
Classification msc 08A99
Classification msc 20A05
Classification msc 20-00
Synonym well-formed formula
Synonym wff
Synonym infinite word
Related topic Group
Related topic StraightLineProgram
Related topic Alphabet
Related topic Language
Related topic Concatenation
Related topic AlternativeTreatmentOfConcatenation
Related topic FreeSemigroup
Defines empty word
Defines well formed formula
Defines subword
Defines occur in
Defines occurrence
Defines ω-word