Varieties of Middle Persian I: The Manichaean, Book Pahlavi and Inscriptional Scripts

Comparing Manichaean & Zoroastrian scripts

Thomas Benfey

The first of a series on the various types of Middle Persian, and the scripts in which Middle Persian was written.

Most people who study Middle Persian begin with Book Pahlavi. This is the Middle Persian of the “Pahlavi books,” works that Zoroastrian priests have composed, transmitted, and preserved in handwritten copies, and (more or less) continuously housed in libraries. Although most of these works were composed in the early Islamic (ca. 650-1000 CE) or even Sasanian (ca. 220-650 CE) periods, and retain many features characteristic of these earlier times, the earliest extant manuscripts date from many centuries later. The oldest extant manuscripts of the Book Pahlavi Bundahišn, for example, which is arguably the principal basis for modern scholars’ understanding of ancient and medieval Zoroastrian cosmology, were written in the late 14th century CE at the earliest.


book pahlavi for blog 7 2

Book Pahlavi script in a manuscript copy of the Middle Persian Ardā-Wīrāz Nāmag (Bayerische Staatsbibliothek Cod.Zend 51a; image from the Avestan Digital Archive)



There are other kinds of Middle Persian, though, for which much earlier hard evidence is available. Two of these varieties are named after the non-Zoroastrian religious communities by whom they were used: Manichaean Middle Persian and Christian Middle Persian. We have manuscripts in both Manichaean and Christian Middle Persian from no later than the tenth century, and in all likelihood dating from a substantially earlier period. Although the original texts were composed in Sasanian Iran, these manuscripts were all excavated further east: in the vicinity of Turfan, in what is now Xinjiang. In the Middle Ages this region was home to substantial communities of Christians and Manichaeans with roots in the Sasanian Empire, whose texts were preserved until modern times thanks to the favorable environmental conditions of the Taklamakan desert.

Aside from their specifically Christian and Manichaean content, and some more subtle grammatical, phonological, and stylistic differences, the most obvious feature that distinguishes these texts from the Book Pahlavi corpus is their scripts. Following a pattern familiar from the history of other languages like Arabic, New Persian, and Sogdian, in the Middle Persian case, too, different religious communities used different scripts to write (more or less) the same language. While the Book Pahlavi script is notoriously ambiguous, with identical characters potentially representing as many as five phonemes, the Christian and Manichaean scripts have more letters, and are accordingly much easier to read. All three originally stem from the Aramaic script.

The Manichaean writing system also differs substantially from that of Book Pahlavi in its orthography, here too tending toward greater readability. Like modern English, Book Pahlavi is very conservative and “historical” in its spelling, which therefore often does not correspond closely or even regularly to the way a given word would have been pronounced. Book Pahlavi, like other Middle Iranian languages such as Sogdian and Parthian with roots in the Aramaic scribal traditions of the Achaemenid Empire (ca. 700-300 BCE), also has many “Arameograms”: words written as Aramaic, but pronounced as Persian.

An Arameogram’s spelling is completely disconnected from the pronunciation of the word it designates. Although the Book Pahlavi script has far more ambiguous letters than its early relative, the monumental script of the Sasanian inscriptions of the third and fourth centuries CE, does, the Arameograms and historical orthography that are also characteristic of Book Pahlavi script are already present in these early inscriptions. Both of these features, Arameograms and historical orthography, are absent from the Manichaean script; in Manichaean Middle Persian words are largely spelled as they would have been pronounced.

Some real and hypothetical examples from English will help to illustrate each of these discrepancies between the Manichaean and Book Pahlavi scripts, and how they came about. First, we will look at historical spellings. To this day English has a word knight, fully half of whose written form consists of “silent” letters. These are vestiges of the word’s earlier pronunciation and spelling. Old English (spoken ca. 449-1150, with written records from the early eighth century CE) had a word cniht, meaning “boy,” which was pronounced as it was spelled: c and h, like all consonants in Old English, were both pronounced--the h as a voiceless palatal fricative, like the ch in German nicht. Later developments led to the orthographic shifts from c to k, and h to gh (as well as the semantic shift from “boy” to “military follower”), but the k and gh in what would become modern English knight only continued to be pronounced as consonants until the fifteenth and eighteenth centuries, respectively. The decline of the voiceless palatal fricative, in particular (the sound represented by the gh in English knight), is illustrated by the variant spellings of the word in Middle English (spoken ca. 1150-1450 CE): alongside knyht, knygt, knyght, knight we also have knyth, knytt, knit. But the dialect of English adopted by Henry V’s (r. 1413-22 CE) chancery staff, so influential for the development of “standard English,” happened to be that of the East Midlands, where this voiceless palatal fricative, represented by gh, was still pronounced. These fifteenth-century administrators opted for knight, and this eventually fossilized as the correct spelling, which persisted even as all English speakers stopped pronouncing the word’s k and gh.[1] Examples of this kind could be multiplied, of course: modern English is full of such historical spellings, which have not corresponded to pronunciation for centuries.[2] 

We can see a phenomenon much like this in the spelling of the word shahr, “realm, land” in Book Pahlavi: <štr'>.[3] Book Pahlavi, like most of the scripts of the premodern Middle East, was a consonantal one, so we have nothing directly marking the a of shahr, or the fact that the word contained no other vowels; this much had to be known by the reader. The <'>, meanwhile, simply represents the “word-final stroke,” which marks the end of many words in Book Pahlavi. But the really interesting thing here is the <t>, which does not correspond, in any straightforward way, to how the word was pronounced.


mmp text for blog 7 2

Manichaean Middle Persian fragment from Turfan, with portion of Mani’s Šabuhragān where he outlines Manichaeism’s superiority to the world’s other religions (M 5794r; from Turfanforschung database)


So why is shahr spelled <štr'> and not <šhr'> in Book Pahlavi? As with English knight, we need to go several centuries back in time to explain this. Although the antecedent of Middle Persian shahr in the Old Persian language (largely attested in the monumental Achaemenid inscriptions dating from the sixth to the fourth centuries BCE) is xšaçam, “realm” (with the x representing the voiceless velar fricative, pronounced like the ch in German doch; and ç representing some kind of sibilant, close to English s), Middle Persian is not the direct descendant of Old Persian in every respect. As a result of the influence of other Iranian languages, early Middle Persian, as it was developing between the fall of the Achaemenids (ca. 330 BCE) and the rise of the Sasanian Empire (ca. 226 CE), would have had a word something like *shathr, “realm, land.”[4] The earliest known Middle Persian spelling system, which would be the basis for the Book Pahlavi script, was developed when this word was still pronounced *shathr, and hence has a <t> representing the th sound. Although *shathr came to be pronounced shahr by the third century CE, the spelling system was never updated to reflect this phonological change—just as English knight continues to reflect a pronunciation that was already outdated five hundred years ago.

As for Arameograms, the analogies with English are less straightforward, although some hypothesizing will help to illustrate the situation. English has incorporated many foreign loanwords in the course of its history, with arguably the most important collection of foreign vocabulary coming from French, in the wake of the Norman conquest, in the eleventh century CE. Imagine that all of these French loanwords did not simply supplant, or persist alongside, their English synonyms, but rather that the new French vocabulary was entirely restricted to the written realm; words would be often spelled as if they were French, but pronounced as English. For example, the verb close, based on a French loanword, would not have entered the language as a synonym of the native English shut; shut would simply be frequently written <close>, but this new spelling would not correspond in any way to its pronunciation, which would remain shut. Similarly, French air would not have replaced Old English lyft; rather, people would have continued to say lyft, but it would now typically be written <air>. This is much like the situation in Book Pahlavi: many words, even the most common verbs and pronouns, are typically written as Aramaic, but they would have been pronounced as Persian. The word nām, “name,” for instance, is typically written not as <n'm> but as <ŠM>[5], corresponding to the Aramaic word for “name,” shem.

That the Aramaic and French influences on (respectively) Persian and English diverged so much has to do with the broader historical circumstances: French speakers successfully invaded Great Britain, and had a significant presence there for generations afterward, which had a decisive influence on the development of spoken and written English; and there was no Aramaic invasion of the Achaemenid Empire. Rather, the empire’s administrative staff initially wrote largely in Aramaic, but gradually shifted to writing the various Iranian languages in the Aramaic script, with certain words continuing to be spelled in Aramaic. Hence the Aramaic influence on Middle Persian, while very significant, was largely confined to the written realm.

In Manichaean Middle Persian, the words shahr, “land, realm,” and nām “name” are exclusively spelled <šhr> and <n'm>: as they were pronounced. The letters <n> and <r> look identical in Book Pahlavi, as do <h> and <'>, but this is not so in Manichaean Middle Persian either. This is the general pattern when we compare Manichaean Middle Persian to the varieties of Zoroastrian Middle Persian: not only Book Pahlavi, but, as we have discussed above, also the script used for the monumental Middle Persian inscriptions of the third and fourth centuries CE. Book Pahlavi has more ambiguous letters than the monumental script does, but its Arameograms and historical spellings continue an older writing tradition, which is also reflected in the monumental script. Manichaean texts, at least on the orthographic level, were far more straightforward to read, due to their far more phonetic spelling, and lack of Arameograms and ambiguous letters.

Book Pahlavi has more ambiguous letters than the monumental script does, but its Arameograms and historical spellings continue an older writing tradition, which is also reflected in the monumental script.

Why did the Manicheans, and perhaps Manichaeism’s founder Mani in particular, adopt a new script in the third century CE? This new script was used not only for Manichaean texts in Middle Persian, but also for Manichaean texts in the other Middle Iranian languages Sogdian and Parthian. These languages’ existing writing systems were rooted in the Aramaic scribal traditions of the Achaemenid Empire too, and employed historical spellings and Arameograms analogous to those in the Middle Persian system which was the basis for both the early monumental inscriptions and Book Pahlavi. And just like Manichaean Middle Persian, Manichaean Parthian and Manichaean Sogdian also largely dispense with these historical spellings and Arameograms.

manis seal for blog 7 2

As argued by Zsuzsanna Gulácsi, possibly Mani’s personal crystal seal, with Aramaic inscription reading “Mani, apostle of Jesus Christ” running counter-clockwise around it (image from Gulácsi 2013)

Here, too, the history of the English language may furnish a revealing parallel. In the 1850s, the Mormons of Utah, under the leadership of Brigham Young, adopted a new writing system, which would be known as the Deseret Alphabet.[6] In this system, as in that of the Manichaeans, words were spelled entirely according to how they were pronounced; historical spellings and their associated “silent” letters were dispensed with.

Historians differ on why exactly the Mormons tried to introduce this system, but two aspects of their motivations have likely resonances with the Manichaean case. First, proselytization was a major concern for the Mormons, and they especially wished to reach new immigrants to the United States and Native Americans, who would not necessarily have had a firm command of the difficult English writing system to begin with. It was thought that teaching these people new to the English language the Deseret Alphabet, and providing them with Mormon texts written in this script, would greatly simplify their passage into the Mormon community. Mani had similar ambitions: to circulate his writings as widely as possible, and bring as many new converts into the fold as he could. This goal likely manifested itself in his religion’s new script, which was far more readable than the existing alternatives.

Second, the Mormons’ ambitions were broader than “just the creation of another Christian church,” and expanding the number of its followers. Rather, as their leader Brigham Young put it, “we will continue to improve the whole science of truth; for that is our business; our religion circumscribes all things, and we should be prepared to take hold of whatever will be a benefit and blessing to us.” The Deseret Alphabet, accordingly, was not only a means to the narrow end of growing the Mormon community; it was also meant to be a contribution to all of humanity. Corresponding with the all-encompassing, all-superseding nature of the Mormon faith, which “circumscribes all things,” the Deseret Alphabet, it was thought, would eventually be universally adopted. Here too, we have a resonance with Manichaean history.[7]


sermon on the mount 16 feb 1859 deseret news

The Sermon on the Mount written in the Deseret alphabet, along with a table giving the letters’ Roman script equivalents, as published in the February 16, 1859 edition of the Deseret News (from Wikipedia)


Mani spoke of Manichaeism in similar terms to those used by Brigham Young for Mormonism. Mani lists ten respects in which his religion is superior to those which preceded it, which includes, first and foremost, its universal character: while the previous religions “were in one country and one language... my religion is such that it will be manifest in every country and in every language, and it will be taught in distant countries.” Hence, aside from the immediate practical aim of making as many new Manichaean converts as possible, Mani’s invention of a new and improved alphabet, more broadly applicable than what had preceded it, may well have had an ideological basis, in his aspirations to found a truly universal religion.



BeDuhn, Jason. 2015. “Mani and the Crystallization of the Concept of ‘Religion’ in Third Century Iran.” In Mani at the Court of the Persian Kings: Studies on the Chester Beatty Kephalaia Codex, edited by Iain Gardner, Jason BeDuhn, and Paul Dilley, 247–75. Leiden: Brill.

Denison, David, and Richard M. Hogg. 2010. A History of the English Language. Cambridge: Cambridge University Press.

Gulácsi, Zsuzsanna. 2013. “The Crystal Seal of ‘Mani, the Apostle of Jesus Christ’ in the Bibliothèque Nationale de France.” In Manichaean Texts in Syriac, 245–67, + Plates 16-23. Turnhout: Brepols.

Henning, W. B. 1958. Mitteliranisch. Leiden: Brill.

Herzfeld, Ernst. 1981. Paikuli: Monument and Inscription of the Early History of the Sasanian Empire. Berlin: Dietrich Reimer.

Moore, Richard G. 2006. “The Deseret Alphabet Experiment.” The Religious Educator 7 (3): 62–77.

Skjærvø, Prods Oktor. 1995. “Aramaic in Iran.” Aram 7: 283–318.

———. 1996. “Aramaic Scripts for Iranian Languages.” In The World’s Writing Systems, edited by Peter Daniels and William Bright, 515–35. New York; Oxford: Oxford University Press.

Sundermann, Werner. 1985. “Schriftsysteme und Alphabete im alten Iran.” Altorientalische Forschungen 12 (1–2): 101–13.


[1] Not to mention the Great Vowel Shift of the 15th-17th centuries CE, which led to the vowel i shifting from representing the sound [i:] (“ee” as in “reed”), to [aɪ] (as in the pronoun “I”).

[2] There are also pseudo-historical spellings, such as sleigh, whose purely cosmetic gh corresponds to nothing that was ever pronounced; for pseudo-historical spellings, too, there are analogues in Book Pahlavi orthography.

[3]  Generally I will put spellings inside angled brackets < >, to distinguish spelling from pronunciation. <š> as a letter corresponds to the consonant written “sh” in English.

[4] Rather than the *shas one would expect, if the Middle Persian word were the direct descendant of Old Persian xšaçam. Perhaps *shas existed alongside *shathr for a time, like the various forms of Middle English knight with and without the voiceless palatal fricative we saw earlier, before ultimately being replaced by *shathr.

[5] According to modern scholarly conventions, Arameograms are written in capital letters.

[6] Named after the University of Deseret, where it was developed, which would eventually become the University of Utah. Deseret was the provisional state proposed by the Mormon community, which encompassed a large swath of what would become the western United States: not only most of the territory of modern-day Utah, but also substantial portions of what are now Nevada, California, Arizona, New Mexico, Colorado, Idaho, Wyoming, and Oregon.

[7] Quotations from Richard G. Moore, “The Deseret Alphabet Experiment” (Religious Educator 7.3, 2006).