The Naibbe cipher: a substitution cipher that encrypts Latin and Italian as Voynich Manuscript-like ciphertext

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

In this article, I investigate the hypothesis that the Voynich Manuscript (MS 408, Yale University Beinecke Library) is compatible with being a ciphertext by attempting to develop a historically plausible cipher that can replicate the manuscript’s unusual properties. The resulting cipher—a verbose homophonic substitution cipher I call the Naibbe cipher—can be done entirely by hand with 15th-century materials, and when it encrypts a wide range of Latin and Italian plaintexts, the resulting ciphertexts remain fully decipherable and also reliably reproduce many key statistical properties of the Voynich Manuscript at once. My results suggest that the so-called “ciphertext hypothesis” for the Voynich Manuscript remains viable, while also placing constraints on plausible substitution cipher structures.

Similar Papers
  • Research Article
  • 10.6084/m9.figshare.903756.v1
VM408 folio86v ‘The Rosette Map’: Elements of a mappa mundi and a map of the Elements
  • Feb 7, 2014
  • Wastl Juergen + 1 more

The Rare Manuscript Library at Yale keeps MS408 which is better known as the Voynich Manuscript named after Wilfried M Voynich, who rediscovered it in an Italian monastery in 1912. This manuscript comprises a collection of folios with yet undeciphered text and multi-facetted images. Amongst these are depictions of plants highlighting a botanical or pharmaceutical background, nude women and astrological imageries. Historians and Laymen alike have tried to understand its images and texts over the hundred years since its rediscovery. Many attempts using different techniques to unravel its coding were made, however, to no avail. The continuing unsuccessful attempts to unravel the contents even led to a hypothesis that the Voynich manuscript ‘must’ be fake, based on textual and statistical analysis (1). Rugg provided a way and technique to fake-produce a ‘senseless’ manuscript seemingly effortlessly in a very short time in the style of the Voynich manuscript (2). The question of when this manuscript was produced has also been widely discussed, and remained unresolved so far. A recent chemical analysis based on radio carbon dating sets the date of production of the parchment in the early 15th century (1408-1434) (3). This date was previously predicted by N. Pelling’s independent approaches and evidence based on details in the images in the manuscript. N. Pelling’s book provides an exhaustive amount of details on the Voynich manuscript (4). Of the multitude of sections with botanical, astrological or pharmaceutical imagery one section did not catch the attention and focus so far: The Rosette Map (f86v), named by Mary d’Imperio according to its appearance (5) is one of the most intriguing but also most neglected area of the manuscript. Plenty of speculations exist for the display of details of individual geographic locations in the Rosette Map (Venice, Naples, Pompeii, Tuscan Renaissance gardens), however, no cohesive analysis of the entire map has been published to date. Cartographic depiction of geographic locations in medieval maps was achieved with so-called mappae mundi. These vary in many details (e.g. size, shape, orientation, captions) depending on their use. Huge maps for visualisation and for purposes of presentation e.g. the Hereford map or the

  • Research Article
  • 10.1080/01611194.2024.2414128
The Voynich Manuscript was written in a single, natural language
  • Oct 6, 2024
  • Cryptologia
  • Raj V Ponnaluri

The Voynich Manuscript (VM), or MS 408, preserved in Yale University’s Beinecke Library was acquired by Wilfrid Voynich in 1912. The unreadable, bound vellum folios contain illustrations of plants, astronomical and astrological diagrams, female images, and a large volume of text. Linguists, paleographers, historians, and general enthusiasts have attempted to decipher the text. This work focuses on the morphology of VM, with a focus on the two-language and five-scribe theories of Currier and Davis, respectively. There are two main objectives of this work. One is to examine Currier’s claim of two languages drawn from two hands and Davis’ paleographic evidence that there were five scribes. The other is to evaluate if VM contains writing in a natural language or just gibberish. Following a comprehensive analysis of word types, token, ranks, characters, placement, and frequencies, this work proceeds to demonstrate that there is no evidence for two ‘languages’ or dialects. Second, with the aid of Zipf’s, Brevity and Heap’s laws, this work shows that VM contains natural language, as noted by Landini, Reddy and Knight, and Bowern and Lindemann. Word Token analysis suggests that Scribes 1–4 must have distributed the workload equitably, and that Scribe 5 may have been a late entrant.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 19
  • 10.1162/tacl_a_00084
Decoding Anagrammed Texts Written in an Unknown Language and Script
  • Dec 1, 2016
  • Transactions of the Association for Computational Linguistics
  • Bradley Hauer + 1 more

Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97% accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93% on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 56
  • 10.1371/journal.pone.0067310
Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript
  • Jul 2, 2013
  • PLoS ONE
  • Diego R Amancio + 4 more

While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.

  • Research Article
  • Cite Count Icon 17
  • 10.1080/01611194.2013.797041
Efficient Cryptanalysis of Homophonic Substitution Ciphers
  • Jan 1, 2013
  • Cryptologia
  • Amrapali Dhavare + 2 more

Substitution ciphers are among the earliest methods of encryption. Examples of classic substitution ciphers include the well-known simple substitution and the less well-known homophonic substitution. Simple substitution ciphers are indeed simple, both in terms of their use and their cryptanalysis. Homophonic substitutions—in which a plaintext symbol can map to more than one ciphertext symbol—are also easy to use, but far more challenging to break. Even with modern computing technology, homophonic substitutions can present a significant cryptanalytic challenge. This article focuses on the design and implementation of an efficient algorithm to break homophonic substitution ciphers. The authors employ a nested hill climb approach that generalizes the fastest known attack on simple substitution ciphers. They test their algorithm on a wide variety of homophonic substitutions and provide success rates as a function of both the ciphertext alphabet size and ciphertext length. Finally, they apply their technique to the “Zodiac 340” cipher, which is an unsolved message created by the infamous Zodiac killer.

  • Dissertation
  • 10.31979/etd.u44a-fh82
EFFICIENT ATTACKS ON HOMOPHONIC SUBSTITUTION CIPHERS
  • Apr 18, 2019
  • Amrapali Dhavare

Efficient Attacks On Homophonic Substitution Ciphers by Amrapali Dhavare Substitution ciphers are one of the earliest types of ciphers. Examples of classic substitution ciphers include the well-known simple substitution and the less well-known homophonic substitution. Although simple substitution ciphers are indeed simple both in terms of their use and attacks; the homophonic substitution ciphers are far more challenging to break. Even with modern computing technology, homophonic substitution ciphers remain a significant challenge. This project focuses on designing, implementing, and testing an efficient attack on homophonic substitution ciphers. We use an iterative approach that generalizes the fastest known attack on simple substitution ciphers and also employs a heuristic search technique for improved efficiency. We test our algorithm on a wide variety of homophonic substitution ciphers. Finally, we apply our technique to the “Zodiac 340” cipher, which is an unsolved ciphertext created in the 1970s by the infamous Zodiac killer.

  • Research Article
  • Cite Count Icon 4
  • 10.1016/j.physa.2015.12.031
Does network complexity help organize Babel’s library?
  • Dec 22, 2015
  • Physica A: Statistical Mechanics and its Applications
  • Juan Pablo Cárdenas + 3 more

Does network complexity help organize Babel’s library?

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 6
  • 10.21123/bsj.2020.17.4.1320
Combining Several Substitution Cipher Algorithms using Circular Queue Data Structure
  • Dec 1, 2020
  • Baghdad Science Journal
  • Noor A Iibraheem + 1 more

With the revolutionized expansion of the Internet, worldwide information increases the application of communication technology, and the rapid growth of significant data volume boosts the requirement to accomplish secure, robust, and confident techniques using various effective algorithms. Lots of algorithms and techniques are available for data security. This paper presents a cryptosystem that combines several Substitution Cipher Algorithms along with the Circular queue data structure. The two different substitution techniques are; Homophonic Substitution Cipher and Polyalphabetic Substitution Cipher in which they merged in a single circular queue with four different keys for each of them, which produces eight different outputs for every single incoming letter. The present work can be applied efficiently for personal information security and network communication security as well, and the time required for ciphering and deciphering a message is less than 0.1 sec.

  • Research Article
  • Cite Count Icon 3
  • 10.1515/cllt-2015-0001
Predicting the gender of Welsh nouns
  • Jan 1, 2016
  • Corpus Linguistics and Linguistic Theory
  • Michael Hammond

Welsh grammatical gender exhibits several unusual properties. This paper argues that these properties are necessarily connected. The argument is based on a series of corpus investigations using techniques from statistical natural language processing, specifically distinguishing properties that exhibit significant statistical patterns from those which can be used to make useable predictions. Specifically, it’s shown that the grammatical properties of Welsh gender are such that its unusual statistical properties follow.

  • PDF Download Icon
  • Conference Article
  • Cite Count Icon 18
  • 10.18653/v1/d18-1102
Decipherment of Substitution Ciphers with Neural Language Models
  • Jan 1, 2018
  • Nishant Kambhatla + 2 more

Decipherment of homophonic substitution ciphers using language models is a well-studied task in NLP. Previous work in this topic scores short local spans of possible plaintext decipherments using n-gram language models. The most widely used technique is the use of beam search with n-gram language models proposed by Nuhn et al.(2013). We propose a beam search algorithm that scores the entire candidate plaintext at each step of the decipherment using a neural language model. We augment beam search with a novel rest cost estimation that exploits the prediction power of a neural language model. We compare against the state of the art n-gram based methods on many different decipherment tasks. On challenging ciphers such as the Beale cipher we provide significantly better error rates with much smaller beam sizes.

  • Conference Article
  • 10.1063/1.2356423
Regions of Unusual Statistical Properties as Tools in the Search for Horizontally Transferred Genes in Escherichia coli
  • Jan 1, 2006
  • Catherine Putonti + 6 more

The observed diversity of statistical characteristics along genomic sequences is the result of the influences of a variety of ongoing processes including horizontal gene transfer, gene loss, genome rearrangements, and evolution. The rate at which various processes affect the genome typically varies between different genomic regions. Thus, variations in statistical properties seen in different regions of a genome are often associated with its evolution and functional organization. Analysis of such properties is therefore relevant to many ongoing biomedical research efforts. Similarity Plot or S‐plot is a Windows‐based application for large‐scale comparisons and 2D visualization of similarities between genomic sequences. This application combines two approaches wildly used in genomics: window analysis of statistical characteristics along genomes and dot‐plot visual representation. S‐plot is effective in detecting highly similar regions between two genomes. Within a single genome, S‐plot has the ability to identify highly dissimilar regions displaying unusual compositional properties. The application was used to perform a comparative analysis of 50+ microbial genomes as well as many eukaryote genomes including human, rat, mouse, and drosophila. We illustrate the uses of S‐Plot in a comparison involving Escherichia coli K12 and E. coli O157:H7.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.32351/rca.v5.128
Manuscrito de Voynich - Análisis del algoritmo de codificación con los métodos de cifrado conocidos en la época medieval y resultados de las marginalias que no fueron encriptadas
  • Jan 6, 2020
  • Revista Científica Arbitrada de la Fundación MenteClara
  • Alisa Gladyševa

En el presente estudio describo y analizo dos objetivos, el primero es sobre la oposición y la equivalencia del algoritmo de codificación del manuscrito Voynich con los métodos conocidos de cifrado del período medieval. Según los resultados de mi investigación innovadora del manuscrito Voynich, este fue escrito en gallego medieval (gallego-portugués). Su algoritmo de codificación fue influenciado por el cifrado de sustitución del uso de un cifrado polialfabético para la mayor parte de su texto, así como definitivamente fue influenciado por el cifrado de transposición para texto alquímico de cifrado doble. Sin embargo, debe mencionarse que existen diferencias significativas entre los códigos que se usaron en el período medieval y el algoritmo codificado del manuscrito Voynich. Por la razón que hizo que el descifrado fuera más complicado a lo largo de los siglos, el cifrado por sustitución de un cifrado polialfabético se usó en parte y simultáneamente con cifrado monoalfabético, además del texto sin codificar. Por lo tanto, lo principal a tener en cuenta es que el segundo objetivo de mayor interés en este artículo en particular son partes de los textos del manuscrito de Voynich –marginalias– que no se cifraron en absoluto y su lectura.

  • Conference Article
  • Cite Count Icon 2
  • 10.1063/1.2356390
Statistical Properties of Short Subsequences in Microbial Genomes and Their Link to Pathogen Identification and Evolution
  • Jan 1, 2006
  • Meizhuo Zhang + 6 more

Views Icon Views Article contents Figures & tables Video Audio Supplementary Data Peer Review Share Icon Share Twitter Facebook Reddit LinkedIn Tools Icon Tools Reprints and Permissions Cite Icon Cite Search Site Citation Meizhuo Zhang, Catherine Putonti, Sergei Chumakov, Adhish Gupta, George E. Fox, Dan Graur, Yuriy Fofanov; Statistical Properties of Short Subsequences in Microbial Genomes and Their Link to Pathogen Identification and Evolution. AIP Conf. Proc. 8 September 2006; 854 (1): 13–18. https://doi.org/10.1063/1.2356390 Download citation file: Ris (Zotero) Reference Manager EasyBib Bookends Mendeley Papers EndNote RefWorks BibTex toolbar search Search Dropdown Menu toolbar search search input Search input auto suggest filter your search All ContentAIP Publishing PortfolioAIP Conference Proceedings Search Advanced Search |Citation Search

  • Book Chapter
  • 10.1007/978-1-4757-9337-6_5
Rydberg Atoms Radiating in Free-Space or in Cavities : New Systems to Test Electrodynamics and Quantum Optics at an Unusual Scale
  • Jan 1, 1986
  • Serge Haroche

Rydberg atoms, characterized by a very strong coupling to the microwave and millimeter wave part of the electromagnetic spectrum, exhibit unusual radiative properties in free space or in resonant cavities. The rates of spontaneous emission between Rydberg levels, although small in absolute value, are very large when compared to those of ordinary atoms or molecules radiating in the same frequency range. The rates of transitions induced by external fields impinging on Rydberg atoms are also very large. In particular, blackbody radiation, even at low temperature, has dramatic effects on the Rydberg state lifetimes and thermal radiation dependent energy shifts of these states are observable. Amplification of radiation by Rydberg systems in free space or in cavities leads to the realization of new types of maser devices which operate with very low thresholds. Collective systems of Rydberg atoms radiating in cavities are in fact quasi ideal examples of superradiant sources and their statistical properties (atomic and field fluctuations) are very interesting to study as examples of macroscopic quantum systems or sources of non classical “squeezed” states of radiation.

  • Research Article
  • Cite Count Icon 74
  • 10.1051/0004-6361:20040337
A hint of Poincaré dodecahedral topology in the WMAP first year sky map
  • Aug 12, 2004
  • Astronomy & Astrophysics
  • B F Roukema + 4 more

Luminet et al. (2003) suggested that WMAP data are better matched by a Poincar\'e dodecahedral FLRW model of global geometry, rather than by an infinite flat model. The analysis by Cornish et al. (2003) for angular radii 25-90 degrees failed to support this. Here, a matched circles analysis specifically designed to detect dodecahedral patterns of matched circles is performed over angular radii in the range 1-40 degrees on the one-year WMAP ILC map, using a correlation statistic and an rms difference statistic. Extreme value distributions of these statistics are calculated for left-handed and right-handed 36 degree `screw motions' (Clifford translations) when matching circles and for a zero (unphysical) rotation. The most correlated circles appear for circle radii of 11\pm1 degrees, for the left-handed screw motion, but not for the right-handed one, nor for the zero rotation. The favoured six dodecahedral face centres in galactic coordinates are (l,b)= (252, +65), (51, +51), (144,+38), (207,+10), (271,+3), (332,+25) and their opposites. The six pairs of circles_independently_ each favour a circle angular radius of 11\pm1 degrees. Whether or not these six circle pairs centred on dodecahedral faces match via a 36 degree rotation only due to unexpected statistical properties of the WMAP ILC map, or whether they match due to global geometry, it is clear that the WMAP ILC map has some unusual statistical properties which mimic a potentially interesting cosmological signal. The software for reading the WMAP data and for carrying out this analysis are released under the GNU General Public License.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.