Heralding a new era for the Arabic language lexicon, Sharjah has embarked on a landmark project to chronicle 17 centuries of development in the Arabic language spanning five distinct time periods. The Historical Corpus of the Arabic Language is a monumental undertaking that will offer unparalleled insight into the world’s fifth most widely spoken language and serve as a linguistic resource for researchers, academia, linguists and students worldwide.
Hundreds of senior researchers and linguists, editors, and experts from 10 Arabic language academies across the Arab world are currently in the process of documenting and researching the history and evolution of all Arabic words.
Upon completion at an estimated timeframe of 6 years, this will be the most comprehensive historical corpus of the Arabic language undertaken till date, and also the first one to cover its evolution from the pre-Islamic period through its growth during the Islamic era and several dynasties to its modern form.
Documenting 17 centuries of the Arabic language
With roots that lie in classical and modern Semitic, African and Asian languages, Arabic is a rich and sophisticated language that has had an enduring legacy in shaping civilisations across the Middle East and Africa. Spoken by more than 400 million people in these regions, it was also the medium through which philosophers, mathematicians, and astronomers pursued knowledge during the Golden Age of Islam.
Since the dawn of the last century, efforts have been underway to document the ancient Arabic language in an all-encompassing corpus. However, the massive scale of such a project coupled with sub-par planning and financial constraints brought such initiatives to a halt until it was resumed following the directives of His Highness the Ruler of Sharjah.
With the digitisation of nearly 20,000 Arabic books, manuscripts, sources, and historical documents, the Historical Corpus of the Arabic Language will be a portal into 17 centuries of the Arabic language which includes Arabic engravings and antiquities dating back to the third century before Islam.
The corpus will answer several questions about language use as researchers go to the original root of each word, and methodically trace the usage of Arabic vocabulary in five distinct time periods: Pre-Islamic period, the Islamic era from Hijri years 1 AH to 132 AH, the Abbasid Caliphate, the development of nation states, and the present modern-day era.
Digital repository of more than 40,000 titles, extracts and documents
On completion, apart from a physical book, the project will also see the development of a massive digital library that will host more than 40,000 titles, extracts and documents in fields as diverse as literature, poetry, philosophy, history and the sciences. Many of these will be available in a digital format for the first time.
The Historical Corpus will provide a detailed documentation of word roots, derivatives, and phonetic variations. The history of every derivative word will be traced to find its first known usage dating from the pre-Islamic era until the modern day.
It will outline the development and evolution of terms used throughout the centuries and document the entry of new words into the language at different periods of time. It will also detail semantic changes – be it semantic shift, progression, development, or drift, whether through Arabic speakers or speakers of other Semitic languages that have influenced the Arabic language.
The project will reference the broad history of science and arts, and delve into the scientific study of the Arabic language, including syntax, morphology, fiqh (Islamic jurisprudence), phonetics, arūd (Arabic prosody, or poetic metres), rhetoric, and more, highlighting the words born out of these sciences.
Word comparisons with Semitic languages such as Hebrew, Akkadian, Syriac, Abyssinian, and others, will be emphasised. To accomplish this task, a specialised committee of Semitic language scholars has been formed who are tracing similarities and differences between Arabic words and their equivalents in Semitic languages and citing examples along with documentation of the original source.
Commenting on the project, Dr. Mohammed Safi Al Mostaghanemi, Secretary-General of the Arabic Language Academy in Sharjah, said: “The Historical Corpus of the Arabic Language fulfils a dream of the entire Arabic speaking world; it is an exemplary feat that will shine light on the richness of the Arabic language. No other project in recent times has drawn as much attention from linguists and Arabic language enthusiasts as this corpus, especially since several world languages have developed their own corpora – most notably Romance and Germanic languages such as French and English.”
Dr. Al Mostaghanemi added: “The Arabic language corpus had long been delayed due to arbitrary planning, financial constraints and the sheer scale of the endeavour. It is therefore a matter of great pride that Sharjah – an emirate recognised globally for its role in disseminating culture and knowledge, has reignited this project under the leadership of His Highness Sheikh Dr. Sultan Al Qasimi. It is solely due to His Highness’s unyielding dedication to supporting scientific research as well as preserving and promoting the Arabic language and Arab cultural initiatives that the project has finally managed to take off in the right direction.”
Created using a fully digitised platform with state-of-the-art technology, the Historical Corpus of the Arabic Language will be easy to access and navigate for both linguists and members of the public. The use of optical character recognition (OCR) technology for all documents will further enable researchers to find the information they require quickly within a broad historical context.