Project Madurai

Electronic versions of printed texts (abbreviated as ETexts) of ancient literary works are important pedagoic and scholarly resources. Stored in easily accessible archives, they permit preservation and wider distribution of ancient literary works around the globe through the means of internet. Etexts of literary works also allow quick search for phrases, words, and combinations of words in any literary work. There are many projects currently active world-wide that attempts to put in electronic form ancient literary works.
"Project Madurai" will be an open and voluntary initiative to collect and publish free electronic editions of ancient tamil literary classics. This means either typing-in or scanning old books and archiving the text in one of the most readily accessible formats ("ETEXTS") for use on all popular computer platforms. Some of the text files will also be converted to HTML files, and put up on the World Wide Web servers so that anyone located anywhere may download a copy for personal use or read what we publish on the internet, free of charge . While major emphasis will be given to produce etexts the tamil literatary classics in native tamil script, popular tamil works will also be made available in 'transliterated/romanized' text format widely used by western indologists. The project will be coordinated by a handful of volunteers, most of which have Project Madurai s as their hobby, just like you!
Please consult the webpage of the Univ. of Virginia for a collection of projects in languages of the world. Alex's catalog provides on-line search of available etexts. There are already a handful of etext archiving projects of tamil works, some by major universities:
Univ. of Cologne, Germany, Univ. of California, Berkeley and the Inst. of Asian Studies, Chennai,
National Univ. of Singapore and International Inst of Tamil Studies
and also by a handful of individuals.
One of the major goals of the Project Madurai will be coordinate such scattered activities.

"Project Madurai"- Frequently Asked Questions (FAQ)

How did this start?
Several etext archiving projects have been taking place world-wide particularly in the last five years. Through postings in the soc.culture.tamil (usenet), tamil.net (email discussion list), many expressed the desire to start similar projects, targetting ancient tamil literary classics. Based on recent calls in tamil.net, the project, officially takes off on Pongal/tamil new year day of Jan. 14, 1998!

Significance of Madurai to be chosen as the name? ?

Madurai is one of the oldest cities of southern India. It has been a centre of learning and pilgrimage, for centuries. Legend has it, that the divine nectar falling from Lord Shiva's locks, gave the city its name - Madhurapuri, now known as Madurai. Madurai's history dates back to over 2000 years ago, when it was the capital of the Pandyan kings. In the 10th century AD, Madurai was captured by the Chola emperors. It remained in their hands, until the Pandyans regained their independence in the 12th century, only to lose it to the Muslim invaders under Malik Kafur, a general in the service of the Delhi Sultanate. Malik Kafur's dynasty was overthrown by the Hindu Vijaynagar kings of Hampi. After the fall of Vijayanagar, in 1565, the Nayaks ruled Madurai until 1781 AD. During the rule of the Nayaks, the bulk of the Meenakshi temple was built, the main attraction for visitors, today. Madurai also became the cultural centre of the Tamil people. Madurai passed on to the East India Company in 1781, and in 1840, the Company razed the fort which had previously surrounded the city, and filled in the moat. Four streets, the Veli streets, which were constructed on top of the fill, till today, define the limits of the old city.
It was the Pandyan Kings during their long reigning period, who set up Sangams (academies) for the encouragement and criticism of Tamil Studies. The Sangam period lasting for several centuries, is considered universally as the Golden Age of Tamil Literature. Great anthologies such as Ettuthohai and Pattupattu were compiled and many immortals like Iraiyanar, Valluvar, Kapilar, Nakkirar, Paranar, Auvaiyar and Ilango Adigal gave their best to the Tamil Muse. The proposed electronic text archive project devoted to tamil literature is named after this great historic city Madurai.

Who will do the work?
You do! Project Madurai is based on voluntary cooperation between many people in several countries. You can use scanners or keyboards to enter text from (old) books, and send the text by e-mail to the project leader or one of the regional coordinator. The coordinators are experts in Etext archiving, HTML and World Wide Web technology and so you don't have to know these details to be a contributor. You can also contribute to the project in several other ways: writing OCR and translation softwares for cross-platform transport of files, proof-reading, publicising locally or collecting source materials (photocopies or donated books) for future archiving. The etext files will carry explicitly in the header part the person(s) actually involved in the keying in of the text and also the person(s) involved in the proof-reading part. If possible the header also will indicate the hard copy details (publisher, year etc) used as a reference for proof-reading/editing.

Who owns the Project?
As a grass-root internet-based effort, it belongs to each and every one of you. We very much rule ourselves. As a volunteer, you have a say! The project is neither commercial nor governmental.

What are the works chosen for archiving ?
On the choice of works, the main criteria would be honoring of copyright protection given to authors. Even though the copyright rules vary from country to country, in most of the etext archiving projects, elapse of at least 50-70 years after the death of the author is considered a safe criterion. So, as a rule of thump, we can consider works of authors who died before 1929. Hence is the tilt for archiving ancient tamil literary classics.
The second main reason for going for ancient literature is that they are out of print and hence stand the risk of getting lost to the world. 20th century tamil literature is largely dominated by novels and associated decrease in the sale of printed copies of ancient literary works covering other domains. Key publishing houses such as Saiva Siddhantha Trust have dropped several of their projected reprinting of classics for lack of market. Few university libraries have copies that were printed in twenties or thirties. In the absence of adequate storage facilities these works are being eaten away by insects. Tamil language has a rich heritage dating to several thousand years. We all have a moral obligation to ensure that the future generation do have access to this rich treasure, possibly via better means of archival and world-wide distribution. Having said this, any ancient literary work for which we can find a hard/printed copy and importantly volunteers to key-in the text can be considered for inclusion in the archives.
As regards modern literary works, yes, they can be included provided the concerned author (or their legal heirs) is prepared to give explicit written consent for the work to be put up in electronic form and for un-restricted, free distribution on the internet. (some etext archives do accept shareware works where the user has to pay a fixed amount to download a copy. Since this will involve adminstrative bureaucracy, we will not go for shareware option, at least in the initial stages). Through special arrangements with the nearest kins, Tamilnadu Govt. has placed "in public domain" the works of select tamil authors of 20th C - Bharathiyar, Bharathidaasan, Annadurai and few others (can someone get this official list from the Govt of Tamilnadu and let us know?). Hence these works will be included in the present project.
A tentative time-table of the work being undertaken is available. Anyone who would like to coordinate or volunteer for this project, please contact the project leader.

Do you plan to use both direct typing and scanning of texts used to generate etexts?
Yes, as stated in the preamble part, Project Madurai will use to the maximum possible text input in the form of direct keying in of texts and also via use of scanning /OCR techniques. Even though scanning is included as a means of archiving, without appropriatae OCR packages, it is rather premature to engage in archiving of scanned images. Major efforts such as that of Univ. of Chicago and Roja Muthiah Tamil Library do microfilming as the first step of archiving. For the above reasons, in the early stages, emphasis will be direct typing of textsKeying in of tamil texts does not require any major resources. It can be done by anyone at his home at his own pace using simple word-processing softwares that are in all computers.

In what format will be the etext files archived and distributed?
We are committed to using the forthcoming Tamil Standard Code for Information Interchange (TSCII) as the encoding format for the archives. But it may take end of this year (1998) when a TSCII standard has been accepted by Tamilnadu Govt. and International Standardisation bodies. In order not to loose time and the momentum, we will go ahead and try to build etexts using some of the most popular tamil encoding formats.
First, to simplify our tasks, at least in the early stages, the number of formats that we will handle/release etexts will be limited to the following four:

tamil typewriter
mylai or mylai-sri (please only the old mylai version alone !)
anjal/inaimathi
romanized/transliterated txt
(the forthcoming world standard TSCII will be added later!)
The reasons for limiting the choice to these four are
a) Over 90% of the tamils worldwide use one of these;
b) fonts are available free for use on all of the three major computer platforms - windows, macintosh and unix. Anyone who uses these computers can work in all of these formats; and
c) convertors are available that work reliably to go between these formats.
For any other format, convertors are to be first written and tested. When we are not too far from going for a TSCII standard, it will be a sheer waste of time to add more font formats to the above list. Instead of forcing every volunteer to be familiar with features of each of these format options, it is enough if we have a couple of knowledgeable coordinators who can generate the equivalent files using Adhawin and Anjal. The goal lis to let let the volunteers work in the environment (font and computer system) he/she feels comfortable with. Very minimal constraints if any will be imposed on the volunteers who will do the major task of keying in of texts.

Who will pay the bill?
At least in the initial stages, Project Madurai will be a pure voluntary effort, involving cooperation between many people in several countries. The volunteers and the coordinators will use their private computer resources to generate the tamil etext collections. Tamil.net server, freely hosted by the Asia Pacific Internet Company (APIC), Sydney, Australia will be the depository of the etext archives. Most probably the lead URL "http://www.tamil.net/projectmadurai/pminfo.html " will be the door to the website for this project.

Where will I can be informed on this project?
Progress in the archiving of tamil etexts will be put up periodically in the form of webpages and updates announced periodically in tamil.net and soc.culture.tamil newsgroups/mailing lists.

Who are the coordinators of this project?
The following have agreed to be coordinators/volunteers of the project, covering geographical zones indicated in parenthesis:

PROJECT MADURAI

Overall coordination:
Logistics Regional Coordinators /Working Groups

Interested to contribute?
If you are interested to participate in the project, send an email to the Project Leader or to one of the regional coordinators expressing your plans and/or interests. Be specific in stating in which way you will like to contribute and specify the work(s) if any that you are interested. We can go from there.


This file was last updated on Jan 20, 1998.
The top number in the box shows the number of visitors since Jan 16, 1998.
and these webpages are supported by