Web service for 19th century Irish personal name matching


Wangrungarun, Phattara (2015) Web service for 19th century Irish personal name matching. Masters thesis, National University of Ireland Maynooth.

[img]
Preview
Download (3MB) | Preview


Share your research

Twitter Facebook LinkedIn GooglePlus Email more...



Add this article to your Mendeley library


Abstract

Before the first Irish civil registration on 1864, census materials were mostly lost or incomplete. So genealogical research uses parish records and also some ‘census substitute’ documents, such as land ownership and tenancy records. However, some of these documents may not contain enough information in identify individuals. Some of them contains a name and address, whereas others might contain only a name. Record linkage is one method to gather scattered information among many documents. It uses a person's name as a reference to link that person's information between many documents.With patience, a more complete information about that person can be obtained. Therefore linking or matching a person's name is important in the process. Unfortunately, in the 19th century, in Ireland, there was no standard spelling of names, handwriting could be difficult to read and contractions or abbreviations were often used. The names with the same pronunciation and for the same individual could be written in many different ways. Moreover, names in the Irish language which are equivalent to English names were used, for example, Irish version of ‘Smith’ could be ‘Gowan’. A further complication is that historical and genealogical research often requires large quantities of names to be matched. To handle these name variations, various solutions have been created to find matching different names that refer to the same person. However, for our extent knowledge, there is yet no public system which encodes those solutions together and provides a service of bulk name matching. Thus, we developed a web service system using Ruby on Rails framework to achieve our goal. The system is initially encoded with 4 matching algorithms, Levenshtein distance, soundex, Irish soundex, and lookup table. We also present a web interface for a client to use the system from the web browser. It is designed to be simple and extensible from using inheritance. The system performs matchings on large quantities of names in a reasonable time. We test our system with 12,944 name matchings and the result were completed in no more than half a minute (28,786 milliseconds, to be precise). However, the system consumes a large amount of memory (around 373 megabytes). We believe that, with proper optimisation, we would reduce the memory usage along with a shortened processing time. Further matching algorithms could also be implemented for names in other languages, so that it can handle a broader domain of names.

Item Type: Thesis (Masters)
Additional Information: Taught Masters Thesis for the Erasmus Mundus MSc in Dependable Software Systems
Keywords: Web service; 19th century; Irish personal name matching;
Academic Unit: Faculty of Science and Engineering > Computer Science
Item ID: 7092
Depositing User: IR eTheses
Date Deposited: 04 May 2016 11:18
URI:

    Repository Staff Only(login required)

    View Item Item control page

    Document Downloads

    More statistics for this item...