It has always surprised me that there is no “standard” for transcribing English parish registers. We see all manner of variations, from a literal line by line transcript, to a simple index of surnames and dates, and all manner of formats in between. I suppose that the variations have evolved from handwritten and typewritten transcripts of the days before computers, and in many ways, these methods and styles have often been carried forwards, often reluctantly, into the computer age.
There is the school of thought that there is nothing like good old paper for archiving information, and with this I would certainly have no argument, for computer data, stored on whatever media, can hardly be considered to be permanent, or indeed, readable by future generations of computers and software. However, data stored on computer database has one superbly brilliant advantage. It can be searched and sorted in the blink of an eye.
Imagine, even with a well indexed paper transcript, never mind an original parish baptism register, how long it would take to find every baptism of a SMITH in a large parish with some 200,000 baptisms. Life is just too short. A computer database can do it in one or two seconds, ready to print. Furthermore, it can sort out all of the families for you too.
A database is the ideal way to store information such as a parish register. It can be sorted and searched in seconds (or less), even with a file containing hundreds of thousands of records. We can still print out a meaningful “transcription” onto paper, and the database can also be used to create an instant index for the paper print, although in use in the computer, indexes are no longer needed, as the information can be searched for and found incredibly quickly. But we need to consider the way that the data is entered in the first place, and hence the point of this article.
One point which I should make very clear right from the start. Some people like the idea of entering the data into a spreadsheet program such as Microsoft Excel. Quite simply, it isn’t man enough for the task. A spreadsheet is not intended for this type of data; it is intended for performing calculations. Sure, it has a “table” format similar to a database, but a database program is so much more powerful when it comes to doing searches, queries and filters. Furthermore, a spreadsheet runs out of data space, (there are only so many records that you can enter before it is full up!), whereas a database’s size is limited only by the size of your hard disk. Forget the idea of using spreadsheets for this type of data!
I would thoroughly recommend the exercise of transcribing parish registers to anyone who has a computer with database software. There is nothing really difficult about learning to use a database, and in many ways it is easier than using a modern word processor. The exercise is boring, but you get an incredible amount of satisfaction from the knowledge that you are helping other fellow researchers. You also gain a great deal of experience in reading old handwriting. A good hint, if you haven’t done it before, is to start by transcribing the post 1813 registers for a parish first. The handwriting is usually a little clearer, and you get to know the surnames and places referred to. That makes it a whole lot easier when you come to do the older registers.
There is a down-side though. Someone has to enter all of the information into the computer database, typing it in line by line. It is a painfully slow process, and mind-blowingly boring. It could not, however, ever be considered to be a thankless task, and I would recommend the exercise to anyone, no matter how slow they type. What we should consider though, is the format in which the information is transcribed, so that it can be easily searchable.
There are basically three types of computer software which can be used, a word processor, a spreadsheet, or a database, but only one of these will do the task that we want really successfully. A database was invented for this type of information, so that it can be sorted and searched with great ease. A database stores the information in what is known as “fields” – one field for each item of data. A collection of fields make up the information for one “record”, that is, all of the information about the one person or event (e.g. a baptism). A collection of records makes up the whole database file.
The use of commas and quotation marks in a database file
It is very tempting to enter data such as: Margin note: “Illegitimate”, or 13, High Street, Newent. Commas or quotation marks should never be used in a database! It presents a big problem if the data needs to be exported into a different database software. I prefer to use Microsoft Access as my database software, but not everyone else does, furthermore, others may have different types of computers.
There is a “standard” for transferring data between different databases, and this standard has existed since computers first began to be used. Every database can “import” files of this standard. The file type is known as a “Comma Separated Variable” file. “CSV” file for short. It is a simple text only computer file. Each “field” of data is separated by a comma, and enclosed in speech marks. It looks something like this:
“No”,”Birth date”,”Baptism date”,“First name”,”sex”,”Father”,”Mother” …. etc.
“0001”,”20 Jan 1875”,”26 Mar 1875”,”John”,”son”,”James”,”Mary” … etc.
So if our data contains commas or speech marks, then it is impossible for it to be transferred sensibly to a different database or a database in a different type of computer. Well, not quite impossible, but it entails hours of work editing the CSV file to get rid of all of the extra commas and speech marks. Believe me, it is a pain!
Never use speech marks or commas in database files!