If you are a new reader, welcome! In the first entry of this series, I discussed the background and this new naval history database project that I’m working on. In this post, I will look in more detail and provide examples of the two different series that I’ll be using, and how I plan to approach them.
In the past, when historians have looked at the quantitative data for the Royal Navy in the Restoration period, they will have done so with specific goals in mind, and therefore pulled specific data. The benefit of using a database is that the data can be queried, organized, thought about in many more different ways. Of course, the great difficulty of doing this is that there are so many ways that the data would be able to be presented.
There are really two distinct problems that I’m considering at this phase. First, the organization of the tables and the actual storage of the data in the database. How will the data be associated? Second, the creation of the questions that will allow users to most flexibly look at the data. For this reason, the two series of documents I’m looking at are particularly good because they provide a very different range of challenges.
The first series is ADM 8 specifically the first two volumes. These are ‘List Books’, and they contain lists, surprisingly enough. Lists of officers, lists of ships in service etc. In particular, these volumes include monthly lists of how many ships are in service, and where they were deployed to. Let’s have a look.
With ADM 8, the creation of the organization for the database will be quite complicated. The basic unit for these documents is effectively a monthly report. Not every report is identical. Even if they were, the ‘monthly report’ isn’t exactly the kind of format I want to directly reproduce in the database.
My approach to how this ADM 8 will be represented in the database is driven by what I think will be the way to look at this data to start. From the beginning, what I wanted to do was have a geographical representation- actually show with dots, or some icon, where ships were in every report. Originally, I wanted to take the ships of the ’30 Ship Program’ of 1677 and represent where they traveled during their service lives (as long as the data held up). This has changed, now that I see the wealth of data in these volumes. However, the idea and attraction of a visual, geographic representation remains. The first idea is that a user could use choose a month, and then see where all the ships were at that time. The second idea, is that a user could choose a specific ship (or multiple ships) and also choose a date range, and then have that represented.
But there’s far more information here- commanders, guns, number of crews. The problem with this data is that I want to organize the storage so that it is both good for retrieving data, but also efficient for data entry- I want to be able to have whoever is doing it (likely me) to involve a minimum of page-flipping (well, photo- changing, but anyways). Clearly, a major problem to be solved is the determination of exactly how much information from the volumes is to be contained in the database. The more information I want to harvest from the documents, the more complicated the database design will get. I would rather do more work up front and have a more complex database design, than to have successively more complex ones with more capability. But frankly, I’m going to leave this part of the database for a while, and work on the slightly easier data (for this step, anyways).
What I’m leaning towards doing is actually having multiple tables. For example, one table would be for the ships, and have the data pertinent to those ships. A second table would be stations/deployments, and have that data. Certainly, I don’t think it’ll be possible to have everything in a single table and have it work well.
The second series I’m looking at is ADM 107- Lieutenant’s Exam Results. These resulted from the 1677 creation of Royal Navy-specific professional qualifications for the rank of Lieutenant. Those who were to be examined had to serve a certain number of years in the Royal Navy, including a number of years as a midshipman (although these numbers changed several times. The Captain of the ship was required to provide a certificate, and there were certain skills that had to be mastered. This is reflected in the text. Let’s have a look. (click to look at full size. Very large photo)
Here we see the rough layout for these volumes (and I have volumes 1 and 2). They are handwritten, but there is certainly a bunch of boilerplate language. However, from observation it’s clear that this language was *not* identical in each report. It’d be interesting to note which reports deviated from the ‘standard’ text, if indeed there was a standard text. The slightly more complicated issue is that each report will have a different number of ships involved. This is, however, much simpler to deal with than the complexity of ADM 8. On the other hand, the various questions that can be asked of these volumes are probably more complex than would be asked of ADM 8.
The final thought for this is that, really it’s all going to be part of the same database- so where the years overlap (although the volumes of ADM 8 that I have start in 1673 and the volumes of ADM 107 start in the 1690s), there could be some really interesting things looking at both Lieutenant’s exam results and the ship deployments. That can be investigated more as the data is more fully examined. Certainly a table of Ships featured in both would be a bridge between the two sets of data.
In the next segment of this series, I will look at database design in detail, talk about MySQL, PHP and a number of other interesting things.
If you have ideas of the types of queries/analyses you’d like to do based on the types of information you can see in these photos, please leave a comment below or email me, I’d love to be able to incorporate these ideas as I go further along this project. Also, if you have any questions, feel free to message me on Twitter or Email me.
I guess the real challenge is that if the information is not machine readable, then human intervention is needed – supplying subjects, tags etc so that the material can at least be found. Maybe how you found the material in the fist place.
Unless of course you are planning to transcribe the material so that it is machine readable.
The British Newspaper Archive is a good example of machine readable material that can searched for and then dowloaded as a PDF. I assume until told otherwise they are using a text based database. In the case of the BNA they are using printed material, a distinct advantage.
Human intervention is definitely needed: I’m not planning to simply reproduce the text, that wouldn’t create the kind of research tool that I’m envisioning (though.. that would be another project I think).
I think the thing that will take the longest to do is actually the data. What I’m planning to do is get the database design done, get.. a chunk of records from both volumes in, and then move from there to work on the front end. Once I have it functioning, then do the rest of the data entry. I think it may be possible to get volunteers to help once I prove the project is viable.