Sunday 5 February 2012

A RootsTech Perspective

Introduction


For people of 'a certain age' blogging is not a natural way to disseminate or promote a personal view, but given the encouragement to participate in Web 2.0 and share the way most of the rest of the planet seems to want to 'tell all' I figured I'd take advantage of  the medium.

The snow is falling as I write, which I guess is fairly appropriate as what I want to talk about is the RootsTech Conference in Salt Lake City, just down the road from Park City which is arguably the centre of the Winter Sports industry in the US. Firstly, let me award a big bouquet to the conference organisers for providing the facilities  to stream a selection of presentations 'live' to The Internet, mostly it worked just fine when I worked out that the environment was optimised for Internet Explorer on a Windows PC, not Safari on an iMac or the standard browser on an Android tablet.

The experience of virtually participating at a conference event was novel, the feeling of being the target of received wisdom without being able to either ask questions or network with other attendees to gauge their perceptions of concepts presented left the feeling of opportunities missed – hence this blog, to air some idea's and reactions. I think the keynote on the first morning was thought provoking in the extreme, the stark realisation of the impermanence of media and format, illustrated by the demise of the floppy disk should be a wake-up call to all of us.

Succession planning


 A view of 2060 was the focus of Jay Verkler keynote presentation on the first day, he proposed the view that genealogists of that era will be eternally grateful for our rigorous efforts to portray the century that preceded ours.

I disagree with him and think that genealogists in 2060 will be disappointed that we failed to take advantage of the options with which we were presented. I think we should concentrate attention to succession planning and persistence of research and data. I fear that too many times records are thrown out or a password protected computer is permanently switched off after a person dies, abandoning possibly unique insights into the past.

Publication and collaboration


The resolution of the case for Web publication of research results is a personal decision for each genealogist, for the 'records in a shoe box' or the compiler of Excel spreadsheets there is probably no alternative – their records will remain forever private, presumably thats what they feel comfortable with. The genealogist who consigns the totality of their records to 'The Cloud' and embraces the collaborative approach, without a desktop 'master copy' represents the other end of the spectrum.


In the middle are the current majority of genealogists who insist on maintaining a 'master' version of their records on their PC but seek the kudos and collaborative possibilities of Internet publishing and online updating always providing that any changes can be synchronised back to the 'master' copy. In my opinion these people are going to have to 'wake up and smell the coffee', automatic synchronisation in this sense is never going to happen – its not technically possible or I would argue, desirable, synchronisation is and will remain a manual exercise fraught with problems. With a rigorous unattended backup regime in place which regularly emails a copy of the database backup, not just a GEDCOM, genealogists should trust 'The Cloud' – it has a much greater up-time than a local PC.

There are issues with collaboration which require some thought, how are disputes resolved, the ownership of data or media and the responsibility for learning how to use the chosen software need to be agreed and spelled out – responsibilities as well as rights.

Image recognition


As part of the Ancestry keynote panel discussion on Saturday the participants were invited to speculate about the most wide ranging technological advance they expected to see in the next ten years. Search technology was mentioned with advances predicted in the specific area of the use of semantics based around the new HTML 5 micro-code standard which will allow the extraction of genealogically significant names and events from appropriately marked up text streams. DNA developments featured in several speculative assessments but nobody mentioned developments in image / face recognition. I believe it will be possible quite soon, if it isn't already, for software to 'learn' the specifics of  a facial image and identify that individual in hitherto unattributed images.

Search technology


Robert Gardner from Google introduced the topic of search engine optimisation as related to genealogy and whilst his talk may have acted as an introduction to the subject he presented little of substance to existing web site owners already familiar with the topic.


Have you ever wondered why search engines don't return a large number of hits from Ancestry, Find My Past or Family Search when you query a name, its because of the 'pay wall' of the professional sites that stop the crawlers dead in their tracks and prevents them harvesting data to index. Similarly, login forms, in use in so many amateur sites have the same effect and the only way around this is to generate XML site-maps which can be dynamically updated to accommodate changes of content.


Almost as a 'throw away' he offered a search tip for genealogists generally, which for me was one of the highlights of RootsTech. When using a search engine to look for records related to an individual one should preface the search argument with '~genealogy' which aims to limit the result set to those that are genealogy significant, its not perfect but it helps to eliminate grass seed resellers if you're looking for Thomas LAWN. Indeed, I've already discovered a transcribed Polish marriage reference buried in a 2007 news group archive which would normally been lost on page 78 of the search results.