A Day at the BYU Family History Technology Workshop

After an uneventful drive from Salt Lake City to Provo this morning, my BYU Family History Technology Workshop registration began at about 8 am, with a light continental breakfast of fruit and bagels. Most of us carried our food to our seats, and settled in, laptops on full alert…

The workshop is now in its 17th year, and has been held at BYU and RootsTech in Salt Lake City. This year it was held in the Hinkley Center in Provo. Attendees included researchers, software developers, and professionals.

The workshop is made up of research talks, developer presentations, and lightening talks. Areas of extreme interest include handwriting recognition, automated record transcription, data modeling, machine learning, natural language processing, visualization, human interaction, and user experience.

9:00 a.m. The morning kicked off with a welcome by Mark Clement, followed by the Keynote address, which was titled 25 Years of Research in Family History Technologies at BYU: Where we have been, where we are going, presented by Bill Barrett. Bill has been at BYU, and involved in technology for a quarter century. In many cases he’s been the guy with the vision of doing the impossible – and then helping to make it possible. Many of Bill’s students now work for tech-related genealogy companies.

Bill pointed out that the Archive tab at the BYU Family History Technology Workshop website contains the papers of the past programs starting with 2001. The 2018 papers will be posted shortly. Bill noted that his keynote lecture – giving a fascinating timeline of tech progress at BYU – will be included. Accurate automated handwriting recognition is still a major issue. Great progress in the area is being made, but there’s still not a perfect system. There are many factors involved – machine print, handwriting, stamps, overlapping lines, and such – and all have to be dealt with.

10:05 a.m. The Lightning Talks was what many of us came to hear. Each presenter has about 10 minutesto give their pitch. The following were given, one right after the other. The title, followed by the presenter, and a brief summary (mine) are listed. Keep in mind that I’m deaf, so wasn’t able to summarize all the info as I would have liked.

Studio, Benson Giang
Studio is an iPhone-based system that specializes in capturing photo albums and scrapbooks from wherever you are… Studio captures from the ‘top down’ right through existing plastic covers and sheets, so there is no need to remove photos or pages from our precious albums. This technology creates a digital replica without the glare from the protective covers. See: http://support.legacyrepublic.com/customer/en/portal/articles/2821426-studio-experience—2017-

QromaTag, Tony Knight
QromaTag is an iOS application that makes it easy to put the most important parts of a story into any photo in a way that will survive for generations. Using two voice recognition systems that work in tandem, QromaTag creates industry standard photo metadata based on what you tell it about your photos. Using natural language, speak the date, location and people that are in the photo and QromaTag takes care of all the technical details and embeds that information into the photo. See: https://devpost.com/software/qromatag-r70ufq

Sparse Data Matching, Chrisine Marchesci (BYU Economics)
Christine works with the BYU Record Linking Lab. An example was used showing how automated matching could be used to match names found in a college annual with the U.S. census records. Many libraries are digitizing hgh school yearbooks. Automated linkng of records found within can be very useful for genealogists.

Using Deep Learning to Link Census Records, Chris Cook (BYU Economics)
Chris Cook is also with the record linking Lab. The talk dealt with people’s names being run through a machine learning algorithim, using data found on the trees in FamilyTree.

Extracting Genealogical Data from Books, Nick Grasley (BYU Economics)
Nick also works in the Record Linking Lab. Archive.org has 130,000 books with a genealogy tag. The books range from parish records to family trees. Thus far, 363,947 individual records have been extracted from 35 books. One book contained almost 150,000 records. An example from New Jersey Colonial Records was used. About 2/3 of the people in the book were not found on FamilySearch! This technology has great potential.

Using Geo-coordinates to Better Match Records, Tanner Eastmond (BYU Economics)
Tanner also works with the BYU Record Linking Lab. The example used was Soldiers in the Great War, 70,000 US Soldiers Who Died in WWI. The soldier’s names and their home towns (as well as a picture) are given in the book. They wanted to match these people to the U.S. Census. They used geo-coordinates to make the matches.

TreeSweeper, Sam Litster (BYU Computer Science)
Sam explained how you can search your family tree (at the FamilySearch website) for possible errors – a clue might be that a person was dead when their child was born. The program catches the possible error, and then give suggestions on what may be the issue. See: https://beta.treesweeper.fhtl.byu.edu/#!/

Brownie, Ben Jones (BYU Computer Science)
Brownie breaks down your research into a to-do list, using methodology that professionals might use. According to Sam, the program is still pretty primitive, and a work in progress. Tips and Resources are given the genealogist by this new program that has great potential.

Ancestor Games, Jeremy Hodges, (BYU Computer Science)
Jeremy talked about various games that are being produced for genealogy.
Included are:

  • Matching games
  • A coloring book using sketches of your ancestor.
  • Crossword puzzles
  • A word search
  • A word scramble

Record Quest, Calvin McMurray (BYU Computer Science)
Found at the FamilyHistory Technology Lab website – Record Quest is an online game. See: https://fhtl.byu.edu/apps/record-quest.html

Check out the Family History Technology Labs website at:
https://fhtl.byu.edu/index.html It’s got some pretty interesting stuff on there.

At 11:00, we took a break. A devotional was led by Elder Gifford Nielson, and some of the rest of us spent the time on our computers and networking.

At noon we enjoyed a terrific lunch. Now, I don’t know about the majority of the folks, but I ordered a vegetarian meal, and it was wonderful! It must have cost half my registration fee alone!

1:00 p.m. Research and Developer Talks (15 min + 5 min Q&A)

The early afternoon started off with a 15 minute talk titled Building a National Longitudinal Research Infrastructure, Steve Ruggles & Catherine Fitch (University of Minnesota)
Historical data from the census is used to link across five generations 1850-2020.
Life Histories for each person can be built, – using census, and many other life records.
See: https://usa.ipums.org/usa/

GenCo – Machine Learning Entity Resolution, Tyler Folkman (Ancestry)
GenCo – Genealogically Inspired Compare
Automatically comparing people from family trees to understand if people are the same folks. Ancestry professionals actually looked at many thousands of records and helped to work up the data for the automated system.

Extraction Rule Creation by Text Snippet Examples, Dave Embley (FamilySearch/BYU)
Created Rules should allow automated text extraction that is

  • Usable by non experts
  • rapid development
  • high quality results.

Green QQ Current implementation – Quick and Quality
This is a work in progress. The interface is still to be written.

The impact of European General Data Protection Regulation (GDPR) on genealogy software, Sophie Tardivel (CEO of Doptim, France)
From Britanny, France
See: Geneafinder.com

The European Union’s General Data Protection Regulation went into effect 25 May 1018. It affect’s the European Union for all products and services delivered in Europe.
Any information relative to a natural person directly or indirectly is covered by the regulations.
Penalties are impossed on data companies that break the rules… Consent from individuals is important and must be obtained to process his/her data.
See: http://doptim.eu/

At 2:20, the program broke for a 20-minute break. I had to break away and head back to Salt Lake, as RootsTech registration, as well as the Media Dinner were yet on the day’s menu.

The following Developer talks were given in the afternoon. I missed them, as I was already on my way… I had to get back to Salt Lake for RootsTech registration, and the later Media dinner. Bummer…

The papers for all these talks will be posted at the website https://fhtw.byu.edu/ within a few days. Check them all out!

2:40 p.m. Research and Developer Talks (15 min + 5 min Q&A)

Improved Blur Detection of Historical Document Images with a Neural Network, Ben Baker (FamilySearch)

Page Segmentation using Fully Convolutional Networks, Seth Stewart (BYU)

An Open Source Pytorch Library for Handwriting Recognition, Oliver Nina (University of Ohio)

Applications of Subword Spotting, Brian Davis (BYU)

Text Baseline Detection with Convolutional Neural Networks, Chris Tensmeyer (BYU)

Use of Deep Learning for Open Format Line Detection and Handwriting Recognition: An End-to-End System, Curtis Wigington (BYU)

4:40 Concluding Remarks by Joe Price

About Leland Meitzler

Leland K. Meitzler founded Heritage Quest in 1985, and has worked as Managing Editor of both Heritage Quest Magazine and The Genealogical Helper. He currently operates Family Roots Publishing Company (www.FamilyRootsPublishing.com), writes daily at GenealogyBlog.com, writes the weekly Genealogy Newsline, conducts the annual Salt Lake Christmas Tour to the Family History Library, and speaks nationally, having given over 2000 lectures since 1983.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload the CAPTCHA.