The pajek file is for testing and visualing alternatives in the automated procedure with the alberti data. Actually, here, all the possible identifications with parents who fit the fill of the current parameters are shown (the same can be done for the entire dataset). I have not yet adjusted all the parameters optimally nor ranked the choices. Eventually you will have the following ranked relations
1 best choice for parents of busband
2 ditto for wife
3 2nd best choice for parents of busband, if different
4 ditto for wife
5 3rd best choice for parents of busband, if different
6 ditto for wife
when entered back in the datafile these will be six separate variables plus one or more variables for the probable reliability of the estimate.
The identification of parental couples will correspond to their unique line number in the master file, which will also become a variable so the file can subsequently be changed.

Within Pajek using the Options/ReadWrite Treshold (setting to 3, 5, etc) you can strip off all but the top choices to get the best estimate of parentals or keep the multiple relations to see what are the alternatives (be sure to set the threshold back to 0 when done!). There may still be a few where the links shown on the genealogies are not among the alternatives because of huge discrepancies of marriage dates for parents and children. In some case, the genealogies themselves may be wrong.

There is an incredibly good fit at this point between the automated procedure and the Alberti genealogy. Part of this is because you have done an amazing job with accuracy and uniqueness of the spelling of the names. 90% or more of my best estimates by the automated procedure will correspond with what is in the genealogies. What is amazing to me is that nearly all of the ancestors who are branching points in the tree (ie who leave descendants) are recovered by the atomated procedure, at least within the timeframe covered. Lots of those who do not leave descendants, of course, died early, did not marry, or migrated out so they do not enter your records.

This job took three days of programming, but I am quite happy with the results. I now need to tweak the parameters a bit to get optimal predictions, and then to program the ranking and reliability estimation procedure.

I have an option to feed the program a family name or names, as I have done here with alberti, so that only one or more selected segments of the genealogy is reconstructed.