CUCC Expedition Handbook - Peoples' names design options

What, How and Why : Peoples' names

Why
Maintenance constraints
What we have now

This was basically fixed in 2023. A root-and-branch replacement of peoples names with a 'slug' derived from peoples' names. However we still have things we could do to improve 'folk'

Further options for folk

Names: Why it is a problem

The former system completely failed with names which are in any way "non standard". Troggle ccouldn't cope with a name not structured as "Forename Surname": where it is only two words and each begins with a capital letter (with no other punctuation, capital letters or other names or initials).

There were 19 people for which the troggle name parsing and the separate folklist script parsing were different.

Names: Maintenance constraints

We have special code scattered across troggle to cope with "Wookey", "Wiggy" and "Mike the Animal". This is a pain to maintain.

Names: How it works

Fundamentally we have regexes detecting whether something is a name or not - in several places in the different types of raw data. However we do now use unique 'slugs' for the references between pages (since Sept. 2023).

Four different bits

We have the folklist script holding "Forename Surname (nickname)" and "Surname" as the first two columns in the CSV file. These are used by the standalone script to produce the /folk/index.html which is run manually, and which is also parsed by troggle (by a regex in parsers/people.py) only when a full data import is done. Which is a problem for people like Lydia-Clare Leather and various 'von' and 'de' middle 'names', McLean, MacLeod and McAdam.
We have the *team notes Becka Lawson lines in all our survex files which are parsed (by regexes in parsers/survex.py) when a full data import is done (or when a survex file is edited online).
We have the <div class="trippeople"><u>Luke</u>, Hannah</div> trip people line in each logbook entry. These are recognised by a regex in parsers/logbooks.py when a full data import is done (or when a logbook entry is edited online).
We have the names of people in a list on a wallet: which is necessary when the wallet has no attached survex file. But even when there are (one or more) attached survexfiles, there is a place to input a list of peoples' names as well. This is parsed by parsers/scans.py.

Frankly it's amazing it even appears to work at all.

In urls.py we used to have re_path(r'^person/(?P[A-Z]*[a-z\-\'&;]*)[^a-zA-Z]*(?P[a-z\-\']*[^a-zA-Z]*[\-]*[A-Z]*[a-zA-Z\-&;]*)/?', person, name="person"), re_path(r'^personexpedition/(?P[A-Z]*[a-z&;]*)[^a-zA-Z]*(?P[A-Z]*[a-zA-Z&;]*)/(?P\d+)/?$', personexpedition, name="personexpedition"), re_path('wallets/person/(?P[A-Z]*[a-z\-\'&;]*)[^a-zA-Z]*(?P[a-z\-\']*[^a-zA-Z]*[\-]*[A-Z]*[a-zA-Z\-&;]*)/?', walletslistperson, name="walletslistperson"), where the 'transmission noise' is attmpting to recognise a name and split it into <first_name> and <last_name>. Naturally this failed horribly even for relatively straightforward names such as Ruairidh MacLeod.

We now [October 2023] have path('person/<slug:slug>', person, name="person"), path('personexpedition/<slug:slug>/<int:year>', personexpedition, name="personexpedition"), path('wallets/person/<slug:slug>', walletslistperson, name="walletslistperson"), which is a lot easier to maintain.

Return to: Troggle design and future implementations
Return to: Troggle intro
Troggle index: Index of all troggle documents