CUCC Expedition Handbook - Peoples' names design options
What, How and Why : Peoples' names
This was basically fixed in 2023. A root-and-branch replacement of peoples names with a 'slug' derived from peoples' names. However we still have things we could do to improve 'folk'
Names: Why it is a problem
The former system completely failed with names which are in any way "non standard".
Troggle ccouldn't cope with a name not structured as
"Forename Surname": where it is only two words and each begins with a capital letter (with no other punctuation,
capital letters or other names or initials).
There were 19 people for which the troggle name parsing and the separate folklist script parsing
were different.
Names: Maintenance constraints
We have special code scattered across troggle to cope with "Wookey", "Wiggy" and "Mike the Animal". This is a pain to maintain.
Names: How it works
Fundamentally we have regexes detecting whether something is a name or not - in several places in the different types of raw data. However we do now use unique 'slugs' for the references between pages (since Sept. 2023).
Four different bits
- We have the folklist script holding "Forename Surname (nickname)" and "Surname" as the first two columns in the CSV file.
These are used by the standalone script to produce the /folk/index.html which is run manually, and which is also parsed by troggle (by a regex in
parsers/people.py) only when a full data import is done. Which is a problem for people like Lydia-Clare Leather and various 'von' and 'de' middle
'names', McLean, MacLeod and McAdam.
- We have the *team notes Becka Lawson lines in all our survex files which are parsed (by regexes in parsers/survex.py) when a full data import is done (or when a survex file is edited online).
- We have the <div class="trippeople"><u>Luke</u>, Hannah</div> trip people line in each logbook entry.
These are recognised by a regex in parsers/logbooks.py when a full data import is done (or when a logbook entry is edited online).
- We have the names of people in a list on a wallet: which is necessary when the wallet has no attached survex file. But even when there are (one or more) attached survexfiles, there is a place to input a list of peoples' names as well. This is parsed by parsers/scans.py.
Frankly it's amazing it even appears to work at all.
In urls.py we used to have
re_path(r'^person/(?P[A-Z]*[a-z\-\'&;]*)[^a-zA-Z]*(?P[a-z\-\']*[^a-zA-Z]*[\-]*[A-Z]*[a-zA-Z\-&;]*)/?', person, name="person"),
re_path(r'^personexpedition/(?P[A-Z]*[a-z&;]*)[^a-zA-Z]*(?P[A-Z]*[a-zA-Z&;]*)/(?P\d+)/?$', personexpedition, name="personexpedition"),
re_path('wallets/person/(?P[A-Z]*[a-z\-\'&;]*)[^a-zA-Z]*(?P[a-z\-\']*[^a-zA-Z]*[\-]*[A-Z]*[a-zA-Z\-&;]*)/?', walletslistperson, name="walletslistperson"),
where the 'transmission noise' is attmpting to recognise a name and split it into <first_name> and <last_name>.
Naturally this failed horribly even for relatively straightforward names such as Ruairidh MacLeod.
We now [October 2023] have
path('person/<slug:slug>', person, name="person"),
path('personexpedition/<slug:slug>/<int:year>', personexpedition, name="personexpedition"),
path('wallets/person/<slug:slug>', walletslistperson, name="walletslistperson"),
which is a lot easier to maintain.
Return to: Troggle design and future implementations
Return to: Troggle intro
Troggle index:
Index of all troggle documents