Files
expoweb/handbook/troggle/namesredesign.html

96 lines
4.7 KiB
HTML

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>CUCC Expedition Handbook: Peoples' names design options</title>
<link rel="stylesheet" type="text/css" href="/css/main2.css" />
</head>
<body>
<style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>
<h2 id="tophead">CUCC Expedition Handbook - Peoples' names design options</h2>
<h1>What, How and Why : Peoples' names</h1>
<ul>
<li><a href="#why">Why</a>
<li><a href="#maint">Maintenance constraints</a>
<li><a href="#whatold">What we have now</a>
</ul>
<span style="color:red">This was basically fixed in 2023. A root-and-branch replacement of peoples names with a 'slug' derived from peoples' names. However we still have things we could do to improve 'folk'</span>
<ul>
<li><a href="folkredesign.html#otherfolk">Further options for folk</a>
</ul>
<h2 id="why">Names: Why it is a problem</h2>
<p>The <a href="#whatold">former system</a> completely failed with names which are in any way "non standard".
Troggle ccouldn't cope with a name not structured as
"Forename Surname": where it is only two words and each begins with a capital letter (with no other punctuation,
capital letters or other names or initials).
<p>There were 19 people for which the troggle name parsing and the separate <a href="scriptscurrent.html#folk">folklist script</a> parsing
were different.
<h2 id="maint">Names: Maintenance constraints</h2>
<p>We have special code scattered across troggle to cope with "Wookey", "Wiggy" and "Mike the Animal". This is a pain to maintain.
<h2 id="whatold">Names: How it works</h2>
<p>Fundamentally we have regexes detecting whether something is a name or not - in several places in the different types of raw data. However we do now use unique 'slugs' for the references between pages (since Sept. 2023).
<h4>Four different bits</h4>
<ul>
<li>We have the <a href="scriptscurrent.html#folk">folklist script</a> holding "Forename Surname (nickname)" and "Surname" as the first two columns in the CSV file.
These are used by the standalone script to produce the <var>/folk/index.html</var> which is run manually, and which is also parsed by troggle (by a regex in <var>
parsers/people.py</var>) only when a full data import is done. Which is a problem for people like <var>Lydia-Clare Leather</var> and various 'von' and 'de' middle
'names', McLean, MacLeod and McAdam.
<li>We have the <var>*team notes Becka Lawson</var> lines in all our survex files which are parsed (by regexes in <var> parsers/survex.py</var>) when a full data import is done (or when a survex file is edited online).
<li>We have the <var>&lt;div class="trippeople"&gt;&lt;u&gt;Luke&lt;/u&gt;, Hannah&lt;/div&gt;</var> trip people line in each logbook entry.
These are recognised by a regex in <var>parsers/logbooks.py</var> when a full data import is done (or when a logbook entry is edited online).
<li>We have the names of people in a list on a wallet: which is necessary when the wallet has no attached survex file. But even when there are (one or more) attached survexfiles, there is a place to input a list of peoples' names as well. This is parsed by <var>parsers/scans.py</var>.
</ul>
<p>Frankly it's amazing it even appears to work at all.
<p>
In <var>urls.py</var> we used to have
<code>
re_path(r'^person/(?P<first_name>[A-Z]*[a-z\-\'&;]*)[^a-zA-Z]*(?P<last_name>[a-z\-\']*[^a-zA-Z]*[\-]*[A-Z]*[a-zA-Z\-&;]*)/?', person, name="person"),
<br /><br />
re_path(r'^personexpedition/(?P<first_name>[A-Z]*[a-z&;]*)[^a-zA-Z]*(?P<last_name>[A-Z]*[a-zA-Z&;]*)/(?P<year>\d+)/?$', personexpedition, name="personexpedition"),
<br /><br />
re_path('wallets/person/(?P<first_name>[A-Z]*[a-z\-\'&;]*)[^a-zA-Z]*(?P<last_name>[a-z\-\']*[^a-zA-Z]*[\-]*[A-Z]*[a-zA-Z\-&;]*)/?', walletslistperson, name="walletslistperson"),
</code>
where the 'transmission noise' is attmpting to recognise a name and split it into &lt;first_name&gt; and &lt;last_name&gt;.
Naturally this failed horribly even for relatively straightforward names such as <em>Ruairidh MacLeod</em>.
<p>
<span style="color:red">We now [October 2023] have</span>
<code>
path('person/&lt;slug:slug&gt;', person, name="person"),<br />
path('personexpedition/&lt;slug:slug&gt;/&lt;int:year&gt;', personexpedition, name="personexpedition"),<br />
path('wallets/person/&lt;slug:slug&gt;', walletslistperson, name="walletslistperson"),
</code>
which is a lot easier to maintain.
<hr />
Return to: <a href="trogdesign.html">Troggle design and future implementations</a><br />
Return to: <a href="trogintro.html">Troggle intro</a><br />
Troggle index:
<a href="trogindex.html">Index of all troggle documents</a><br /><hr /></body>
</html>