folk names plan

2026-03-12 16:26:55 +00:00 · 2022-08-31 18:02:16 +03:00
parent f8f3328486
commit b6d7057765
4 changed files with 141 additions and 0 deletions
--- a/handbook/computing/folkupdate.html
+++ b/handbook/computing/folkupdate.html
@@ -17,6 +17,8 @@ The folk.csv file is stored on the server under version control in the <var>:exp
 href="../computing/repos.html">repository</a> in 
 <code>expoweb/folk/folk.csv</code>

+<p>Note that this area is subject to a <a href="../troggle/namesredesign.html">redesign proposal</a>.
+
 <p>Before expo starts the folk.csv file is updated. 

 <p>Edit folk/folk.csv, adding the new year to the end of the header line, a new column, with just a comma (blank cell) for people 
--- a/handbook/troggle/namesredesign.html
+++ b/handbook/troggle/namesredesign.html
@@ -0,0 +1,126 @@
+<!DOCTYPE html>
+<html>
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<title>CUCC Expedition Handbook: Peoples' names design options</title>
+<link rel="stylesheet" type="text/css" href="../../css/main2.css" />
+</head>
+<body><style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>
+<h2 id="tophead">CUCC Expedition Handbook - Peoples' names design options</h2>
+
+<h1>What, How and Why : Peoples' names</h1>
+
+<ul>
+<li><a href="#why">Why</a>
+<li><a href="#maint">Maintenance constraints</a>
+<li><a href="#whatold">What we have now</a>
+<li><a href="#otherfolk">Further options for folk</a>
+
+</ul>
+
+<h2 id="why">Names: Why we need a change</h2>
+
+
+<p>The <a href="#whatold">current system</a> completely fails with names which are in any way "non standard".
+Troggle can't cope with a name not structured as
+"Forename Surname": where it is only two words and each begins with a capital letter (with no other punctuation,
+capital letters or other names or initials). 
+<p>There are 19 people for which the troggle name parsing and the separate <a href="scriptscurrent.html#folk">folklist script</a> parsing
+are different. Reconciling these (find easily using a link checker scanner on the
+folk/.index.htm file) is a job that needs to be done. Every name in the generated
+index.htm now has a hyperlink which goes to the troggle page about that person. Except 
+for those 19 people.
+
+This has to be fixed as it affects ~5% of our expoers. 
+<p><em>[This document originally written 31 August 2022]</em>
+
+<h2 id="maint">Names: Maintenance constraints</h2>
+<p>We have special code scattered across troggle to cope with "Wookey", "Wiggy" and "Mike the Animal". This is a pain to maintain.
+
+<h2 id="whatold">Names: How it works now</h2>
+<p>Fundamentally we have regexes detecting whether something is a name or not - in several places. These should all be replaced by properly delimited strings.
+<h4>Four different bits</h4>
+<ul>
+<li>In <var>urls.py</var> we have
+<code>
+    re_path(r'^person/(?P<first_name>[A-Z]*[a-z\-\'&;]*)[^a-zA-Z]*(?P<last_name>[a-z\-\']*[^a-zA-Z]*[\-]*[A-Z]*[a-zA-Z\-&;]*)/?', person, name="person"),
+    re_path(r'^personexpedition/(?P<first_name>[A-Z]*[a-z&;]*)[^a-zA-Z]*(?P<last_name>[A-Z]*[a-zA-Z&;]*)/(?P<year>\d+)/?$', personexpedition, name="personexpedition"),
+</code>
+where the transmission noise is attmpting to recognise a name and split it into &lt;first_name&gt; and &lt;last_name&gt;. 
+Naturally this fails horribly even for relatively straightforward names such as <em>Ruairidh MacLeod</em>.
+
+<li>We have the <a href="scriptscurrent.html#folk">folklist script</a> holding  "Forename Surname (nickname)" and  "Surname" as the first two columns in the CSV file. 
+These are used by the standalone script to produce the <var>/folk/index.html</var> which is run manually, and which is also parsed by troggle (by a regex in <var>
+parsers/people.py</var>) only when a full data import is done. Which it gets wrong for people like <var>Lydia-Clare Leather</var> and various 'von' and 'de' middle 
+'names', McLean, MacLeod and McAdam.
+
+<li>We have the <var>*team notes Becka Lawson</var> lines in all our survex files which are parsed (by regexes in <var> parsers/survex.py</var>) only when a full data 
+import is done. 
+
+    
+
+<li>We have the <var>&lt;div class="trippeople"&gt;&lt;u&gt;Luke&lt;/u&gt;, Hannah&lt;/div&gt;</var> trip people line in each logbook entry.
+These are recognised by a regex in <var>parsers/logbooks.py</var> only when a full data import is done.
+</ul>
+<p>Frankly it's amazing it even appears to work at all.
+
+<h4>Troggle folk data importing</h4>
+<p>
+Troggle reads the mugshot and blurb about each person.
+It reads it direct from folk.csv which has fields of URL links to those files.
+It does this when troggle is run with 
+<code>python databaseReset.py people</code>
+<p>
+Troggle generates its own blurb about each person, including past expeditions and trips
+taken from the logbooks (and from parsing svx files)
+A link to this troggle page has been added to folk/index.htm 
+by making it happen in make-folklist.py
+<p>
+Troggle scans the blurb and looks for everything between &lt;body&gt; and &lt;hr&gt;
+to find the text of the blurb
+(see <var>parsers/people.py</var>)
+<p>
+All the blurb files have to be .htm - .html is not recognised by people.py
+and trying to fix this breaks something else (weirdly, not fully investigated).
+<p>
+There seems to be a problem with importing blurbs with more than one image file, even those the code
+in people.py only looks for the first image file but then fails to use it.
+
+<h4>Proposal</h4>
+<p>I would start by replacing the recognisers in <var>urls.py</var> with a slug for an arbitrary text string, and interpreting it in the python code handling the page. 
+This would entail replacing all the database parsing  bits to produce the same slug in the same way.
+<p>At that point we should get the 19 people into the system even if all the other crumdph is still there.
+Then we take a deep breath and look at it all again.
+
+<h2 id="otherfolk">Folk: pending possible improvements</h2>
+
+<p>Read about the <a href="../computing/folkupdate.html">folklist script</a> before reading the rest of this.
+<p>This does some basic validation: it checks that the mugshot
+images and blurb HTML files exist.
+
+<p> The folk.csv file could be split:
+<br>
+folk-1.csv will be for old cavers who will not come again, so this file need never be touched.
+<br>
+folk-2.csv will be for recent cavers and the current expo, this needs editing every year
+
+<p>
+The year headings of folk-1 and folk-2 need to be accurate , but they do not need to be
+the same columns. So folk-2 can start in a much later year.
+
+<p>
+folk-0 will be for awkward buggers whose attendance spans decades. This needs updating whenever
+one of these lags attends:
+AERW, Becka, Mark Dougherty, Philip Sargent, Chris Densham, Mike Richardson
+
+<p>
+Currently (August 2022) the software ignores folk-0, -1, -2 and we have used the old folk.csv for 
+the 2022 expo. But we hope to have this fixed next year...
+
+<hr />
+Return to: <a href="trogdesign.html">Troggle design and future implementations</a><br />
+Return to: <a href="trogintro.html">Troggle intro</a><br />
+Troggle index: 
+<a href="trogindex.html">Index of all troggle documents</a><br /><hr />
+</body>
+</html>
--- a/handbook/troggle/trogdesign.html
+++ b/handbook/troggle/trogdesign.html
@@ -43,6 +43,13 @@ so that the entrance description pops up.
 <h3>Using Question Marks in active exploration</h3>
 <p>See <a href="scriptsqms.html">the current ugly situation</a>.

+<h3>Proper archive/restore of Tunnel and Therion files</h3>
+<p>Strangely, we have no process at all to allow anyone to download the archived Tunnel or Therion XML files and also 
+download the referenced source scan files at the same time so that the references within the XML files
+actually work.
+<p>The XML files contain cross-reference links to the scan files <em>on the computer the tunnelling/therioning was done</em>
+which is different for every machine as we have no recommended standard setup.
+
 <h3>Supporting Final Survey Preparation</h3>
 <p>We have no procedure for this. And also no proper procedures (or even agreed single final location) for rigging topos either. We have a bucket folder for final drawn-up surveys on expofiles.

@@ -57,6 +64,10 @@ Element, which we can archive ourseleves, and maybe we can use Kanboard (ditto)

 <h2 id="badly">Things Troggle Does Badly</h2>

+<h3>Managing periople's names</h3>
+<p>As of 2022, there are 15 people troggle can't cope with at all because their name is not structured as
+"Forename Surname": where it is only two words and each begins with a capital letter (with no other punctuation,
+capital letters or other names or initials). See the design document <a href="namesredesign.html">handling people's names properly</a>.

 <h3>Writing Cave Descriptions</h3>
 <p>In 2022 we have a working online form to create or edit the cave and entrance descriptions. But the URL 
@@ -150,6 +161,7 @@ complete copy, but if universal internet access is coming anyway, any such work

 <h2 id="specific">Specific, Immediate problems</h2>
 <ul>
+    <li>New systems for <a href="namesredesign.html">handling people's names properly</a>
    <li>New systems for <a href="menudesign.html">website menus</a>
    <li>New  <a href="lbredesign.html">logbook coding system</a> - not at all urgent
    <li><s>Short-term note on "logon" <a href="trogregistr.html">django-registration</a></s>
--- a/handbook/troggle/trogindex.html
+++ b/handbook/troggle/trogindex.html
@@ -24,6 +24,7 @@
 <li><a href="trogregistr.html">Troggle Login and user registration</a> - proposal to remove registration (DONE)<br>
 <li><a href="lbredesign.html">Troggle Logbook Format Redesign</a> - options for revising the logbook HTML format<br>
 <li><a href="menudesign.html">Troggle Menu Design</a> - options for replacing the menuing system<br><br>
+<li><a href="namesredesign.html">Troggle people's names' redesign</a>
 <li><a href="trogsimpler.html">Troggle - a kinder simpler troggle?</a> - Radost's proposals (critiqued)<br>
 <li><a href="trogspeculate.html">Troggle Architecture Speculations</a> - as in April 2020<br>
 <li><a href="trog2030.html">Troggle in 2025-2030</a> - architectural evolution proposal<br>