diff --git a/handbook/troggle/namesredesign.html b/handbook/troggle/namesredesign.html index 4a69cdd7a..09cdbeabe 100644 --- a/handbook/troggle/namesredesign.html +++ b/handbook/troggle/namesredesign.html @@ -1,126 +1,127 @@ - - - - -CUCC Expedition Handbook: Peoples' names design options - - - -

CUCC Expedition Handbook - Peoples' names design options

- -

What, How and Why : Peoples' names

- - - -

Names: Why we need a change

- - -

The current system completely fails with names which are in any way "non standard". -Troggle can't cope with a name not structured as -"Forename Surname": where it is only two words and each begins with a capital letter (with no other punctuation, -capital letters or other names or initials). -

There are 19 people for which the troggle name parsing and the separate folklist script parsing -are different. Reconciling these (find easily using a link checker scanner on the -folk/.index.htm file) is a job that needs to be done. Every name in the generated -index.htm now has a hyperlink which goes to the troggle page about that person. Except -for those 19 people. - -This has to be fixed as it affects ~5% of our expoers. -

[This document originally written 31 August 2022] - -

Names: Maintenance constraints

-

We have special code scattered across troggle to cope with "Wookey", "Wiggy" and "Mike the Animal". This is a pain to maintain. - -

Names: How it works now

-

Fundamentally we have regexes detecting whether something is a name or not - in several places. These should all be replaced by properly delimited strings. -

Four different bits

- -

Frankly it's amazing it even appears to work at all. - -

Troggle folk data importing

-

-Troggle reads the mugshot and blurb about each person. -It reads it direct from folk.csv which has fields of URL links to those files. -It does this when troggle is run with -python databaseReset.py people -

-Troggle generates its own blurb about each person, including past expeditions and trips -taken from the logbooks (and from parsing svx files) -A link to this troggle page has been added to folk/index.htm -by making it happen in make-folklist.py -

-Troggle scans the blurb and looks for everything between <body> and <hr> -to find the text of the blurb -(see parsers/people.py) -

-All the blurb files have to be .htm - .html is not recognised by people.py -and trying to fix this breaks something else (weirdly, not fully investigated). -

-There seems to be a problem with importing blurbs with more than one image file, even those the code -in people.py only looks for the first image file but then fails to use it. - -

Proposal

-

I would start by replacing the recognisers in urls.py with a slug for an arbitrary text string, and interpreting it in the python code handling the page. -This would entail replacing all the database parsing bits to produce the same slug in the same way. -

At that point we should get the 19 people into the system even if all the other crumdph is still there. -Then we take a deep breath and look at it all again. - -

Folk: pending possible improvements

- -

Read about the folklist script before reading the rest of this. -

This does some basic validation: it checks that the mugshot -images and blurb HTML files exist. - -

The folk.csv file could be split: -
-folk-1.csv will be for old cavers who will not come again, so this file need never be touched. -
-folk-2.csv will be for recent cavers and the current expo, this needs editing every year - -

-The year headings of folk-1 and folk-2 need to be accurate , but they do not need to be -the same columns. So folk-2 can start in a much later year. - -

-folk-0 will be for awkward buggers whose attendance spans decades. This needs updating whenever -one of these lags attends: -AERW, Becka, Mark Dougherty, Philip Sargent, Chris Densham, Mike Richardson - -

-Currently (August 2022) the software ignores folk-0, -1, -2 and we have used the old folk.csv for -the 2022 expo. But we hope to have this fixed next year... - -


-Return to: Troggle design and future implementations
-Return to: Troggle intro
-Troggle index: -Index of all troggle documents

- - + + + + +CUCC Expedition Handbook: Peoples' names design options + + + + +

CUCC Expedition Handbook - Peoples' names design options

+ +

What, How and Why : Peoples' names

+ + + +

Names: Why we need a change

+ + +

The current system completely fails with names which are in any way "non standard". +Troggle can't cope with a name not structured as +"Forename Surname": where it is only two words and each begins with a capital letter (with no other punctuation, +capital letters or other names or initials). +

There are 19 people for which the troggle name parsing and the separate folklist script parsing +are different. Reconciling these (find easily using a link checker scanner on the +folk/.index.htm file) is a job that needs to be done. Every name in the generated +index.htm now has a hyperlink which goes to the troggle page about that person. Except +for those 19 people. + +This has to be fixed as it affects ~5% of our expoers. +

[This document originally written 31 August 2022] + +

Names: Maintenance constraints

+

We have special code scattered across troggle to cope with "Wookey", "Wiggy" and "Mike the Animal". This is a pain to maintain. + +

Names: How it works now

+

Fundamentally we have regexes detecting whether something is a name or not - in several places. These should all be replaced by properly delimited strings. +

Four different bits

+ +

Frankly it's amazing it even appears to work at all. + +

Troggle folk data importing

+

+Troggle reads the mugshot and blurb about each person. +It reads it direct from folk.csv which has fields of URL links to those files. +It does this when troggle is run with +python databaseReset.py people +

+Troggle generates its own blurb about each person, including past expeditions and trips +taken from the logbooks (and from parsing svx files) +A link to this troggle page has been added to folk/index.htm +by making it happen in make-folklist.py +

+Troggle scans the blurb and looks for everything between <body> and <hr> +to find the text of the blurb +(see parsers/people.py) +

+ [This now seems to have have been fixed (July 2023):

+ +

Proposal

+

I would start by replacing the recognisers in urls.py with a slug for an arbitrary text string, and interpreting it in the python code handling the page. +This would entail replacing all the database parsing bits to produce the same slug in the same way. +

At that point we should get the 19 people into the system even if all the other crumdph is still there. +Then we take a deep breath and look at it all again. + +

Folk: pending possible improvements

+ +

Read about the folklist script before reading the rest of this. +

This does some basic validation: it checks that the mugshot +images and blurb HTML files exist. + +

The folk.csv file could be split: +
+folk-1.csv will be for old cavers who will not come again, so this file need never be touched. +
+folk-2.csv will be for recent cavers and the current expo, this needs editing every year + +

+The year headings of folk-1 and folk-2 need to be accurate , but they do not need to be +the same columns. So folk-2 can start in a much later year. + +

+folk-0 will be for awkward buggers whose attendance spans decades. This needs updating whenever +one of these lags attends: +AERW, Becka, Mark Dougherty, Philip Sargent, Chris Densham, Mike Richardson + +

+Currently (July 2023) the software ignores folk-0, -1, -2 and we have used the old folk.csv for +the 2023 expo. But we hope to have this fixed next year... + +


+Return to: Troggle design and future implementations
+Return to: Troggle intro
+Troggle index: +Index of all troggle documents

+