diff --git a/handbook/computing/folkupdate.html b/handbook/computing/folkupdate.html index f43f20a6a..194c5cc7a 100644 --- a/handbook/computing/folkupdate.html +++ b/handbook/computing/folkupdate.html @@ -5,7 +5,7 @@ CUCC Expedition Handbook: People Update - +

CUCC Expedition Handbook

The list of people on expo

diff --git a/handbook/computing/logbooks-parsing.html b/handbook/computing/logbooks-parsing.html index 01ec43191..82b3b70d3 100644 --- a/handbook/computing/logbooks-parsing.html +++ b/handbook/computing/logbooks-parsing.html @@ -5,7 +5,7 @@ CUCC Expedition Handbook: Logbook import - +>

CUCC Expedition Handbook

Logbooks Import

@@ -59,15 +59,7 @@ Calculating GetPersonExpeditionNameLookup for 2017

Errors are usually misplaced or duplicated <hr /> tags, names which are not specific enough to be recognised by the parser (though it tries hard) such as "everyone" or "et al." or are simply missing, or a bit of description which has been put into the names section such as "Goulash Regurgitation".

The logbooks format

-

This is documented on the logbook user-documentation page as even expoers who can do nothing else technical can at least write up their logbook entries. - -

[ Yes this format needs to be re-done using a proper structure:
-

-<div class="logentry">
-     -</div">
-it's on the to-do list...] - +

This is documented on the logbook user-documentation page as even expoers who can do nothing else technical can at least write up their logbook entries.

Historical logbooks format

Older logbooks (prior to 2007) were stored as logbook.txt with just a bit of consistent markup to allow troggle parsing.

diff --git a/handbook/logbooks.html b/handbook/logbooks.html index 4b3c1ef18..44b0a0382 100644 --- a/handbook/logbooks.html +++ b/handbook/logbooks.html @@ -144,10 +144,18 @@ idea to type up just your trip(s) in a separate file, e.g. "logbook-mynew <div class="timeug">T/U 10 mins</div>

Note: the ID's must be unique, so are generated from 't' plus the trip date plus a,b,c etc. when there is more than one trip on a day.

-

Note: T/U stands for "Time Underground" in hours (6 minutes would be "0.1 hours"). -

Note: the <hr /> is significant and used in parsing, it is not just prettiness. +

Note: T/U stands for "Time Underground" in hours (6 minutes would be "0.1 hours"). +

Note: the <hr /> is significant and used in parsing, it is not just prettiness. +

Note this special format "Top Camp - " in the triptitle line: +

<div class="triptitle">Top Camp - Setting up 76 bivi</div>
+It denotes the cave or area the trip or activity happened in. It is a word or two separated from the rest of the triptitle with " - " (space-dash-space). Usual values +for this are "Plateau", "Base camp", "264", "Balkon", "Tunnocks", "Travel" etc. +

Note this special format "<u>Jenny Black</u>" in the trippeople line: +

<div class="trippeople"><u>Jenny Black</u>, Olly Betts</div>
+
+It is necessary that one (and only one) of the people on the trip is set in <u></u> underline format. This is interpreted to mean that this is the author of the logbook entry. If there is no author set, then this is an error and the entry is ignored.
diff --git a/handbook/survey/qmentry.html b/handbook/survey/qmentry.html index 4ead8f5ac..1776bde71 100644 --- a/handbook/survey/qmentry.html +++ b/handbook/survey/qmentry.html @@ -12,7 +12,7 @@

QM data and cave descriptions

-This document describes how to include Qustion Marks (QMs) and cave descriptions in .svx files. +This document describes how to include Question Marks (QMs) and cave descriptions in .svx files.

There are dedicated fields in the template.svx file for this purpose, but there has been laxness recently on filling them in. @@ -68,6 +68,17 @@ Here is an example from the last bit of bipedalpassage.svx in 264. Note that eac ;QM6 C bipedalpassage.31 - Very good location where main phreatic passages and enlarges - but far side of chamber choked. One part of choke was not accessed as needs 2m climb up to poke nose in it. A good free climber could do this or needs one bolt to be sure no way on. Very strong draft in choke! Interesting southerly trend at margin of known system +

+The format for question mark lists is
+

+

The QM numbers themselves are in the format
+

+

This format is documented in the QM conventions page. +

The example below demonstrates correct and effective use of the QM list referring back to earlier elements in the svx file:

@@ -80,6 +91,10 @@ is a very inefficient use of time.

Also if the person reading it hasn’t been to the bit of cave (which is, like, the whole point, then the data has a higher chance of being incorrect. It is not always easy to interpret Tunnel or Therion drawings correctly with this sort of thing. +

Programming note

+

Better handling of historic QMs is a current, occasionally active, area of +development in our online systems. The current status is documented here. +

Conclusion

Survey data recorded in .svx files is incomplete if there is no QM List data and cave description data! diff --git a/handbook/troggle/scriptsqms.html b/handbook/troggle/scriptsqms.html index bbb87e2f2..fddfea09b 100644 --- a/handbook/troggle/scriptsqms.html +++ b/handbook/troggle/scriptsqms.html @@ -34,7 +34,7 @@ tl;dr - use svx2qm.py. Look at the output at:

There are four ways we have used to manage QMs:

    -
  1. Perl script - Historically QMs were not in the survex file but typed up in a separate list qms.csv for each cave system. A perl script turned that into an HTML file for the website. +
  2. Perl script - Historically QMs were not in the survex file but typed up in a separate list qms.csv for each cave system. A perl script turned that into an HTML file for the website. But there appear to be 3 different formats for this.
  3. Perl + troggle - One of troggle's input parsers "QM parser" is specifically designed to import the three HTML files produced from qms.csv but doesn't do anything with that data (yet).
  4. Python script - Phil Withnall's 2019 script svx2qm.py scans all the QMs in a single survex file. See below for how to run it on all survex files.
  5. New troggle - Sam's recent addition to troggle's "survex parser" makes it recognise and store QMs when it parses the survex files. @@ -52,6 +52,24 @@ tl;dr - use svx2qm.py. Look at the output at:
    /1623/204/qm.html

    Note that the qms.csv file file used as input by this script is an entirely different format and table structure from the qms.csv file produced by svx2qm.py. +

    And in fact the formats of these 3 qm.csv files are not the same (These are the +"older or artisanal QM formats" referred to by Phil Withnall at th ebottom if this page) : + +Fields in 204/qm.csv are: +

    Number, grade, area, description, page reference, nearest station, completion description, Comment
    +e.g.
    +C1999-204-09    C    Wolp    Hole in floor through dangerous boulders        veined.10    Filled with rocks
    +
    +Fields in 258/qm.csv are: +
    Cave, year, number, Grade, nearest station, description, completion description, found by, completed by
    +e.g.
    +258  2006  27        C      258.gknodel.4    Small passage to E in Germkn”del          Sandeep Mavadia and Dave Loeffler
    +
    +Fields in 264/qm.csv are: +
    Year, number, Grade, Survey folder ref#, Surveyname, Nearest Station number, Area of the cave, Description, Y if marked on drawn-up survey,
    +2014  7          C        2014#11      roomwithaview    4        Room With a View      Room With a View: "Probably chokes"  opposite stations 4 and 5      ALREADY EXPLORED PROBABLY
    +
    +

    There are also three versions of the QM list for cave 161 (Kaninchenhohle) apparently produced by this method but hand-edited:
    /1623/161/qmaven.html 1996 version
    /1623/161/qmtodo.html 1998 version
    @@ -60,6 +78,25 @@ tl;dr - use svx2qm.py. Look at the output at:

    In the /1623/204/ folder there is a script qmreader.pl which apparently does the inverse of tablize-qms.pl: it transforms a QMs' HTML file into a CSV file. +

    As Wookey says (Slack, 7 Jan. 2020): +"I'm not quite sure what the best format is. Some combination of the +258 and 264 formats might be best. Including the cave number seems +pointless. Including 'conclusion' info seems like a good idea. I'm not +sure there what the benefit of separating the 'surveyname' and +'nearest station' fields is. Having an 'area of cave' field is somewhat useful +for grouping, even though it is sort-of repeating the 'survey-station' info. + +If I was making a QM list I'd enter these fields: +year, number, Grade, nearest station, folder reference, description, found by, completed (Year), completion description/cave description link, completed by + +with these details: +

    then a short description here is OK." +

    troggle/parsers/qms.py

    The parser troggle/parsers/qms.py currently imports those same qm.csv files from the perl script into troggle using a mixture of csv and html parsers: @@ -88,6 +125,11 @@ The 2019 copies are online in /expofiles/: This will work on all survex *.svx files even those which have not yet been run through the troggle import process.

    Phil says (13 April 2020): "The generated files are not meant to be served by the webserver, it’s a tool for people to run locally. Someone could modify it to create HTML output (or post-process the CSV output to do the same), but that is work still to be done." +

    troggle/parsers/survex.py

    +

    The QMs inside thge survex files are parsed by troggle along with all the other information +inside survex files and stored in the database. But the webpages which display tis data are rudimentary, e.g. /getQMs/1623-204 or /cave/qms/1623-204. +Looking through urls.py and core/view_caves.py we see a lot of code for providing new QM numbers, producing lists of QMs for a given cave and for downloading QM.csv files generated by the database. But none of it appears to be working today (14 May 2020), see below. +

    Sam's parser additions

    Troggle troggle/parsers/survex.py currently parses and stores all the QMs it finds in survex files. The tables where the data is put are listed in the current data model including structure for ticking them off. @@ -108,7 +150,7 @@ So someone was busy at one time.

    QMs - monitoring progress

    find-dead-qms.py

    -

    This finds references to completed qms in the qm.csv files in the cave folders (/1623/ etc.) in the :expoweb: repository. It looks to see which QMs have been completed but where there is not yet a matching text in the cave description. +

    This stand-alone script finds references to completed qms in the qm.csv files in the cave folders (/1623/ etc.) in the :expoweb: repository. It looks to see which QMs have been completed but where there is not yet a matching text in the cave description.

    Quick and dirty Python script to find references to completed qms in the cave description pages. Run this to find which bits of description need updating. @@ -153,7 +195,7 @@ I guess it all depends on what questions people are trying to answer using the Q as to how (and where) best to present it. I’m afraid I don’t have any suggestions there. :Rob Watson wrote some documentation about QMs -:http://expo.survex.com/handbook/survey/qmentry.html +:http://expo.survex.com/handbook/survey/qmentry.html :is there anything subtle missing as to how they are used ? Nope, I think Rob’s page covers it all. That page also documents the correct QM format diff --git a/handbook/troggle/trogdesignx.html b/handbook/troggle/trogdesignx.html index 0bbb527fe..fac629b71 100644 --- a/handbook/troggle/trogdesignx.html +++ b/handbook/troggle/trogdesignx.html @@ -50,6 +50,15 @@ Which is fun, but not useful. And not just because it is immature. None of this addresses our biggest problem: devising something that can be maintained by fewer, less-expert people who can only devote short snippets of time and not long-duration immersion. +

    Our biggest problem

    +We need: +
      +
    • something that can be maintained by fewer, less-expert people +
    • who can only devote short snippets of time +
    • without requiring weeks of long-duration deep immersion +
    + +

    Federation of independent scripts

    I know Wookey has been thinking of a loose federation of independent scripts working on the same data, but the more I look at troggle and the tasks it @@ -63,22 +72,38 @@ wallets.py does (originally by Martin Green) is in troggle already - but better. [There is a many:many relationship between svx files and wallet directories in reality, not 1:1]

    +

    troggle now

    Troggle is very nearly fully working (not with as many functions as -originally envisaged admittedly) but very nearly. There are several -import/parsers which are aborting without producing error messages, so most -of the survey blocks don't get loaded where they actually get displayed, and -the surveyscan images only appear as filename strings which are not checked -for referential integrity, so we are missing a consistency check there, and -the QM data display needs writing; but other than that it's in pretty good +originally envisaged admittedly) but very nearly. +The QM data display needs writing; but other than that it's in pretty good shape. [Ah, yes, we should really add "drawings" as a core concept as well as "surveyscans". That will be a bit of work.]

    +

    Need for separate data-import checking scripts

    The one thing external scripts would be really useful for is syntax checking and reference checking prior to import. I have found some weird and wonderful filename paths inside the tunnel and therion drawings, and in survex *ref paths. -

    -

    Addendum

    + +

    Non-django troggle

    +

    Another possibility is ripping django out of troggle and leaving bare python +plus a SQL database. This means that programmers would need to understand more SQL +but would not need to understand "django". Arguably this +could mean that we could gain. +

    Writing our own multi-user code would not be sensible, hence the database. +But we could move to a read-only system where the only writing happens on data-import. +Then we could use python 'pickle()' or 'json()' read-only data structures, but we +would need to create all our own indexing and cross-referencing code. +

    There would be more lower-level code, but the +different segments of the system could be in caving-sensible modules not +django-meaningful modules. And we would not have all the extra +language-like constructs that django introduces e.g. X.objects.set_all(), which +modern editors complain about because it is a django idiom and +not a function within the python codebase. + +(We could retain an HTML templating engine though.) + +

    Addendum

    There is a templating engine Nunjucks which is a port to JavaScript of the Django templating system we use (via Jinja - these are the same people who do Flask). This would be an obvious thing to use if we needed to go in that direction.