mirror of
https://expo.survex.com/repositories/expoweb/.git/
synced 2024-11-22 07:11:55 +00:00
Docs on QM code status, troggle redesign
This commit is contained in:
parent
2cb287ca81
commit
9d75a09cf5
@ -5,7 +5,7 @@
|
||||
<title>CUCC Expedition Handbook: People Update</title>
|
||||
<link rel="stylesheet" type="text/css" href="../../css/main2.css" />
|
||||
</head>
|
||||
<body>
|
||||
<body><style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>
|
||||
<h2 id="tophead">CUCC Expedition Handbook</h2>
|
||||
<h1>The list of people on expo</h1>
|
||||
|
||||
|
@ -5,7 +5,7 @@
|
||||
<title>CUCC Expedition Handbook: Logbook import</title>
|
||||
<link rel="stylesheet" type="text/css" href="../../css/main2.css" />
|
||||
</head>
|
||||
<body>
|
||||
<body><style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>>
|
||||
<h2 id="tophead">CUCC Expedition Handbook</h2>
|
||||
<h1>Logbooks Import</h1>
|
||||
|
||||
@ -59,15 +59,7 @@ Calculating GetPersonExpeditionNameLookup for 2017
|
||||
<p>Errors are usually misplaced or duplicated <hr /> tags, names which are not specific enough to be recognised by the parser (though it tries hard) such as "everyone" or "et al." or are simply missing, or a bit of description which has been put into the names section such as "Goulash Regurgitation".
|
||||
|
||||
<h3 id="history">The logbooks format</h3>
|
||||
<p>This is documented on the <a href="..logbooks.html#format">logbook user-documentation page</a> as even expoers who can do nothing else technical can at least write up their logbook entries.
|
||||
|
||||
<p>[ Yes this format needs to be re-done using a proper structure:<br />
|
||||
<code><pre>
|
||||
<div class="logentry"><br />
|
||||
<span style="text-decoration: line-through wavy red;"> </span>
|
||||
</div"></pre></code>
|
||||
it's on the to-do list...]
|
||||
|
||||
<p>This is documented on the <a href="../logbooks.html#format">logbook user-documentation page</a> as even expoers who can do nothing else technical can at least write up their logbook entries.
|
||||
|
||||
<h3 id="history">Historical logbooks format</h3>
|
||||
<p>Older logbooks (prior to 2007) were stored as logbook.txt with just a bit of consistent markup to allow troggle parsing.</p>
|
||||
|
@ -144,10 +144,18 @@ idea to type up <i>just your trip(s)</i> in a separate file, e.g. "logbook-mynew
|
||||
<div class="timeug">T/U 10 mins</div></pre></code>
|
||||
<p>Note: the ID's must be unique, so are generated from 't' plus the trip date plus a,b,c etc.
|
||||
when there is more than one trip on a day.</p>
|
||||
<p>Note: T/U stands for "Time Underground" in hours (6 minutes would be "0.1 hours").
|
||||
<p>Note: the <hr /> is significant and used in parsing, it is not just prettiness.
|
||||
<p>Note: <var><span style="color:red">T/U</span></var> stands for "Time Underground" in hours (6 minutes would be "0.1 hours").
|
||||
<p>Note: the <var><span style="color:red"><hr /></span></var> is significant and used in parsing, it is not just prettiness.
|
||||
|
||||
<p>Note this special format <var>"<span style="color:red">Top Camp - </span>"</var> in the triptitle line:
|
||||
<code><pre><div class="triptitle"><span style="color:red">Top Camp - </span>Setting up 76 bivi</div></pre></code>
|
||||
It denotes the <var>cave</var> or <var>area</var> the trip or activity happened in. It is a word or two separated from the rest of the triptitle with "<var> - </var>" (space-dash-space). Usual values
|
||||
for this are "Plateau", "Base camp", "264", "Balkon", "Tunnocks", "Travel" etc.
|
||||
|
||||
<p>Note this special format <var>"<span style="color:red"><u>Jenny Black</u></span>"</var> in the trippeople line:
|
||||
<code><pre><div class="trippeople"><span style="color:red"><u>Jenny Black</u></span>, Olly Betts</div>
|
||||
</pre></code>
|
||||
It is necessary that one (and only one) of the people on the trip is set in <span style="color:red"><u></u></span> underline format. This is interpreted to mean that this is the author of the logbook entry. If there is no author set, then this is an error and the entry is ignored.
|
||||
<hr />
|
||||
|
||||
|
||||
|
@ -12,7 +12,7 @@
|
||||
<h2>QM data and cave descriptions</h2>
|
||||
|
||||
<p>
|
||||
This document describes how to include Qustion Marks (QMs) and cave descriptions in .svx files.
|
||||
This document describes how to include Question Marks (QMs) and cave descriptions in .svx files.
|
||||
|
||||
<p>There
|
||||
are dedicated fields in the template.svx file for this purpose, but there has been laxness recently on filling them in.
|
||||
@ -68,6 +68,17 @@ Here is an example from the last bit of bipedalpassage.svx in 264. Note that eac
|
||||
;QM6 C bipedalpassage.31 - Very good location where main phreatic passages and enlarges - but far side of chamber choked. One part of choke was not accessed as needs 2m climb up to poke nose in it. A good free climber could do this or needs one bolt to be sure no way on. Very strong draft in choke! Interesting southerly trend at margin of known system
|
||||
</code></pre>
|
||||
|
||||
<p>
|
||||
The format for question mark lists is <br>
|
||||
<ul>
|
||||
<li>QM identifier, <li><a href="../../qm.html">Quality Grade</a>, <li>Area indicator, <li>decription of QM.
|
||||
</ul>
|
||||
<p>The QM numbers themselves are in the format <br>
|
||||
<ul>
|
||||
<li><a href="../../qm.html">Discoverer identifier</a>, <li>Year of discovery, <li>Cave identifier, <li>serial number.
|
||||
</ul>
|
||||
<p>This format is <a href="../../qm.html">documented in the QM conventions</a> page.
|
||||
|
||||
<p>
|
||||
The example below demonstrates correct and effective use of the QM list referring back to earlier elements in the svx file:
|
||||
<p>
|
||||
@ -80,6 +91,10 @@ is a very inefficient use of time.
|
||||
<p>Also if the person reading it hasn’t been to the bit of cave (which is, like, <em>the whole point</em>, then the data has a higher chance of being incorrect. It is not always easy to interpret Tunnel or Therion
|
||||
drawings correctly with this sort of thing.
|
||||
|
||||
<h3>Programming note</h3>
|
||||
<p>Better handling of historic QMs is a current, occasionally active, area of
|
||||
development in our online systems. The current status is <a href="../troggle/scriptsqms.html">documented here</a>.
|
||||
|
||||
<h3>Conclusion</h3>
|
||||
<p>Survey data recorded in .svx files is incomplete if there is no QM List data and
|
||||
cave description data!
|
||||
|
@ -34,7 +34,7 @@ tl;dr - use <em>svx2qm.py</em>. Look at the output at:<br>
|
||||
|
||||
<p>There are four ways we have used to manage QMs:
|
||||
<ol>
|
||||
<li><strong>Perl script</strong> - Historically QMs were not in the survex file but typed up in a separate list <var>qms.csv</var> for each cave system. A perl script turned that into an HTML file for the website.
|
||||
<li><strong>Perl script</strong> - Historically QMs were not in the survex file but typed up in a separate list <var>qms.csv</var> for each cave system. A perl script turned that into an HTML file for the website. But there appear to be 3 different formats for this.
|
||||
<li><strong>Perl + troggle</strong> - One of troggle's input parsers "QM parser" is specifically designed to import the three HTML files produced from <var>qms.csv</var> but doesn't do anything with that data (yet).
|
||||
<li><strong>Python script</strong> - Phil Withnall's 2019 script <em>svx2qm.py</em> scans all the QMs in a single survex file. See below for how to run it on all survex files.
|
||||
<li><strong>New troggle</strong> - Sam's recent addition to troggle's "survex parser" makes it recognise and store QMs when it parses the survex files.
|
||||
@ -52,6 +52,24 @@ tl;dr - use <em>svx2qm.py</em>. Look at the output at:<br>
|
||||
<a href="../../1623/204/qm.html">/1623/204/qm.html</a><br />
|
||||
<p>Note that the <var>qms.csv</var> file file used as input by this script is an <em>entirely different format and table structure</em> from the <var>qms.csv</var> file produced by <a href="#svx2qm">svx2qm.py</a>.
|
||||
|
||||
<p>And in fact the formats of these 3 qm.csv files are <em>not the same</em> (These are the
|
||||
"older or artisanal QM formats" referred to by Phil Withnall at th ebottom if this page) :
|
||||
|
||||
Fields in 204/qm.csv are:
|
||||
<code><pre><span style="font-size:small">Number, grade, area, description, page reference, nearest station, completion description, Comment
|
||||
e.g.
|
||||
C1999-204-09 C Wolp Hole in floor through dangerous boulders veined.10 Filled with rocks
|
||||
</span></pre></code>
|
||||
Fields in 258/qm.csv are:
|
||||
<code><pre><span style="font-size:small">Cave, year, number, Grade, nearest station, description, completion description, found by, completed by
|
||||
e.g.
|
||||
258 2006 27 C 258.gknodel.4 Small passage to E in Germkn”del Sandeep Mavadia and Dave Loeffler
|
||||
</span></pre></code>
|
||||
Fields in 264/qm.csv are:
|
||||
<code><pre><span style="font-size:small">Year, number, Grade, Survey folder ref#, Surveyname, Nearest Station number, Area of the cave, Description, Y if marked on drawn-up survey,
|
||||
2014 7 C 2014#11 roomwithaview 4 Room With a View Room With a View: "Probably chokes" opposite stations 4 and 5 ALREADY EXPLORED PROBABLY
|
||||
</span></pre></code>
|
||||
|
||||
<p>There are also three versions of the QM list for cave 161 (Kaninchenhohle) apparently produced by this method but hand-edited:<br />
|
||||
<a href="../../1623/161/qmaven.html">/1623/161/qmaven.html</a> 1996 version<br />
|
||||
<a href="../../1623/161/qmtodo.html">/1623/161/qmtodo.html</a> 1998 version<br />
|
||||
@ -60,6 +78,25 @@ tl;dr - use <em>svx2qm.py</em>. Look at the output at:<br>
|
||||
<p>In the /1623/204/ folder there is a script <em>qmreader.pl</em> which apparently does the inverse of
|
||||
<em>tablize-qms.pl</em>: it transforms a QMs' HTML file into a CSV file.
|
||||
|
||||
<p>As Wookey says (Slack, 7 Jan. 2020):
|
||||
"I'm not quite sure what the best format is. Some combination of the
|
||||
258 and 264 formats might be best. Including the cave number seems
|
||||
pointless. Including 'conclusion' info seems like a good idea. I'm not
|
||||
sure there what the benefit of separating the 'surveyname' and
|
||||
'nearest station' fields is. Having an 'area of cave' field is somewhat useful
|
||||
for grouping, even though it is sort-of repeating the 'survey-station' info.
|
||||
|
||||
If I was making a QM list I'd enter these fields:
|
||||
year, number, Grade, nearest station, folder reference, description, found by, completed (Year), completion description/cave description link, completed by
|
||||
|
||||
with these details:
|
||||
<ul>
|
||||
<li>number is just the serial number, not the whole year-serial-grade
|
||||
<li>'nearest station' does not include the cave number
|
||||
<li>completed is blank (for not completed) or a year for when it was done
|
||||
<li>completeion description should be a link to the relevant bit of cave description, but if that doesn't exist
|
||||
</ul> then a short description here is OK."
|
||||
|
||||
|
||||
<h4 id="qms.py">troggle/parsers/qms.py</a></h4>
|
||||
<p>The parser <em>troggle/parsers/qms.py</em> currently imports those same <var>qm.csv</var> files from the perl script into troggle using a mixture of csv and html parsers:
|
||||
@ -88,6 +125,11 @@ The 2019 copies are online in /expofiles/:
|
||||
This will work on all survex *.svx files even those which have not yet been run through the troggle import process.
|
||||
<p>Phil says (13 April 2020): <em>"The generated files are not meant to be served by the webserver, it’s a tool for people to run locally. Someone could modify it to create HTML output (or post-process the CSV output to do the same), but that is work still to be done."</em>
|
||||
|
||||
<h4>troggle/parsers/survex.py</a></h4>
|
||||
<p>The QMs inside thge survex files are parsed by troggle along with all the other information
|
||||
inside survex files and stored in the database. But the webpages which display tis data are rudimentary, e.g. <a href="/getQMs/1623-204">/getQMs/1623-204</a> or <a href="/cave/qms/1623-204">/cave/qms/1623-204</a>.
|
||||
Looking through urls.py and core/view_caves.py we see a lot of code for providing new QM numbers, producing lists of QMs for a given cave and for downloading QM.csv files generated by the database. But none of it appears to be working today (14 May 2020), see below.
|
||||
|
||||
<h4 id="samqms">Sam's parser additions</a></h4>
|
||||
<p>Troggle <em>troggle/parsers/survex.py</em> currently parses and stores all the QMs it finds in survex files. The tables where the data is put are listed in <a href="datamodel.html">the current data model</a> including structure for ticking them off.
|
||||
|
||||
@ -108,7 +150,7 @@ So someone was busy at one time.
|
||||
<h2>QMs - monitoring progress</h2>
|
||||
|
||||
<h4 id="find-dead-qms">find-dead-qms.py</h4>
|
||||
<p>This finds references to <em>completed</em> qms in the qm.csv files in the cave folders (/1623/ etc.) in the :expoweb: <a href="../computing/repos.html">repository</a>. It looks to see which QMs have been completed but where there is not yet a matching text in the cave description.
|
||||
<p>This stand-alone script finds references to <em>completed</em> qms in the qm.csv files in the cave folders (/1623/ etc.) in the :expoweb: <a href="../computing/repos.html">repository</a>. It looks to see which QMs have been completed but where there is not yet a matching text in the cave description.
|
||||
<blockquote><em>Quick and dirty Python script to find references to completed qms in the
|
||||
cave description pages. Run this to find which bits of description
|
||||
need updating.
|
||||
@ -153,7 +195,7 @@ I guess it all depends on what questions people are trying to answer using the Q
|
||||
as to how (and where) best to present it. I’m afraid I don’t have any suggestions there.
|
||||
|
||||
:Rob Watson wrote some documentation about QMs
|
||||
:http://expo.survex.com/handbook/survey/qmentry.html
|
||||
:<a href="../survey/qmentry.html">http://expo.survex.com/handbook/survey/qmentry.html</a>
|
||||
:is there anything subtle missing as to how they are used ?
|
||||
|
||||
Nope, I think Rob’s page covers it all. That page also documents the correct QM format
|
||||
|
@ -50,6 +50,15 @@ Which is fun, but not useful. And not just because it is immature. None of
|
||||
this addresses <strong>our biggest problem: devising something that can be
|
||||
maintained by fewer, less-expert people who can only devote short snippets
|
||||
of time and not long-duration immersion</strong>.
|
||||
<h3>Our biggest problem</h3>
|
||||
We need:
|
||||
<ul>
|
||||
<li>something that can be maintained by fewer, less-expert people
|
||||
<li>who can only devote short snippets of time
|
||||
<li>without requiring weeks of long-duration deep immersion
|
||||
</ul>
|
||||
|
||||
<h3>Federation of independent scripts</h3>
|
||||
<p>
|
||||
I know Wookey has been thinking of a loose federation of independent scripts
|
||||
working on the same data, but the more I look at troggle and the tasks it
|
||||
@ -63,22 +72,38 @@ wallets.py does (originally by Martin Green) is in troggle already - but
|
||||
better. [There is a many:many relationship between svx files and wallet
|
||||
directories in reality, not 1:1]
|
||||
<p>
|
||||
<h3>troggle now</h3>
|
||||
Troggle is very nearly fully working (not with as many functions as
|
||||
originally envisaged admittedly) but very nearly. There are several
|
||||
import/parsers which are aborting without producing error messages, so most
|
||||
of the survey blocks don't get loaded where they actually get displayed, and
|
||||
the surveyscan images only appear as filename strings which are not checked
|
||||
for referential integrity, so we are missing a consistency check there, and
|
||||
the QM data display needs writing; but other than that it's in pretty good
|
||||
originally envisaged admittedly) but very nearly.
|
||||
The QM data display needs writing; but other than that it's in pretty good
|
||||
shape. [Ah, yes, we should really add "drawings" as a core concept as well
|
||||
as "surveyscans". That will be a bit of work.]
|
||||
<p>
|
||||
<h3>Need for separate data-import checking scripts</h3>
|
||||
The one thing external scripts would be really useful for is syntax checking
|
||||
and reference checking prior to import. I have found some weird and
|
||||
wonderful filename paths inside the tunnel and therion drawings, and in
|
||||
survex *ref paths.
|
||||
<p>
|
||||
<h3>Addendum</h3>
|
||||
|
||||
<h3>Non-django troggle</h3>
|
||||
<p>Another possibility is ripping django out of troggle and leaving bare python
|
||||
plus a SQL database. This means that programmers would need to understand more SQL
|
||||
but would not need to understand "django". Arguably this
|
||||
could mean that we could gain.
|
||||
<p>Writing our own multi-user code would not be sensible, hence the database.
|
||||
But we could move to a read-only system where the only writing happens on data-import.
|
||||
Then we could use python 'pickle()' or 'json()' read-only data structures, but we
|
||||
would need to create all our own indexing and cross-referencing code.
|
||||
<p>There would be more lower-level code, but the
|
||||
different segments of the system could be in caving-sensible modules not
|
||||
django-meaningful modules. And we would not have all the extra
|
||||
language-like constructs that django introduces e.g. <var>X.objects.set_all()</var>, which
|
||||
modern editors complain about because it is a django idiom and
|
||||
not a function within the python codebase.
|
||||
|
||||
(We could retain an HTML templating engine though.)
|
||||
|
||||
<h3><em>Addendum</em></h3>
|
||||
<p>There is a templating engine <a href="https://mozilla.github.io/nunjucks/">Nunjucks</a>
|
||||
which is a port to JavaScript of the Django templating system we use
|
||||
(via <a href="https://palletsprojects.com/p/jinja/">Jinja</a> - these are the same people who do Flask). This would be an obvious thing to use if we needed to go in that direction.
|
||||
|
Loading…
Reference in New Issue
Block a user