expoweb/handbook/troggle/scriptsqms.html

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Expo documentation - QMs scripts</title>
<link rel="stylesheet" type="text/css" href="../../css/main2.css" />
</head>
<body><style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>
<style>
h4 {
    margin-top: 1.5em;
    margin-bottom: 0;
}
p {
    margin-top: 0;
    margin-bottom: 0.5em;
}
var { # to match <code> but inline
    font-family: monospace;
    font-size: 0.9em;
    #font-style: normal;
    background-color: #eee;
}
</style>
<h2 id="tophead">CUCC Expedition Handbook</h2>
<h1>QMs and leads</h1>
tl;dr - use <em>svx2qm.py</em>. Look at the output at:<br>
<a href="/expofiles/writeups/2019/qms2019.txt">qms2019.txt</a><br>
<a href="/expofiles/writeups/2019/qms2019.csv">qms2019.csv</a><br>

<h2>QMs - the fourfold path</h2>
<img class="onright" src ="../i/qm-image.jpg" />
<p>You will be familiar with <a href="../survey/qmentry.html">documenting newly found QMs</a> in the survex file when you type it in. But QMs are only useful if they can be easily scanned by people planning the next pushing trip. That's what we are discussing here.

<p>There are four (and a half) ways we have used to manage QMs:
<ol>
<li><strong>Perl script</strong> - Historically QMs were not in the survex file but typed up in a separate list <var>qms.csv</var> for each cave system. A perl script turned that into an HTML file for the website. But there appear to be 3 different formats for this.
<li><strong>Perl + troggle</strong> - One of troggle's input parsers "QM parser" is specifically designed to import the three HTML files produced from <var>qms.csv</var> but doesn't do anything with that data (yet).
<li><strong>Python script</strong> - Phil Withnall's 2019 script <em>svx2qm.py</em> scans all the QMs in a single survex file. See below for how to run it on all survex files.
<li><strong>New troggle</strong> - Sam's recent addition to troggle's "survex parser" makes it recognise and store QMs when it parses the survex files.
<li><strong>The elderly Prospecting Guide</strong> - covers some of the same sort sof information as needed by someone wanting to
chase QMs. It is a troggle-generated document at <a href="/prospecting_guide/">expo.survex.com/prospecting_guide/</a>. It is so old that "top camp" in the guide refers to the col camp and not the Stonebridge bivvy. Some updates were done in 2007.
</ol>

<p>QMs all use <a href="../survey/qm.html">the same QM description conventions</a>.

<h4 id="QM_helper">js/QM_helper.js</h4>
<p>A relic.
<p>This is referred to in core/admin.py and appears to help with the userinterface within the
Django Admin control panel for manipulating QMs. It is not live as media/js/ is not plumbed in.
(Live javascript lives in media/jslib/ which is routed to the URL /javascript/.)


<h4 id="tabqmsqms">tablize-qms.pl</h4>
<p>This is a perl script dating from November 2004.

<p>it takes a <em>hand-edited</em> CSV file name as the program's argument and generates an HTML page listing all the QMs.
<p><a href="../../1623/258/tablize-qms.pl" download>Varient copies of it</a> (they are all slightly different) live in the three cave file folders in <em>:expoweb:/1623/</em>, in <em>258/, 234/</em>, and <em> 204/</em> . These generated html files are live pages in the cave descriptions: <br />
<a href="../../1623/258/qm.html">/1623/258/qm.html</a><br />
<a href="../../1623/234/qm.html">/1623/234/qm.html</a><br />
<a href="../../1623/204/qm.html">/1623/204/qm.html</a><br />
<p>Note that the <var>qms.csv</var> file file used as input by this script is an <em>entirely different format and table structure</em> from the <var>qms.csv</var> file produced by <a href="#svx2qm">svx2qm.py</a>.

<p>And in fact the formats of these 3 qm.csv files are <em>not the same</em> (These are the
"older or artisanal QM formats" referred to by Phil Withnall at th ebottom if this page) :

Fields in 204/qm.csv are:
<code><pre><span style="font-size:small">Number, grade, area, description, page reference, nearest station, completion description, Comment
e.g.
C1999-204-09    C    Wolp    Hole in floor through dangerous boulders        veined.10    Filled with rocks
</span></pre></code>
Fields in 258/qm.csv are:
<code><pre><span style="font-size:small">Cave, year, number, Grade, nearest station, description, completion description, found by, completed by
e.g.
258  2006  27        C      258.gknodel.4    Small passage to E in Germkn”del          Sandeep Mavadia and Dave Loeffler
</span></pre></code>
Fields in 264/qm.csv are:
<code><pre><span style="font-size:small">Year, number, Grade, Survey folder ref#, Surveyname, Nearest Station number, Area of the cave, Description, Y if marked on drawn-up survey,
2014  7          C        2014#11      roomwithaview    4        Room With a View      Room With a View: "Probably chokes"  opposite stations 4 and 5      ALREADY EXPLORED PROBABLY
</span></pre></code>

<p>There are also three versions of the QM list for cave 161 (Kaninchenhohle) apparently produced by this method but hand-edited:<br />
<a href="../../1623/161/qmaven.html">/1623/161/qmaven.html</a> 1996 version<br />
<a href="../../1623/161/qmtodo.html">/1623/161/qmtodo.html</a> 1998 version<br />
<a href="../../1623/161/qmdone.html">/1623/161/qmdone.html</a> 1999 (incomplete) version
</p>
<p>In the /1623/204/ folder there is a script <em>qmreader.pl</em> which apparently does the inverse of
<em>tablize-qms.pl</em>: it transforms a QMs' HTML file into a CSV file.

<p>As Wookey says (Slack, 7 Jan. 2020):
"I'm not quite sure what the best format is. Some combination of the
258 and 264 formats might be best. Including the cave number seems
pointless. Including 'conclusion' info seems like a good idea. I'm not
sure there what the benefit of separating the 'surveyname' and
'nearest station' fields is. Having an 'area of cave' field is somewhat useful
for grouping, even though it is sort-of repeating the 'survey-station' info.

If I was making a QM list I'd enter these fields:
year, number, Grade, nearest station, folder reference, description, found by, completed (Year), completion description/cave description link, completed by

with these details:
<ul>
<li>number is just the serial number, not the whole year-serial-grade
<li>'nearest station' does not include the cave number
<li>completed is blank (for not completed) or a year for when it was done
<li>completeion description should be a link to the relevant bit of cave description, but if that doesn't exist
</ul> then a short description here is OK."


<h4 id="qms.py">troggle/parsers/qms.py</a></h4>
<p>The parser <em>troggle/parsers/qms.py</em> currently imports those same <var>qm.csv</var> files from the perl script into troggle using a mixture of csv and html parsers:
<code><pre>parseCaveQMs(cave='stein',inputFile=r"1623/204/qm.csv")
parseCaveQMs(cave='hauch',inputFile=r"1623/234/qm.csv")
parseCaveQMs(cave='kh', inputFile="1623/161/qmtodo.htm")
#parseCaveQMs(cave='balkonhoehle',inputFile=r"1623/264/qm.csv")</pre></code>
but does not apparently have any output webpage to display them (yet).
</p>
<p>Note that the hand-edited <var>qm.csv</var> for Balkonhohle was apparently abandoned unfinished as we transitioned to putting the QMs in the survex files instead. It contains QMs from 2014 and 2016:<br />
<a href="../../1623/264/qm.csv" download>/1623/264/qm.csv</a> - unused <br/>

<h4 id="svx2qm">svx2qm.py</a></h4>
<p>Philip Withnall's 2019 QM extractor <em>svx2qm.py</em> (in :loser:/qms/) can be used to generate a list of all the QMs in all the svx files in either text or CSV format. When run together with <em>file</em> and <em>xargs</em> it will produce a output listing all the QMs:
<pre><code>cd loser
find -name '*.svx' | xargs qms/svx2qm.py --format csv
</code></pre>
and --format human produces a simple text format.
<p>

The 2019 copies are online in /expofiles/:
<a href="/expofiles/writeups/2019/qms2019.txt">qms2019.txt</a> and
<a href="/expofiles/writeups/2019/qms2019.csv">qms2019.csv</a>.

<p>
This will work on all survex *.svx files even those which have not yet been run through the troggle import process.
<p>Phil says (13 April 2020): <em>"The generated files are not meant to be served by the webserver, it’s a tool for people to run locally. Someone could modify it to create HTML output (or post-process the CSV output to do the same), but that is work still to be done."</em>

<h4>troggle/parsers/survex.py</a></h4>
<p>The QMs inside the survex files are parsed by troggle along with all the other information
inside survex files and stored in the database. But the webpages which display this data are very rudimentary and currently useless, e.g. <a href="/getQMs/1623-204">/getQMs/1623-204</a> or <a href="/cave/qms/1623-204">/cave/qms/1623-204</a>.
Looking through urls.py and core/view_caves.py we see a lot of code for providing new QM numbers, producing lists of QMs for a given cave and for downloading QM.csv files generated by the database. But none of it appears to be working today (14 May 2020), see below.

<h4 id="samqms">Sam's parser additions</a></h4>
<p>Troggle <em>troggle/parsers/survex.py</em> currently parses and stores all the QMs it finds in survex files. The tables where the data is put are listed in <a href="datamodel.html">the current data model</a> including structure for ticking them off.

<p>There is a troggle template file :troggle:/templates/qm.html which is intended to become a useful outstanding QM report in future. Though since it was last edited in 2009, this does not seem to be on anyone's urgent task list.
<p>Troggle has archaic URL recognisers in <var>:troggle:/urls.py</var> for:
<ul>
<li>/newqmnumber/ - crashes troggle
<li>/getQMs/&lt;caveslug&gt; - crashes troggle
<li>/cave/qms/ such that <a href="/cave/qms/1623-161/">/cave/qms/1623-161/</a> doesn't actually crash
<li>/cave/&lt;caveslug&gt;-&lt;year&gt;&lt;qm_id&gt;- crashes troggle
<li>/cave/&lt;cave-id&gt;/qm.csv - to download a <var>qm.csv</var> file (NB not qms.csv) - crashes troggle
<li>/downloadqms - crashes troggle
</ul>
So someone was busy at one time.

<p>There is not yet a troggle report listing the QMs which works.

<h2>QMs - monitoring progress</h2>

<h4 id="find-dead-qms">find-dead-qms.py</h4>
<p>This stand-alone script finds references to <em>completed</em> qms in the qm.csv files in the cave folders (/1623/ etc.) in the :expoweb: <a href="../computing/repos.html">repository</a>. It looks to see which QMs have been completed but where there is not yet a matching text in the  cave description.
<blockquote><em>Quick and dirty Python script to find references to completed qms in the
cave description pages. Run this to find which bits of description
need updating.
<br>
The list of qms is read from the qm.csv file and any with an entry in the
"Completion description" column (column 7) are searched for in all the html
files.
<br>
The script prints a list of the completed qms that it found references to
and in which file.
<br>
Nial Peters - 2011
</em></blockquote>

<hr>
<pre>
From: Philip Withnall [tecnocode]
Sent: 13 April 2020 23:41
To: Philip Sargent (Gmail)
Subject: Re: svx2qm

Hi Philip,

Hope you’re well, thanks for getting in touch about this.

The generated files are not meant to be served by the webserver, it’s a tool for people to run locally.
Someone could modify it to create HTML output (or post-process the CSV output to do the same),
but that is work still to be done.

I can't see any problem with moving it all to expoweb/scripts/ - so long as it is
run with the loser top level directory specified - but I might be mistaken:
find  /home/expo/loser -name '*.svx' | xargs ./svx2qm.py --format human
and it should go into the Makefile too at some point.

Feel free to move it wherever; I am not planning on doing any further work on it.
The script itself just expects to be passed some (relative or absolute) paths to SVX files,
so can be placed wherever, as long as it’s passed appropriate relative paths.

I haven’t written any other scripts which post-process the data or otherwise format it.

I guess it all depends on what questions people are trying to answer using the QM data,
as to how (and where) best to present it. I’m afraid I don’t have any suggestions there.

:Rob Watson wrote some documentation about QMs
:<a href="../survey/qmentry.html">http://expo.survex.com/handbook/survey/qmentry.html</a>
:is there anything subtle missing  as to how they are used ?

Nope, I think Rob’s page covers it all. That page also documents the correct QM format
which is what svx2qm.py understands. (There were some older or artisanal QM formats
floating around at one point, although I think I reformatted them all so the tool
would understand them, and so people would hopefully standardise on what Rob’s
documented from then on.)

Philip</pre>

<hr>
Return to: <a href="scriptsother.html">Other scripts</a><br />
Return to: <a href="trogintro.html">Troggle intro</a><br />
Troggle index:
<a href="trogindex.html">Index of all troggle documents</a><br /><hr />
</body>
</html>