expoweb/handbook/survey/onlinewallet.html

<html>
<head>
<title>Survey Handbook - online wallet</title>
<link rel="stylesheet" type="text/css" href="../../css/main2.css" />
</head>
<body>
<h2 id="tophead">CUCC Expedition Handbook</h2>
<h1>Maintaining the online wallets</h1>
<p>If you are a newcomer to the system, read the <a href="newcave.html#onlinew">beginners introduction to online wallets</a> first.

<h3>Why we have online wallets</h3>
<p>There are three quite different reasons:
<ol>
<li>The scans of the survey notebook pages are the ultimate original raw survey data and completely irreplaceable.
<li>The other files in the wallet are part of the process of producing a survey of the cave as a whole.
<li>Individual to-do lists are produced automatically for each caver listing what survey processing tasks they haven't finished yet.
</ol>

<h3>The scanned pages</h3>
<p>These are simply the scanned imaages (or digital photographs) of each page of the original survey notes. 
They should be named <em><span style="font-family:monospace">notesXXX.jpg</span></em> where "XXX" can be 
anything you like. Typically we have the scanned pages called notes1.jpg, notes2.jpg, notes3.jpg.
<p>It is important that you use use the .jpg (JPEG) file format, and definitely not PNG (very voluminous) 
or PDF (very hard to re-use elsewhere). Set the scanner at 300 dpi and adjust the contrast of the image after scanning
by using photo-editing software to enhance the writing. Also please crop each image to just the area containing 
the survey data.
<p>As soon as the notes have been scanned you should (a) copy them to a USB stick or email them to someone, (b) upload the entire online wallet to the expo server in Cambridge 
<span style="font-family:monospace">expo.survex.com</span>. This is so that these precious files are backed-up as soon as possible.

<h3>The other files and online index <span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span></h3>
<p>All the other files are part of the multi-step process of producing the cave survey - see <a href="newcave.html">
Creating a new cave...</a> for the full list of steps. The <em><span style="font-family:monospace">notesXXX.jpg</span></em> files need to be at moderately high resolution but the plan and elevation files are usually fine at 200 dpi. So if the caver has scanned these at high resolution you can reduce the size of these files without damange.

<p>We keep an index of how many of those steps have been completed in two places:
<ul>
<li>On paper, in the tick boxes in the contents list of this years' survey trips' lever-arch file in the potato hut.
<li>Online in the <em><span style="font-family:monospace">contents.json</span></em> file which exists in each trip online wallet.
</ul>

<p>but the  <em><span style="font-family:monospace">contents.json</span></em> file has another,completely different function:
it may be the <b>only online record</b> that connects the wallet number to the cave identifier. So if a future cave surveyor deperately needs
to consult the original cave survey, it can be done by, e.g. <br>
<span style="font-family:monospace">
grep -rl "2018-dm-07" expofiles/surveyscans
</span><br>
will find and list all the wallets which contain survey data for cave 2018-dm-07 (which is also known as "Homecoming Cave" and which will
have a different Austrian Kataster number issued for it in due course).

<p>The link between a .svx file and the wallet should also be recorded in the .svx file itself using the "*ref:" field, e.g. 
<pre>
*ref 2018#06
; the #number is on the clear pocket containing the original notes
</pre>
But sometime in mid-Expo 2015 everyone stopped using the survex template file and so this information was not recorded since then. This will be fixed by hand-editing indue course.
(Note that many old .svx files were processed with an older version of survex which did not suppport this feature and so a comment was used instead.)

<p>Troggle produces very useful auto-generated reports of the status of the wallets and the survex files
<ul>
<li><a href="http://expo.survex.com/survey_scans/">List of all wallets</a> and the survex files produced from them (incomplete due to poor data entry especially since 2015)
<li><a href="http://expo.survex.com/expedition/2018">List of all trips and survex files</a> (scroll down to the bottom of the page) in any one year - includes a link to the logbook fragment for the relevant day
<li><a href="http://expo.survex.com/personexpedition/MichaelSargent/2018">List of trips and surveys</a> by a particular person in a particular year (nothing to do with wallets but added here for completeness).
</ul>
These troggle reports are invaluable for finding data entry errors or other mistakes.

<p> The paper tick-list tracks the following steps for each online wallet:
<ul>
<li><u>Survex</u>
<ul>
<li>Data
<li>LRUD
<li>Description + QMs
</ul>
<li><u>Drawn</u>
<ul>
<li>Plan + Xsections
<li>Elevation
</ul>
<li><u>Scanned</u>
<ul>
<li>In cave notes
<li>Plan + Xsections
<li>Elevation
</ul>

<li>Tunnel
<li>Online guidebook updated
<li>json file edited
</ul>
<p>(where the "json file updated" step only refers to the initial editing of the json file to ensure 
that it has the right people, date and cave identifier and name).

<p>A fully-populated and complete <em><span style="font-family:monospace">contents.json</span></em> file
looks like this:
<pre>
			{
			 "description written": false, 
			 "website updated": false, 
			 "people": [
					"Dickon Morris",
					"Jon Arne Toft",
					"Becka Lawson"], 
			 "elev not required": false, 
			 "cave": "2018-dm-07", 
			 "survex not required": false, 
			 "qms written": true, 
			 "plan not required": false, 
			 "electronic survey": false, 
			 "plan drawn": true, 
			 "date": "2018-07-13", 
			 "elev drawn": true, 
			 "description url": "", 
			 "survex file": "caves-1626/2018-dm-07/2018-dm-07.svx", 
			 "name": "Homecoming cave"
			}
</pre>
Yes, this is <a href="https://en.wikipedia.org/wiki/JSON">a programming format</a>
  (standardised in 2013) and every comma is critical.

<p>When entering people's names it is important not to use any funny characters (such as "?") because
peoples names here are used by the software to construct filenames for the surveying to-do lists. And "?" (for instance) is illegal
in filenames on Windows computers.

<h3>"To do" lists for every caver</h3>
<p>The folder containing all the wallets for the year, e.g.
<pre>
/home/expo/expofiles/surveyscans/2018/
</pre>
will, after the appropriate magic has happened, contain a file 
<p>
<span style="font-family:monospace; size=x-small; background-color: lightgray">index.html</span>
<p>
which lists all the wallets which have uncompleted tasks, and lists all the people responsible for completing them. 
You can see <a href="http://expo.survex.com/expofiles/surveyscans/2018/">the  index.html for 2018</a> . Also there will be a linked file for each individual 
for their personal to-do list, and each online wallet contains its own 
<span style="font-family:monospace">index.html</span> file which  describes the survey production status for all the wallets.

<p>The magic creates index.html files in each folder /2018/2018#nn/
and creates or updates a webpage for each person listed in any of the contents.json files
in the folder/2018/ e.g. <a href="http://expo.survex.com/expofiles/surveyscans/2018/Becka%20Lawson.html">Becka Lawson.html</a>.

<p>All this magic is created by a script <span style="font-family:monospace; size=x-small; background-color: lightgray">wallets.py</span>. 

<h3>Setting up the online wallets</h3>
<p>When, at the beginning of expo, you create the folder in
<span style="font-family:monospace; size=x-small; background-color: lightgray">expofiles/surveyscans/</span> for the current year, e.g. 
<span style="font-family:monospace; size=x-small; background-color: lightgray">/2019/</span>, you will copy <span style="font-family:monospace">wallets.py</span> 
from the previous year's folder. You will do this on your own laptop or on the expo laptop.
<p>You will also manually create a number of subfolders, e.g. 2019#01, 2019#02 etc. to be ready for the influx of
new trip surveys.
<p>Next you will test that the magic works: open a terminal in 
<span style="font-family:monospace; size=x-small; background-color: lightgray">expofiles/surveyscans/2018/</span> and run
<p>
<span style="font-family:monospace; size=x-small; background-color: lightgray">python wallets.py</span>
<br>
<p>This will create a default <span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> and
<span style="font-family:monospace; size=x-small; background-color: lightgray">index.html</span> in each online wallet subfolder and also a 
<span style="font-family:monospace; size=x-small; background-color: lightgray">index.html</span> in the <span style="font-family:monospace; size=x-small; background-color: lightgray">/2018/</span> folder.

<p>This script works fine on Linux (Debian, Xubuntu, etc.) and also now works fine in the <a href="https://www.howtogeek.com/249966/how-to-install-and-use-the-linux-bash-shell-on-windows-10/">Windows 10 bash system</a>. 

<h3>Maintaining the online wallets</h3>
<p>Ideally the cavers who are scanning their notes and typing in the survey data will also be updating the 
<span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> file in their wallet. In your dreams.

<p>The first difficulty when editing a blank <span style="font-family:monospace">contents.json</span> 
for a newly-created wallet is finding out which cave the wallet describes. 
The lable on the plastic wallet may say "radaghost to blitzkriek"
(or whatever) but without the name of the cave you can't find the .svx files 
as you don't know that you need to look in e.g. loser/caves-1626/2018-dm-07/.
Usually the cave number is written by hand on the label of the wallet. Sometimes it will just give the 
informal name of the cave,e.g. "Homecoming",instead of the identifier "2018-dm-07" you want.
<p>A regular task during expo is for a nerd to review the <span style="font-family:monospace">contents.json</span> files for
recently created wallets and to check that names, dates and cave numbers are correct.

You will run 
<p>
<span style="font-family:monospace; size=x-small; background-color: lightgray">python wallets.py</span>
<p>
regularly, after every batch of survey data is entered or scanned. 
<p>This will always overwrite all the <span style="font-family:monospace">index.html</span> files but it will never touch
the <span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> files.
<p>You will also regularly synchronise your laptop
and the expo laptop with <br />
<span style="font-family:monospace">expo.survex.com/expofiles/surveyscan/2018/</span>
<br />and this is where it gets tricky.
<p><span style="font-family:monospace">expo.survex.com/expofiles/</span> is <font color=red><b>not under version control</b></font>, 
so the most recent person
to upload the contents of <span style="font-family:monospace">/2018/</span> <font color=red><b>will overwrite everyone else's work</b></font>. 
This does not matter for the autogenerated files, but it is vital that it does not overwrite all the painfully manually edited
<span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> files. Which is very easy to do.
This does mean that this is one of the cases where 
it may be better to use <span style="font-family:monospace; size=x-small; background-color: lightgray">rsync</span> rather than an FTP client such as Filezilla.

<h4>Naming the included files</h4>
<p>The script detects if there are notesX.jpg planX.jpg and elevX.jpgfiles present, and 
produces a reminder/warning if they are not,even if these have all been scanned 
and given different names.
<p>
The job of the checker (perhaps on a second pass) if to rename files so that these
warnings disappear. But if tunnel or therion files have already been produce don't rename anything.


<h4>Not under version control</h4>
<p>
As all this is not under version control the timestamps of the files are really quite important in figuring things out when someone makes an update mistake.
<p>
So script <span style="font-family:monospace; size=x-small; background-color: lightgray">wallets.py</span> has been fixed so that
<ul>
<li>the generated <span style="font-family:monospace; size=x-small; background-color: lightgray">index.html</span> file in each wallet folder is given  the same timestamp as the <span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> file there, <em>not the time of when the script is run</em>. This is unusual but intentional and in practice very helpful.
<li>the script no longer overwrites the <span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> files every  time it runs. It only changes that file's timestamp if it actually changes anything in the contents.json file.
</ul>

<h4>Useful rsync scripts</h4>
<p>A copy of useful rsync scripts is kept in a file such as 
<span style="font-family:monospace; size=x-small; background-color: lightgray">expo.survex.com/expofiles/rsync2018toserver</span>. Always run it with the -n option first,
to see what overwriting you will do.

<h3>More <em><span style="font-family:monospace; size=x-small; background-color: lightgray">wallets.py</span></em> magic</h3>
<p>The python script does more than just re-format the <span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> data into
different formats. It also 
<ul>
<li>checks whether the .svx files listed are actually present <br>in the <a href="http://expo.survex.com/repositories/home/expo">::loser::</a> repository
<li>checks for the presence of notesXXX.jpg, planXXX.jpg and elevXXX.jpg files
<li>creates a template <span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span> in any wallet which does not have one.
<li>creates helpful URL links to the existing online survey documentation for the cave being surveyed
<li>creates helpful URL links to the working files you are managing on your own laptop
</ul>

<p>Things it might do in future (if someone gets around to it) include: <br>
- checking the cave number specified matches the folder for the .svx file,<br>
- checking that the *ref: filed in the survex file is the same as the wallet name<br>
- detecting whether there is a description or a list of QMs in the survex file,<br>
- accepting a list of .svx files and not just one (a very common thing),<br>
- checking the name of the cave against the cave number,<br>
- checking whether the website page even exists for this cave,<br>
- being more intelligent about .topo files and thus the lack of scan files,<br>
- checking the date is in the recent past etc.<br><br>

<h3>How <em><span style="font-family:monospace; size=x-small; background-color: lightgray">contents.json</span></em> fields match 
<em><span style="font-family:monospace; size=x-small; background-color: lightgray">index.html</span></em> reports</h3>
<p>
<em>to be written...</em>
<hr>
Old notes, being turned into real documentation...
<pre>
# Instructions
# 2018-08-14
# Philip Sargent


Wookey told me to sort out the contents.json files in expofiles/surveyscans/2018/
and these are my notes to remind myself what this entails.

The job is to populate the contents.json file in each folder, e.g.

expofiles/surveyscans/2018/2018#03/contents.json

using the following input materials:
- the wallet 2018#03 and the papers inside it. This is in the 2018 lever-arch file.
- the folder in repo 'loser' holding the appropriate .svx files e.g.
  "caves-1623/2017-cucc-24/gshclimb.svx"
- the script expofiles/surveyscans/2018/wallets.py (run by "python wallets.py")

the "wallets.py" script creates index.html files in each folder /2018/2018#nn/
and creates or updates a webpage for each person listed in any of the contents.json files
in the folder/2018/.

The script wallets.py requires that the //loser// repo is populated on the machine that you
run the script on so that it can find the.svx files. 

If your machine has the ::loser:: repo in a different place from that expected by the script, you can just 
put the path on the command line:

python wallets.py "/mnt/d/CUCC-Expo/loser/"

Before doing anything else, run wallets.py. This will create empty template contents.json
files in each folder.

You may need to create missing folders,e.g. I just had to create /2018/2018#30 to #32.

Every time you finish entering the data in contents.json in a folder, 
run wallets.py to update the "person" html files and to 
re-generate the index.html file for the 2018 folder as 
a whole (surveyscans/2018/index.html).

There are ambiguities about how the entries in the contents.json actually lead to
reminder instructions in the html files produced, and this is particularly
difficult for electronic caves where the topo files are missing
and for surface prospecting where it is not clear which of the actions
should be done and thus which products should be produced.

This needs to be documented.

For prospecting and surface surveying it is not clear whether the default folder
for the url link should be repo ::loser:: surface/1623/allplateau.svx

When there are more than one .svx file there seems to be no way of recording the list 
in contents.json so it is impossible to tell what was done on that trip or whether
there is anything missing. This is especially true if it was electronic and the 
.topo files are missing. Wookey confirms that this is the case.


HINT
When there are a lot on wallets all with the same cave, make your own template
with the cave name and the right folder prefix for the svx folder 
(in the loser repo) and copy it in to all those wallet folders - overwriting 
the blank template produced by the wallets.py

# Update March 2019

a consolidated to-do list of the last 3 years on the server:
http://expo.survex.com/expofiles/surveyscans/2016-18/index.html
This is a hand-done kludge and only the first level of links works - which is to the individual person's page.

the lists for the last 3 years individually and all the links are working for each wallet page: 
both local links to your PC and to the right location of the .svx files on the troggle server.
http://expo.survex.com/expofiles/surveyscans/2016/index.html
http://expo.survex.com/expofiles/surveyscans/2017/index.html
http://expo.survex.com/expofiles/surveyscans/2018/index.html
and all the names of people have been hand-edited in the .json files to be consistent and identical.

2015 has now been done stand-alone but there is no consolidated report for 2015-18 yet. 
The big task was editing everyone's names to be exactly the same version of the name as used in other years.


For 2014 and earlier one needs to do a lot more data entry. The contents.json files for 2014 and earlier do not say who the
people were on the trip. So we would need to work from the svx files (where they contain the *ref: wallet ID), 
original plastic wallets (and the scanned drawings and notes – which are incomplete) to enter that data. 
This is made much easier by the troggle reports
http://expo.survex.com/survey_scans/
http://expo.survex.com/expedition/2014
http://expo.survex.com/survey/2018%2330

This is probably not worth doing except maybe for specific critical connections.

The script runs without errors on each of the years 1999-2014, but the results are less useful, e.g. see
http://expo.survex.com/expofiles/surveyscans/1999/
or 
http://expo.survex.com/expofiles/surveyscans/2014/

</pre>


<hr />

</body>
</html>