expoweb/handbook/manual.html

<html>
<head>
<title>CUCC Expedition Handbook: Programmers manual</title>
<link rel="stylesheet" type="text/css" href="../css/main2.css" />
</head>
<body>
<h2 id="tophead">CUCC Expedition Handbook - Online systems</h2>
<h1>Expo Data Maintenance Manual</h1>

<h2><a id="manual">Expo data management programmers' manual</a></h2>

<ul>
<li>This page is <i>not</i> for cavers wanting to know how to record their cave survey data.
<li>This page is <i>not</i> for cavers wanting to know how to type in logbooks or upload photographs.

<li>This page <i>is for programmers</i> who are modifying the software or helping cavers do their thing.
</ul>

<p>Editing the expo data management system is an adventure. Until 2007, there was no
guide which explained the whole thing as a functioning system. Learning
it by trial and error is non-trivial. There are lots of things we
could improve about the system, and anyone with some computer nous is
very welcome to muck in. It is slowly getting better organised.</p>

<p>This manual is organized in a how-to sort of style. The categories,
rather than referring to specific elements of the data management system, refer to
processes that a maintainer would want to do.</p>
<p>Note that to display the survey data you will need a copy of the <a href="getsurvex.html">survex</a> software.

<p>Go elsewhere if this is what you want to know:
<ul>
<li><a href="uploading.html">How to upload photos</a></li>
<li><a href="logbooks.html">Typing in logbook entries</a></li>
</ul>

<h3>Contents of this manual</h3>

<ol>
<li><a href="#usernamepassword">Getting a username and password</a></li>
<li><a href="#repositories">The repositories</a></li>
<li><a href="#howitworks">How the data management system works</a></li>
<li><a href="#quickstart">Quick start</a></li>
<li><a href="#editingthedata management system">Modifying the data management system</a></li>
<li><a href="#expowebupdate">The expoweb-update script</a></li>
<li><a href="#cavepages">Updating cave pages</a></li>
<li><a href="#updatingyears">Updating expo year pages</a></li>
<li><a href="#tickingoff">Ticking off QMs</a></li>
<li><a href="#surveystatus">Maintaining the survey status table</a></li>
<li><a href="#automation">Automation</a></li>

</ol>
Appendices:
<ul>
<li><a href="website-history.html">Website history</a> - a history of the data management system up to 2019</li>
<li><a href="c21bs.html">Taking Expo Bullshit into the 21st Century</a> - initial report from 1996</li>
</ul>

<h3><a id="usernamepassword">Getting a username and password</a></h3>

<p>Use these credentials for access to the site. The user is 'expo',
  with a cavey:beery password. Ask someone if this isn't enough clue for you.
  <b>This password is important for security</b>. The whole site <strong>will</strong> get hacked by spammers or worse if you are not careful with it. Use a secure method for passing it on to others that need to know (i.e not unencrypted email), don't publish it anywhere, don't check it in to the data management system by accident. A lot of people use it and changing it is a pain for everyone so do take a bit of care.
</p>
<p>Note that you don't need a password to view most things, but you will need one to change them</p>

<p>This password is all you need to log in to troggle and to use the control panel. But if you want to edit the software itself, or update webpages, then
you will also need to get a cryptographic key and register it with the server. See <a href="computing/keyexchange.html">key exchange</a> for details.

<p>Unfortunately, pushing cave data to ::loser:: and ::drawings:: also needs a key. So cavers entering their cave survey data
currently have to use a machine on which this already set up. These machines are
the <i>expo laptop</i> and the laptop '<i>aziaphale</i>' which live in the potato hut during expo.


<h3><a id="repositories">The repositories</a></h3>

<p>All the expo data is contained in 4 "repositories" at
expo.survex.com. This is currently hosted on a free virtual server we have blagged on a server farm.
We use a distributed version control system (DVCS) to manage these repositories because this allows simultaneous collaborative
editing and keeps track of all changes so we can roll back and have branches if needed.</p>

<p>The site has been split into four parts:</p>

<ul>
 <li><a href="http://expo.survex.com/repositories/home/expo/loser/graph/">loser</a> - the survex cave survey data</li>
 <li><a href="http://expo.survex.com/cgit/drawings/.git/log">drawings</a> - the tunnel and therion cave data and drawings</li>
 <li><a href="http://expo.survex.com/repositories/home/expo/expoweb/graph">expoweb</a> - the website pages, handbook, generation scripts</li>
 <li><a href="http://expo.survex.com/cgit/troggle/.git/log">troggle</a> - the database/software part of the survey data management system - see <a href="troggle-ish.html">notes on troggle</a> for further explanation</li>
</ul>

<p>In 2019 we are in the process of migrating from mercurial to git for the repos.

<p>All the scans, photos, presentations, fat documents and videos are
stored just as files (not in version control) in 'expofiles'. See
below for details on that.</p>

<h3><a id="howitworks">How the data management system works</a></h3>

<p>Part of the data management system is static HTML, but quite a lot is generated by scripts.
So anything you check in which affects cave data or descriptions won't appear on the site until
the data management system update scripts are run.
This happens automatically every 30 mins, but you can also kick off a manual update.
See 'The expoweb-update script' below for details.</p>

<p>Also note that the ::expoweb:: web pages and cave data reports you see on the visible website
are not the same as the version-controlled  "master" expoweb repo.
So in order that your committed and pushed changes become visible on the website,
they have to be 'pulled' from the repo onto the webserver before your changes are reflected.</p>

<h3><a id="editthispage">Using 'Edit This Page'</a></h3>
<p>This edits the file served by the webserver  on
the expo server but it does not update the copy of the file in the
repository. To properly finish the job you need to
<ul>
<li>
ssh into expo@expo.survex.com (use putty on a Windows machine)
<li>cd to the directory containing the repo you want, i.e. "cd loser" for
cave data or "cd expoweb" for the handbook and visible data management system, which takes you to /home/expo/expoweb
<li>Then run "hg status" (to check what
changes are pending),
<li>then "hg diff" to see the changes in detail
(or "hg diff|less" if you know how to use "less" or "more") and
<li>then DO NOT just run 'hg commit' unless you know how emacs works as it will dump
you into an emacs editing window (C-x C-C is the way to exit emacs). Instead, do
'hg commit -m "found files left over - myName" '
which submits the obligatory comment witht he commit operation.
</ul>


<h3><a id="quickstart">Quick start</a></h3>

<p>If you know what you are doing here is the basic info on what's where:<br>
(if you don't know what you're doing, skip to <a href="#editingthedata management system">Editing the data management system</a> below.)

<p>This section is all about how to use mercurial. Since we are changing to git it has been
removed to <a href="computing/qstart-hg.html">a separate place</a>.


<dl>
    <dt>expofiles (all the big files and documents)</dt>

<p>Photos, scans (logbooks, drawn-up cave segments) (This was about
40GB of stuff in 2017 which you probably don't actually need locally).
<p>If you don't need an entire copy of all of it, then it is probably best to use Filezilla/ftp to
copy just a small part of the filesystem to your own machine and to upload the bits you add to or edit.
Instructions for installing and using Filezilla are found in the expo user instructions for
uploading photographs: <a href="uploading.html">uploading.html</a>.

<p> To sync all
the files from the server to local expofiles directory:</p>

<p><tt>rsync -av expo@expo.survex.com:expofiles /home/expo</tt></p>

<p>To sync the local expofiles directory back to the server (but only if your machine runs Linux):</p>
<p><tt>rsync --dry-run --delete-after -a /home/expo/expofiles expo@expo.survex.com</tt></p>
then CHECK that the list of files it produces matches the ones you absolutely intend to delete forever! ONLY THEN do:
<p><tt>rsync -av /home/expo/expofiles expo@expo.survex.com:</tt></p>

<p>(do be <b>incredibly</b> careful not to delete piles of stuff then rsync back, or to get the directory level of the command wrong - as it'll all get deleted on the server too, and we may not have backups!). It's <b>absolutely vital</b>Use rsync --dry-run --delete-after -a first to check what would be deleted.

<p>If you are using rsync from a Windows machine you will <em>not</em> get all the files as some filenames are incompatible with Windows. What will happen is that rsync will invisibly change the names as it downloads them from the Linux expo server to your Windows machine, but then it forgets what it has done and tries to re-upload all the renamed files to the server even if you have touched none of them. Now there won't be any problems with simple filenames using all lowercase letters and no funny characters, but we have nothing in place to stop anyone creating such a filename somewhere in that 60GB or of detecting the problem at the time. So don't do it. If you have a Windows machine use Filezilla not rsync.

<p>(We may also have an issue with rsync not using the appropriate user:group  attributes for files pushed back to the server. This may not cause any problems, but watch out for it.)</p>
</dl>
<h3><a id="editingthedata management system">Editing the data management system</a></h3>

<p>To edit the data management system fully, you need to use the disributed version control system
(DVCM) software which is currently mercurial/TortoiseHg.
Some (static text) pages can be edited directly on-line using the 'edit this page link' which you'll
see if you are logged into troggle. In general the dynamically-generated pages, such as those describing
caves which are generated from the cave survey data, can not be edited in this way, but forms are provided
for some types of these like 'caves'.</p>

<p><tt>
[ui]<br/>username = Firstname Lastname &lt;myemail@example.com&gt;
</tt></p>

<p>The commit has stored the changes in your local Mercurial DVCS, but it has not sent anything back to the server. To do that you need to:</p>

<p><tt>hg push</tt></p>

<p>Before pushing, you should do an <tt>hg pull</tt> to sync with upstream first. If someone else has edited the same files you may also need to do:</p>

<p><tt>hg merge</tt></p>

<p>(and sort out any conflicts if you've both edited the same file) before pushing again</p>

<p>Simple changes to static files will take effect immediately, but changes to dynamically-generated files (cave descriptions, QM lists etc) will not take effect, until the server runs the expoweb-update script.</p>


<h3><a id="expowebupdate">The expoweb-update script</a></h3>

<p>The script at the heart of the data management system update mechanism is a makefile that runs the various generation scripts. It is run every 15 minutes as a cron job (at 0,15,30 and 45 past the hour), but if you want to force an update more quickly you can run it he</p>

<p>The scripts are generally under the 'noinfo' section of the site just because that has (had) some access control. This will get changed to something more sensible at some point</p>


<h3><a id="cavepages">Updating cave pages</a></h3>

<p>Cave description pages are automatically generated from a set of
cave files in noinfo/cave_data/ and noinfo/entrance_data/. These files
are named <area>-<cavenumber>.html (where area is 1623 or 1626). These
files are processed by troggle. Use <tt>python databaseReset.py
caves</tt> in /expofiles/troggle/ to update the site/database after
editing these files.</p>

<p>Clicking on 'New cave' (at the bottom of the cave index) lets you enter a new cave. <a href="caveentry.html">Info on how to enter new caves has been split into its own page</a>.</p>

<p>(If you remember something about CAVETAB2.CSV for editing caves, that was
superseded in 2012).</p>
<p>This may be a useful reminder of what is in a survex file <a href="survey/how_to_make_a_survex_file.pdf">how to create a survex file</a>.

<h3><a id="updatingyears">Updating expo year pages</a></h3>

<p>Each year's expo has a documentation index which is in the folder</p>

<p>/expoweb/years</tt></p>

<p>, so to checkout the 2011 page, for example, you would use</p>

<p>hg clone ssh://expo@expo.survex.com/expoweb/years/2011</tt></p>

<p> Once you have pushed your changes to the repository you need to update the server's local copies, by ssh into the server and running hg update in the expoweb folder. </p>

<h3>Adding a new year</h3>
<p>Edit folk/folk.csv, adding the new year to the end of the header
line, a new column, with just a comma (blank
cell) for people who weren't there, a 1 for people who were there, and
a -1 for people who were there but didn't go caving. Add new lines for
new people, with the right number of columns.</p>

<p>This proces is tedious and error-prone and ripe for improvement.
Adding a list of people, fro the bier book, and their aliases would be
a lot better, but some way to make sure that names match with previous
years would be
good.</p>

<h3><a id="tickingoff">Ticking off QMs</a></h3>

<p>To be written.</p>


<h3><a id="surveystatus">Maintaining the survey status table</a></h3>

<p>There is a table in the survey book which has a list of all the surveys and whether or not they have been drawn up, and some other info.</p>

<p>This is generated by the script tablizebyname-csv.pl from the input file Surveys.csv</p>


<hr />
</body>
</html>