expoweb/handbook/update.htm


<html>
<head>
<title>CUCC Expedition Handbook: The Website</title>
<link rel="stylesheet" type="text/css" href="../css/main2.css" />
</head>
<body>
<h2 id="tophead">CUCC Expedition Handbook</h2>
<h1>Expo Website</h1>
<p>The website is now large and complicated with a lot of (too many!) moving parts. This handbook section contains info at various levels: simple 'Howto add stuff' information for the typical expoer, more detailed info for cloning it onto your own machine for more significant edits, and structural info on how it's all put together for people who want/need to change things.</p>

<ul>
<li><a href="#update">Updating the website</a></li>
<li><a href="#checkout">Expo Website manual</a></li>
<li><a href="expodata.html">Expo website developer info</a></li>
</ul>

<h2><a id="update">Updating the website - HOWTO</a></h2>

<p>Simple <a href="checkin.htm">instructions</a> for updating the website
(on the expo machine).</p>

<p>You can update the site via the troggle pages, by editing pages online via a browser, by editing them locally on disk, or by checking out the relevant part to your computer and editing it there. Which is best depends on your knowledge and what you want to do. For simple addition of cave or survey data troggle is recommended. For other edits it's best if you can edit the files directly rather than using the 'edit this page' button, but that means you either need to be on expo with the expo computer, or be able to check out a local copy. If neither of these apply then using the 'edit this page' button is fine.</p>

<p>It's important to understand that everything on the site is stored in a distributed version control system (DVCS) (called 'mercurial'), which means that every edited file needs to be 'checked in' at some point. The Expo website manual goes into more detail about this, below. This stops us losing data and makes it very hard for you to screw anything up permanently, so don't worry about making changes - they can always be reverted if there is a problem. It also means that several people can work on the site on different computers at once and normally merge their changes easily.</p>

<p>Increasing amounts of the site are autogenerated, not just files, so you have to edit the base data, not the generated file. All autogenerated files say 'This file is autogenerated - do not edit' at the top - so check for that before wasting time on changes that will just be overwritten</p>

<h2>Expo website manual</h2>

<p>Editing the expo website is an adventure. Until now, there was no guide which explains the whole thing as a functioning system. Learning it by trial and error is non-trivial. There are lots of things we could improve about the system, and anyone with some computer nous is very welcome to muck in. It is slowly getting better organised.</p>

<p>This manual is organized in a how-to sort of style. The categories, rather than referring to specific elements of the website, refer to processes that a maintainer would want to do.</p>

<h3>Contents</h3>

<ol>
<li><a href="#usernamepassword">Getting a username and password</a></li>
<li><a href="#repositories">The repositories</a></li>
<li><a href="#howitworks">How the website works</a></li>
<li><a href="#quickstart">Quick start</a></li>
<li><a href="#editingthewebsite">Editing the website</a></li>
<li><a href="#mercurialinwindows">Using Mercurial/TortoiseHg in Windows</a></li>
<li><a href="#expowebupdate">The expoweb-update script</a></li>
<li><a href="#cavepages">Updating cave pages</a></li>
<li><a href="#updatingyears">Updating expo year pages</a></li>
<li><a href="#logbooks">Adding typed logbooks</a></li>
<li><a href="#photos">Uploading photos</a></li>
<li><a href="#tickingoff">Ticking off QMs</a></li>
<li><a href="#surveystatus">Maintaining the survey status table</a></li>
<li><a href="#history">History</a></li>
<li><a href="#automation">Automation</a></li>
</ol>

<h3><a id="usernamepassword">Getting a username and password</a></h3>

<p>Use these credentials for access to the site. The user is 'expo',
  with a cavey:beery password. Ask someone if this isn't enough clue for you.
  <b>This password is important for security</b>. The whole site <strong>will</strong get hacked by spammers or worse if you are not careful with it. Use a secure method for passing it on to others that need to know (i.e not unencrypted email), don't publish it anywhere, don't check it in to the website by accident. A lot of people use it and changing it is a pain for everyone so do take a bit of care.
</p>

<p>Note that you don't need a password to view most things, but you will need ne to change them</p>

<h3><a id="repositories">The repositories</a></h3>

<p>All the expo data is contained in 4 'mercurial' repositories at
expo.survex.com. This is currently hosted on a server at the university. Mercurial* is a distributed version control system which allows collaborative editing and keeps track of all changes so we can roll back and have branches if needed.</p>

<p>The site has been split into four parts:</p>

<ul>
 <li>expoweb - the website itself, including generation scripts</li>
 <li>troggle - the database-driven part of the website</li>
 <li>loser - the survex survey data</li>
 <li>tunneldata - the tunnel data and drawings</li>
</ul>


<p>All the scans, photos, presentations, fat documents and videos are
stored just as files (not in version control). See below for details on that.</p>

<h3><a id="howitworks">How the website works</a></h3>

<p>Part of the website is static HTML, but quite a lot is generated by scripts. So anything you check in which affects cave data or descriptions won't appear on the site until the website update scripts are run. This happens automatically every 30 mins, but you can also kick off a manual update. See 'The expoweb-update script' below for details.</p>

<p>Also note that the website you see is its own mercurial checkout (just like your local one) so that has to be 'pulled' from the server before your changes are reflected.</p>

<h3><a id="quickstart">Quick start</a></h3>

<p>If you know what you are doing here is the basic info on what's where:</p>

<dl>
    <dt>expoweb (The Website)</dt>
    <dd>
      <tt>hg [clone|pull|push] ssh://expo@expo.survex.com/expoweb</tt> (read/write)<br />
      <tt>hg [clone|pull|push] http://expo.survex.com/repositories/home/expo/expoweb/</tt> (read-only checkout)
    </dd>

    <dt>troggle (The Website backend)</dt>
    <dd>
      <tt>hg [clone|pull|push] ssh://expo@expo.survex.com/troggle</tt> (read/write)<br />
      <tt>hg [clone|pull|push] http://expo.survex.com/repositories/home/expo/troggle/</tt> (read-only checkout)
    </dd>

    <dt>loser (The survey data)</dt>
    <dd>
      <tt>hg [clone|pull|push] ssh://expo@expo.survex.com/loser</tt> (read/write)<br />
      <tt>hg [clone|pull|push] http://expo.survex.com/repositories/home/expo/loser/</tt> (read-only)
    </dd>

    <dt>tunneldata (The Tunnel drawings)</dt>
    <dd>
      <tt>hg [clone|pull|push] ssh://expo@expo.survex.com/tunneldata</tt> (read/write)<br />
      <tt>hg [clone|pull|push] http://expo.survex.com/repositories/home/expo/expoweb/</tt> (read-only)
    </dd>
</dl>

<p>Photos, scans (logbooks, drawn-up cave segments) (This is about
16GB of stuff which you probably don't actually need locally) To sync
the files from the server to local expoimages directory:</p>

<p><tt>rsync -av expo@expo.survex.com:expoimages /home/expo/fromserver</tt></p>

<p>To sync the local expoimage directory back to the server:</p>

<p><tt>rsync -av /home/expo/fromserver/expoimages expo@expo.survex.com:</tt></p>

<p>(do be careful not to delete piles of stuff then rsync back - as it'll all get deleted on the server too, and we may not have backups!)</p>

<h3><a id="editingthewebsite">Editing the website</a></h3>

<p>To edit the website fully, you need a mercurial client. Some (static text) pages can be edited directly on-line using the 'edit this page link' which you'll see if you are logged into troggle. DYnamically-generated pages can not be edited in this way.</p>

<p>Mercurial can be used from the command line, but if you prefer a GUI, tourtoisehg is highly recommended on all OSes (available on Linux from Debian 6 and Ubuntu 11.04 onwards).</p>

<p>Linux: Install mercurial and tortoisehg-nautilus from synaptic,
then restart nautilus <tt>nautilus -q</tt>. If it works, you'll be able to see the menus of Tortoise within your Nautilus windows. </p>

<p>Once you've downloaded and installed a client, the first step is to create what is called a checkout of the website. This creates a copy on your machine which you can edit to your heart's content. The command to initially check out ('clone') the entire expo website is:</p>

<p><tt>hg clone ssh://expo@expo.survex.com/expoweb</tt></p>

<p>for subsequent updates</p>

<p><tt>hg update</tt></p>

<p>will generally do the trick.</p>

<p>In TortoiseHg, merely right-click on a folder you want to check out to, choose "Mercurial checkout," and enter</p>

<p><tt>ssh://expo@expo.survex.com/expoweb</tt></p>

<p>After you've made a change, commit it to you local copy with:</p>

<p><tt>hg commit</tt>   (you can specify filenames to be specific)</p>

<p>or right clicking on the folder and going to commit in TortoiseHG. Mercurial can't always work out who you are. If you see a message like "abort: no username supplied" it was probably not set up to deduce that from your environment. It's easiest to give it the info in a config file at ~/.hgrc (create it if it doesn't exist, or add these lines if it already does) containing something like</p>

<p><tt>
[ui]<br/>username = Firstname Lastname &lt;myemail@example.com&gt;
</tt></p>

<p>The commit has stored the changes in your local mercurial DVCS, but it has not sent anything back to the server. To do that you need to:</p>

<p><tt>hg push</tt></p>

<p>If someone else is editing the same bit at the same time you may also need to:</p>

<p><tt>hg merge</tt></p>

<p>Simple changes to static files will take effect immediately, but changes to dynamically-generated files (cave descriptions, QM lists etc) will not take effect, until the server runs the expoweb-update script.</p>

<h3><a id="mercurialinwindows">Using Mercurial/TortoiseHg in Windows</a></h3>

<p>In Windows: install Mercurial and TortoiseHg of the relevant flavour from http://mercurial.selenic.com/downloads/ (ignoring antivirus/Windows warnings).</p>

<p>To start cloning a repository: start TortoiseHg Workbench, click File -> Clone repository, a dialogue box will appear. In the Source box type</p>

<p><tt>ssh://expo@expo.survex.com/expoweb</tt></p>

<p>or similar for the other repositories. In the Destination box type whatever destination you want your local copies to live in. Hit Clone, and it should hopefully prompt you for the usual beery password.

<p>The first time you do this it will probably not work as it does not recognise the server. Fix this by running putty, and connecting to the server 'expo@expo.survex.com' (on port 22). Confirm that this is the right server. If you succeed in getting a shell prompt then ssh connection are working and tortoisehg should be able to clone the repo, and send changes back.</p>


<h3><a id="expowebupdate">The expoweb-update script</a></h3>

<p>The script at the heart of the website update mechanism is a makefile that runs the various generation scripts. It is run every 15 minutes as a cron job (at 0,15,30 and 45 past the hour), but if you want to force an update more quickly you can run it he</p>

<p>The scripts are generally under the 'noinfo' section of the site just because that has (had) some access control. This will get changed to something more sensible at some point</p>


<h3><a id="cavepages">Updating cave pages</a></h3>

<p>Cave description pages are automatically generated from a set of
cave files in noinfo/cave_data/ and noinfo/entrance_data/. These files
are named <area>-<cavenumber>.html (where area is 1623 or 1626). These
files are processed by troggle. Use <tt>python databaseReset.py
cavesnew</tt> in /expofiles/troggle/ to update the site/database after
editing these files.</p>

<p>(If you remember something about CAVETAB2.CSV for editing caves, that was
superseded in 2012).</p>

<h3><a id="updatingyears">Updating expo year pages</a></h3>

<p>Each year's expo has a documentation index which is in the folder</p>

<p>/expoweb/years</tt></p>

<p>, so to checkout the 2011 page, for example, you would use</p>

<p>hg clone ssh://expo@expo.survex.com/expoweb/years/2011</tt></p>

<h3><a id="logbooks">Adding typed logbooks</a></h3>

<p>Logbooks are typed up and put under the years/nnnn/ directory as 'logbook.html'.</p>

<p>Do whatever you like to try and represent the logbook in html. The only rigid structure is the markup to allow troggle to parse the files into 'trips':</p>
<pre>
&lt;div class="tripdate" id="t2007-07-12B"&gt;2007-07-12&lt;/div&gt;
&lt;div class="trippeople"&gt;&lt;u&gt;Jenny Black&lt;/u&gt;, Olly Betts&lt;/div&gt;
&lt;div class="triptitle"&gt;Top Camp - Setting up 76 bivi&lt;/div&gt;
&lt;div class="timeug"&gt;T/U 10 mins&lt;/div&gt;
</pre>
<p>Note that the ID's must be unique, so are generated from 't' plus the trip date plus a,b,c etc when there is more than one trip on a day.</p>

<hr>
<p>Older logbooks (prior to 2007) were stored as logbook.txt with just a bit of consistent markup to allow troggle parsing.</p>

<p>The formatting was largely freeform, with a bit of markup ('===' around header, bars separating date, <place> - <description>, and who) which allows the troggle import script to read it correctly. The underlines show who wrote the entry. There is also a format for time-underground info so it can be automagically tabulated.</p>

<p>So the format should be:</p>

<pre>
===2009-07-21|204 - Rigging entrance series| Becka Lawson, Emma Wilson, Jess Stirrups, Tony Rooke===

&lt;Text of logbook entry&gt;

T/U: Jess 1 hr, Emma 0.5 hr
</pre>
<hr>

<h3><a id="photos">Uploading photos</a></h3>

<p>Photos are stored in the general file area of the site under <a
href="http://expo.survex.com/expoimages/photos/">http://expo.survex.com/expoimages/photos/</a>
They are organised by year, and by photographer. Please use directory
names like 2014/YourName (i.e no spaces, CamelCase for names).</p>

<p>They are viewed at <a
href="http://expo.survex.com/photos/">http://expo.survex.com/photos/</a></p>

<p>Photos can be uploaded in 2 basic ways:
<ol>
<li>Rsync,scp,sftp as user 'expo' to expo.survex.com, into the directory expoimages/photos/&lt;year&gt;/&lt;PhotographerName&gt;</li>
<li>Webdav upload to special dir http://expo.survex.com/expoimages/uploads/&lt;year&gt;/&lt;PhotographerName&gt;</li>
</ol></p>

<p>See <a href="uploading.html">Photo/File Upload Instructions</a> for
using webdav/webfolders or winscp from your browser or with other
tools, on various OSes.</p>

<h3><a id="tickingoff">Ticking off QMs</a></h3>

<p>To be written.</p>


<h3><a id="surveystatus">Maintaining the survey status table</a></h3>

<p>There is a table in the survey book which has a list of all the surveys and whether or not they have been drawn up, and some other info.</p>

<p>This is generated by the script tablizebyname-csv.pl from the input file Surveys.csv</p>

<h3><a id="history">History</a></h3>

<p>The CUCC Website was originally created by Andy Waddington in the early 1990s and was hosted by Wookey. The VCS was CVS. The whole site was just static HTML, carefully designed to be RISCOS-compatible (hence the short 10-character filenames) as both Wadders and Wookey were RISCOS people then. Wadders wrote a huge amount of info collecting expo history, photos, cave data etc.</p>

<p>Martin Green added the SURVTAB.CSV file to contain tabulated data for many caves around 1999, and a script to generate the index pages from it. Dave Loeffler added scripts and programs to generate the prospecting maps in 2004. The server moved to Mark Shinwell's machine in the early 2000s, and the VCS was updated to subversion.</p>

<p>In 2006 Aaron Curtis decided that a more modern set of generated, database-based pages made sense, and so wrote Troggle. This uses Django to generate pages. This reads in all the logbooks and surveys and provides a nice way to access them, and enter new data. It was separate for a while until Martin Green added code to merge the old static pages and new troggle dynamic pages into the same site. Work on Troggle still continues sporadically.</p>

<p>After expo 2009 the VCS was updated to hg, because a DVCS makes a great deal of sense for expo (where it goes offline for a month or two and nearly all the year's edits happen).</p>

<p>The site was moved to Julian Todd's seagrass server (in 2010), but the change from a 32-bit to 64-bit machine broke the website autogeneration code,
which was only fixed in early 2011, allowing the move to complete. The
data has been split into 3 separate repositories: the website,
troggle, the survey data, the tunnel data. Seagrass was turned off at
the end of 2013, and the site has been hosted by Sam Wenham at the
university since Feb 2014.</p>


<h3 id="automation">Automation on cucc.survex.com/expo</h3>

<p>Ths section is entirely out of date (June 2014), and awaiting deletion or removal</p>.

<p>The way things normally work, python or perl scripts turn CSV input into HTML for the website. Note that:</p>
<p>The CSV files are actually tab-separated, not comma-separated despite the extension.</p>
<p>The scripts can be very picky and editing the CSVs with microsoft excel has broken them in the past- not sure if this is still the case.</p>
<p>Overview of the automagical scripts on the expo website</p>
Script location 	Input file 	Output file 	Purpose
/svn/trunk/expoweb/noinfo/make-indxal4.pl 	/svn/trunk/expoweb/noinfo/CAVETAB2.CSV 	many 	produces all cave description pages
</br>/svn/trunk/expoweb/noinfo/make-folklist.py 	/svn/trunk/expoweb/noinfo/folk.csv 	http://cucc.survex.com/expo/folk/index.htm 	Table of all expo members

</br>/svn/trunk/surveys/tablize-csv.pl /svn/trunk/surveys/tablizebyname-csv.pl
	/svn/trunk/surveys/Surveys.csv

http://expo.survex.com/expo/surveys/surveytable.html http://expo.survex.com/surveys/surtabnam.html
	Survey status page: "wall of shame" to keep track of who still needs to draw which surveys


<p>Mercurial is a distributed revision control system.  On expo this means that many people can edit and merge their changes with each other either when they can access the internet.  Mercurial is inefficient for scanned survey notes, which are large files that do not get modified, so they are kept as a plain directory of files.</p>

<h2>The website conventions bit</h2>
 <p>This is likely to change with structural change to the site, with style changes which we expect to implement and with the method by which the info is actually stored and served up.</p>
<p>... and it's not written yet, either :-)</p>
<ul>

<li>Structure</li>
<li>Info for each cave &ndash; automatically generated by <tt>make-indxal4.pl</tt></li>
<li>Contents lists &amp; relative links for multi-article publications like  journals. Complicated by expo articles being in a separate hierarchy from journals.</li>
<li>Translations</li>
<li>Other people's work - the noinfo hierarchy.</li>
<li>Style guide for writing cave descriptions: correct use of boldface (<em>once</em> for each passage name, at the primary definition thereof; other uses of the name should be links to this, and certainly should not be bold.) </li>
</ul>
<hr />
</body>
</html>