<ahref="c21bs.html">Taking Expo Bullshit into the 21st Century</a> - a story of the data management system up to Spring 1996. [This was less than five years after Tim Berners-Lee published the world's very first web page on 6th August 1991. So the expo website is nearly as old as the web itself.]
<p>Before Survex, <ahref="../years/1990/svy2.htm">in 1990 we used Sean Kelly's Surveyor '88</a>, written for the Queen Mary
College Belize Expedition, and in the 1980s we used Andy Waddington's <ahref="/expofiles/documents/fortran-su-programs">SU
5.13C</a>, a fortran programme on the IBM mainframe at the university (and, probably, at the UK Atomic Energy Establishment
at Windscale). The routines to produce <ahref="/expofiles/documents/fortran-su-programs/SURVEY.SUPLOT.f">graphical output on
a pen-plotter</a> were ported to the University Computing Service by Philip Sargent in 1984. (Over Christmas 1983, the university mainframe had its <ahref="http://www.computinghistory.org.uk/det/5622/University-of-Cambridge-Computing-Service-November-1983-Newsletter-107/">RAM doubled: from 16 to 32 MB</a>.)
<p>From having a set of HTML files, it was a small step to publish a website. The CUCC Expo Website, which publishes the cave data, was originally created by
Andy Waddington in the early 1990s and was hosted by Wookey.
<details>
<summary>1999 scripts and spreadsheets</summary>
The version control system was <ahref="https://www.nongnu.org/cvs/">CVS</a>. The whole site was just static HTML, carefully
designed to be RISCOS-compatible (hence the short 10-character filenames)
as both Wadders and Wookey were <ahref="https://en.wikipedia.org/wiki/RISC_OS">RISCOS"</a> people then (in the early 1990s).
Wadders wrote a huge amount of info collecting expo history, photos, cave data etc.</p>
<p>Another important element of this system was <ahref="computing/repos.html">version control</a>. The entire data structure was
stored initially in a <ahref="https://en.wikipedia.org/wiki/Concurrent_Versions_System">Concurrent Version System</a> repository, and later migrated to
<ahref="https://en.wikipedia.org/wiki/Apache_Subversion">Subversion</a> [<em>now using <ahref="computing/repos.html">git</a> in 2020</em>].
Any edits to the spreadsheets which caused the scripts to fail, breaking the
<p>From the <ahref="/years/2009/report.html">2009 expo report</a>:<br/>
<ul>
This year's expedition also had a non-caving goal (not just drinking Gösser). Recently [since 2006] members of CUCC have started to develop a piece of software called Troggle, which aims to facilitate keeping track of logbook entries, typing up surveys, caves etc, and save time in a lot of the work that goes on behind the scenes when expo is over. This year was the first time Troggle would be tested "in the field" (well, spud hut).
which was only fixed in early 2011, allowing the move to complete.
<p>By 2011 Troggle was under development with a wiki hosted on the CUCC server and we have a
<ahref="troggle/2011-archive.html">snapshot of the status in April 2011</a>. As you can see, Troggle was still very incomplete in 2011.
<p>The handbook was separate for a while until Martin Green added code to merge the old static pages and
new troggle dynamic pages into the same site. This is now the live system running everything (in 2022). Work on developing Troggle further still continues (see <ahref="troggle/trogintro.html">Troggle intro</a>).</p>
<li><ahref="/hgrepositories/home/expo/loser/graph/">loser</a> - the survex cave survey data (hg)</li>
<li><ahref="/repositories/drawings/.git/log">drawings</a> - the tunnel and therion cave data and drawings (git)</li>
<li><ahref="/hgrepositories/home/expo/expoweb/graph">expoweb</a> - the website pages, handbook, generation scripts (hg)</li>
<li><ahref="/repositories/troggle/.git/log">troggle</a> - the database/software part of the survey data management system - see <ahref="troggle/trogintro.html">notes on troggle</a> for further explanation (git)</li>
<p>In early 2019 the university computing service upgraded its firewall rules which took the
server offline completely.
<p>
Wookey eventually managed to find us free space (a virtual machine)
on a debian mirror server somewhere in Leicestershire (we think).
This move to a different secure server means that all ssh to the server now needs to use cryptographic keys tied to individual machines. There is an expo-nerds email list (all mailing lists are now hosted on wookware.org as the university list system restricted what non-Raven-users could do) to coordinate server fettling.
<p>At the beginning of the 2019 expo two repos had been moved from mercurial to git: troggle and drawings (formerly called tunneldata).
<p>The troggle software has been migrated to git, and the old erebus and cvs branches (pre 2010) removed. Some decrufting was done to get rid of log files, old copies of embedded javascript (codemirror, jquery etc) and some fat images no longer used.
<p>
The tunneldata repo has also been migrated to git, and renamed 'drawings' as it includes therion data too these days.
<p>
The loser repo and expoweb repo need more care in hg->git migration (expoweb is the website content - which is published by troggle). Loser should have the old 1999-2004 CVS history restored, and maybe Tom's annual snapshots from before that, so ancient history can usefully be researched (sometimes useful). It's also a good idea to add the 2015, 2016 and 2017 ARGE data we got (in 2017) added in the correct years so that it's possible to go back to an 'end of this year' checkout and get an accurate view of what was found (for making plots and length stats). All of that requires some history rewriting, which is best done at the time of conversion.
<p>
Similarly expoweb is full of bloat from fat images and surveys and one 82MB thesis that got checked in and then removed. Clearing that out is a good idea. I have a set of 'unused fat blob' lists which can be stripped out with git-gilter. It's not hard to make a 'do the conversion' script, ready for sometime after expo 2019 has calmed down.
<summaryid='#may2020'>May 2020 and django</summary>
<p>
Wookey has now moved 'expoweb' from mercurial to git largely "as-is". Mark Shinwell has said that he will help on the loser (survex files) migration to git.
<p>In May we were on django 1.7 and python 2.7.17. Sam continued to work on upgrading django from v1.7 . We wanted to upgrade django as quickly as possible because old versions of django had unpatched security issues.
[Upgrading to later django versions <ahref="troggle/trogdjangup.html">is a real pig</a> - not helped by the fact that all the tools to help do it are now out of date for these very old django releases.]
<ul>
<li>"Django 1.11 is the last version to support Python 2.7. Support for Django 1.11 ends in 2020." see: <ahref="https://docs.djangoproject.com/en/3.0/faq/install/">django versions</a>. You will notice that we are really outstaying our welcome here, especially as python2.7 was <ahref="https://python-release-cycle.glitch.me/">declared dead in January</a> this year.
<li>For a table displaying the various versions of django and support expiry dates
see <ahref="https://www.djangoproject.com/download/">the django download</a> page.
<li>Ubuntu 20.04 came out on 23rd April but it does not support python2 at all. So we cannot use it for software maintenance (well be can, but only using non-recommended software, which is what we are trying to get away from).
</ul>
<p>We planned to upgrade from django 1.7 to django 1.11, then port from python2 to python3 on
the same version of django, then upgrade to as recent a version of django as we could. But we have
discovered that django1.7 works just fine with <ahref="https://docs.djangoproject.com/en/1.10/topics/python3/">python3</a>, so we will move the development version to python3 during June and
then upgrade the server operating system from Debian <var>stretch</var> to <var>buster</var> before
tackling the next step: thinking deeply about when we migrate from django
<p>Sam was a bit overworked in trying to get an entire university to work remotely during Covid lockdown so Philip [Sargent] started on the python2/3 conversion and got troggle on django 1.7 to work on python 3.5 and then 3.8. He then did the slog of migrating it through the django versions up to 1.11.29 - the last version before django 2.0 . 1.11.29 is an LTS (long term support) version of django. In doing this we had to retreat to python3.7 due to a django plugin incompatibility.
<p>
In the course of these migrations several unused or partly-used django plugins were dropped as they caused migration problems (notably staticfiles) and the plug-ins pillow, django-registration, six and sqlparse were brought up to recent versions. This was all done with pip in a python venv (virtual environment) on a Windows 10 machine running ubuntu 20.04 under WSL (Windows Systems for Linux) v1.
<p>Missing troggle functions were repaired and partly-implemented pages, such as the list of all cavers and their surveyed passages, were finished and made to work. The logbook parsing acquired a cacheing system to re-load pre-parsed files. The survex file parsing was completely rebuilt to reduce the excessive memory footprint. While doing so the parser was extended to cover nearly the full range of survex syntax and modified to parse, but not store, all the survey stations locations. A great many unused classes and some partly written code ideas were deleted.
<h4>July 2020</h4>
<p>Wookey upgraded debian on the server from 9 <var>stretch</var> to 10 <var>buster</var> and we got the python3 development of troggle running as the public version (with some http:// and https:// glitches) by 23rd July. <var>Buster</var> will be in-support definitely until June 2024 so we are rather pleased to be on a "not ancient" version of the operating system at last. This concided with a last tweak at improving the full cave data file import so now it runs on the development system in ~80 seconds. Which is considerably more useful than the ~5 hours it was taking earlier this year.