expoweb/handbook/troggle/trogspeculate.html
2021-12-01 21:12:07 +00:00

146 lines
11 KiB
HTML

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Handbook Troggle Architecture Speculations</title>
<link rel="stylesheet" type="text/css" href="../../css/main2.css" />
</head>
<body>
<style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>
<h2 id="tophead">CUCC Expedition Handbook</h2>
<h1>Troggle Architecture Speculations</h1>
<pre>
From: Philip Sargent (Gmail) [mailto:philip.sargent@gmail.com]
Sent: 19 April 2020 01:28 [original - since edited with extra refs.]
To: expo-tech@lists.wookware.org
Subject: vague thoughts about future troggle architecture
</pre>
<p>
At our last virtual pub Sam confirmed that using today's tools to
re-partition troggle with all the user interface in the user's browser would
be utterly horrible using current tools (javascript frameworks: react,
angular etc.).
<p>
These front-end frameworks get out of date in couple of years or so. So they don't
give us the decade-long stability we need to match available maintenance
effort. [ See <a href="https://en.wikipedia.org/wiki/Comparison_of_JavaScript_frameworks">
Wikipedia list of javascript frameworks</a>.] With our deep historical perspective ("cough"),
we can expect <a href="https://www.circleid.com/posts/20201031-the-javascript-ecosystem/">this
menagerie to sort itself</a> out into a stable, standardised foundation
within the next couple of decades but probably not within the next 10 years.
(ECMAscript 12 is definitely on the way to making these frameworks redundant.)
<p>
A web API to expose the troggle database (read-only) would allow some keen
person to write a special-purpose app on a phone, e.g. an entrance-locator
app, talking directly to the database. But replacing the whole user
interface does not seem feasible yet. In 10 years: probably.
<p>
It did occur to me that we are missing a trick: 99+% of the database doesn't
change except for survey data updates which, apart from during expo, happen
only every week or so. And the database is only 10 MB so is entirely
feasible to copy absolutely everything into the browser except for scanned
images and photos.
<p>
So we could partition troggle so that all the user display bits run in the
browser (or a progressive web app) using a python interpreter running in
javascript. [yeah, expofiles would need some subset labelled as needing to
be forcibly downloaded, but the rest coming only on demand.] Some django
enthusiast must have done this already surely ? Ah yes, Brython.<br>
<a href="https://github.com/brython-dev/brython">github.com/brython-dev/brython</a> &nbsp; &nbsp;
<a href="https://www.brython.info/">www.brython.info</a>,<br>
<a href="https://pyodide.org/en/stable/">Pyodide</a> - full browser using webassembly (2021) and <br>
<a href="https://skulpt.org/gallery.html">Skulpt</a> (which has, since 2017, a full-blown
<a href="https://anvil.works/features">commerical system(</a>) built on top of it - by a CambridgeCL spinout)</br>
<a href="https://www.theregister.com/2021/11/30/python_web_wasm/">WASM</a> - CPython in webassembly (2021)<br>
<p>
Which is fun, but not useful. And not just because it may be immature. None of
this addresses <strong>our biggest problem: devising something that can be
maintained by fewer, less-expert people who can only devote short snippets
of time and not long-duration immersion</strong>.
<h3>Our biggest problem</h3>
We need:
<ul>
<li>something that can be maintained by fewer, less-expert people
<li>who can only devote short snippets of time
<li>without requiring weeks of long-duration deep immersion
</ul>
<h3>Federation of independent scripts</h3>
<p>
I know Wookey has been thinking of a loose federation of independent scripts
working on the same data, but the more I look at troggle and the tasks it
does the less I feel that would work. <strong>At the core there is a common data
model that everything must understand</strong> - and the only unambiguous way of
presenting that data model is working code, e.g. see
<a href="http://expo.survex.com/handbook/troggle/trogarch.html">Troggle architecture</a> and click on the image
to see a bigger copy. [It is out of date - if someone can quickly generate
an update that would be nice. It's on my <a href="../computing/todo.html">to-do list..</a>] Much of what
wallets.py does (originally by Martin Green) is in troggle already - but
better. [There is a many:many relationship between svx files and wallet
directories in reality, not 1:1]
<p>
<h3>troggle now</h3>
Troggle is very nearly fully working (not with as many functions as
originally envisaged admittedly) but very nearly [it is now: 8 July 2020].
The QM data display needs writing; but other than that it's in pretty good
shape. [Ah, yes, we should really add "drawings" as a core concept as well
as "surveyscans". That will be a bit of work.]
<p>
<h3>Need for separate data-import checking scripts</h3>
The one thing external scripts would be really useful for is syntax checking
and reference checking prior to import. I have found some weird and
wonderful filename paths inside the tunnel and therion drawings, and in
survex *ref paths.
<h3>Non-django troggle</h3>
<p>Another possibility is ripping django out of troggle and leaving bare python
plus a SQL database [see <a href="trog2030.html">Trog2030 proposal</a>]. This means that programmers would need to understand more SQL but would not need to understand "django". Arguably this
could mean that we could gain.
<p>Writing our own multi-user code would not be sensible, hence the database.
But we could move to a read-only system where the only writing happens on data-import.
Then we could use python 'pickle()' or 'json()' read-only data structures, but we
would need to create all our own indexing and cross-referencing code (which is <a href="#mud">a much bigger job</a><sup>*</sup> than you might think).
<p>There would be more lower-level code, but the
different segments of the system could be in caving-sensible modules not
django-meaningful modules. And we would not have all the extra
language-like constructs that django introduces e.g. <var>X.objects.set_all()</var>, which
modern editors complain about because it is a django idiom and
not a function within the python codebase.
(We could retain an HTML templating engine though.)
<h3><em>Addendum 1</em></h3>
<p>The above discussion is extremely ignorant in a couple of respects. Now (April 2021) we can properly appreciate that the part of Django that interacts with a database is actually a small part of the system. The http request/response engine is not easily replaced. And the 90 or so HTML templates do not just reformat the data given to them in python dictionaries: they directly query and traverse the database to produce tabular output. So if we 'took out' the database, most of our templates would fail utterly and need completely rewriting. It could be done, but the manpower requirement is not trivial.
<h3><em>Addendum 2</em></h3>
<p>There is a templating engine <a href="https://mozilla.github.io/nunjucks/">Nunjucks</a>
which is a port to JavaScript of the Django templating system we use
(via <a href="https://palletsprojects.com/p/jinja/">Jinja</a> - these are the same people who do Flask). This would be an obvious thing to use if we needed to go in that direction.
<p>We need a templating engine because so much of the troggle coorindation output is in tables of data from diffrerent sources, e.g. see <a href=/survexfile/264">all survey data for 264</a>.
<p>Several organisations have moved their user-interface layer to the browser using
Nunjucks including <a href="https://service-manual.nhs.uk/design-system/prototyping">
the NHS digital service</a> and Firefox.
<h3 =id="mud">* Later Note on object dependencies</h3>
<!-- Philip Sargent 29 July 2020 -->
<a href="http://picocontainer.com/inversion-of-control-history.html#timelines">
<img class="onright" src ="../computing/ioc-timeline.png" width="200px"></a>
<p>Currently every troggle code operation uses the django ORM <var>search</var> and <var>filter</var> operations on the central database to find any object it needs. If we don't have a central database then we have to use direct object references and we need to think about the design of <a href="https://medium.com/@geoffreykoh/implementing-the-factory-pattern-via-dynamic-registry-and-python-decorators-479fc1537bbe">a central registry object</a> to hold these. There is a well-studied design pattern that describes this design "<a href="http://www.laputan.org/mud/">Big Ball of Mud</a>" which and the contributing actions "Piecemeal growth" and "Sweeping it under the rug".
<p>We are always using one object, e.g. a wallet, just to get at another object, e.g. a scan of some original notes, in order to check the data we are checking, e.g. a survex file. Maintaining two-way dependencies amoung all the objects is what "foreign keys" do in a database, but the problem doesn't go away when we don't have a database: it gets slightly harder. <p>One thing that is easier with troggle is that we don't have many object lifecycle issues. Everything is created once and lasts forever. There are only a few ephemeral objects during the initial data import from files.
<h4>Wiring-up components</h4>
<p>Troggle today doesn't need anything complex, a single <a href="https://hub.packtpub.com/python-design-patterns-depth-singleton-pattern/">registry
singleton</a> would probably be fine (though hard to test), but if it evolves towards being a set of interacting services then a more sophisticated architecture would be needed.
<p>The Java community found "dependency resolution" very helpful for wiring-up loosely objects/components in the late 1990s with the "<a href="http://picocontainer.com/inversion-of-control.html">Inversion of Control</a>" technique which can be implemented in several ways, most commonly using "<a href="https://martinfowler.com/articles/injection.html">Dependency Injection</a>". But for troggle we must be careful that doing this the "right" way may make the code even more inaccessible to novice caver-programmers than django is. Which is the whole point of moving away from django. Fortunately python programmers have produced some recent guidance: <a href="https://blog.benpri.me/blog/2020/05/13/python-dependency-injection-made-simple/">Python Dependency Injection Made Simple</a> and <a href="https://python-dependency-injector.ets-labs.org/introduction/di_in_python.html">Dependency injection and inversion of control in Python</a>. We should probably use the simpler "<a href="http://picocontainer.com/constructor-injection.html">Constructor Injection</a>" variation as we need to make all our code <a href="http://picocontainer.com/mock-objects.html">more easily testable</a>. Flask uses that.
<hr />
Return to: <a href="trogdesign.html">Troggle design and future implementations</a><br />
Return to: <a href="trogintro.html">Troggle intro</a><br />
Troggle index:
<a href="trogindex.html">Index of all troggle documents</a><br />
<hr /></body>
</html>