expoweb/handbook/troggle/trogspeculate.html

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Handbook Troggle Architecture Speculations</title>
<link rel="stylesheet" type="text/css" href="../../css/main2.css" />
</head>
<body>
<style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>
<h2 id="tophead">CUCC Expedition Handbook</h2>
<h1>Troggle Architecture Speculations</h1>


<pre>
From: Philip Sargent (Gmail) [mailto:philip.sargent@gmail.com] 
Sent: 19 April 2020 01:28 [original - since edited with extra refs.]
To: expo-tech@lists.wookware.org
Subject: vague thoughts about future troggle architecture 
</pre>

<p>
At our last virtual pub Sam confirmed that using today's tools to
re-partition troggle with all the user interface in the user's browser would
be utterly horrible using current tools (javascript frameworks: react,
angular etc.). 
<p>
These front-end frameworks get out of date in couple of  years or so. So they don't
give us the decade-long stability we need to match available maintenance
effort. [ See <a href="https://en.wikipedia.org/wiki/Comparison_of_JavaScript_frameworks">
Wikipedia list of javascript frameworks</a>.] With our deep historical perspective ("cough"),
we can expect <a href="https://www.circleid.com/posts/20201031-the-javascript-ecosystem/">this 
menagerie to sort itself</a> out into a stable, standardised foundation 
within the next couple of decades but probably not within the next 10 years.
(ECMAscript 12 is definitely on the way to making these frameworks redundant.)
<p>
A web API to expose the troggle database (read-only) would allow some keen
person to write a special-purpose app on a phone, e.g. an entrance-locator
app, talking directly to the database. But replacing the whole user
interface does not seem feasible yet. In 10 years: probably.
<p>
It did occur to me that we are missing a trick: 99+% of the database doesn't
change except for survey data updates which, apart from during expo, happen
only every week or so. And the database is only 10 MB so is entirely
feasible to copy absolutely everything into the browser except for scanned
images and photos.
<p>
So we could partition troggle so that all the user display bits run in the
browser (or a progressive web app) using a python interpreter running in
javascript. [yeah, expofiles would need some subset labelled as needing to
be forcibly downloaded, but the rest coming only on demand.] Some django
enthusiast must have done this already surely  ? Ah yes, Brython.<br>
<a href="https://github.com/brython-dev/brython">github.com/brython-dev/brython</a> &nbsp; &nbsp;
<a href="https://www.brython.info/">www.brython.info</a>,<br>
<a href="https://pyodide.org/en/stable/">Pyodide</a> - full browser using webassembly (2021) and <br>
<a href="https://skulpt.org/gallery.html">Skulpt</a> (which has, since 2017, a full-blown 
<a href="https://anvil.works/features">commerical system(</a>) built on top of it - by a  CambridgeCL spinout)</br>
<a href="https://www.theregister.com/2021/11/30/python_web_wasm/">WASM</a> - CPython in webassembly (2021)<br>

<p>
Which is fun, but not useful. And not just because it may be immature. None of
this addresses <strong>our biggest problem: devising  something that can be
maintained by fewer, less-expert people who can only devote short snippets
of time and not long-duration immersion</strong>.
<h3>Our biggest problem</h3>
We need:
<ul>
<li>something that can be maintained by fewer, less-expert people
<li>who can only devote short snippets of time
<li>without requiring weeks of long-duration deep immersion
</ul>

<h3>Federation of independent scripts</h3>
<p>
I know Wookey has been thinking of a loose federation of independent scripts
working on the same data, but the more I look at troggle and the tasks it
does the less I feel that would work. <strong>At the core there is a common data
model that everything must understand</strong> - and the only unambiguous way of
presenting that data model is working code, e.g. see 
<a href="http://expo.survex.com/handbook/troggle/trogarch.html">Troggle architecture</a> and click on the image
to see a bigger copy. [It is out of date - if someone can quickly generate
an update that would be nice. It's on my <a href="../computing/todo.html">to-do list..</a>] Much of what
wallets.py does (originally by Martin Green) is in troggle already - but
better. [There is a many:many relationship between svx files and wallet
directories in reality, not 1:1]
<p>
<h3>troggle now</h3>
Troggle is very nearly fully working (not with as many functions as
originally envisaged admittedly) but very nearly [it is now: 8 July 2020]. 
The QM data display needs writing; but other than that it's in pretty good
shape. [Ah, yes, we should really add "drawings" as a core concept as well
as "surveyscans". That will be a bit of work.]
<p>
<h3>Need for separate data-import checking scripts</h3>
The one thing external scripts would be really useful for is syntax checking
and reference checking prior to import.  I have found some weird and
wonderful filename paths inside the tunnel and therion drawings, and in
survex *ref paths.

<h3>Non-django troggle</h3>
<p>Another possibility is ripping django out of troggle and leaving bare python
plus a SQL database [see <a href="trog2030.html">Trog2030 proposal</a>]. This means that programmers would need to understand more SQL but would not need to understand "django". Arguably this 
could mean that we could gain. 
<p>Writing our own multi-user code would not be sensible, hence the database. 
But we could move to a read-only system where the only writing happens on data-import.
Then we could use python 'pickle()' or 'json()' read-only data structures, but we 
would need to create all our own indexing and cross-referencing code (which is <a href="#mud">a much bigger job</a><sup>*</sup> than you might think).
<p>There would be more lower-level code, but the
different segments of the system could be in caving-sensible modules not
django-meaningful modules. And we would not have all the extra 
language-like constructs that django introduces e.g. <var>X.objects.set_all()</var>, which
modern editors complain about because it is a django idiom and 
not a function within the python codebase.

(We could retain an HTML templating engine though.)

<h3><em>Addendum 1</em></h3>
<p>The above discussion is extremely ignorant in a couple of respects. Now (April 2021) we can properly appreciate that the part of Django that interacts with a database is actually a small part of the system. The http request/response engine is not easily replaced. And the 90 or so HTML templates do not just reformat the data given to them in python dictionaries: they directly query and traverse the database to produce tabular output. So if we 'took out' the database, most of our templates would fail utterly and need completely rewriting. It could be done, but the manpower requirement is not trivial.

<h3><em>Addendum 2</em></h3>
<p>There is a templating engine <a href="https://mozilla.github.io/nunjucks/">Nunjucks</a> 
which is a port to JavaScript of the Django templating system we use 
(via <a href="https://palletsprojects.com/p/jinja/">Jinja</a> - these are the same people who do Flask). This would be an obvious thing to use if we needed to go in that direction.
<p>We need a templating engine because so much of the troggle coorindation output is in tables of data from diffrerent sources, e.g. see <a href=/survexfile/264">all survey data for 264</a>.
<p>Several organisations have moved their user-interface layer to the browser using
Nunjucks including <a href="https://service-manual.nhs.uk/design-system/prototyping">
the NHS digital service</a> and Firefox.

<h3 =id="mud">* Later Note on object dependencies</h3>
<!-- Philip Sargent 29 July 2020 -->
<a href="http://picocontainer.com/inversion-of-control-history.html#timelines">
<img class="onright" src ="../computing/ioc-timeline.png" width="200px"></a>
<p>Currently every troggle code operation uses the django ORM  <var>search</var> and <var>filter</var> operations on the central database to find any object it needs. If we don't have a central database then we have to use direct object references and we need to think about the design of <a href="https://medium.com/@geoffreykoh/implementing-the-factory-pattern-via-dynamic-registry-and-python-decorators-479fc1537bbe">a central registry object</a> to hold these. There is a well-studied design pattern that describes this design "<a href="http://www.laputan.org/mud/">Big Ball of Mud</a>" which and the contributing actions  "Piecemeal growth" and "Sweeping it under the rug".
<p>We are always using one object, e.g. a wallet, just to get at another object, e.g. a scan of some original notes, in order to check the data we are checking, e.g. a survex file. Maintaining two-way dependencies amoung all the objects is what "foreign keys" do in a database, but the problem doesn't go away when we don't have a database: it gets slightly harder. <p>One thing that is easier with troggle is that we don't have many object lifecycle issues. Everything is created once and lasts forever. There are only a few ephemeral objects during the initial data import from files.
<h4>Wiring-up components</h4>
<p>Troggle today doesn't need anything complex, a single <a href="https://hub.packtpub.com/python-design-patterns-depth-singleton-pattern/">registry 
singleton</a> would probably be fine (though hard to test), but if it evolves towards being a set of interacting services then a more sophisticated architecture would be needed.
<p>The Java community found "dependency resolution" very helpful for wiring-up loosely objects/components in the late 1990s with the "<a href="http://picocontainer.com/inversion-of-control.html">Inversion of Control</a>" technique which can be implemented in several ways, most commonly using "<a href="https://martinfowler.com/articles/injection.html">Dependency Injection</a>". But for troggle we must be careful that doing this the "right" way may make the code even more inaccessible to novice caver-programmers than django is. Which is the whole point of moving away from django. Fortunately python programmers have produced some recent guidance: <a href="https://blog.benpri.me/blog/2020/05/13/python-dependency-injection-made-simple/">Python Dependency Injection Made Simple</a> and <a href="https://python-dependency-injector.ets-labs.org/introduction/di_in_python.html">Dependency injection and inversion of control in Python</a>. We should probably use the simpler "<a href="http://picocontainer.com/constructor-injection.html">Constructor Injection</a>" variation as we need to make all our code <a href="http://picocontainer.com/mock-objects.html">more easily testable</a>. Flask uses that.

<hr />
Return to: <a href="trogdesign.html">Troggle design and future implementations</a><br />
Return to: <a href="trogintro.html">Troggle intro</a><br />
Troggle index: 
<a href="trogindex.html">Index of all troggle documents</a><br />
<hr /></body>
</html>
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<!DOCTYPE html>`
			`<html>`
			`<head>`
			`<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />`
			`<title>Handbook Troggle Architecture Speculations</title>`
			`<link rel="stylesheet" type="text/css" href="../../css/main2.css" />`
			`</head>`
Edit this page 2021-05-05 17:47:22 +01:00			`<body>`
			`<style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<h2 id="tophead">CUCC Expedition Handbook</h2>`
			`<h1>Troggle Architecture Speculations</h1>`


			`<pre>`
			`From: Philip Sargent (Gmail) [mailto:philip.sargent@gmail.com]`
troggle documentation move para 2020-07-09 23:48:02 +01:00			`Sent: 19 April 2020 01:28 [original - since edited with extra refs.]`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`To: expo-tech@lists.wookware.org`
documentation updates 2021-04-09 13:50:23 +01:00			`Subject: vague thoughts about future troggle architecture`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`</pre>`

			`<p>`
			`At our last virtual pub Sam confirmed that using today's tools to`
			`re-partition troggle with all the user interface in the user's browser would`
			`be utterly horrible using current tools (javascript frameworks: react,`
			`angular etc.).`
			`<p>`
troggle documentation move para 2020-07-09 23:48:02 +01:00			`These front-end frameworks get out of date in couple of years or so. So they don't`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`give us the decade-long stability we need to match available maintenance`
troggle documentation move para 2020-07-09 23:48:02 +01:00			`effort. [ See <a href="https://en.wikipedia.org/wiki/Comparison_of_JavaScript_frameworks">`
			`Wikipedia list of javascript frameworks</a>.] With our deep historical perspective ("cough"),`
documentation updates 2021-04-09 13:50:23 +01:00			`we can expect <a href="https://www.circleid.com/posts/20201031-the-javascript-ecosystem/">this`
			`menagerie to sort itself</a> out into a stable, standardised foundation`
			`within the next couple of decades but probably not within the next 10 years.`
python & JS news updates 2021-12-01 21:12:07 +00:00			`(ECMAscript 12 is definitely on the way to making these frameworks redundant.)`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<p>`
			`A web API to expose the troggle database (read-only) would allow some keen`
			`person to write a special-purpose app on a phone, e.g. an entrance-locator`
			`app, talking directly to the database. But replacing the whole user`
troggle documentation move para 2020-07-09 23:48:02 +01:00			`interface does not seem feasible yet. In 10 years: probably.`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<p>`
			`It did occur to me that we are missing a trick: 99+% of the database doesn't`
			`change except for survey data updates which, apart from during expo, happen`
			`only every week or so. And the database is only 10 MB so is entirely`
			`feasible to copy absolutely everything into the browser except for scanned`
			`images and photos.`
			`<p>`
			`So we could partition troggle so that all the user display bits run in the`
			`browser (or a progressive web app) using a python interpreter running in`
			`javascript. [yeah, expofiles would need some subset labelled as needing to`
			`be forcibly downloaded, but the rest coming only on demand.] Some django`
			`enthusiast must have done this already surely ? Ah yes, Brython.<br>`
HTML5 <details> & <summary> reformatting 2021-04-09 16:15:51 +01:00			`<a href="https://github.com/brython-dev/brython">github.com/brython-dev/brython</a>    `
Edit this page 2021-05-05 17:47:22 +01:00			`<a href="https://www.brython.info/">www.brython.info</a>,<br>`
			`<a href="https://pyodide.org/en/stable/">Pyodide</a> - full browser using webassembly (2021) and <br>`
HTML5 <details> & <summary> reformatting 2021-04-09 16:15:51 +01:00			`<a href="https://skulpt.org/gallery.html">Skulpt</a> (which has, since 2017, a full-blown`
			`<a href="https://anvil.works/features">commerical system(</a>) built on top of it - by a CambridgeCL spinout)</br>`
python & JS news updates 2021-12-01 21:12:07 +00:00			`<a href="https://www.theregister.com/2021/11/30/python_web_wasm/">WASM</a> - CPython in webassembly (2021)<br>`

grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<p>`
HTML5 <details> & <summary> reformatting 2021-04-09 16:15:51 +01:00			`Which is fun, but not useful. And not just because it may be immature. None of`
More emphasis and explanation and links 2020-05-12 19:59:10 +01:00			`this addresses <strong>our biggest problem: devising something that can be`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`maintained by fewer, less-expert people who can only devote short snippets`
More emphasis and explanation and links 2020-05-12 19:59:10 +01:00			`of time and not long-duration immersion</strong>.`
Docs on QM code status, troggle redesign 2020-05-14 22:28:13 +01:00			`<h3>Our biggest problem</h3>`
			`We need:`
			`<ul>`
			`<li>something that can be maintained by fewer, less-expert people`
			`<li>who can only devote short snippets of time`
			`<li>without requiring weeks of long-duration deep immersion`
			`</ul>`

			`<h3>Federation of independent scripts</h3>`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<p>`
			`I know Wookey has been thinking of a loose federation of independent scripts`
			`working on the same data, but the more I look at troggle and the tasks it`
More emphasis and explanation and links 2020-05-12 19:59:10 +01:00			`does the less I feel that would work. <strong>At the core there is a common data`
			`model that everything must understand</strong> - and the only unambiguous way of`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`presenting that data model is working code, e.g. see`
			`<a href="http://expo.survex.com/handbook/troggle/trogarch.html">Troggle architecture</a> and click on the image`
			`to see a bigger copy. [It is out of date - if someone can quickly generate`
			`an update that would be nice. It's on my <a href="../computing/todo.html">to-do list..</a>] Much of what`
			`wallets.py does (originally by Martin Green) is in troggle already - but`
			`better. [There is a many:many relationship between svx files and wallet`
			`directories in reality, not 1:1]`
			`<p>`
Docs on QM code status, troggle redesign 2020-05-14 22:28:13 +01:00			`<h3>troggle now</h3>`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`Troggle is very nearly fully working (not with as many functions as`
troggle documentation move para 2020-07-09 23:48:02 +01:00			`originally envisaged admittedly) but very nearly [it is now: 8 July 2020].`
Docs on QM code status, troggle redesign 2020-05-14 22:28:13 +01:00			`The QM data display needs writing; but other than that it's in pretty good`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`shape. [Ah, yes, we should really add "drawings" as a core concept as well`
			`as "surveyscans". That will be a bit of work.]`
			`<p>`
Docs on QM code status, troggle redesign 2020-05-14 22:28:13 +01:00			`<h3>Need for separate data-import checking scripts</h3>`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`The one thing external scripts would be really useful for is syntax checking`
			`and reference checking prior to import. I have found some weird and`
			`wonderful filename paths inside the tunnel and therion drawings, and in`
			`survex *ref paths.`
Docs on QM code status, troggle redesign 2020-05-14 22:28:13 +01:00
			`<h3>Non-django troggle</h3>`
			`<p>Another possibility is ripping django out of troggle and leaving bare python`
troggle documentation move para 2020-07-09 23:48:02 +01:00			`plus a SQL database [see <a href="trog2030.html">Trog2030 proposal</a>]. This means that programmers would need to understand more SQL but would not need to understand "django". Arguably this`
Docs on QM code status, troggle redesign 2020-05-14 22:28:13 +01:00			`could mean that we could gain.`
			`<p>Writing our own multi-user code would not be sensible, hence the database.`
			`But we could move to a read-only system where the only writing happens on data-import.`
			`Then we could use python 'pickle()' or 'json()' read-only data structures, but we`
Django upgrade and troggle docm. 2020-07-29 18:02:33 +01:00			`would need to create all our own indexing and cross-referencing code (which is <a href="#mud">a much bigger job</a><sup>*</sup> than you might think).`
Docs on QM code status, troggle redesign 2020-05-14 22:28:13 +01:00			`<p>There would be more lower-level code, but the`
			`different segments of the system could be in caving-sensible modules not`
			`django-meaningful modules. And we would not have all the extra`
			`language-like constructs that django introduces e.g. <var>X.objects.set_all()</var>, which`
			`modern editors complain about because it is a django idiom and`
			`not a function within the python codebase.`

			`(We could retain an HTML templating engine though.)`

HTML5 <details> & <summary> reformatting 2021-04-09 16:15:51 +01:00			`<h3><em>Addendum 1</em></h3>`
			<p>The above discussion is extremely ignorant in a couple of respects. Now (April 2021) we can properly appreciate that the part of Django that interacts with a database is actually a small part of the system. The http request/response engine is not easily replaced. And the 90 or so HTML templates do not just reformat the data given to them in python dictionaries: they directly query and traverse the database to produce tabular output. So if we 'took out' the database, most of our templates would fail utterly and need completely rewriting. It could be done, but the manpower requirement is not trivial.

			`<h3><em>Addendum 2</em></h3>`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<p>There is a templating engine <a href="https://mozilla.github.io/nunjucks/">Nunjucks</a>`
			`which is a port to JavaScript of the Django templating system we use`
			`(via <a href="https://palletsprojects.com/p/jinja/">Jinja</a> - these are the same people who do Flask). This would be an obvious thing to use if we needed to go in that direction.`
troggle documentation move para 2020-07-09 23:48:02 +01:00			`<p>We need a templating engine because so much of the troggle coorindation output is in tables of data from diffrerent sources, e.g. see <a href=/survexfile/264">all survey data for 264</a>.`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<p>Several organisations have moved their user-interface layer to the browser using`
			`Nunjucks including <a href="https://service-manual.nhs.uk/design-system/prototyping">`
			`the NHS digital service</a> and Firefox.`

Django upgrade and troggle docm. 2020-07-29 18:02:33 +01:00			`<h3 =id="mud">* Later Note on object dependencies</h3>`
			`<!-- Philip Sargent 29 July 2020 -->`
			`<a href="http://picocontainer.com/inversion-of-control-history.html#timelines">`
			`<img class="onright" src ="../computing/ioc-timeline.png" width="200px"></a>`
			<p>Currently every troggle code operation uses the django ORM <var>search</var> and <var>filter</var> operations on the central database to find any object it needs. If we don't have a central database then we have to use direct object references and we need to think about the design of <a href="https://medium.com/@geoffreykoh/implementing-the-factory-pattern-via-dynamic-registry-and-python-decorators-479fc1537bbe">a central registry object</a> to hold these. There is a well-studied design pattern that describes this design "<a href="http://www.laputan.org/mud/">Big Ball of Mud</a>" which and the contributing actions "Piecemeal growth" and "Sweeping it under the rug".
			<p>We are always using one object, e.g. a wallet, just to get at another object, e.g. a scan of some original notes, in order to check the data we are checking, e.g. a survex file. Maintaining two-way dependencies amoung all the objects is what "foreign keys" do in a database, but the problem doesn't go away when we don't have a database: it gets slightly harder. <p>One thing that is easier with troggle is that we don't have many object lifecycle issues. Everything is created once and lasts forever. There are only a few ephemeral objects during the initial data import from files.
			`<h4>Wiring-up components</h4>`
			`<p>Troggle today doesn't need anything complex, a single <a href="https://hub.packtpub.com/python-design-patterns-depth-singleton-pattern/">registry`
			`singleton</a> would probably be fine (though hard to test), but if it evolves towards being a set of interacting services then a more sophisticated architecture would be needed.`
			<p>The Java community found "dependency resolution" very helpful for wiring-up loosely objects/components in the late 1990s with the "<a href="http://picocontainer.com/inversion-of-control.html">Inversion of Control</a>" technique which can be implemented in several ways, most commonly using "<a href="https://martinfowler.com/articles/injection.html">Dependency Injection</a>". But for troggle we must be careful that doing this the "right" way may make the code even more inaccessible to novice caver-programmers than django is. Which is the whole point of moving away from django. Fortunately python programmers have produced some recent guidance: <a href="https://blog.benpri.me/blog/2020/05/13/python-dependency-injection-made-simple/">Python Dependency Injection Made Simple</a> and <a href="https://python-dependency-injector.ets-labs.org/introduction/di_in_python.html">Dependency injection and inversion of control in Python</a>. We should probably use the simpler "<a href="http://picocontainer.com/constructor-injection.html">Constructor Injection</a>" variation as we need to make all our code <a href="http://picocontainer.com/mock-objects.html">more easily testable</a>. Flask uses that.

grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`<hr />`
New index to all troggle pages 2020-07-27 01:42:09 +01:00			`Return to: <a href="trogdesign.html">Troggle design and future implementations</a><br />`
			`Return to: <a href="trogintro.html">Troggle intro</a><br />`
			`Troggle index:`
			`<a href="trogindex.html">Index of all troggle documents</a><br />`
Edit this page 2021-05-05 17:47:22 +01:00			`<hr /></body>`
grey banner and lots of to-do fixes, to do items all updated from scribbled notes on printout 2020-04-22 19:37:10 +01:00			`</html>`