expoweb/handbook/troggle/lbredesign.html

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>CUCC Expedition Handbook: Logbook HTML formats</title>
<link rel="stylesheet" type="text/css" href="/css/main2.css" />
</head>
<body><style>body { background: #fff url(/images/style/bg-system.png) repeat-x 0 0 }</style>
<h2 id="tophead">CUCC Expedition Handbook - Website menu design options</h2>

<h1>Logbooks - multiple formats</h1>

<ul>
<li><a href="#why">Why</a>
<li><a href="#prop1">Proposal #1</a>
</ul>
<h2>Update - January 2023</h2>
<p>This has now been done. All logbooks use the same format now and we have only one parser.

<p>There is an advnatgae in using a "separator format" rather than a "encapsulated entry format". When parsing the logbook.html file, everthing will be in one of the entries if we use a separator (e.g. &lt;hr&gt; as opposed to a &lt;article&gt; ... &lt;/article&gt; encapsulation). Stuff between encapsulations is probably meant to be in an adjacent entry. So we are continuing to use the  &lt;hr&gt; separator format style.

<hr />

<h2 id="why">HTML formats - Why we needed changes</h2>
<h4>Maintenance workload</h4>

<p>We <s>have</s> had 4 different markdown and HTML formats for logbooks of different vintages. This means 4x as much maintenance as we need.

<code><pre>LOGBOOK_PARSER_SETTINGS = {
                "2010": ("logbook.html", "Parseloghtmltxt"),
                "2009": ("2009logbook.txt", "Parselogwikitxt"),
                "2008": ("2008logbook.txt", "Parselogwikitxt"),
                "2007": ("logbook.html", "Parseloghtmltxt"),
                "2006": ("logbook.html", "Parseloghtmltxt"),
#               "2006": ("logbook/logbook_06.txt", "Parselogwikitxt"),
                "2006": ("logbook.html", "Parseloghtmltxt"),
                "2005": ("logbook.html", "Parseloghtmltxt"),
                "2004": ("logbook.html", "Parseloghtmltxt"),
                "2003": ("logbook.html", "Parseloghtml03"),
                "2002": ("logbook.html", "Parseloghtmltxt"),
                "2001": ("log.htm", "Parseloghtml01"),
                "2000": ("log.htm", "Parseloghtml01"),
                "1999": ("log.htm", "Parseloghtml01"),
                "1998": ("log.htm", "Parseloghtml01"),
                "1997": ("log.htm", "Parseloghtml01"),
                "1996": ("log.htm", "Parseloghtml01"),
                "1995": ("log.htm", "Parseloghtml01"),
                "1994": ("log.htm", "Parseloghtml01"),
                "1993": ("log.htm", "Parseloghtml01"),
                "1992": ("log.htm", "Parseloghtml01"),
                "1991": ("log.htm", "Parseloghtml01"),
                "1990": ("log.htm", "Parseloghtml01"),
                "1989": ("log.htm", "Parseloghtml01"), #crashes MySQL
                "1988": ("log.htm", "Parseloghtml01"), #crashes MySQL
                "1987": ("log.htm", "Parseloghtml01"), #crashes MySQL
                "1985": ("log.htm", "Parseloghtml01"),
                "1984": ("log.htm", "Parseloghtml01"),
                "1983": ("log.htm", "Parseloghtml01"),
                "1982": ("log.htm", "Parseloghtml01"),
            }

</pre></code>
<h4>Complexity - missing entries</h4>
<p>
Secondly, it is highly likely that most of the different parsers have errors and so some logbook entries do not get imported. One parser, which we
could devote more effort to, would mean data does not get mislaid.

<p>Thirdly, the current format is error-prone and nonsensical, so it an unecessary learning curve for all expoers.


<h2 id="prop1">Logbooks: Proposal #1 - One Single Format</h2>
<h4>Architecture</h4>
<ul>
<li>Use new HTML5 tags e.g. &lt;article&gt; --stuff-- &lt;/article&gt; or another tag that does not allow nesting. Ideally.
<li>Use closing tag at end of entry - no implicit merging of entries
<li>Explicitly handle content not in a logbook entry, e.g. title, frontispiece.
</ul>

<p>There are several HTML structural tags we could choose,
see <a href="https://webplatform.github.io/docs/guides/html_structural_elements/">HTML5 structural elements</a>.
<br>
DIV, SECTION, ARTICLE, ASIDE

<h4>Implementation</h4>
<ul>
<li>Start by exporting using this format from the import parsers
<li>extensive manual checking for each logbook
<li>Start with 2003 which has a unique parser
<li>trial new import parser, check it gives same results as old parser on old format
<li>repeat for each format type
<li>retire old format parsers, archive old formats of logbook
</ul>

<h4>Advantages</h4>
<ul>
<li>Reduced maintenance load in future
<li>More expoers will write up their logbook entries ! Win!
<li>Clearly distinct programming task: would suit newcomer
</ul>

<h4>Disadvantages</h4>
<ul>
<li>non urgent work
</ul>

<hr />
Return to: <a href="trogdesign.html">Troggle design and future implementations</a><br />
Return to: <a href="trogintro.html">Troggle intro</a><br />
Troggle index:
<a href="trogindex.html">Index of all troggle documents</a><br /><hr />
</body>
</html>