Version history of the Aneamal Translator

30

2023-10-08

support markup for inline notes

An inline note note can be used for information that is legally required, for example an attribution, license information or a note on included or excluded tax. The marks at the start and end of an inline note are called left fork and right fork respectively and consist of an equal sign and a hyphen. Example:

=-inline note-=

support markup for mutually exclusive options

Mutually exclusive options are options of which only one can be selected at a time within one block. Exclusive options may not occur together with non-exclusive options in the same block. The syntax is the same as for regular options except that an apostrophe is added after the right curly bracket. Example:

question
{key 1}' answer 1
{key 2}' answer 2

Exclusive options are implemented as radio buttons in HTML.

support markup to preselect options

Options can be preselected by adding a hyphen after the right curly bracket.

Two regular options, the first preselected:
{key 1}- this is preselected
{key 2} this is not preselected
Two mutually exclusive options, the first preselected:
{key 1}- this is preselected
{key 2}' this is not preselected

In a block of mutually exclusive options only one option can be preselected of course.

support markup for binary challenges

A binary challenge is an option with an expected choice. A binary challenge that should be selected is marked with {…}1 at the start of a line and a binary challenge that should not be selected is marked with {…}0 at the start of the line. Which choice is expected is not shown to readers.

How a failed binary challenge is dealt with depends on the module handling the form submission.

support markup for textboxes

Single-line textboxes are marked with [_] at the start of a line and multi-line textboxes are marked with [=] at the start of a line. A placeholder can be added like this: [_:placeholder]. Textboxes can be linked to a file: [_]->file.

Multi-line text boxes always load linked-file content as default value. Single-line text boxes load linked-file content as default value, if the linked file contains a single line. If it contains multiple lines, the textbox becomes a combobox and the file is interpreted as single-column TSV file wherein each record is a suggestion that is offered to readers in case of matching user input.

support markup to require user input in form fields

Options can be marked as required by adding an exclamation mark after the right curly bracket: {…}!. Textboxes can be made require input by adding an exclamation mark after the right square bracket, for example: [_]!

Browsers typically do not allow a form to be submitted until all required form fields are filled. But if a submitted form lacks required input, the Aneamal Translator aborts the submission.

prepare API for form submission by a module

This version of the Aneamal Translator adds form fields and all form fields in the same Aneamal file belong to the same form. The Aneamal core syntax provides no means to submit a form though. One reason for this omission is that the Aneamal core is not updated frequently enough to compete in an arms race with spammers. But forms can be processed by JavaScript, CSS or Aneamal modules.

Modules opt in to use the form API by adding a $post argument with a default value of type string to the module’s main function:

return function (array $_, string $post = "") {
	return "…";
};

The Aneamal Translator supplies the module with data about the form then: the form parameter (that is $_['form']) contains the form’s HTML ID which is used to associate form fields or a submit button with the form; the post parameter contains all the user input and associated labels when the form is submitted; the cron parameter documents whether the file that contains the form was changed during the time that a user filled the form.

A module can add its own form fields to a form and feed their user input into the form API. To do the latter, the module’s $post argument’s default value must be the name of a function, the post handler. If the form is submitted, the Aneamal Translator calls the post handler right before the module’s main function. The post handler must return a PHP array that contains the user input from the module’s own form fields such as

[
	[
		'input' => 'foo',
		'label' => 'foolish field',
	],
	[
		'input' => 'bar',
		'label' => 'barely interesting',
	],
]

which is registered as

[
	'addon' => 'name of the module or sub-module',
	'block' => [
		[
			'input' => 'foo',
			'label' => 'foolish field',
		],
		[
			'input' => 'bar',
			'label' => 'barely interesting',
		],
	],
	'topic' => 'caption from the Aneamal file', // optional
]

in the form API then.

Modules that make form submission possible should include some form of spam protection. Modules that identify spam or malicious code can abort a submission by clearing PHP's $_POST array.

Developers should keep in mind that user input can not be trusted. You may be dealing with an evil attacker!

For what it’s worth, you can trust $_POST['_time'] to be a UNIX timestamp in seconds with microsecond precision though and that the HTML form code has been generated by the Aneamal Translator at that time.

print empty quotation blocks with citation

An empty quotation block with a citation is added to the HTML output now. For example

>
Silent Bob

is translated to HTML as:

<blockquote>
<cite>Silent Bob</cite>
</blockquote>

Previously the whole block would have been ignored. An empty quotation block without citation is still ignored.

add a timestamp as a query string to @look.css

The Aneamal translator adds a link to the @look.css file automatically, if it exists. Now the time when the file was last changed is added to that link as a query string. For example:

<link rel='stylesheet' href='/@look.css?1692977144'>

This results in browsers downloading the CSS file anew after a change instead of using an old version from the browser cache. Note that this only works, if the HTML is also generated anew on the server and not loaded from Aneamal’s own cache.

add a class to templates automatically

Templates get an automatic class identical to the template’s name now, for example a-info. Class names are added only once to any instance where a template is used, even if a class is both added automatically and declared manually in the template as well.

support @translator metadata

The translator of an Aneamal file can be declared in metadata now. This is inspired by https://doc.ohreally.nl/metatag-translator. You are discouraged to add an email address in the metadata value though. It could attract spam mail.

support @x-…, @t-…, @math declarations with files

@x-…, @t-… and @math metadata can also be declared with a linked file or an embedded file as alternatives to a simple textual value now. This way modules can be configured with a JSON file for example.

enforce a single value for each metadata name irrespective of its type

With a few exceptions each metadata name can only be declared once inside a file – either with a textual value, a link or embedded file. Confusingly, a metadata name could still hold two different values in the body at the same time though, a link and a textual value, if one was declared locally and the other inherited from a parent file.

Now a new declaration without ? always replaces inherited metadata, i.e. not just when they are of the same type, and a new fallback declaration marked by ? is only respected, if no metadata is inherited for the name, no matter its type.

apply fallback metadata declarations consistently

? after a metadata name in a metadata declaration means that the declaration is just a fallback for the case that the same metadata name has not been declared in a parent document. Previously this had only an effect on the metadata used inside the HTML body. Fallback metadata declarations were still handled like regular ones and took precedence for output to the HTML head. This lead to confusing inconsistencies. Now fallback metadata is consistently ignored, if its name has been declared earlier.

Consider the example:

@ author?: anonymous
Made by @author.

The old behaviour would have caused

<meta name='author' content='anonymous'>

to be added to the HTML head, but another author would be printed in the made-by line on the webpage, if @meta.nml had declared @author differently. Now both instances will have the same author, that is the one declared in @meta.nml, since the local declaration is only marked as fallback for an otherwise undeclared case.

do not inherit writing direction from @meta.nml

This is in accordance with linked files not inheriting the writing direction from the main file.

allow empty data URIs

Empty data URIs are supported now, since they are valid according to RFC2397.

allow x-modules with zero links

An example use case is a drawing module where a linked image would be loaded into a canvas while the absence of a link would imply a blank canvas.

Note that the default minimum number of links is still one. An x-module needs to set its minimum to 0 in order to work with zero links.

pass uniq parameter to modules

An individual parameter uniq is passed to each module instance. Its value can be used as a HTML ID for example.

change the meta parameter’s default value that is passed to modules

The default value of the meta parameter that is passed to modules is an empty PHP string now. Previously the default was NULL, but modules were already supposed to treat the empty string like NULL.

Using an empty string makes sense, for instance because a declaration in @meta.nml can not be undeclared in another file. It can only be overwritten with a declaration that has an empty value.

improve error reporting

⚠️ replace <div role=group> for headings with <hgroup>

Headings that are immediately followed by a subline such as a byline or tagline are now grouped together with these lines in an HTML hgroup element instead of a div element with a role attribute whose value is group. For example,

=== Dracula ===
by Bram Stoker

is now translated to:

<hgroup>
<h1>Dracula</h1>
<p>by Bram Stoker</p>
</hgroup>

While this change does not introduce any backwards incompatibility with older Aneamal files, it is a breaking change as far as CSS is concerned.

Rationale: Years ago HTML5 introduced a controversial alternative heading syntax and a document outline algorithm that never fully worked in browsers. It also introduced a subheading syntax against the recommendation of accessibility experts which never really worked correctly with assistive technology. The Aneamal Translator avoided these pitfalls, but it had to use a kludge to group headings with bylines, datelines, taglines etc. that are attached to a heading, because HTML lacked a native syntax for that purpose which did not cause any problems. In 2022, ill-fated additions to the HTML heading syntax were rolled back. The HTML hgroup element that had been introduced for subheadings in a way that did not work was altered in that context so that it would work and indeed work for use cases such as grouping bylines with a heading. Now that there is a working native syntax in HTML, the Aneamal Translator should make use of it, especially as our main developer was involved in bringing this change to the HTML syntax. Switching to to the specific element hgroup avoids collisions with HTML code that uses the more generic group role on a div element in other contexts.

Updating: Replace div[role=group] with hgroup in your stylesheets. Then replace remaining instances of [role=group] with hgroup.

⚠️ make UTF-8 the only valid encoding in Aneamal files

UTF-8 is the only supported encoding for Aneamal files now. Invalid UTF-8 of Aneamal files is not reported as error anymore. Invalid byte sequences will simply be replaced by the replacement character �.

This is a backwards incompatible change.

Rationale: UTF-8 has always been the default encoding of Aneamal files. It has been the only supported encoding for most linked files, for example HTML, TSV and plain text files as well as styles and scripts. The Aneamal Translator has always been restricted to encodings compatible with US-ASCII such as UTF-8, which ruled out UTF-16 and UTF-32 for example. The use of an encoding other than UTF-8 is slower and complicates the processing, since it needs to be converted to UTF-8. UTF-8 is the only valid encoding in HTML today. We do not know any Aneamal users who use an encoding other than UTF-8.

Updating: Convert Aneamal files with an encoding other than UTF-8 to UTF-8 and either remove the @charset metadata declarations from the files or assign UTF-8 in their @charset metadata declarations. There is no technical reason to keep the declarations.

⚠️ remove support for modules that lack an own folder

Modules are supposed to have their own folder in the Aneamal folder which contains a file index.php, e.g. /aneamal/x-module/index.php. Until now, there was an undocumented alternative: modules could also exist as a single PHP file directly inside the Aneamal folder, e.g. /aneamal/x-module.php. The latter, undocumented alternative is removed.

This is a backwards incompatible change.

Rationale: Having two different possible locations for a module is a source for confusion. It would be tricky in particular when files exist in both possible places for the same module name. Only one would be called, the other one ignored, and users may not be aware why their preferred module does not do what they expect it to do. While having a module as a single PHP file inside the Aneamal folder may seem like a simple solution, it becomes a painful restriction when further development of the module makes additional files necessary. Only switching to a folder then would make updating the module for users complicated. It is better to have a consistent and future-proof structure from the beginning on.

Updating: If you have any module PHP files located directly in your Aneamal folder instead of in a subfolder, contact the module’s developer and ask them to provide a new version that conforms to the documented module structure before you update your Aneamal installation.

use advantages of PHP 8.0

Aneamal 30 is published in 2023 as PHP 8.0 is the oldest PHP version that is supported with security updates. Hence webmasters should update their PHP installation anyway.

29

2022-05-26

lazy loading of images

Browsers are told to load all except the first two images in a webpage lazily by default now. This means that images near the bottom of a page are only loaded when the reader scrolls down and the images are about to come into view. This is done by an HTML loading attribute for img elements.

This results in less bandwith use for some clients and the server, quicker initial page load times and a better distribution of server load over time.

The behavior can be changed in Aneamal with a @load metadata declaration.

set HTML height and width attributes for images

This is done automatically for images linked with [j] and local images linked with [i]. The image dimensions are cached, so that they do not have to be determined repeatedly.

Setting the attributes in the HTML output prevents page reflows while images are loaded especially in the context of lazy loading.

encode previews progressively

JPEGs are encoded in multiple passes with progressively more detail now. This means that a low quality impression of the whole image becomes available very quickly while loading and becomes more detailed then. Previously the image loaded line by line in full detail, but did not show the bottom part of an image until loading was finished. The visual end result is identical.

Usually the whole loading time is reduced a little in progressively encoded images, but does not change much.

support previews of images with a mirrored EXIF orientation

This is in addition to the existing support for rotations. Now all settings for the EXIF Orientation Tag – that is 1 to 8 as of EXIF 2.32 – in images are supported.

generate a preview from a video’s still image

A preview image with the dimensions of the video is generated from the still image, if both video and still image are available locally and the video comes in a WebM or MP4 container format. Example:

[v]->still.jpg->video.mp4

HTML width and height attributes are also set for the video in the HTML output then.

Automatically sized previews are a convenience gain for authors and can reduce bandwidth use for clients and the server.

You can use an absolute URL for the image to force the display of the original image instead of an optimized preview:

[v]->https://example.org/foo/still.jpg->video.mp4

no preloading of videos with still image

If a still image is provided for a video or audio file, the video or audio does not get preloaded in browsers anymore. Loading only begins once a reader tells it to play. This is communicated to browsers via an HTML preload attribute with the value none for video elements.

This results in less bandwith use for some clients and the server and quicker initial page load times.

Mind that still images should have the same dimensions as a video, unless you let the Aneamal Translator generate a matching preview. Otherwise browsers will only learn about the correct size of the video display area once the video starts playing.

Videos and audio files without still image get a HTML preload attribute with the value metadata so that browsers can determine the space needed to display the video and extract a still frame.

/aneamal/public and /aneamal/private for generated files

The new directory /aneamal/public is for automatically generated files that shall be publicly accessible. Files are not stored directly in that folder though, but organized in subfolders. Each module can have its own subfolder.

Preview images which are generated by the Aneamal Translator are stored in up to 256 subfolders of /aneamal/public/jpeg now. Previously all of them had been in /aneamal/pix.

The new directory /aneamal/private is for automatically generated files that shall not be directly accessible from the outside. Files are organized in subfolders here as well and each module can have its own folder.

The main Aneamal cache for the generated HTML code of webpages consists of up to 256 subfolders inside /aneamal/private/cache now. Previously, all cached files had been directly in /aneamal/cache, sometimes even mixed with files from modules (such as the Mouse search engine).

Having distinct folders makes these generated files easier to manage. For example, the main Aneamal cache can be cleared by deleting the directory /aneamal/private/cache now without accidentally deleting the databases of the Mouse search engine, which are stored in /aneamal/private/mouse from its version 2 onwards.

Dividing the main Aneamal cache and the folder for preview images into up to 256 subfolders also makes them easier to handle with software such as some FTP programs which start to struggle when they have to deal with thousands of files in a single folder.

The old directory /aneamal/cache can be removed. The old directory /aneamal/pix can be removed unless an author directly linked to images in it.

overhaul file links and metadata links

Metadata links can be used with linked files now, e.g.

@1: ->file

[i]->@1

Additionally, links to targets can be used in both metadata declarations and with linked files now:

@license: ->#copyright

[x-foo]->#bar

An additional link for a linked image can not be marked as unendorsed anymore.

overhaul handling of data URIs

The Aneamal Translator also accepts data URIs that are not base64 encoded now. Note that many bytes in data URIs that are not base64 encoded must be percent encoded instead, for example the space as %20.

The Aneamal Translator does not check the validity of data URIs which are just passed to browsers and not otherwise used by Aneamal anymore.

Mind that modern browsers usually do not fully support data URIs. So while you can link to a data URI like this …

`Click me.`->data:,Hello%20World%21

… a reader may not be able to use the corresponding link on your webpage, because their browser blocks access to the data URI even though it is valid and safe. However, the same data URI used as an address for a linked file …

[t]->data:,Hello%20World%21

… works, because it is processed by the Aneamal Translator which does not block it.

allow links without address

Links without an address such as Linktext-> are turned into <a>Linktext</a> in HTML now. This is rarely useful, but it can be in navigational menus in which the current page shall not link to itself, but shall be styled as an inactive link.

This is also for consistency, since meta links were already allowed without address. Meta links where the arrow is followed by something which results in an empty address like @Bar: ->`` are also allowed now.

do not encode the letter x in URLs for targets

Upper- and lowercase letters x were the only letters from the ASCII range that were encoded in URL fragments and HTML id attributes for targets. They were encoded as x78. For example, ->##fox created a target that linked to itself with an URL #fox78. Now x is not encoded anymore. The URL is #fox. This looks nicer and is more straight-forward.

Furthermore bytes outside the printable ASCII range are now simply encoded as two-digit hexadecimal number, that is without a leading letter x. This results in shorter encoded strings.

linked and implied files inherit language, iff not declared

If the language is not declared in a linked or implied Aneamal file, then it inherits the language of the main document now. This is the same behavior already used for embedded files.

Previously the language of linked and implied files without an own @lang metadata declaration was handled as unknown.

enable clues for [a] and [t]

Authors can now add a clue for the file content within Aneamal and text file tokens. Example:

[a:The poem “trees” by Max Mustermann]->trees.nml

This has already been possible for the other file types. It is important for images in particular, because the clue is offered as an alternative for those who can not see an image there.

AneamalHome in .htaccess

The Aneamal Translator uses the $_SERVER['SCRIPT_NAME'] variable in PHP to determine the home address of the Aneamal installation for the purpose of creating host-relative links in the HTML output. Sometimes servers are configured in a way that the $_SERVER['SCRIPT_NAME'] variable does not give the correct path though. In that case authors can now set the correct path manually in .htaccess, for example:

SetEnv AneamalHome /

file type and module name are passed to modules

Additional data is passed to modules and can be used by module developers. The new parameters name and type differ only in case of submodules. Example:

file token name type
[t-foo] t-foo t-foo
[x-bar/baz] x-bar x-bar/baz

data-nosnippet for error messages

In the HTML output, error messages get a data-nosnippet attribute. This causes supporting search engines such as Mouse 2 and Google not to include content from error messages in their snippets within search results. See info at google.com.

improve readability of HTML output source code

Linebreaks in the HTML output make it easier to read and analyze it now. Previously almost all output came in a single, very long line.

make use of PHP 7.3 and PHP 7.4 advantages

Aneamal 29 is published in 2022. PHP 7.4 is the oldest PHP version supported with security updates in 2022. So authors should update their PHP version anyway.

Prior history

The version history for the Aneamal Translator up to version 28 can be found in German at https://prlbr.de/projekt/aneamal/versionen/.