Version history of the Aneamal Translator

31

2024-09-07

support markup for hooks

An ampersand before a word or string turns it into a hook, e.g. &muskox or &`Ovibos moschatus`, which is translated to HTML

<span class='_hook muskox'>muskox</span> or <span class='_hook ovibos-moschatus'>Ovibos moschatus</span>

Hooks can be leveraged with CSS or JavaScript for styling or functionality that native Aneamal phrase markup does not provide. They are going to replace the now deprecated custom marks. Unlike custom marks, hooks do not require metadata declarations and are better integrated in Aneamal. You can have double hooks like &&furlong or combine them with targets like &#`long fur` and metadata like &@version.

support closing regular sections seamlessly

Like expandable sections and subsections, regular sections and subsections can be closed seamlessly with +++ and + + respectively now, that is without adding a visible separator to the webpage.

As a consequence the two marks are renamed now:

old name new name
+++ expandable-section break seamless section break
+ + expandable-subsection break seamless subsection break

⚠️ encode sections explictly in HTML

Regular sections and subsections get wrapped in HTML section elements now. This is analogous to how expandable sections and subsections are wrapped in HTML details elements.

The HTML section element makes regular sections stylable with CSS. Consistent section processing also helps with possible future features such as syntax highlighting, code folding or an automatic document outline.

Updating: Check whether the addition of HTML section elements interferes with your individual CSS settings. In particular, CSS child selectors as in main > h2 may need to be adapted, since headings do not occur as direct children of HTML elements like main anymore. Additionally, CSS pseudo-elements like :first-child, :nth-child, :last-child or :first-of-type, :nth-of-type, :last-of-type may need to be changed or replaced.

add an individual class to each HTML section and details element

Expandable sections and subsections are wrapped in HTML details elements, regular ones in HTML section elements. Both details and section elements get a class attribute now. Its value is derived from the corresponding heading with the same algorithm that turns a target into a HTML id attribute and a hook into a class attribute.

This enables selecting indiviual sections in CSS and JavaScript easily.

add alignment class to the HTML hgroup element, if available

The HTML class for alignment is added to the wrapping hgroup element in the generated webpage in case of regular headings with sublines now. This is analogous to expandable-section headings where the class is (and must be) added to the HTML summary element that wraps the heading. In other cases alignment is still added by wrapping a block into a HTML div element with a class attribute.

use Imagick to generate previews, if available

The main reason to use Imagick instead of GD right now is that Imagick 7.0.10-54+ supports JXL while GD support for JXL and then GD-in-PHP support for JXL is not here yet. But we also create JPG previews with Imagick now, if it is available. Image quality can be a bit better and the results are more consistent, because we use the same filter for upscaling as we do for downscaling, which GD’s imagecopyresampled does not.

Imagick supports a wider range of input formats than GD. But that is just a side effect for Aneamal users. It is generally preferable that users convert niche image formats to formats that are widely supported by browsers on their own first, because website visitors have a better chance to be able to see the full image and not just the preview then.

GD is a more integral part of PHP though and hence available on more servers, so the GD code is kept as a fallback.

resize images in a linear color space

Literally use a linear RGB color space while resizing in ImageMagick. Do it approximately by applying gamma correction before and after resizing in GD. See http://www.ericbrasseur.org/gamma.html for why a linear color space should be used.

recognize a previews fix for enhanced preview support

This fix wraps the HTML img tag which integrates a JPEG preview that has been generated from an image linked with the [j] file token inside a HTML picture element. The fix also generates a JXL preview for supporting browsers, if the server can do it, and integrates it via an HTML source element. The generated HTML code may look like this:

<figure>
<a href='/path/to/example.png'>
<picture>
<source srcset='/aneamal/public/jpeg/12/34567.jxl' type='image/jxl'>
<img src='/aneamal/public/jpeg/12/34567.jpg' alt='a bee' width='640' height='480'>
</picture>
</a>
</figure>

The benefit of the fix right now is that it offers JXL image previews to supporting browsers. The JXL alternative has a smaller file size, better image quality (less encoding artifacts, smoother gradients) and supports transparency.

It is planned to switch to JXL as default preview format later when the major browsers support it. The fix will still work then. Its benefit will then be that it still also serves JPEG previews for legacy browsers that do not support JXL.

JXL creation can't be done with GD as of January 2024, but is possible with Imagick 7.0.10-54+. Imagick does not support progressive encoding of JXL yet, which is a temporary downside compared to JPEGs.

recognize an inherit fix to inherit previous fixes

A fix metadata declaration normally overwrites inherited fixes. That behavior is changed by the inherit fix that can be assigned alongside other values. Inherited and new values are combined then.

prevent implicit form submissions when no submit button is present

The HTML standard encourages browsers to submit forms with a single textbox implicitly, that is without activation of a submit button, for example when a user hits the enter key. But this behavior is both irritating and useless for forms without submit button in Aneamal. A hidden disabled submit button in the generated webpage prevents it now.

ignore a leading line break in bulleted/numbered list items

1.
foo

and

1. foo

are equivalent now: the translated list item does not contain an initial HTML <br> in the first case anymore.

This is consistent with tagged list items and the list-like form field labels, but also with file captions and labels of math blocks, with alignment, with content of headings, quotation blocks and Aneamal files.

Ignoring an initial break also makes list items work better with sandwich markup that needs to be placed at the start of a line, not after 1. or <>.

remove HTML title attribute for invisible targets

A target #{foo} gets translated to <span id='foo'></span> in HTML without title attribute now. By default, an empty HTML span does not cover any area that could be hovered with a mouse pointer to trigger the display of the title attribute as tooltip like in the case of actual hints.

relax error reporting on empty headings

Only headings that have no content at all or that have only whitespace and escaped line breaks get an error message for being empty now. While a heading such as --- `` --- may not be useful to readers, authors who write this probably do not write that accidentally.

improve handling of invalid UTF-8

Previously, only US-ASCII characters were preserved in text that is not correctly UTF-8 encoded while each non-ASCII byte sequence, no matter how long, was replaced by a Unicode replacement character. Now only erroneous byte sequences are replaced by a Unicode replacement character and multiple such replacement characters can occur in a row. This makes finding the actual errors in the encoding easier.

The change does not apply to data URIs, which are still handled as US-ASCII, if wrongly encoded. US-ASCII is the default encoding of data URIs.

make the start of an inline note end an implicit group

For example, the target in #foo=-bar-= becomes just foo, not foo-bar. This is consistent with other string markup. It was an oversight not to do it when inline notes were implemented. It is fixed now.

deprecate declared custom marks

  • Custom marks were introduced before some other features such as crossed-out strings and inline notes. Custom markup is less frequently needed due to such dedicated native markup.
  • Declared custom marks are rather complicated to use as each one requires a metadata declaration with a linked or embedded HTML file.
  • Custom marks behave unusual in conjunction with other markup such as links or hints. A custom mark was neither handled as a word like targets nor like string markup, but like a normal letter. This can be unexpected and confusing.
  • The newly introduced hooks offer a more Aneamal-like alternative and do not need to be declared.

Custom marks are planned to become obsolete in Aneamal 32.

deprecate data URI shorthand ->, for linked files

Embedded files are preferred over data URIs for text files. This is also true for metadata declarations. So the ->, shorthand for base64-encoded text data URIs is deprecated, but still supported.

⚠️ define notes as their own section

Notes are typically used for footnotes referenced from various sections in a page. So it makes sense not to interpret the notes as part of the last section started before them, but as their own notes region. Hence notes automatically end previous subsections and sections now.

Updating: If you have notes that are supposed to be inside expandable sections or subsections, you can put them into an embedded Aneamal file to prevent them from ending the section automatically. Alternatively, consider whether you can convert a short note into an inline note which can be used directly within expandable sections and subsections.

⚠️ simplify the translation of targets to HTML ids

The translation of a target to a HTML id works as follows now:

  1. normalization
    1. replace U+00A0 no-break space by U+0020 space
    2. remove U+00AD soft hyphen (new)
    3. turn Unicode letters to lowercase
  2. main conversion
    1. leave ASCII letters and digits untouched
    2. collapse all other ASCII bytes to single hyphens (simplified)
    3. encode non-ASCII bytes as lowercase hexadecimal number
  3. trimming
    1. remove leading and trailing hyphens

Not differentiating between different kinds of non-alphanumeric ASCII bytes in step 2.b. is the main change.

Rationale: Only a small amount of real-world targets get encoded differently now, but explaining the encoding is significantly simplified so that it is easier to understand and implement in other tools. Targets that will change are usually no problem, since links to them within the same page change accordingly. The most important concern is that #\*, which may have been used as target of a footnote, will not work anymore. However, the U+002A asterisk is a bad choice for a footnote reference: many fonts display it small and super-positioned by default which makes the symbol extremely tiny when it is further reduced in size and super-positioned with ^\*. Besides, the U+002A asterisk is normally the heavy emphasis mark. So it is preferable to use other Unicode asterisks as footnote target and reference.

Updating: Replace the U+002A asterisk, if and only where you have used it as target or reference. The teardrop-spoked asterisk U+273B ✻ is a suitable alternative:

obsolete alternative
target #\* #✻
reference ^\* ^✻
link to target ->#\* ->#✻

The Dingbats Unicode block contains many more asterisk variants.

⚠️ do not interpret backslashes in initial sandwich line

Rationale: The backslash \ was only interpreted as mark for a subsequent literal character inside the prefix in the initial line of sandwich markup so that the prefix could contain a slash / that is otherwise interpreted as ending the prefix.

This is a rare use case. It is also confusing, since it is not intuitively clear what a slash at the first position of the prefix should achieve when sandwich markup is interpreted, i.e. when its prefix is added to every subsequent line. Does it start new sandwich markup there or is it printed literally then? The latter was implemented, but this is inconsistent with how any other mark inside the prefix was treated.

As a consequence of interpreting the backslash inside the initial line of sandwich markup, the backslash itself needed to be protected with a backslash there, if you wanted it to be added as actual mark for a literal character to the start of every subsequent line. This meant that you needed four backslashes in the initial sandwich markup line to print it literally at the start of subsequent lines once. Complicated!

In general, the simple explanation that whatever is in the sandwich markup prefix gets added to the start of every subsequent line was false for backslashes. Thanks to the change it is true for every character and sandwich markup is straight-forward to understand now.

The change also removes the inconsistency that the backslash was interpreted in the prefix, but not in the delimiter of the initial sandwich markup line.

The downside – that you can not have a slash inside the sandwich markup prefix anymore – is acceptable, because sandwich markup is syntactic sugar that is not necessary to achieve any result. Finally, we are not aware of any users who used sandwich markup to add slashes to subsequent lines.

Updating: Remove single backslashes \ from the prefixes of initial lines of sandwich markup. Then replace double backslashes in the prefixes of initial lines of sandwich markup with single backslashes. If you used sandwich markup to add forward slashes / at the start of a range of lines, you need to remove the sandwich markup and add the prefix manually to each line. Mind that a slash at the very start of the line must be marked as literal in order not to be mistaken for sandwich markup.

use advantages of PHP 8.1

PHP 8.1 is the oldest PHP branch which still receives security support in 2024. Hence you should update older PHP installations anyway.

release under Mozilla Public License 2.0

This is a free and open-source software license. Popular programs released under MPL 2.0 include Firefox, Thunderbird and LibreOffice. Unlike the permissive license used before, it is a weak copyleft license.

The license is compatible with some other widespread and stronger copyleft licenses such as LGPL, GPL, AGPL and EUPL. The Aneamal Translator continues to be interoperable even with proprietary and closed-source software. The license does not impose any restrictions on the licenses that Aneamal modules can have.

All contributors of the Aneamal Translator source code agreed to the relicensing under MPL 2.0.

30

2023-10-08

support markup for inline notes

An inline note note can be used for information that is legally required, for example an attribution, license information or a note on included or excluded tax. The marks at the start and end of an inline note are called left fork and right fork respectively and consist of an equal sign and a hyphen. Example:

=-inline note-=

support markup for mutually exclusive options

Mutually exclusive options are options of which only one can be selected at a time within one block. Exclusive options may not occur together with non-exclusive options in the same block. The syntax is the same as for regular options except that an apostrophe is added after the right curly bracket. Example:

question
{key 1}' answer 1
{key 2}' answer 2

Exclusive options are implemented as radio buttons in HTML.

support markup to preselect options

Options can be preselected by adding a hyphen after the right curly bracket.

Two regular options, the first preselected:
{key 1}- this is preselected
{key 2} this is not preselected
Two mutually exclusive options, the first preselected:
{key 1}- this is preselected
{key 2}' this is not preselected

In a block of mutually exclusive options only one option can be preselected of course.

support markup for binary challenges

A binary challenge is an option with an expected choice. A binary challenge that should be selected is marked with {…}1 at the start of a line and a binary challenge that should not be selected is marked with {…}0 at the start of the line. Which choice is expected is not shown to readers.

How a failed binary challenge is dealt with depends on the module handling the form submission.

support markup for textboxes

Single-line textboxes are marked with [_] at the start of a line and multi-line textboxes are marked with [=] at the start of a line. A placeholder can be added like this: [_:placeholder]. Textboxes can be linked to a file: [_]->file.

Multi-line text boxes always load linked-file content as default value. Single-line text boxes load linked-file content as default value, if the linked file contains a single line. If it contains multiple lines, the textbox becomes a combobox and the file is interpreted as single-column TSV file wherein each record is a suggestion that is offered to readers in case of matching user input.

support markup to require user input in form fields

Options can be marked as required by adding an exclamation mark after the right curly bracket: {…}!. Textboxes can be made require input by adding an exclamation mark after the right square bracket, for example: [_]!

Browsers typically do not allow a form to be submitted until all required form fields are filled. But if a submitted form lacks required input, the Aneamal Translator aborts the submission.

prepare API for form submission by a module

This version of the Aneamal Translator adds form fields and all form fields in the same Aneamal file belong to the same form. The Aneamal core syntax provides no means to submit a form though. One reason for this omission is that the Aneamal core is not updated frequently enough to compete in an arms race with spammers. But forms can be processed by JavaScript, CSS or Aneamal modules.

Modules opt in to use the form API by adding a $post argument with a default value of type string to the module’s main function:

return function (array $_, string $post = "") {
	return "…";
};

The Aneamal Translator supplies the module with data about the form then: the form parameter (that is $_['form']) contains the form’s HTML ID which is used to associate form fields or a submit button with the form; the post parameter contains all the user input and associated labels when the form is submitted; the cron parameter documents whether the file that contains the form was changed during the time that a user filled the form.

A module can add its own form fields to a form and feed their user input into the form API. To do the latter, the module’s $post argument’s default value must be the name of a function, the post handler. If the form is submitted, the Aneamal Translator calls the post handler right before the module’s main function. The post handler must return a PHP array that contains the user input from the module’s own form fields such as

[
	[
		'input' => 'foo',
		'label' => 'foolish field',
	],
	[
		'input' => 'bar',
		'label' => 'barely interesting',
	],
]

which is registered as

[
	'addon' => 'name of the module or sub-module',
	'block' => [
		[
			'input' => 'foo',
			'label' => 'foolish field',
		],
		[
			'input' => 'bar',
			'label' => 'barely interesting',
		],
	],
	'topic' => 'caption from the Aneamal file', // optional
]

in the form API then.

Modules that make form submission possible should include some form of spam protection. Modules that identify spam or malicious code can abort a submission by clearing PHP's $_POST array.

Developers should keep in mind that user input can not be trusted. You may be dealing with an evil attacker!

For what it’s worth, you can trust $_POST['_time'] to be a UNIX timestamp in seconds with microsecond precision though and that the HTML form code has been generated by the Aneamal Translator at that time.

add a timestamp as a query string to @look.css

The Aneamal translator adds a link to the @look.css file automatically, if it exists. Now the time when the file was last changed is added to that link as a query string. For example:

<link rel='stylesheet' href='/@look.css?1692977144'>

This results in browsers downloading the CSS file anew after a change instead of using an old version from the browser cache. Note that this only works, if the HTML is also generated anew on the server and not loaded from Aneamal’s own cache.

add a class to templates automatically

Templates get an automatic class identical to the template’s name now, for example a-info. Class names are added only once to any instance where a template is used, even if a class is both added automatically and declared manually in the template as well.

support @x-…, @t-…, @math declarations with files

@x-…, @t-… and @math metadata can also be declared with a linked file or an embedded file as alternatives to a simple textual value now. This way modules can be configured with a JSON file for example.

do not inherit writing direction from @meta.nml

This is in accordance with linked files not inheriting the writing direction from the main file.

allow empty data URIs

Empty data URIs are supported now, since they are valid according to RFC2397.

pass uniq parameter to modules

An individual parameter uniq is passed to each module instance. Its value can be used as a HTML ID for example.

change the meta parameter’s default value that is passed to modules

The default value of the meta parameter that is passed to modules is an empty PHP string now. Previously the default was NULL, but modules were already supposed to treat the empty string like NULL.

Using an empty string makes sense, for instance because a declaration in @meta.nml can not be undeclared in another file. It can only be overwritten with a declaration that has an empty value.

improve error reporting

  • Servers can exit a PHP process after some time with a Fatal Error, if it reaches a maximum execution time, e.g. 30 seconds for shared webhosting. This limit is likely to be reached on a page where many preview images have to be generated anew. Now a message explaining the reason for the timeout is printed, if at least one preview image has been successfully generated before.
  • More error messages identify the specific line number in which an error occurred now, not just the range of lines of its block or that it occurred in ghost markup.
  • Errors in template metadata declarations are reported.
  • Errors in @meta.nml or a manually declared meta file are reported.
  • An alignment mark without content to align does not cause an error anymore. It is simply translated to an empty HTML <div>.
  • A metadata declaration with a metadata name such as @\style, i.e. a metadata name that would be recognized if not written with \, that gets an embedded file assigned, causes an error now instead of being treated as if the \ wasn't there.
  • A few errors get their own specific error messages and explanations instead of being covered by a more general error message.
  • Catch the PHP ValueError when PHP’s mb_preferred_mime_name is given an unknown encoding, which was a warning that could be suppressed with the PHP @ operator prior to PHP 8.
  • Unexpected PHP exceptions when handling Aneamal blocks are caught.

⚠️ replace <div role=group> for headings with <hgroup>

Headings that are immediately followed by a subline such as a byline or tagline are now grouped together with these lines in an HTML hgroup element instead of a div element with a role attribute whose value is group. For example,

=== Dracula ===
by Bram Stoker

is now translated to:

<hgroup>
<h1>Dracula</h1>
<p>by Bram Stoker</p>
</hgroup>

While this change does not introduce any backwards incompatibility with older Aneamal files, it is a breaking change as far as CSS is concerned.

Rationale: Years ago HTML5 introduced a controversial alternative heading syntax and a document outline algorithm that never fully worked in browsers. It also introduced a subheading syntax against the recommendation of accessibility experts which never really worked correctly with assistive technology. The Aneamal Translator avoided these pitfalls, but it had to use a kludge to group headings with bylines, datelines, taglines etc. that are attached to a heading, because HTML lacked a native syntax for that purpose which did not cause any problems. In 2022, ill-fated additions to the HTML heading syntax were rolled back. The HTML hgroup element that had been introduced for subheadings in a way that did not work was altered in that context so that it would work and indeed work for use cases such as grouping bylines with a heading. Now that there is a working native syntax in HTML, the Aneamal Translator should make use of it, especially as our main developer was involved in bringing this change to the HTML syntax. Switching to to the specific element hgroup avoids collisions with HTML code that uses the more generic group role on a div element in other contexts.

Updating: Replace div[role=group] with hgroup in your stylesheets. Then replace remaining instances of [role=group] with hgroup.

⚠️ make UTF-8 the only valid encoding in Aneamal files

UTF-8 is the only supported encoding for Aneamal files now. Invalid UTF-8 of Aneamal files is not reported as error anymore. Invalid byte sequences will simply be replaced by the replacement character �.

This is a backwards incompatible change.

Rationale: UTF-8 has always been the default encoding of Aneamal files. It has been the only supported encoding for most linked files, for example HTML, TSV and plain text files as well as styles and scripts. The Aneamal Translator has always been restricted to encodings compatible with US-ASCII such as UTF-8, which ruled out UTF-16 and UTF-32 for example. The use of an encoding other than UTF-8 is slower and complicates the processing, since it needs to be converted to UTF-8. UTF-8 is the only valid encoding in HTML today. We do not know any Aneamal users who use an encoding other than UTF-8.

Updating: Convert Aneamal files with an encoding other than UTF-8 to UTF-8 and either remove the @charset metadata declarations from the files or assign UTF-8 in their @charset metadata declarations. There is no technical reason to keep the declarations.

⚠️ remove support for modules that lack an own folder

Modules are supposed to have their own folder in the Aneamal folder which contains a file index.php, e.g. /aneamal/x-module/index.php. Until now, there was an undocumented alternative: modules could also exist as a single PHP file directly inside the Aneamal folder, e.g. /aneamal/x-module.php. The latter, undocumented alternative is removed.

This is a backwards incompatible change.

Rationale: Having two different possible locations for a module is a source for confusion. It would be tricky in particular when files exist in both possible places for the same module name. Only one would be called, the other one ignored, and users may not be aware why their preferred module does not do what they expect it to do. While having a module as a single PHP file inside the Aneamal folder may seem like a simple solution, it becomes a painful restriction when further development of the module makes additional files necessary. Only switching to a folder then would make updating the module for users complicated. It is better to have a consistent and future-proof structure from the beginning on.

Updating: If you have any module PHP files located directly in your Aneamal folder instead of in a subfolder, contact the module’s developer and ask them to provide a new version that conforms to the documented module structure before you update your Aneamal installation.

use advantages of PHP 8.0

Aneamal 30 is published in 2023 as PHP 8.0 is the oldest PHP version that is supported with security updates. Hence webmasters should update their PHP installation anyway.

29

2022-05-26

lazy loading of images

Browsers are told to load all except the first two images in a webpage lazily by default now. This means that images near the bottom of a page are only loaded when the reader scrolls down and the images are about to come into view. This is done by an HTML loading attribute for img elements.

This results in less bandwith use for some clients and the server, quicker initial page load times and a better distribution of server load over time.

The behavior can be changed in Aneamal with a @load metadata declaration.

set HTML height and width attributes for images

This is done automatically for images linked with [j] and local images linked with [i]. The image dimensions are cached, so that they do not have to be determined repeatedly.

Setting the attributes in the HTML output prevents page reflows while images are loaded especially in the context of lazy loading.

encode previews progressively

JPEGs are encoded in multiple passes with progressively more detail now. This means that a low quality impression of the whole image becomes available very quickly while loading and becomes more detailed then. Previously the image loaded line by line in full detail, but did not show the bottom part of an image until loading was finished. The visual end result is identical.

Usually the whole loading time is reduced a little in progressively encoded images, but does not change much.

support previews of images with a mirrored EXIF orientation

This is in addition to the existing support for rotations. Now all settings for the EXIF Orientation Tag – that is 1 to 8 as of EXIF 2.32 – in images are supported.

generate a preview from a video’s still image

A preview image with the dimensions of the video is generated from the still image, if both video and still image are available locally and the video comes in a WebM or MP4 container format. Example:

[v]->still.jpg->video.mp4

HTML width and height attributes are also set for the video in the HTML output then.

Automatically sized previews are a convenience gain for authors and can reduce bandwidth use for clients and the server.

You can use an absolute URL for the image to force the display of the original image instead of an optimized preview:

[v]->https://example.org/foo/still.jpg->video.mp4

no preloading of videos with still image

If a still image is provided for a video or audio file, the video or audio does not get preloaded in browsers anymore. Loading only begins once a reader tells it to play. This is communicated to browsers via an HTML preload attribute with the value none for video elements.

This results in less bandwith use for some clients and the server and quicker initial page load times.

Mind that still images should have the same dimensions as a video, unless you let the Aneamal Translator generate a matching preview. Otherwise browsers will only learn about the correct size of the video display area once the video starts playing.

Videos and audio files without still image get a HTML preload attribute with the value metadata so that browsers can determine the space needed to display the video and extract a still frame.

/aneamal/public and /aneamal/private for generated files

The new directory /aneamal/public is for automatically generated files that shall be publicly accessible. Files are not stored directly in that folder though, but organized in subfolders. Each module can have its own subfolder.

Preview images which are generated by the Aneamal Translator are stored in up to 256 subfolders of /aneamal/public/jpeg now. Previously all of them had been in /aneamal/pix.

The new directory /aneamal/private is for automatically generated files that shall not be directly accessible from the outside. Files are organized in subfolders here as well and each module can have its own folder.

The main Aneamal cache for the generated HTML code of webpages consists of up to 256 subfolders inside /aneamal/private/cache now. Previously, all cached files had been directly in /aneamal/cache, sometimes even mixed with files from modules (such as the Mouse search engine).

Having distinct folders makes these generated files easier to manage. For example, the main Aneamal cache can be cleared by deleting the directory /aneamal/private/cache now without accidentally deleting the databases of the Mouse search engine, which are stored in /aneamal/private/mouse from its version 2 onwards.

Dividing the main Aneamal cache and the folder for preview images into up to 256 subfolders also makes them easier to handle with software such as some FTP programs which start to struggle when they have to deal with thousands of files in a single folder.

The old directory /aneamal/cache can be removed. The old directory /aneamal/pix can be removed unless an author directly linked to images in it.

overhaul handling of data URIs

The Aneamal Translator also accepts data URIs that are not base64 encoded now. Note that many bytes in data URIs that are not base64 encoded must be percent encoded instead, for example the space as %20.

The Aneamal Translator does not check the validity of data URIs which are just passed to browsers and not otherwise used by Aneamal anymore.

Mind that modern browsers usually do not fully support data URIs. So while you can link to a data URI like this …

`Click me.`->data:,Hello%20World%21

… a reader may not be able to use the corresponding link on your webpage, because their browser blocks access to the data URI even though it is valid and safe. However, the same data URI used as an address for a linked file …

[t]->data:,Hello%20World%21

… works, because it is processed by the Aneamal Translator which does not block it.

do not encode the letter x in URLs for targets

Upper- and lowercase letters x were the only letters from the ASCII range that were encoded in URL fragments and HTML id attributes for targets. They were encoded as x78. For example, ->##fox created a target that linked to itself with an URL #fox78. Now x is not encoded anymore. The URL is #fox. This looks nicer and is more straight-forward.

Furthermore bytes outside the printable ASCII range are now simply encoded as two-digit hexadecimal number, that is without a leading letter x. This results in shorter encoded strings.

linked and implied files inherit language, iff not declared

If the language is not declared in a linked or implied Aneamal file, then it inherits the language of the main document now. This is the same behavior already used for embedded files.

Previously the language of linked and implied files without an own @lang metadata declaration was handled as unknown.

enable clues for [a] and [t]

Authors can now add a clue for the file content within Aneamal and text file tokens. Example:

[a:The poem “trees” by Max Mustermann]->trees.nml

This has already been possible for the other file types. It is important for images in particular, because the clue is offered as an alternative for those who can not see an image there.

AneamalHome in .htaccess

The Aneamal Translator uses the $_SERVER['SCRIPT_NAME'] variable in PHP to determine the home address of the Aneamal installation for the purpose of creating host-relative links in the HTML output. Sometimes servers are configured in a way that the $_SERVER['SCRIPT_NAME'] variable does not give the correct path though. In that case authors can now set the correct path manually in .htaccess, for example:

SetEnv AneamalHome /

file type and module name are passed to modules

Additional data is passed to modules and can be used by module developers. The new parameters name and type differ only in case of submodules. Example:

file token name type
[t-foo] t-foo t-foo
[x-bar/baz] x-bar x-bar/baz

data-nosnippet for error messages

In the HTML output, error messages get a data-nosnippet attribute. This causes supporting search engines such as Mouse 2 and Google not to include content from error messages in their snippets within search results. See info at google.com.

improve readability of HTML output source code

Linebreaks in the HTML output make it easier to read and analyze it now. Previously almost all output came in a single, very long line.

make use of PHP 7.3 and PHP 7.4 advantages

Aneamal 29 is published in 2022. PHP 7.4 is the oldest PHP version supported with security updates in 2022. So authors should update their PHP version anyway.

Prior history

The version history for the Aneamal Translator up to version 28 can be found in German at https://prlbr.de/projekt/aneamal/versionen/.