Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[asciidoc-lang-dev] Avoiding Implementation Specifics Thoughts

One thing that has been mentioned several times is the concept of "extension points".  This needs careful definition and consideration if its to be included in any specification to avoid implementation specifics.

Asciidoctor makes good use of the dynamism of its development environment, Ruby, to allow customisation.  And no markup is likely to cover every eventuality, so some extendability seems important (although as devil's advocate I point out that documents using extensions likely makes the documents _not_ Asciidoc Specification compliant).  

But languages like C or other fully compiled languages are not so amenable to being changed dynamically.  It would seem to be not a good thing to specify the capability in a way that implementations in those languages cannot reasonably be made to comply.

Another implementation specific is the handling of the simplest markup, constrained and unconstrained quotes.  The current implementations (AFAIK all of them) perform the recognition and replacement in a fixed order, rather than the order that the markup occurs in the document (this is the source of the problem, independent if you use regexes or lexers to recognise the tokens).  

Replacing that with a recursive descent parser giving an in-order AST is simple (I have an experimental one already that I hope to publish to github in a few weeks, depending on how my "real" world goes, it has lots of other experiments too :-) but the results are different to existing processors.  So for Asciidoc 1.0 this might need to stay as is for compatibility, and documents that depend on it could be deprecated ready for Asciidoc 2.0 that changes the processing order.

The same issue occurs at the next higher level as well, the ordered recognition of inline markup, special, quotes, replacements, etc and its sidekick, the infamous "subs=" attribute.  That makes the Asciidoc language not only context dependent (in structures like sections and lists), but "subs=" is _content_ dependent.  Not many (actually not any AFAIK) programming languages allow the source code to specify which language constructs are allowed in parts of the program, making normal formal computer language methods difficult to apply.  Imagine if a programming language source could say "no, the `while` construct isn't to be parsed in this part of the program", that is what "subs=" does.

It will be interesting to see how this is formalised.

Cheers
Lex

Back to the top