Nleyten@1200: all of the above (almost)
Question: what sort of markup should we use for stories and comments? Here are some popular choices:
- HTML: It has the advantage of being so common that most people are at least familiar with the idea of adorning text with <b>, <i>, and such tags. It has the disadvantage that HTML tends to be awfully verbose (damn you XML!) and ugly.
- BBCode: Similar to HTML, but using a different notation for tags and a much saner (though more limited) way to delimit paragraphs. It has the additional advantage of having become fairly popular across Internet forums, and being therefore very familiar.
- LaTeX: A lot less verbose than HTML and BBCode, and a lot more readable even for large documents. It has the disadvantage that "kids these days" are not quite so familiar with it.
- Wiki: It makes the basic formatting simple, but the slightly more complex a nightmare to read and to parse. I can easily build examples that are ambiguous for a human to parse, much less a computer!
The answer: Lambdium will accept all of the above, with the possible exception of Wiki markup. The reason is that the first three are very similar parsing-wise, and I have in fact already built working parsers for them. As for the Wiki markup, the situation is more complex. The main difficulty is that there's no simple way to express in Wiki markup the complex structure that Lambdium accepts for documents. I was therefore faced with a choice of a) extending the Wiki markup in a non-standard way so it becomes more expressive, b) making Wiki-formatted documents not as expressive as their siblings, or c) drop it for now, and consider options a) or b) in the future if there's enough demand for this markup. I have chosen c).
As for the Subversion log for revision 1200, it reads "BBcode parser now integrated into Lambdium-light". Yeap, I've integrated one of the parsers into Lambdium-light (it previously accepted only text without formatting). I've picked only one because Lambdium-light is supposed to remain "light", obviously! In addition, I didn't include in the grammar some of the more advanced constructs (such as tables, figures, citations, and cross-references), because they would only make it more complex and Lambdium-light is also supposed to be easy to learn.
If you're curious, here's the Ocaml definition of the document structure for Lambdium-light's comments and stories (note: Document.t = fragment_t). As you can see, a document is just a list of blocks; each block is either a simple paragraph, or one of the compound blocks. Some of the latter can be as complex as to include fragment_t in their definition (yeap, fragment_t is recursively defined!). As you can see, this simple definition allows for document trees extremely complex — hopefully as complex as required to express all conceivable stories and comments.
type text_t = string
type link_t = string
type inline_t = node_t list
and node_t =
Text of text_t
| Bold of inline_t
| Emph of inline_t
| Sup of inline_t
| Sub of inline_t
| Url of link_t * inline_t
type fragment_t = block_t list
and block_t =
Paragraph of inline_t
| Section of inline_t
| Quote of fragment_t
| Code of text_t
| Listing of fragment_t list
| Enum of fragment_t list
type t = fragment_t

Comments