I am interested about your GSoD'21 project proposal. I found about it on the season-of-docs repo. It is a well defined project and I am sure it will be accepted.
Most of the documentation projects have an implicit requirement that you get familiar with the software itself, since you are expected to improve the content of the documentation. In my opinion this is the most difficult part of the project. But this project only seeks to improve the infrastructure of the documentation, and that's why I find it easier than the others.
Of course it needs a lot of work and it may have its own difficulties, but I am familiar with documentation systems, conversion tools, etc. I am sure that I will find some way to automate the conversion of the existing Docbook XML docs to AsciiDoc. Even if there are no existing tools that are appropriate for this task, I can develop my own conversion tools; Docbook XML is easy to parse and process by a program and I have some experience with doing this (many years ago). Before being a technical writer, I am a computer engineer, software developer and hacker, so I am sure that I can find a way.
You can find some links to my previous documentation work on this page: https://gitlab.com/-/snippets/1861138
Please let me know if you have any questions, or would like to know more details.
Hi @Dashamir Hoxha and thank you for your optimism. Hope it gets accepted too!
You are quite correct -- if you were working on improving or even re-arranging BRL-CAD's documentation, it would help to have some basic familiarity and we have tutorials to help (literally anyone at any skill level) with that. You still might want to check them out for better understand, but it's not strictly required for GSoD.
Automated conversion is almost certainly going to be essential, but we know it won't cover all existing docs, so some manual conversion will likely still be required.
Relevant FOSDEM presentations:
https://fosdem.org/2021/schedule/event/ttdpostgresdocbook/
https://fosdem.org/2021/schedule/event/ttdasciidocantora/
https://fosdem.org/2021/schedule/event/ttddoctoolchain/
So... is the intent then to replace the DocBook->html and DocBook->man page tool chains in our current build?
To duplicate the current capability to bootstrap the doc building toolchain from only the main compiler, it looks like we'd have to bootstrap ruby and then build asciidoctor (plus whatever deps it may have...)
Thanks @Sean , I am going to have a look at these presentations.
This project is the first in my preference list. If you accept me, then I don't have to apply to any other projects. Instead, I can start working on this one (getting familiar, etc.)
@starseeker multiple intents... one is to seriously give something new a trial for docs, and another is to more towards docs as a submodule, and yet another is to simply invigorate doc infrastructure activity.
the hope, of course, is to unify with our other docs (e.g., presentations, wiki), find something more easily editable/readable for non-dev contributors, still produce web+pdf+man outputs.
That was part of the strategic interest in Antora/Asciidoc as it is supported by github, somewhat compatible with markdown, also a simple readable/editable format, and converts to docbook very well. That keeps some processing options open for us if we find, for example, we want or need to keep the existing manpage tooling.
I'm not strongly committed yet to Antora, but after weeks of research, it checked the most requirement boxes. It's a bit of a bikeshed considering mk/rst/adoc doc formats and their associated presentation systems (antora, sphynx, rtd, mkdocs, hugo, docusaurus, etc), but adoc is a really close fit with with our current docbook so I'm hopeful that strike a balance for representing the things we do in docbook while still providing trivial authoring. and it's got a lot of development momentum currently, which doesn't hurt.
The postgresql folks talking about their 20 years with docbook and how it's been a love+hate relationship riddled with issues hits a little close to home... :)
From the presentation, it seems that Antora+AsciiDoc is the right tool for the job.
/me looks at ruby and asciidoc, winces - if we go that route, it's probably going to mean requiring an installed Ruby/asciidoc setup. I doubt I'll have time to hammer that into a bootstrapped toolchain build cross platform...
@Sean Would it be possible to consider using OpenBSD's mandoc for just the man pages? In some ways they're the most crucial to have bundled with the CAD system proper, and mandoc is almost certainly a lot easier to bootstrap...
If we need asciidoc to feed Antora, we could see if mandoc's markdown output and https://github.com/asciidoctor/kramdown-asciidoc could act as an adapter.
starseeker said:
/me looks at ruby and asciidoc, winces - if we go that route, it's probably going to mean requiring an installed Ruby/asciidoc setup. I doubt I'll have time to hammer that into a bootstrapped toolchain build cross platform...
I wouldn't worry about integration issues until after decent progress is made proving the end-result is worthwhile.
Looks like there's a javascript port that should work cross-platform without much fuss (e.g., via cscript.exe or wsript.exe on windows), but I think it shouldn't be too hard to detect whatever is required and/or provide it from a devtooling submodule.
I agree that manpages are critical but I think all the docs need to be reliably generated and integrated at least as well as they are now. Towards that, I'm more inclined to leave the docbook infrastructure for manpages until it's sorted reliably. I don't think we should revert back to man/mdoc/troff files... that'd be quite a regression, atrocious editability.
Looks like the asciidoc format was recently updated to make it even more concise, and coincidentally aligns more closely with markdown syntax. Previous processor is implemented in python. That's another option, though obviously only viable if we limit ourselves to the previous syntax/features.
mdoc is probably less mainstream than docbook, but I wouldn't have thought it would be an edit-ability regression (at least compared to docbook) for that specific purpose... It should at least be much less verbose. I suppose it might be less editable than asciidoc... I should try to translate some man pages into both and see how steep the respective learning curves are. As you say though, only comes up once decent asciidoc progress has been demonstrated.
I must be missing something... mdoc is the old manpage format with a set of prescribed macros, right? Or did some group name their format same as the old one? You do realize BRL-CAD's manpages used to all be in that format, right?? They were not better than the current setup by any stretch.
Docbook is verbose, but it's declarative and relatively unencoded. You can infer what it represents simply by reading it. That makes editing (and reading source form) straightforward albeit technically tedious. That can't be said of mdoc.
Reading an existing page, there's no way to even guess what ".Nm This is something" or ".Qq yay arcane syntax" is going to do without reading a manual (hah) and/or seeing rendered output. It's not friendly to non-developers; heck I'd argue it's not developer-friendly either. It was fairly broadly abandoned as a source format some 20 years ago for that and many other reasons.
For what it's worth, the direction I have had in mind for our docs is prioritizing ease of non-technical authorship, web+pdf output, and feasibility for online editing. Note I intentionally prioritize those three things over tooling as the toolchain challenges can be managed. While it does still need to be proven, I do believe Antora can work out well for BRL-CAD with asciidoc as a source format. Asciidoc is less markup than Markdown while retaining the expressiveness of Docbook; so docs remain unencoded and become even simpler text files. Markdown would be a fallback option that Docusaurus appears to excel at, but markdown does not have as compatible or consistent a representation and offline/pdf output becomes much harder. Both are known for being about as good as it gets right now in terms of ease of non-technical editing, both have very good online editing options, both supported by Github too, but then Asciidoc reportedly handling technical documents much better and the structure lends itself to much easier pdf generation too. That's some of the motivation for trailblazing in this direction.
Automated conversion is almost certainly going to be essential, but we know it won't cover all existing docs, so some manual conversion will likely still be required.
From the project plan: 2.2 What is not in scope?:
There are extensive docs not in Docbook XML format in our repository, on the website wiki, in other formats, and in other files that are desirable and may also be migrated (e.g., for testing purposes), but they are secondary priority and need not be considered in scope.
Wiki docs are written in markdown format, right? So, I think that it should not be difficult to convert them automatically to AsciiDoc. If there are no better tools, even a bash or sed script might do the job.
In general, if the docs have an editable format, which is later converted to a presentation format (like PDF, HTML, etj.) chances are that they can be converted automatically to AsciiDoc. If they exist only in a presentation format (PDF, HTML), most probably a full automatic conversion is not possible and some manual intervention and editing would be required.
@Sean That's fair. (I think OpenBSD's mdoc is a variation on the original man page syntax.) I never directly edited much in the original format for the man pages or OpenBSD's mdoc variation, so I don't have any real sense of how hard it is to work with for that purpose (I do know it's not well suited for general typesetting.) When we made the docbook conversion OpenBSD's mdoc/mandoc didn't exist, so it wouldn't have been a factor in the original decision. My initial thought was the potential simplicity of that toolchain has a lot to recommend it, but if the format it's based on is specifically on the to-avoid list then it won't change the calculus.
The phrase "toolchain challenges can be managed" makes me flinch a little, but if we get significantly better results from the contribution and display side we can give it a go.
With the old troff format (which is essentially what man pages are), you can describe macros that essentially do prescribed typesetting. There used to be a bunch of competing sets that all did similar things to create what we recognize as the general standard sections of a man page. Looking at the mdoc manual page, it's simply one of those macro sets that they settled on.
Which is to say that it's nothing "new" per se, they just slapped a name on a convention. It's still troff/man format. Which isn't to say it's good or bad -- there are some nice things about man pages. That was just a ship that literally sailed 2 decades ago... :)
starseeker said:
The phrase "toolchain challenges can be managed" makes me flinch a little, but if we get significantly better results from the contribution and display side we can give it a go.
That's totally fair. Maybe one of the things for the GSoD project can be to also proof the integration challenges to make sure there's not a show-stopper.
/me nods. The potential game changer for me was the mandoc program, which (if I'm understanding correctly) can produce html, man, PDF and Postscript from a single source document while being completely self contained. Assuming it really can, that's even better than our docbook toolchain (we need Apache FOP for PDF output), although it's probably a question how sophisticated its formatting is for things like tables.
Oh well. Maybe I can teach mandoc to read asciidoc man page inputs.
We really need doc unification too. I mean that's arguably the biggest obstacle we've had with docbook. Setup is such a chore. It's a bit of a bear to convert something that's been formatted in InDesign or even Word.
To me, our current setup is a perfectly sufficient solution for man pages (there's not much value/point to PDF manual pages other than completeness) because it generates good manual pages flawlessly. It's a solution for the tutorials too because they're converted. The pain is new non-man docs, online editing, importing existing docs, etc.
Well now that's an interesting option. I just successfully compiled asciidoctor.js to a standalone binary executable. There's a javascript compiler that worked surprisingly well.
So it's at least feasible that we could precompile for all concerned environments and pull them from a repo en lieu of requiring the ruby toolchain. The asciidoctor.js tool can also simply be run directly on some platforms (e.g., on Windows).
Push comes to shove, looking at the size and complexity of the code, I think we could transcode asciidoctor from ruby or javascript to C++ in a week too, though that'd be a burden to keep in sync with upstream changes.
Dashamir Hoxha said:
I am interested about your GSoD'21 project proposal.
I am familiar with documentation systems, conversion tools, etc. I am sure that I will find some way to automate the conversion of the existing Docbook XML docs to AsciiDoc. Even if there are no existing tools that are appropriate for this task, I can develop my own conversion tools; Docbook XML is easy to parse and process by a program and I have some experience with doing this (many years ago). Before being a technical writer, I am a computer engineer, software developer and hacker, so I am sure that I can find a way.
You can find some links to my previous documentation work on this page: https://gitlab.com/-/snippets/1861138
Please let me know if you have any questions, or would like to know more details.
Is it useful/desirable to have an automatic conversion from AsciiDoc to Docbook as well (after the migration to AsciiDoc is done)? Or you just want to get rid of Docbook and its toolchain?
@Dashamir Hoxha I don't think we will know that for a while.
I mean, I do believe less complexity is better and -- in general -- eliminating complexity and/or dependencies can be a good thing for maintainability. BUT... the key consideration here is ease of authorship / editing and unification of BRL-CAD's documentation both online and offline.
I am interested in applying for BRLCAD.
I am Simos Xenitellis, https://blog.simos.info/
Last updated: Jan 10 2025 at 00:48 UTC