Learning Groff

Wed, 19 Apr 2023 10:02:55 -0500

One of the things people first complain about when moving over to Linux is the absence Microsoft Word. Their first instinct is to either figure out how to install Word on Linux (which is a bad idea), or choose the most common open source alternative: Libre Office Writer. Now, as far as I'm concerned, Libre Office Writer is a fine 'What You See Is What You Get' (WYSIWYG) alternative to Word, but I am here to argue that WYSIWYG editors are a problem. In particular, I believe that WYSIWYG editors like Word have generally decreased computer literacy.

WYSIWYG vs. WYSIWYM

In the most basic sense, a WYSIWYG editor is exactly what is sounds like. What you see on the page as you're writing is how it will look when you print the final product. For example, bold text looks bold and italics appear italicized as you're editing. WYSIWYG editors have been the default for the average user for a long time now, but that was not always the case. Plain text formatting or 'What You See Is What You Mean' (WYSIWYM) formatting has always been alive and active, and is, in my opinion, the superior mode of document writing.

You may ask what is WYSIWYM formatting? As the name implies, you write in plain text, but you use tags or macros to tell the computer what to do with that text. Those with experience with HTML will know what this looks like. For example, instead of clicking a button to bold a letter, you will instead use a predefined tag to indicate where bold text should be.

<p><strong>Bold Text in HTML</strong> followed by normal text</p>

Then, you pass this though a complier or interpreter which will convert or display your plain text document into a more visual format (e.g. PDF). These tags or macro sets are sometimes called markup languages. Common markup languages include: HTML, LaTeX, and Markdown.

Despite the modest learning curve, using markup languages have many benefits. Writing documents in plain text allows for greater portability; often uses less memory; admits programmatic manipulation; makes formatting math equations, chemical formulas, and bibliographies a breeze; and perhaps most importantly, it improves computer literacy.

Learning a markup language provides an educational opportunity to learn more how a computer works. When we teach people how to use software like Word, PowerPoint, etc., a lot of the mechanics under the hood are hidden from view. This obfuscation of the underlying logic of the software leads to non-technical people to start thinking software is "magic." As someone who works in IT, it often amazes me the lack of fundamental knowledge people have when it comes to their own computers. People's lack of understanding basic concepts directory structures, file types, and resource management. I don't think everyone needs to become a computer scientist, but people should have a better education when it comes to a technology as ubiquitous as computers and learning a markup language is a great platform to begin such an education.

Groff

Many new Linux users are not aware that they all have a small program called Groff installed on their system. In fact, many experienced users are not aware of this either. Groff is the GNU implementation of an old typesetting software called Troff which is itself based on an even older program call Roff. It allows the user to write plain text documents with macros to specify formatting. Then, the user can compile this document into their preferred format: PDF, Postscript, HTML, among other common formats. Simply put, Groff is a markup language with many nice macro packages for different use cases.

simple documents

Today, we will focus on a macro package called MS which is short for manuscript. This package is mostly used for writing simple documents, scientific papers, and reports. We begin by creating a document. Let's call it document.ms. Open this file in your favorite plain text editor: vim, nano, emacs, gedit, etc.

title pages

Every Groff macro begins with a '.' and be the first thing on the line. It is best to show this by example. At the top of this document, we will put a title and the author's name.

.TL
Title Goes Here
.AU
Mat L.

As you can see, the text that the macro is acting on goes on the line below. By default, the title and author will be bold and centered.

headings

Following our title and author, we might want to make some section headings. We can either make numbered or unnumbered headings.

.NH 
Level 1 Heading
.NH 2
Level 2 Heading
.SH
Unnumbered Level 1 Heading
.SH 2
Unnumbered Level 2 Heading

Notice that the .NH macro for numbered headings defaults to a first level heading. We can add integer arguments to specify multilevel headings. Groff will automatically keep track of the numbering. For example the first subsection of section 2 will be numbered 2.1. If you give a numeric argument to .SH, the unnumbered heading macro, it will automatically reduce the font size of the heading to indicate it is a sub heading. Additionally, headings are automatically bold and left justified.

paragraphs

Next, let's talk about paragraphs. Suppose you write a bunch of text, and then hit enter once. This will not create a new paragraph. All of the text will still be part of the same paragraph. You can create new paragraphs with a few different macros. The .PP macro creates a new paragraph with an indent and the .LP macro creates a paragraph with no indent.

.LP
This is my first paragraph. I don't want it indented because it's the 
first! Also, even though this text is on a new line, it won't be in a 
new paragraph.
.PP
This is my second paragraph. It's got an indent because I used the .PP
macro. Notice that even though I wrote .PP, it won't treat it as a 
macro because it's not the first word on the line.

We also have macros for bold and italics. .B for bold and .I for italics. They are used a bit differently.

.PP
This is a paragraph again! The next sentence is important. 
.B "I'm putting this in bold because it's important." 
Now, I want something to be in italics.
.I "This is in italics because I want it to be."

Notice that the text that I want to be bold and italicized are on the same line as the .B and .I macros. You may also notice that the text is in quotes. Be sure to escape double quotes if you want them in your italicized or bold sentence.

lists

Next on the list are lists. These are a bit tricky in Groff, but once you get the logic down, they're not a problem. Again, learn by looking at an example.

.RS
.IP \(bu 2
List item 1
.IP \(bu 
List item 2
.IP \(bu 
List item 3
.RS
.IP \(bu 2
Nested Item 1
.IP \(bu 
Nested Item 2
.RE
.RE

This will make a simple bulleted list. Let's break it down. Notice that the entire block is contained with the .RS/.RE tags. These tags indent everything in between them by one level. It's not clear what the R stands for. You can nest as many .RS/.RE blocks as you want. This is followed by the .IP tag. This creates a list item. This macro takes two arguments: a bullet marker and a width. The standard bullet marker is (bu, but there are others to choose from if you desire. The width value indicates how far the bullet is from the list item. I find that 2 is a good width, but adjust as you see fit. You will notice that we only needed to specify the width once within the .RS/.RE environment. It is set afterwards for all other bullets until the environment is escaped.

compiling your document

Groff allows you to compile your document into many different formats. By default, it will create a postscript document, but for most simple documents PDF is sufficient. To compile you document, open a terminal and navigate to the directory your file is in. Run the following command on your file.

$ groff -ms document.ms -Tpdf > document.pdf 

Let's break down this command to understand what's going on:

If this command runs with no output, then you've successfully created a PDF! Now you can open it in your favorite PDF viewer. That's all there is to it!

At this point, you should have the skills necessary to start experimenting and expanding your Groff abilities. We've only scratched the surface. Groff supports so many additional features such as math equations, chemical formulas, bibliographies, multi-column formatting, fonts, images/figures, and even sophisticated graphs. I hope to write more tutorial's on how to use these features, but in the mean time play around with Groff. It's a fun an versatile tool. Below I've included a short macro guide, an example MS document and resulting PDF, and further resources.

Macro guide

This is not comprehensive. For a full guide, visit the man page for the Groff MS macros.

$ man groff ms

cover page

Macro Function
.TL Document Title
.AU Document Author
.AI Author Institution
.AB/.AE Abstract Environment

headings

Macro Function Argument(s)
.SH xx Unnumbered Section Heading Integer 1-99
.NH xx Numbered Section Heading Integer 1-99

paragraphs

Macro Function
.PP Indented Paragraph
.LP Non-Indented Paragraph
.RS/.RE Indented Environment

lists

Macro Function Argument(s)
.IP xx xx Bullet Point Bullet marker width

highlighting

Macro Function Argument(s)
.I Italicized Text "String of text"
.B Bold Text "String of text"
.BI Bold Italic Text "String of text"
.UL Underlined Text "String of text"
.BX Boxed Text "String of text"
\*{text\*} Super Scripted Text "String of text"

Resources


  1. Man Pages: man groff and man groff_ms
  2. Example MS Document and Resulting PDF: document.ms document.pdf