Maintaining accurate and up-to-date documentation for complex software projects can be a daunting challenge. This is especially true for internal applications with evolving task pipelines, like the Go-based Kubernetes application we developed. In such environments, where multiple teams contribute and operate intricate tasks comprising several sequential steps, comprehensive and easily accessible documentation is paramount. Engineers need to quickly grasp how to configure, debug, and manage these tasks effectively.

The fundamental issue lies in the distance between documentation and source code. Traditional tools like JavaDoc, Python Docstrings, or GoDoc excel at generating API-level documentation directly from code comments. While invaluable for developers exploring a library’s interface, these tools often fall short in providing the broader, application-specific context crucial for operational understanding. They typically generate language-ecosystem-specific HTML pages with limited customizability, focusing on API details rather than practical application behavior. For example, a JavaDoc for a GraphIndex class might show inheritance and method details but lack real-world application context. Its main focus is documenting the API to other developers, not the behavior of an application.

This article addresses this gap by presenting a robust solution for generating rich, application-centric documentation directly from Go source code. Our approach combines the benefits of in-code documentation with the power of a static site generator, Antora, to create a cohesive, versioned, and easily navigable documentation portal. We’ll explore how to leverage Go’s Abstract Syntax Tree (AST) to extract meaningful information from comments and code structure, transform it into Markdown, convert it to AsciiDoc, and finally integrate it into a comprehensive Antora site.

Bridging the Gap: From Go Code to Actionable Documentation

To illustrate the challenge and our solution, consider a Go application managing diverse task pipelines. Each TaskLogic implements an interface defining one or more TaskSteps, executed sequentially. For instance, we might have a DataProcessingTask and a ReminderTask.

type TaskLogic interface {
    GetSteps() []TaskStep
}

var TaskLogics = map[api.TaskType]TaskLogic{
    api.DataProcessingTask:  tasks.DataProcessingTaskLogic{},
    api.ReminderTask:        tasks.ReminderTaskLogic{},
}

/*
DataProcessingTaskLogic processes the provided data
in a secure and performant manner. It uses sophisticated
algorithms for maximum efficiency, while ensuring
the highest level of security through quantum-resistant encryption.
*/
type DataProcessingTaskLogic struct{}

/*
StartupStep performs the necessary startup actions,
such as initializing the database connection
and loading the configuration.
*/
type StartupStep struct{}

/*
RunStep performs the actual data processing.
*/
type RunStep struct{}

/*
CleanupStep performs the necessary cleanup actions,
such as closing the database connection and releasing resources.
*/
type CleanupStep struct{}

func (t DataProcessingTaskLogic) GetSteps() []ot.TaskStep {
    return []ot.TaskStep{
        StartupStep{},
        RunStep{},
        CleanupStep{},
    }
}

Now, imagine an engineer receives an alert about a failing DataProcessingTask, with the error pointing to the first step. Without direct insight into the task’s internal workings and step definitions, debugging becomes a time-consuming ordeal. Manual runbooks, while helpful initially, quickly become obsolete as the underlying code evolves. The ideal solution is dynamic documentation, generated directly from the source, reflecting the current task logic and its constituent steps.

Our goal is to create documentation, perhaps looking like a web page with a clear task description and a list of steps with their descriptions, providing a clear overview of the task’s purpose and its individual steps. To achieve this, we leverage Antora, a powerful static site generator that supports AsciiDoc and excels at aggregating documentation from various repositories and languages into a unified, versioned portal.

The documentation generation process unfolds in three main stages:

  1. Markdown Generation: We generate Markdown files for each task type, detailing its logic and steps. Markdown was chosen for its widespread adoption and ease of writing, though direct AsciiDoc generation is also feasible.
  2. Markdown to AsciiDoc Conversion: For seamless integration with Antora, these Markdown files are then converted into AsciiDoc format using a tool like pandoc. (This step can be skipped if AsciiDoc is generated directly).
  3. Antora Integration: Finally, the AsciiDoc files are integrated into Antora’s build process to generate the complete documentation site. (This step is optional if a static site isn’t required).

Let’s delve into the technical implementation of generating this Markdown documentation. We employ Mage as our build tool, organizing documentation-related files within a mage/docs directory. The generation process is initiated via a simple mage docs <output-directory> command.

// magefile.go
func Docs(outputDir string) error {
    return docs.GenerateDocs(outputDir)
}

Our Go-based documentation generator defines key types to represent the extracted information:

// docs/task_docs.go
type TypeIdent struct {
    Name    string
    PkgPath string
}

type TaskLogicType TypeIdent
type TaskStepType TypeIdent

type TaskDocs struct {
    TaskLogicDoc *string
    FileName     string
    StructName   string
    TaskSteps    []TaskStepType
}

type StepDoc struct {
    StructName       string
    FileName         string
    ShortDescription *string
    LongDescription  *string
}

The core GenerateDocs function orchestrates the entire process:

// docs/lib.go
func GenerateDocs(outputDir string) error {
    taskDocs, stepDocs := ExtractTaskAndStepDocs("./")
    markdownDocs := GenerateMarkdownDocs(taskDocs, stepDocs)
    if len(markdownDocs) == 0 {
        return fmt.Errorf("No task documentation generated. Maybe the generator did not find the source files?")
    }

    err := WriteMarkdownDocsToFiles(markdownDocs, outputDir)
    if err != nil {
        return fmt.Errorf("Error writing task documentation to files: %w", err)
    }
    fmt.Println("Task documentation generated successfully to", outputDir)
    return nil
}

We utilize Go’s Abstract Syntax Tree (AST) through the go/ast and go/parser packages to programmatically analyze the source code. The filepath.WalkDir function iterates through all .go files, and for each file, ast.Inspect traverses its AST to identify type declarations. The buildTaskTypeLookups helper function, detailed at the end of the original article, creates mappings for TaskLogic and TaskStep types.

func ExtractTaskAndStepDocs(rootPrefix string) (map[api.TaskType]TaskDocs, map[TaskStepType]StepDoc) {
    taskLogicLookup, taskStepLookup, taskSteps := buildTaskTypeLookups(controllers.TaskLogics)
    taskDocs := make(map[api.TaskType]TaskDocs)
    stepDocs := make(map[TaskStepType]StepDoc)

    err := filepath.WalkDir(rootPrefix, func(path string, d os.DirEntry, err error) error {
        if err != nil {
            return err
        }

        if !d.IsDir() && strings.HasSuffix(d.Name(), ".go") {
            processFile(path, taskLogicLookup, taskStepLookup, taskDocs, taskSteps, stepDocs, rootPrefix)
        }

        return nil
    })

    if err != nil {
        log.Fatalf("Error walking through files: %v", err)
    }

    return taskDocs, stepDocs
}

func processFile(
    filePath string,
    taskLogicLookup map[TaskLogicType]api.TaskType,
    taskStepLookup map[TaskStepType]api.TaskType,
    taskDocs map[api.TaskType]TaskDocs,
    taskSteps map[api.TaskType][]TaskStepType,
    stepDocs map[TaskStepType]StepDoc,
    rootPrefix string,
) {
    fset := token.NewFileSet()

    node, err := parser.ParseFile(fset, filePath, nil, parser.ParseComments)
    if err != nil {
        log.Printf("Failed to parse file %s: %v", filePath, err)
        return
    }

    ast.Inspect(node, func(n ast.Node) bool {
        if genDecl, ok := n.(*ast.GenDecl); ok && genDecl.Tok == token.TYPE {
            processDeclaration(genDecl, filePath, taskLogicLookup, taskStepLookup, taskDocs, taskSteps, stepDocs, rootPrefix)
        }
        return true
    })
}

The processDeclaration function examines GenDecl nodes, distinguishing between TaskLogic and TaskStep structs.
* processTaskLogic: Extracts the entire comment associated with a TaskLogic struct and stores it in TaskDocs.
* processTaskStep: Extracts both a short (first sentence) and long description from a TaskStep struct’s comment, storing it in StepDoc. The filePathToPackagePath helper function is used to construct package paths for linking to source code on GitHub.

func processDeclaration(
    genDecl *ast.GenDecl,
    filePath string,
    taskLogicLookup map[TaskLogicType]api.TaskType,
    taskStepLookup map[TaskStepType]api.TaskType,
    allTaskDocs map[api.TaskType]TaskDocs,
    allTaskSteps map[api.TaskType][]TaskStepType,
    stepDocs map[TaskStepType]StepDoc,
    rootPrefix string,
) {
    for _, spec := range genDecl.Specs {
        if typeSpec, ok := spec.(*ast.TypeSpec); ok {
            if _, ok := typeSpec.Type.(*ast.StructType); ok {
                structName := typeSpec.Name.Name
                typeIdent := TypeIdent{structName, filePathToPackagePath(filePath, rootPrefix, packagePrefix)}
                if taskName, exists := taskLogicLookup[TaskLogicType(typeIdent)]; exists {
                    processTaskLogic(genDecl, filePath, structName, taskName, allTaskDocs, allTaskSteps[taskName])
                }
                if _, exists := taskStepLookup[TaskStepType(typeIdent)]; exists {
                    processTaskStep(genDecl, filePath, structName, stepDocs, rootPrefix)
                }
            }
        }
    }
}

func processTaskLogic(genDecl *ast.GenDecl, filePath, structName string, taskName api.TaskType, taskDocs map[api.TaskType]TaskDocs, taskSteps []TaskStepType) {
    taskDocs[taskName] = TaskDocs{
        TaskLogicDoc: extractComment(genDecl),
        FileName:     filePath,
        StructName:   structName,
        TaskSteps:    taskSteps,
    }
}

func processTaskStep(
    genDecl *ast.GenDecl,
    filePath, structName string,
    stepDocs map[TaskStepType]StepDoc,
    rootPrefix string,
) {
    comment := extractComment(genDecl)
    stepDocs[TaskStepType{structName, filePathToPackagePath(filePath, rootPrefix, packagePrefix)}] = StepDoc{
        StructName:       structName,
        FileName:         filePath,
        ShortDescription: extractShortDescription(comment),
        LongDescription:  comment,
    }
}

func extractComment(genDecl *ast.GenDecl) *string {
    if genDecl.Doc != nil {
        trimmedDoc := strings.TrimSpace(genDecl.Doc.Text())
        return &trimmedDoc
    }
    return nil
}

func extractShortDescription(comment *string) *string {
    if comment == nil {
        return nil
    }
    paragraphs := strings.SplitN(*comment, "\n\n", 2)
    replacedFirstParagraph := strings.ReplaceAll(paragraphs[0], "\n", " ")
    shortDescription := strings.TrimSpace(replacedFirstParagraph)
    return &shortDescription
}

func filePathToPackagePath(filePath, rootPrefix, packagePrefix string) string {
    relativePath := strings.TrimPrefix(filePath, rootPrefix)
    dirPath := filepath.Dir(relativePath)
    packagePath := filepath.Join(packagePrefix, dirPath)
    packagePath = filepath.ToSlash(packagePath)
    return packagePath
}

Once all task and step documentation is extracted, GenerateMarkdownDocs constructs a dedicated Markdown file for each task. Each file includes a description of the task logic, a link to its source code on GitHub, and a detailed list of its steps, incorporating their long descriptions directly into the task’s documentation for simplicity.

func GenerateMarkdownDocs(taskDocs map[api.TaskType]TaskDocs, stepDocs map[TaskStepType]StepDoc) map[api.TaskType]string {
    markdownDocs := make(map[api.TaskType]string)

    for taskType, docs := range taskDocs {
        var sb strings.Builder

        sb.WriteString(fmt.Sprintf("# %s\n\n", taskType))

        sb.WriteString("## Description\n\n")
        if docs.TaskLogicDoc != nil {
            sb.WriteString(fmt.Sprintf("%s\n", *docs.TaskLogicDoc))
        } else {
            sb.WriteString("No description provided.\n")
        }

        sb.WriteString(fmt.Sprintf("\nYou can find the [source code](%s/%s) on GitHub.\n\n", githubPrefix, docs.FileName))

        sb.WriteString("## Steps\n\n")
        if docs.TaskSteps != nil && len(docs.TaskSteps) > 0 {
            for i, step := range docs.TaskSteps {
                sb.WriteString(fmt.Sprintf("### %d. %s\n\n", i+1, step.Name))
                stepDoc, exists := stepDocs[step]
                if exists && stepDoc.LongDescription != nil {
                    sb.WriteString(fmt.Sprintf("%s\n\n", *stepDoc.LongDescription))
                }
            }
        } else {
            sb.WriteString("No steps defined.\n")
        }

        markdownDocs[taskType] = sb.String()
    }

    return markdownDocs
}

Finally, WriteMarkdownDocsToFiles saves these generated Markdown files to the specified output directory, naming each file after its respective task type (e.g., DataProcessingTask.md).

func WriteMarkdownDocsToFiles(markdownDocs map[api.TaskType]string, outputDir string) error {
    for taskType, content := range markdownDocs {
        fileName := fmt.Sprintf("%s.md", taskType)
        filePath := filepath.Join(outputDir, fileName)

        err := os.WriteFile(filePath, []byte(content), 0644)
        if err != nil {
            return fmt.Errorf("failed to write file %s: %w", filePath, err)
        } else {
            fmt.Printf("Wrote file %s\n", filePath)
        }
    }

    return nil
}

Seamless Integration: Markdown to Antora with Pandoc

To bridge the gap between our generated Markdown and Antora’s AsciiDoc-centric environment, we employ Pandoc, a universal document converter. A bash script, integrate_go_docs.sh, automates this conversion process and prepares the necessary navigation structure for Antora.

This script performs the following actions:

  1. Iterates Markdown Files: It traverses the directory containing the generated Markdown task documentation.
  2. Converts to AsciiDoc: For each Markdown file, Pandoc is invoked with specific options (-s -f markdown --shift-heading-level-by -1 --wrap=none -t asciidoc) to convert it into a standalone AsciiDoc document. The --shift-heading-level-by -1 option is crucial for correctly mapping Markdown’s H1 (#) to AsciiDoc’s title (=).
  3. Generates Navigation Snippet: Simultaneously, the script appends an Antora xref entry to a navigation partial file (e.g., partials/task-operator-nav.adoc). This ensures that each newly generated AsciiDoc task page is automatically linked within the Antora site’s navigation menu.
#!/bin/bash
# integrate_go_docs.sh

if [ "$#" -ne 2 ]; then
  echo "Usage: $0 <md_dir> <pages_dir>"
  exit 1
fi

if ! command -v pandoc &> /dev/null
then
  echo "Error: pandoc is not installed. Please install pandoc to continue."
  exit 1
fi

md_dir=$1
pages_dir=$2

tasks_dir="components/operator/tasks"
nav_file="$pages_dir/../partials/task-operator-nav.adoc"

echo "// This file has been generated on $(date)" > "$nav_file"

# Convert task docs
for md_file in "$md_dir"/*.md
do
  adoc_dir="$pages_dir/$tasks_dir"
  adoc_file=$(basename "${md_file%.md}.adoc")
  adoc_path="$adoc_dir/$adoc_file"
  echo "Integrating $md_file"
  # -s to create a standalone document, including the title (=)
  # --shift-heading-level-by -1 to convert the markdown h1 (#) to asciidoc title (=)
  # See https://github.com/jgm/pandoc/issues/5615
  #
  # --wrap=none to avoid wrapping lines, causing long headlines to be broken
  # See https://github.com/jgm/pandoc/issues/3277#issuecomment-264706794
  pandoc -s -f markdown --shift-heading-level-by "-1" --wrap=none -t asciidoc -o "$adoc_path" "$md_file"
  echo "* xref:$tasks_dir/$adoc_file[]" >> "$nav_file"
done

Automated Deployment: Building the Antora Documentation Site

Our final step involves integrating these AsciiDoc files into the Antora documentation site and automating the entire pipeline using GitHub Actions.

The Antora site itself is built using the npm ecosystem. Our package.json defines two key scripts:
* generate: Executes the Antora site generation process using a specified playbook (playbooks/generate.yml).
* go:pandoc: Invokes our integrate_go_docs.sh script, directing it to process the generated Markdown files and place the resulting AsciiDoc content within the Antora module’s pages directory.

{
  "name": "go-antora-docs",
  "scripts": {
    "generate": "antora --stacktrace --fetch --clean playbooks/generate.yml",
    "go:pandoc": "./integrate_go_docs.sh ../docs/generated ./modules/ROOT/pages"
  },
  "repository": {
    "type": "git",
    "url": "git+https://github.com/yourorg/yourrepo.git"
  },
  "dependencies": {
    "@antora/cli": "^3.1.7",
    "@antora/lunr-extension": "^1.0.0-alpha.8",
    "@antora/site-generator-default": "^3.1.8",
    "@redocly/cli": "^1.25.11",
    "antora": "^3.1.8",
    "http-server": "^14.1.1"
  }
}

To ensure a consistent and automated documentation build, we implement a GitHub Actions workflow:

  1. Generate Go Docs (Job generate-go-docs):
    • Checks out the repository.
    • Sets up the Go environment.
    • Installs mage.
    • Runs mage docs docs/generated to generate the Markdown documentation.
    • Uploads the generated Markdown files as an artifact (go-docs).
  2. Compile Docs (Job compile-docs):
    • Depends on the generate-go-docs job to ensure Markdown files are available.
    • Checks out the repository.
    • Sets up Node.js.
    • Installs npm dependencies for the Antora site.
    • Installs pandoc (a prerequisite for Markdown to AsciiDoc conversion).
    • Downloads the go-docs artifact.
    • Executes npm run "go:pandoc" to convert Markdown to AsciiDoc and update the Antora navigation.
    • Runs npm run generate to build the complete Antora site.
    • Copies the index.html file into the build directory.
    • Uploads the final Antora site as an artifact (antora).
jobs:
  generate-go-docs:
    runs-on: ubuntu-24.04
    env:
      GOPATH: /home/runner/go
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up Go
        uses: actions/setup-go@v5
        with:
          go-version: '1.24.2'
      - name: Install Mage
        uses: magefile/mage-action@v3
        with:
          install-only: true
          version: "v1.15.0"
      - name: Generate Docs
        run: mage docs docs/generated
      - name: Upload  Docs
        uses: actions/upload-artifact@v4
        with:
          name: go-docs
          path: docs/generated
  compile-docs:
    runs-on: ubuntu-24.04
    needs:
      - generate-go-docs
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 18
      - name: Install Dependencies
        working-directory: site
        run: npm ci
      - name: Install Pandoc
        working-directory: site
        run: |
          sudo apt-get update
          sudo apt-get install -y pandoc
      - name: Download Operator Docs
        uses: actions/download-artifact@v4
        with:
          name: go-docs
          path: kubernetes/operator/docs/generated
      - name: Generate Operator Pandoc Page
        working-directory: site
        run: npm run "go:pandoc"
      - name: Generate Antora Page
        working-directory: site
        run: npm run generate
      - name: Copy index HTML
        working-directory: site
        run: cp index.html playbooks/build/site
      - name: Upload antora
        uses: actions/upload-artifact@v4
        with:
          name: antora
          path: site/playbooks/build/site

This robust CI/CD pipeline ensures that any changes to the Go application’s task logic or steps are automatically reflected in the comprehensive, versioned Antora documentation site, maintaining accuracy and reducing manual effort.

Conclusion: Empowering Engineers with Dynamic Documentation

This article has demonstrated a powerful and efficient method for generating comprehensive, application-specific documentation directly from Go source code and integrating it seamlessly into an Antora-powered static site. By leveraging Go’s AST, we can extract vital information from code comments and structure, transforming it into easily consumable Markdown and then AsciiDoc.

This approach offers significant advantages:

  • Accuracy and Freshness: Documentation remains perpetually synchronized with the codebase, eliminating the perennial problem of outdated information.
  • Application-Centric View: Beyond mere API descriptions, the generated documentation focuses on the operational behavior and step-by-step execution of tasks, providing invaluable context for engineers.
  • Enhanced Debugging and Operations: Clear, up-to-date documentation empowers engineers to quickly understand, configure, debug, and operate complex task pipelines, significantly reducing mean time to resolution.
  • Scalability and Consistency: The automated pipeline ensures consistent documentation across a growing codebase and diverse teams.
  • Flexibility: The underlying principles can be readily adapted to other programming languages, documentation formats, and static site generators, making this a versatile solution for various development environments.

By investing in automated documentation generation, organizations can foster a culture of clarity and efficiency, enabling their engineering teams to operate with greater confidence and productivity.

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed