Migrate from Bazel
This commit is contained in:
commit
016dbd0814
59 changed files with 7044 additions and 0 deletions
|
|
@ -0,0 +1,501 @@
|
|||
+++
|
||||
template = "article.html"
|
||||
title = "Writing a Bazel rule set"
|
||||
date = 2020-05-16T15:55:00+11:00
|
||||
description = "Learn how to write custom Bazel rules by integrating PlantUML, including rule implementation and testing strategies."
|
||||
|
||||
[taxonomies]
|
||||
tags = ["bazel", "plantuml"]
|
||||
+++
|
||||
|
||||
This post will cover two things:
|
||||
|
||||
- How to run an arbitrary tool with Bazel (in this case,
|
||||
[PlantUML](https://plantuml.com/), a tool to generate diagrams), by writing a
|
||||
rule set
|
||||
- How to test this rule set.
|
||||
|
||||
It should be mentioned that while I was working on this rule set, it became more
|
||||
and more apparent PlantUML is not a great candidate for this kind of
|
||||
integration, as its output is platform-dependent (the font rendering). Despite
|
||||
that, it's still a simple tool and as such its integration is simple, albeit not
|
||||
perfect (the rendering tests I wrote need to run on the same platform every
|
||||
time).
|
||||
|
||||
## PlantUML usage
|
||||
|
||||
PlantUML is a tool that takes a text input looking like this:
|
||||
|
||||
```
|
||||
@startuml
|
||||
Alice -> Bob: SYN
|
||||
@enduml
|
||||
```
|
||||
|
||||
And outputs an image looking like this:
|
||||
|
||||
{% mermaid(caption="PlantUML sample output") %}
|
||||
sequenceDiagram
|
||||
Alice->>Bob: SYN
|
||||
{% end %}
|
||||
|
||||
PlantUML has multiple way of being invoked (CLI, GUI, as well as a _lot_ of
|
||||
integrations with different tools), but we'll go with the easiest: a one-shot
|
||||
CLI invocation. It takes as inputs:
|
||||
|
||||
- A text file, representing a diagram
|
||||
- An optional configuration file, giving control over the output
|
||||
|
||||
It then outputs a single image file, which can be of different formats (we'll
|
||||
just cover SVG and PNG in this article, but adding support for other formats is
|
||||
trivial).
|
||||
|
||||
PlantUML ships as a JAR file, which needs to be run with Java. An invocation
|
||||
generating the sample image above would look like that:
|
||||
|
||||
```bash
|
||||
java -jar plantuml.jar -tpng -p < 'mysource.puml' > 'dir/myoutput.png'
|
||||
```
|
||||
|
||||
Pretty straightforward: run the JAR, with a single option for the image type,
|
||||
pipe the content of the input file and get the output file back. The `-p` flag
|
||||
is the short form of `-pipe`, which we're using as using pipes is the only way
|
||||
of properly controlling the output path (without that, PlantUML tries to be
|
||||
smart and places the output next to the input).
|
||||
|
||||
With a configuration file:
|
||||
|
||||
```bash
|
||||
java -jar plantuml.jar -tpng -config config.puml -p < 'mysource.puml' > 'dir/myoutput.png'
|
||||
```
|
||||
|
||||
Simple enough, right? Well, not really. PlantUML actually integrates some
|
||||
metadata in the files it generates. For example, when generating an SVG:
|
||||
|
||||
```svg
|
||||
<!-- The actual SVG image has been omitted, as this part is deterministic and
|
||||
pretty long. -->
|
||||
|
||||
<svg><g>
|
||||
<!--MD5=[8d4298e8c40046c92682b92efe1f786e]
|
||||
@startuml
|
||||
Alice -> Bob: SYN
|
||||
@enduml
|
||||
|
||||
PlantUML version 1.2020.07(Sun Apr 19 21:42:40 AEST 2020)
|
||||
(GPL source distribution)
|
||||
Java Runtime: OpenJDK Runtime Environment
|
||||
JVM: OpenJDK 64-Bit Server VM
|
||||
Java Version: 11.0.6+10
|
||||
Operating System: Linux
|
||||
Default Encoding: UTF-8
|
||||
Language: en
|
||||
Country: AU
|
||||
--></g></svg>
|
||||
```
|
||||
|
||||
This makes PlantUML non hermetic by default (in addition to the fonts issue
|
||||
mentioned earlier). While PlantUML has a simple way of working around that (in
|
||||
the form of a `-nometadata` flag), this is something to keep in mind when
|
||||
integrating a tool with Bazel: is this tool usable in a hermetic way? If not,
|
||||
how to minimise the impact of this non-hermeticity?
|
||||
|
||||
From there, here is the invocation we'll work with:
|
||||
|
||||
```bash
|
||||
java -jar plantuml.jar -tpng -nometadata -config config.puml \
|
||||
-p < 'mysource.puml' > 'dir/myoutput.png'
|
||||
```
|
||||
|
||||
## Getting PlantUML
|
||||
|
||||
PlantUML is a Java application, available as a JAR on Maven. As such, it can be
|
||||
fetched with the help of
|
||||
[rules_jvm_external](https://github.com/bazelbuild/rules_jvm_external/), as was
|
||||
explained in
|
||||
[a previous article](@/posts/creating-a-blog-with-bazel/02-compiling-a-kotlin-application-with-bazel/index.md#dependencies).
|
||||
The Maven rules will expose the JAR as a library, but we need a binary to be
|
||||
able to run it. In e.g. `//third_party/plantuml/BUILD`:
|
||||
|
||||
```python
|
||||
load("@rules_java//java:defs.bzl", "java_binary")
|
||||
|
||||
java_binary(
|
||||
name = "plantuml",
|
||||
main_class = "net.sourceforge.plantuml.Run",
|
||||
visibility = ["//visibility:public"],
|
||||
runtime_deps = [
|
||||
"@maven//:net_sourceforge_plantuml_plantuml",
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
From there, we can use `//third_party/plantuml` as any Bazel binary target - we
|
||||
can run it with `bazel run`, and we can pass it as a tool for rule actions.
|
||||
|
||||
This is a pattern that works well for any JVM-based tool. Other kinds of tools
|
||||
will need a different preparation step to make them available through Bazel -
|
||||
but as long as you can get a binary, you should be good.
|
||||
|
||||
## Rule set structure
|
||||
|
||||
This rule set will follow the same structure we previously used for
|
||||
[Ktlint](@/posts/creating-a-blog-with-bazel/02-compiling-a-kotlin-application-with-bazel/index.md#ktlint):
|
||||
|
||||
- Based in `//tools/plantuml`
|
||||
- A public interface exposed in `//tools/plantuml/defs.bzl`
|
||||
- Internal actions definition in `//tools/plantuml/internal/actions.bzl`
|
||||
- Internal rule definition in `//tools/plantuml/internal/rules.bzl`
|
||||
|
||||
But in addition:
|
||||
|
||||
- Tests for the actions in `//tools/plantuml/internal/actions_test.bzl`
|
||||
- Integration tests in `//tools/plantuml/tests`
|
||||
|
||||
Let's start by defining our actions.
|
||||
|
||||
## Actions
|
||||
|
||||
### Implementation
|
||||
|
||||
We need only one action for our rule: one that takes a source file, an optional
|
||||
configuration file, the PlantUML binary, and emits the output file by calling
|
||||
PlantUML. Let's assume for a moment we have a helper function which, given the
|
||||
proper input, returns the PlantUML command line to call, called
|
||||
`plantuml_command_line`, and write the action from there:
|
||||
|
||||
```python
|
||||
def plantuml_generate(ctx, src, format, config, out):
|
||||
"""Generates a single PlantUML graph from a puml file.
|
||||
|
||||
Args:
|
||||
ctx: analysis context.
|
||||
src: source file to be read.
|
||||
format: the output image format.
|
||||
config: the configuration file. Optional.
|
||||
out: output image file.
|
||||
"""
|
||||
command = plantuml_command_line(
|
||||
executable = ctx.executable._plantuml_tool.path,
|
||||
config = config.path if config else None,
|
||||
src = src.path,
|
||||
output = out.path,
|
||||
output_format = format,
|
||||
)
|
||||
|
||||
inputs = [src]
|
||||
|
||||
if config:
|
||||
inputs.append(config)
|
||||
|
||||
ctx.actions.run_shell(
|
||||
outputs = [out],
|
||||
inputs = inputs,
|
||||
tools = [ctx.executable._plantuml_tool],
|
||||
command = command,
|
||||
mnemonic = "PlantUML",
|
||||
progress_message = "Generating %s" % out.basename,
|
||||
)
|
||||
```
|
||||
|
||||
This is pretty straightforward: we generate the command line, passing either the
|
||||
attributes' respective paths (or `None` for the configuration file if it's not
|
||||
provided, since it's optional), as well as the requested image format. We define
|
||||
that both our source file and configuration files are inputs, and PlantUML is a
|
||||
requested tool.
|
||||
|
||||
Now let's implement our helper function. It's there again really
|
||||
straightforward: it gets a bunch of paths as input, and needs to generate a
|
||||
command line call (in the form of a simple string) from them:
|
||||
|
||||
```python
|
||||
def plantuml_command_line(executable, config, src, output, output_format):
|
||||
"""Formats the command line to call PlantUML with the given arguments.
|
||||
|
||||
Args:
|
||||
executable: path to the PlantUML binary.
|
||||
config: path to the configuration file. Optional.
|
||||
src: path to the source file.
|
||||
output: path to the output file.
|
||||
output_format: image format of the output file.
|
||||
|
||||
Returns:
|
||||
A command to invoke PlantUML
|
||||
"""
|
||||
|
||||
command = "%s -nometadata -p -t%s " % (
|
||||
shell.quote(executable),
|
||||
output_format,
|
||||
)
|
||||
|
||||
if config:
|
||||
command += " -config %s " % shell.quote(config)
|
||||
|
||||
command += " < %s > %s" % (
|
||||
shell.quote(src),
|
||||
shell.quote(output),
|
||||
)
|
||||
|
||||
return command
|
||||
```
|
||||
|
||||
An interesting note is that because PlantUML is already integrated as an
|
||||
executable Bazel target, we don't care that it's a JAR, a C++ binary or a shell
|
||||
script: Bazel knows exactly what this executable is made of, how to prepare
|
||||
(e.g. compile) it if necessary, its runtime dependencies (in this case, a JRE)
|
||||
and, more importantly in this context, how to run it. We can treat our tool
|
||||
target as a single executable file, and run it as such just from its path.
|
||||
Bazel will automatically make sure to provide us with everything we need. (For
|
||||
more details: the target actually points to a shell script generated by Bazel,
|
||||
through the Java rules, which in the case of a `java_binary` target is
|
||||
responsible for defining the classpath, among other things. The JAR file is
|
||||
merely a dependency of this shell script, and as such is provided as a runtime
|
||||
dependency.)
|
||||
|
||||
Writing this as a helper function rather than directly in the action definition
|
||||
serves two purposes: not only does it make the whole thing slightly easier to
|
||||
read, but this function, which contains the logic (even though in this case it's
|
||||
really simple), is easily testable: it takes only strings as arguments, and
|
||||
returns a string. It's also a pure function: it doesn't have any side effect,
|
||||
and as such it will always return the same output given the same set of inputs.
|
||||
|
||||
### Tests
|
||||
|
||||
To test Starlark functions like this one, Bazel's
|
||||
[Skylib](https://github.com/bazelbuild/bazel-skylib) provides a test framework
|
||||
which, while requiring a bit of boilerplate, is pretty simple to use. In this
|
||||
specific case, we only have two different cases to test: with and without
|
||||
configuration file provided. Error cases should be unreachable due to the way
|
||||
the rule will be defined: Bazel will be responsible for enforcing the presence
|
||||
of an executable target for PlantUML's binary, a valid image format... Let's see
|
||||
how that works. In `//tools/plantuml/internal/actions_test.bzl`:
|
||||
|
||||
```python
|
||||
"""Unit tests for PlantUML action"""
|
||||
|
||||
load("@bazel_skylib//lib:unittest.bzl", "asserts", "unittest")
|
||||
load(":actions.bzl", "plantuml_command_line")
|
||||
|
||||
def _no_config_impl(ctx):
|
||||
env = unittest.begin(ctx)
|
||||
asserts.equals(
|
||||
env,
|
||||
"'/bin/plantuml' -nometadata -p -tpng < 'mysource.puml' > 'dir/myoutput.png'",
|
||||
plantuml_command_line(
|
||||
executable = "/bin/plantuml",
|
||||
config = None,
|
||||
src = "mysource.puml",
|
||||
output = "dir/myoutput.png",
|
||||
output_format = "png",
|
||||
),
|
||||
)
|
||||
return unittest.end(env)
|
||||
|
||||
no_config_test = unittest.make(_no_config_impl)
|
||||
|
||||
def _with_config_impl(ctx):
|
||||
env = unittest.begin(ctx)
|
||||
asserts.equals(
|
||||
env,
|
||||
"'/bin/plantuml' -nometadata -p -tpng -config 'myskin.skin' < 'mysource.puml' > 'dir/myoutput.png'",
|
||||
plantuml_command_line(
|
||||
executable = "/bin/plantuml",
|
||||
config = "myskin.skin",
|
||||
src = "mysource.puml",
|
||||
output = "dir/myoutput.png",
|
||||
output_format = "png",
|
||||
),
|
||||
)
|
||||
return unittest.end(env)
|
||||
|
||||
with_config_test = unittest.make(_with_config_impl)
|
||||
|
||||
def actions_test_suite():
|
||||
unittest.suite(
|
||||
"actions_tests",
|
||||
no_config_test,
|
||||
with_config_test,
|
||||
)
|
||||
```
|
||||
|
||||
First, we define two functions, which are the actual test logic:
|
||||
`_no_config_impl` and `_with_config_impl`. Their content is pretty simple: we
|
||||
start a unit test environment, we invoke our test function and assert that the
|
||||
result is indeed what we expected, and we close the unit test environment. The
|
||||
return value is needed by the test framework, as it's what carries what
|
||||
assertions passed or failed.
|
||||
|
||||
Next, we declare those two functions as actual unit tests, wrapping them with a
|
||||
call to `unittest.make`. We can then add those two test targets to a test suite,
|
||||
which is what actually generates a test target when invoked. Which means that
|
||||
this macro needs to be invoked, in the `BUILD` file:
|
||||
|
||||
```python
|
||||
load(":actions_test.bzl", "actions_test_suite")
|
||||
|
||||
actions_test_suite()
|
||||
```
|
||||
|
||||
We can run our tests, and hopefully everything should pass:
|
||||
|
||||
```bash
|
||||
$ bazel test //tools/plantuml/internal:actions_tests
|
||||
INFO: Invocation ID: 112bd049-7398-4b23-b62b-1398e9731eb7
|
||||
INFO: Analyzed 2 targets (5 packages loaded, 927 targets configured).
|
||||
INFO: Found 2 test targets...
|
||||
INFO: Elapsed time: 0.238s, Critical Path: 0.00s
|
||||
INFO: 0 processes.
|
||||
//tools/plantuml/internal:actions_tests_test_0 PASSED in 0.4s
|
||||
//tools/plantuml/internal:actions_tests_test_1 PASSED in 0.3s
|
||||
|
||||
Executed 0 out of 2 tests: 2 tests pass.
|
||||
INFO: Build completed successfully, 1 total action
|
||||
```
|
||||
|
||||
## Rules definition
|
||||
|
||||
Similarly as the actions definition, we only have one rule to define here. Let's
|
||||
call it `plantuml_graph()`. It needs our usual set of inputs, and outputs a
|
||||
single file, which name will be `${target_name}.{image_format}`. It's also where
|
||||
we define the set of acceptable image formats, the fact that the input file is
|
||||
mandatory but the configuration file optional, and the actual executable target
|
||||
to use for PlantUML. The only thing we actually do is, as expected, calling our
|
||||
`plantuml_generate` action defined above.
|
||||
|
||||
```python
|
||||
load(
|
||||
":actions.bzl",
|
||||
"plantuml_generate",
|
||||
)
|
||||
|
||||
def _plantuml_graph_impl(ctx):
|
||||
output = ctx.actions.declare_file("{name}.{format}".format(
|
||||
name = ctx.label.name,
|
||||
format = ctx.attr.format,
|
||||
))
|
||||
plantuml_generate(
|
||||
ctx,
|
||||
src = ctx.file.src,
|
||||
format = ctx.attr.format,
|
||||
config = ctx.file.config,
|
||||
out = output,
|
||||
)
|
||||
|
||||
return [DefaultInfo(
|
||||
files = depset([output]),
|
||||
)]
|
||||
|
||||
plantuml_graph = rule(
|
||||
_plantuml_graph_impl,
|
||||
attrs = {
|
||||
"config": attr.label(
|
||||
doc = "Configuration file to pass to PlantUML. Useful to tweak the skin",
|
||||
allow_single_file = True,
|
||||
),
|
||||
"format": attr.string(
|
||||
doc = "Output image format",
|
||||
default = "png",
|
||||
values = ["png", "svg"],
|
||||
),
|
||||
"src": attr.label(
|
||||
allow_single_file = [".puml"],
|
||||
doc = "Source file to generate the graph from",
|
||||
mandatory = True,
|
||||
),
|
||||
"_plantuml_tool": attr.label(
|
||||
default = "//third_party/plantuml",
|
||||
executable = True,
|
||||
cfg = "host",
|
||||
),
|
||||
},
|
||||
outputs = {
|
||||
"graph": "%{name}.%{format}",
|
||||
},
|
||||
doc = "Generates a PlantUML graph from a puml file",
|
||||
)
|
||||
```
|
||||
|
||||
## Public interface
|
||||
|
||||
As we only have a single rule, and nothing else specific to do, the public
|
||||
interface is dead simple:
|
||||
|
||||
```python
|
||||
load("//tools/plantuml/internal:rules.bzl", _plantuml_graph = "plantuml_graph")
|
||||
|
||||
plantuml_graph = _plantuml_graph
|
||||
```
|
||||
|
||||
You might then be wondering: why is this useful, and why shouldn't I just import
|
||||
the rule definition from `//tools/plantuml/internal:rules.bzl` directly? Having
|
||||
this kind of public interface allows you to tweak the actual rule definition
|
||||
without breaking any consumer site, as long as you respect the public interface.
|
||||
You can also add features to every consumer site in a really simple way. Let's
|
||||
imagine for example that you have a `view_image` rule which, given an image
|
||||
file, generates a script to view it, you could then transform your public
|
||||
interface like this:
|
||||
|
||||
```python
|
||||
load("//tools/plantuml/internal:rules.bzl", _plantuml_graph = "plantuml_graph")
|
||||
load("//tools/utils:defs.bzl", _view_image = "view_image")
|
||||
|
||||
def plantuml_graph(name, src, config, format):
|
||||
_plantuml_graph(
|
||||
name = name,
|
||||
src = src,
|
||||
config = config,
|
||||
format = format,
|
||||
)
|
||||
|
||||
_view_image(
|
||||
name = "%s.view" % name,
|
||||
src = ":%s.%s" % (name, format),
|
||||
)
|
||||
```
|
||||
|
||||
And suddenly, all your PlantUML graphs have an implicit `.view` target defined
|
||||
automatically, allowing you to see the output directly without having to dig in
|
||||
Bazel's output directories.
|
||||
|
||||
A set of Bazel rules for LaTeX actually provides such a feature to view the PDF
|
||||
output: they have a
|
||||
[`view_pdf.sh` script](https://github.com/ProdriveTechnologies/bazel-latex/blob/master/view_pdf.sh),
|
||||
used by their main
|
||||
[`latex_document` macro](https://github.com/ProdriveTechnologies/bazel-latex/blob/master/latex.bzl#L45).
|
||||
|
||||
## Further testing
|
||||
|
||||
For a rule this simple, I took just a simple further step: having a few
|
||||
reference PlantUML graphs, as well as their expected rendered output, which I
|
||||
compare through Phosphorus, a really simple tool I wrote to help compare two
|
||||
images, covered in the previous article (I told you it would be useful!). But
|
||||
for more complex cases, Skylib offer more utilities like an
|
||||
[analysis test](https://github.com/bazelbuild/bazel-skylib/blob/master/docs/analysis_test_doc.md),
|
||||
and a
|
||||
[build test](https://github.com/bazelbuild/bazel-skylib/blob/master/docs/build_test_doc.md).
|
||||
|
||||
## Closing thoughts
|
||||
|
||||
While writing this kind of tools might look like a lot of works, it's actually
|
||||
pretty mechanical for a lot of cases. I worked on a few others like
|
||||
[markdownlint](https://github.com/igorshubovych/markdownlint-cli), which now
|
||||
runs on all my Markdown files as regular Bazel test targets, or
|
||||
[pngcrush](https://pmt.sourceforge.io/pngcrush/), which is ran on the PNG files
|
||||
hosted on this blog. In a monorepo, writing such a rule is the kind of task that
|
||||
you do once, and it just keeps on giving - you can easily compose different
|
||||
rules with a main use-case, with a bunch of test targets generated for virtually
|
||||
free.
|
||||
|
||||
On another note, I'm aware that having all this in a public repository would
|
||||
make things much simpler to follow. Sadly, it's part of a larger mono-repository
|
||||
which makes open-sourcing only the relevant parts tricky. Dumping a snapshot
|
||||
somewhere would be an option, but I'd rather have an actual living repository.
|
||||
|
||||
Now that we have all the tools we need (that was kind of convoluted, I'll give
|
||||
you that), there are only two steps left to cover:
|
||||
|
||||
- Generating the actual blog (ironically enough, this will be a really quick
|
||||
step, despite being the only really important one)
|
||||
- Managing the deployment.
|
||||
|
||||
We're getting there!
|
||||
Loading…
Add table
Add a link
Reference in a new issue