Jump flooding with bgfx

A practical example of implementing a jump flooding algorithm using the bgfx rendering library.

I undertook this project in order to learn more about working with GPUs, and as such I make no claims about the quality of my solution. For a more in-depth look at the advantages and tradeoffs of this algorithm with respect to graphics development, I highly recommend checking out this blog post by Ben Golus. He put in far more leg-work than I did, and that post (and his references) were my primary resource for my implementation. I also strongly recommend this research paper on the algorithm by Guodong Rong and Tiow-Seng Tan, which concretely defines and explores the jump flooding algorithm’s approximation of a distance field compared to a “pure” approach.

Working with bgfx

Before “jumping” (🤠) into the algorithm, I wanted to make note of a few pitfalls I encountered while working with bgfx, primarily with the shaderc utility.

Using shaderc

Shaderc is the shader compiler utility for bgfx, not to be confused with Google’s library of the same name. On the whole you can write shader code for bgfx as if it were GLSL, with a few differences noted in their documentation. Assuming you’re working in Visual Studio, make sure you pass --with-tools to the project build command to get shaderc in the mix - that param is mentioned as a footnote in bgfx’s build documentation and it caused me some heartburn.

After creating or updating a shader file, your shaders need to be recompiled. If you’re clever/patient/interested you can figure out how to use shaderc directly, but for my project I opted to leverage a makefile script in bgfx.

//makefile
BGFX_DIR=../bgfx
include $(BGFX_DIR)/scripts/shader.mk
SHADERC:="$(BGFX_DIR)/.build/win64_vs2017/bin/shadercDebug.exe"

go:
	@make -s --no-print-directory TARGET=1 rebuild

Note that i’ve hard-coded TARGET=1 as a parameter to my make target, which tells it to only compile d3d11 - I did not write this project with portability in mind. With this makefile I would simply execute make go after I made changes to my shaders. You can also just use the shader.mk commands directly, e.g. make rebuild TARGET=1. Something to keep in mind when using this makefile is that it specifically searches for fs_, vs_, and cs_ prefixes on filenames - if your shader source file doesn’t have one of those prefixes, then it will not be compiled!

varying.def.sc

This is noted in the shaderc documentation I linked above, but I’m not the first to find it mysterious and underdocumented, and it is the most immediate difference between writing bgfx shaders and GLSL. Essentially, it is a separate file from your shaders that is used to declare all of your semantics for all of your shaders, which you can then reference in your shaders using the $input and $output tokens.

//varying.def.sc
vec4 v_color0    : COLOR0;
vec2 v_texcoord0 : TEXCOORD0;

vec3 a_position  : POSITION;
vec4 a_color0    : COLOR0;
vec2 a_texcoord0 : TEXCOORD0;

Views

Another important concept in bgfx is “Views”. Views represent render passes, and can be considered “buckets of draw and compute calls”. In general bgfx handles all of the leg work for managing render passes, and you simply need to keep track of which integer view ID pertains to which step. The most important consequence of this is that for a program like this one that requires multiple iterations of a shader with changing frame buffers, each iteration requires a separate view ID.

The Jump Flood Algorithm

The jump flood algorithm, in a nutshell, is used to efficiently generate a “distance field” for a set of uniquely identifiable points. We call these points “seeds”, and we also define a beginning “jump” distance, call it D. The field is initialized with just the seed values, and 0 or some other value for “empty” values. For each position in the field, we “jump” D units in each of the 8 compass directions (as well as checking the current position), and check what value is in each of those positions. If any of them are non-empty (i.e. they have a seed value in them), we calculate the distance from the current position to the position that the seed was originally from, select the seed with the shortest distance, and set the current position to that seed value. Then, we cut our jump distance in half, D = D/2, and repeat the process. Rinse and repeat until D = 1, i.e. each position has jumped to its immediate neighbors – you can’t get any closer. With this done, every position has been populated with the value of the closest seed, resulting in our distance field for the given set of seeds.

Now, with a basic idea of how the algorithm is supposed to work, it’s time to go over the actual implementation. If you are following along with bgfx, this “hello world” tutorial and the official bloom example were very helpful to me for getting a basic handle of bgfx.

Step 0: View Initialization

int counter = 0; // incremented every frame
int jumpDist = 150; // needs to be sufficiently large to fill the image, could be calculated from the size of the screen
int jumpSteps = int(ceil(log2(jumpDist)));
static const bgfx::ViewId VIEW_GEOMETRY = 0;
static const bgfx::ViewId VIEW_MASK = 1;
static const bgfx::ViewId VIEW_JFA = 2;
static const bgfx::ViewId VIEW_FINAL = VIEW_JFA + jumpSteps;

I’m glossing over most of the boilerplate for bgfx and glfw (or whatever window library you choose to use) because that is fairly well covered in code examples. I primarily wanted to show here how our number of Views (and thus our number of render passes) is dynamically set based on our input parameters.

Step 1: Render Geometry

Our first step is to render the geometry.

//main.cpp

// Set the geometry data buffers. see the "hello world" tutorial linked above
bgfx::setVertexBuffer(0, m_vbh_pos_color);
bgfx::setIndexBuffer(m_ibh[0]);

// Set up the camera for the geometry and make the cube rotate
const bx::Vec3 at = { 0.0f, 0.0f, 0.0f };
const bx::Vec3 eye = { 0.0f, 0.0f, -5.0f };
float view[16];
bx::mtxLookAt(view, eye, at);
float proj[16];
bx::mtxProj(proj, 60.0f, float(WNDW_WIDTH) / float(WNDW_HEIGHT), 0.1f, 100.0f, bgfx::getCaps()->homogeneousDepth);
float mtx[16];
bx::mtxRotateXY(mtx, counter * 0.01f, counter * 0.01f);
bgfx::setViewTransform(VIEW_GEOMETRY, view, proj);
bgfx::setTransform(mtx);

// Initialize the geometry frame buffer
const uint64_t tsFlags = 0 | BGFX_TEXTURE_RT | BGFX_SAMPLER_U_CLAMP | BGFX_SAMPLER_V_CLAMP;
bgfx::TextureHandle gbufferTex[] = {
	bgfx::createTexture2D(uint16_t(WNDW_WIDTH), uint16_t(WNDW_HEIGHT), false, 1, bgfx::TextureFormat::RGBA32F, tsFlags),
	bgfx::createTexture2D(uint16_t(WNDW_WIDTH), uint16_t(WNDW_HEIGHT), false, 1, bgfx::TextureFormat::R8, tsFlags),
};
bgfx::FrameBufferHandle m_gbuffer = bgfx::createFrameBuffer(BX_COUNTOF(gbufferTex), gbufferTex, true);

// Render the geometry
bgfx::setState(BGFX_STATE_WRITE_RGB | BGFX_STATE_WRITE_A | BGFX_STATE_CULL_CW);
bgfx::setViewFrameBuffer(VIEW_GEOMETRY, m_gbuffer);
bgfx::submit(VIEW_GEOMETRY, m_program_geometry); // m_program_geometry = bgfx::createProgram(vs_geometry, fs_geometry);

//vs_geometry.sc
$input a_position, a_color0
$output v_color0

#include <bgfx_shader.sh>

void main() {
	gl_Position = mul(u_modelViewProj, vec4(a_position, 1.0));
	v_color0 = a_color0;
}

//fs_geometry.sc
$input v_color0

#include <bgfx_shader.sh>

void main() {
	gl_FragData[0] = v_color0;
    gl_FragData[1] = vec4(1,1,1,1);
}

This is pretty much just the hello world tutorial, except instead of rendering directly to the default buffer, we’re rendering to m_gbuffer. The most important aspect of this is that the geometry buffer has two textures: one for the geometry as we want it to be viewed in the final scene (with RGBA32F format) which we’ll save for later, and one to create a mask of our geometry (with R8 format, since we don’t need to encode color data). The only special logic in the shaders is simply drawing white (or red, since it’s a red channel texture) to the mask texture.

Step 2: Create a UV mask

//main.cpp

// Set up the camera for the screen-space quad
float proj[16];
bx::mtxProj(proj, 60.0f, float(WNDW_WIDTH) / float(WNDW_HEIGHT), 0.1f, 100.0f, bgfx::getCaps()->homogeneousDepth);
bx::mtxOrtho(proj, 0.0f, 1.0f, 1.0f, 0.0f, 0.0f, 100.0f, 0.0f, bgfx::getCaps()->homogeneousDepth);
bgfx::setViewTransform(VIEW_MASK, NULL, proj);
bgfx::setViewRect(VIEW_MASK, 0, 0, uint16_t(WNDW_WIDTH), uint16_t(WNDW_HEIGHT));

// Initialize another buffer for our uv mask
bgfx::FrameBufferHandle m_mask = bgfx::createFrameBuffer(uint16_t(WNDW_WIDTH), uint16_t(WNDW_HEIGHT), bgfx::TextureFormat::RG16S, tsFlags);

// Render our UV mask, sampling from the R8 texture mask created in the previous step
bgfx::setState(BGFX_STATE_WRITE_RGB | BGFX_STATE_WRITE_A);
bgfx::setTexture(0, s_tex, bgfx::getTexture(m_gbuffer, 1)); // s_tex = bgfx::createUniform("s_tex", bgfx::UniformType::Sampler);
bgfx::setViewFrameBuffer(VIEW_MASK, m_mask);
screenSpaceQuad((float)WNDW_WIDTH, (float)WNDW_HEIGHT, 0.0f, bgfx::getCaps()->originBottomLeft); // https://github.com/bkaradzic/bgfx/blob/6d1791ae2cbef975b42b8979526d79514be83329/examples/38-bloom/bloom.cpp#L126
bgfx::submit(VIEW_MASK, m_program_mask_uvs); // m_program_mask_uvs = bgfx::createProgram(vs_fullscreen, fs_mask_uvs);

//vs_fullscreen.sc
$input a_position, a_texcoord0
$output v_texcoord0

#include <bgfx_shader.sh>

void main() {
    gl_Position = mul(u_modelViewProj, vec4(a_position, 1.0) );
    v_texcoord0 = a_texcoord0;
}

//fs_mask_uvs.sc
$input v_texcoord0

#include <bgfx_shader.sh>

SAMPLER2D(s_tex, 0);

void main() {
    if (texture2D(s_tex, v_texcoord0).r > 0) {
        gl_FragColor = vec4(v_texcoord0.x, v_texcoord0.y, 0, 1);
    } else {
        gl_FragColor = vec4(-1,-1,-1,-1);
    }
}

This step takes the 1 bit mask we created in the first step and overlays the UVs of the screen texture on top of it. We do this by creating a quad the size of the screen and positioning and configuring the camera so it looks directly at it. (This is an extremely common technique for post-processing effects and a lot can be found online about it. I stole the logic for setting it up from one of the bgfx examples.) In this way, not only are we able to run shaders on the “same” image that we captured of the geometry, but we’re able to use the UV data of the screen-space quad as an input to our fragment shader.

And this is one of the genius things about this algorithm: those UVs are the unique seed values we discussed above! Every pixel in our mask now uniquely encodes information about where it is in the texture.

Step 3: The Actual Jump Flooding Step

//main.cpp

// Set up our camera for each of the jump flood render passes. This is the same setup as the previous step
bx::mtxProj(proj, 60.0f, float(WNDW_WIDTH) / float(WNDW_HEIGHT), 0.1f, 100.0f, bgfx::getCaps()->homogeneousDepth);
bx::mtxOrtho(proj, 0.0f, 1.0f, 1.0f, 0.0f, 0.0f, 100.0f, 0.0f, bgfx::getCaps()->homogeneousDepth);
for (int i = 0; i < jumpSteps; i++) {
	bgfx::setViewTransform(VIEW_JFA + i, NULL, proj);
	bgfx::setViewRect(VIEW_JFA + i, 0, 0, uint16_t(WNDW_WIDTH), uint16_t(WNDW_HEIGHT));
}

// Initialize another buffer for the jump flood algorithm, same format as the mask buffer
m_jfabuffer = bgfx::createFrameBuffer((uint16_t)(WNDW_WIDTH), (uint16_t)(WNDW_HEIGHT), bgfx::TextureFormat::RG16S, tsFlags);

bgfx::FrameBufferHandle m_jfa_a = m_mask;
bgfx::FrameBufferHandle m_jfa_b = m_jfabuffer;
bgfx::FrameBufferHandle m_jfa_t;
int jump_i = 0;
while (jump_i < jumpSteps) {
    const float input[4] = { jumpDist,0,0,0 };
    bgfx::setUniform(u_jumpDistance, input);
    bgfx::setViewFrameBuffer(VIEW_JFA + jump_i, m_jfa_b);
    bgfx::setState(0 | BGFX_STATE_WRITE_RGB | BGFX_STATE_WRITE_A);
    bgfx::setTexture(0, s_tex, bgfx::getTexture(m_jfa_a, 0));
    screenSpaceQuad((float)WNDW_WIDTH, (float)WNDW_HEIGHT, 0.0f, bgfx::getCaps()->originBottomLeft);
    bgfx::submit(VIEW_JFA + jump_i, m_program_jfa); // // m_program_jfa = bgfx::createProgram(vs_fullscreen, fs_jfa);
    m_jfa_t = m_jfa_a;
    m_jfa_a = m_jfa_b;
    m_jfa_b = m_jfa_t;
    jumpDist /= 2;
    if (jumpDist < 1) {
        jumpDist = 1;
    }
    jump_i++;
} // m_jfa_a is now the most recent one to have been modified

//fs_jfa.sc
$input v_texcoord0

#include <bgfx_shader.sh>

SAMPLER2D(s_tex, 0);

uniform vec4 u_jumpDistance;

vec3 jump(vec3 minSeed, vec2 current, vec2 offset) {
    vec2 samplePos = current + offset;
    if (length(clamp(samplePos,0,1) - samplePos) > .0001f) { // out of bounds check
        return minSeed;
    } 
    vec2 seed = texture2D(s_tex, samplePos).rg;
    vec2 cScaled = floor(current * textureSize(s_tex,0));
    vec2 sScaled = floor(seed * textureSize(s_tex,0));
    float dist = distance(cScaled, sScaled);
    if (dist < minSeed.z) {
        return vec3(seed.x, seed.y, dist);
    }
    return minSeed;
}

void main() {
    vec2 jumpDist = round(u_jumpDistance.x) / vec2(textureSize(s_tex,0));
    
    vec3 curr = vec3(1,1,9999999);
    curr = jump(curr, v_texcoord0, jumpDist * vec2( 0,  0)); // cc
    curr = jump(curr, v_texcoord0, jumpDist * vec2( 0, +1)); // nn
    curr = jump(curr, v_texcoord0, jumpDist * vec2(+1, +1)); // ne
    curr = jump(curr, v_texcoord0, jumpDist * vec2(+1,  0)); // ee
    curr = jump(curr, v_texcoord0, jumpDist * vec2(+1, -1)); // se
    curr = jump(curr, v_texcoord0, jumpDist * vec2( 0, -1)); // ss
    curr = jump(curr, v_texcoord0, jumpDist * vec2(-1, -1)); // sw
    curr = jump(curr, v_texcoord0, jumpDist * vec2(-1,  0)); // ww
    curr = jump(curr, v_texcoord0, jumpDist * vec2(-1, +1)); // nw

    gl_FragColor = vec4(curr.x, curr.y, 0, 1);
}

This is the meat of the program. For the general case, we want a jump distance that will fill all of the empty space in our image, but if we’re using this to create an outline then only up to the max size of the outline is necessary. At each step, we halve that distance and pass it into the shader as a uniform (bgfx uses vec4 for floats). We use two frame buffers: the buffer we generated our UV mask in, and a new one. In each step we read from one, write to the other, then swap them.

The fragment shader should hopefully be straightforward enough, assuming we have an intuition for the algorithm. We sample u_jumpDistance texels in each direction and select the “color” (which is actually a position) that is closest to the current texel. I opted to keep it unrolled rather than writing a loop, and I passed data between jumps by packing a vec2 and a float into a vec3 rather than using a struct.

At the end of this step we have successfully generated an approximate distance field for our original shape! Technically speaking our job (of implementing the JFA) is done, but there’s no fun in it if we don’t actually use it.

Step 3.5: JFA Implementation Troubleshooting

Before we get into the final step, I wanted to make note of a couple of problems that I ran into while implementing the JFA. Both are a result of this more naive implementation of the jump function:

//incorrect_jfa.sc
vec3 jump(vec3 minSeed, vec2 current, vec2 offset) {
    vec2 samplePos = current + offset;
    vec2 seed = texture2D(s_tex, samplePos).rg;
    float dist = distance(current, seed);
    if (dist < minSeed.z) {
        return vec3(seed.x, seed.y, dist);
    }
    return minSeed;
}

Interference Pattern

The most pernicious problem I encountered was this interference pattern. It would only appear inside the corner intersections of the jumps, and only when the window (and thus the frame buffer’s) height/width were not a power of 2. I spent a lot of time chasing stuff about half-texel sampling, but it turns out it’s more fundamentally a floating point comparison issue. When I switched the JFA textures from RG32F to RG16S, the pattern took on a much more clearly regular form:

One solution I found was to use texelFetch() (sampling texture coordinates [0,texSize]) instead of texture2D() (sampling UV coordinates [0,1]), but I didn’t like having to orient all of my shaders around that scale. The solution I went with is the one you see in the correct implementation, where I scale and floor the values purely for the distance comparison.

Banding

The other problem was this banding that would sometimes appear when the shape was near the edge of the window. This is because the UVs are being clamped, and I simply added logic to discard jumps that fell out of bounds.

Step 4: Result

Here are two demonstrations of uses for the distance field that we’ve generated.

// main.cpp

// Set up our camera for the final result. This is the same setup as the previous steps
float proj[16];
bx::mtxProj(proj, 60.0f, float(WNDW_WIDTH) / float(WNDW_HEIGHT), 0.1f, 100.0f, bgfx::getCaps()->homogeneousDepth);
bx::mtxOrtho(proj, 0.0f, 1.0f, 1.0f, 0.0f, 0.0f, 100.0f, 0.0f, bgfx::getCaps()->homogeneousDepth);
bgfx::setViewTransform(VIEW_FINAL, NULL, proj);
bgfx::setViewRect(VIEW_FINAL, 0, 0, uint16_t(WNDW_WIDTH), uint16_t(WNDW_HEIGHT));

const float parameters[4] = { 100, .04f, false, 0 }; // {max outline width or frequency, speed, shader pattern control flag, unused }
bgfx::setUniform(u_parameters, parameters);
const float time[4] = { counter,0,0,0 };
bgfx::setUniform(u_time, time);
bgfx::setState(BGFX_STATE_WRITE_RGB | BGFX_STATE_WRITE_A);
bgfx::setTexture(0, s_tex, bgfx::getTexture(m_gbuffer, 0));
bgfx::setTexture(1, s_mask, bgfx::getTexture(m_jfa_a, 0));
screenSpaceQuad((float)WNDW_WIDTH, (float)WNDW_HEIGHT, 0.0f, bgfx::getCaps()->originBottomLeft);
bgfx::submit(VIEW_FINAL, m_program_result); // m_program_result = bgfx::createProgram(vs_fullscreen, fs_combine);

// fs_combine.sc
$input v_texcoord0

#include <bgfx_shader.sh>

SAMPLER2D(s_tex, 0);
SAMPLER2D(s_mask, 1);

uniform vec4 u_time;
uniform vec4 u_parameters; // {max outline width or frequency, speed, shader pattern control flag, unused }

vec4 outline(vec2 v_texcoord0, float maxOutlineWidth, float speed) {
    vec4 color = texture2D(s_tex, v_texcoord0);
    vec2 closestSeed = texture2D(s_mask, v_texcoord0).xy;
    float dist = distance(closestSeed, v_texcoord0);
    float texelSize = 1.0 / float(textureSize(s_tex,0).x);
    float targetWidth = maxOutlineWidth * (cos(dist + u_time.x * speed) + 1) / 2;
    if (closestSeed.x > 0 && closestSeed.y > 0 && dist > 0 && dist < texelSize * targetWidth) {
        color = vec4(1,.5,.5,1);
    }
    return color;
}

vec4 march(vec2 v_texcoord0, float frequency, float speed) {
    vec4 color = texture2D(s_tex, v_texcoord0);
    vec2 closestSeed = texture2D(s_mask, v_texcoord0).xy;
    float dist = distance(closestSeed, v_texcoord0);
    if (closestSeed.x > 0 && closestSeed.y > 0 && dist > 0) {
        float v = (sin(dist * frequency - (u_time.x) * speed) + 1 ) / 2;
        color = vec4(v,v,v,1);
    }
    return color;
}

void main() {
    if (u_parameters.z > 0) {
        gl_FragColor = march(v_texcoord0, u_parameters.x, u_parameters.y);
    } else {
        gl_FragColor = outline(v_texcoord0, u_parameters.x, u_parameters.y);
    }
}

Conclusion

I hope you were able to find something of value from these examples. The jump flood algorithm is a very useful tool, and a fun example of an algorithm that makes good use of GPU acceleration. If you haven’t already I encourage you once again to check out Ben Golus' post on the subject, toward the end he explores some improvements and optimizations to the algorithm that I didn’t cover here.

Thanks for reading! Follow me on social media: