Ray Tracing Shaders
The ray tracing shader system in Himalaya implements a complete path tracing reference view using Vulkan's ray tracing pipeline. This documentation covers the shader architecture, the five shader stages (raygen, closest-hit, any-hit, miss, and shadow miss), and the shared utilities that enable physically-based rendering with multiple importance sampling.
Shader Architecture Overview
The ray tracing pipeline follows the Mode A architecture where all surface shading computation resides in the closest-hit shader, while the raygen shader focuses on path accumulation and bounce management. This design separates concerns clearly: raygen handles the Monte Carlo path loop, while closest-hit handles material evaluation, next event estimation (NEE), and BRDF sampling.
The pipeline consists of five shader stages organized into four shader binding table (SBT) groups:
| SBT Group | Type | Shader(s) | Purpose |
|---|---|---|---|
| Group 0 | General | reference_view.rgen |
Primary ray generation, path loop, accumulation |
| Group 1 | General | miss.rmiss |
Environment sampling on geometry miss |
| Group 2 | General | shadow_miss.rmiss |
Shadow ray visibility confirmation |
| Group 3 | Triangles Hit Group | closesthit.rchit + anyhit.rahit |
Surface shading, alpha testing |
Sources: rt_pipeline.h, rt_pipeline.cpp
Ray Generation Shader (`reference_view.rgen`)
The raygen shader is the entry point for path tracing. It executes once per pixel and implements the primary ray generation, path tracing loop with Russian Roulette termination, and running-average accumulation.
Primary Ray Generation
Primary rays are generated by unprojecting pixel coordinates through the inverse camera matrices. Subpixel jittering via Sobol quasi-random sequence provides anti-aliasing:
// Subpixel jitter via Sobol dims 0-1
float jitter_x = rand_pt(0, pc.sample_count, pixel, pc.frame_seed, pc.blue_noise_index);
float jitter_y = rand_pt(1, pc.sample_count, pixel, pc.frame_seed, pc.blue_noise_index);
// Pixel center + jitter → NDC → clip space
vec2 uv = (vec2(pixel) + vec2(jitter_x, jitter_y)) / vec2(size);
vec2 ndc = vec2(uv.x * 2.0 - 1.0, -(uv.y * 2.0 - 1.0));
// Unproject to world-space ray
vec4 clip_target = vec4(ndc, 1.0, 1.0);
vec4 view_target = global.inv_projection * clip_target;
view_target /= view_target.w;
vec3 ray_origin = (global.inv_view * vec4(0.0, 0.0, 0.0, 1.0)).xyz;
vec3 ray_direction = normalize((global.inv_view * vec4(view_target.xyz, 0.0)).xyz);Sources: reference_view.rgen
Path Tracing Loop
The path loop implements Russian Roulette for unbiased path termination starting from bounce 2. The survival probability is based on the maximum component of current throughput, clamped to [0.05, 0.95] to prevent extreme variance:
for (uint bounce = 0; bounce < pc.max_bounces; ++bounce) {
// Russian Roulette (bounce >= 2)
if (bounce >= 2u) {
float rr_prob = russian_roulette(throughput, bounce, rr_rand, survive);
if (!survive) break;
throughput /= rr_prob;
}
// Trace ray and accumulate contribution
traceRayEXT(tlas, ..., payload);
total_radiance += throughput * payload.color;
// Update throughput and advance ray
throughput *= payload.throughput_update;
origin = payload.next_origin;
direction = payload.next_direction;
}Sources: reference_view.rgen, pt_common.glsl
Running Average Accumulation
The accumulation buffer stores a running average of all path samples. On the first frame (sample_count == 0), it overwrites; subsequent frames blend using incremental averaging:
if (pc.sample_count == 0u) {
imageStore(accumulation_image, pixel, vec4(total_radiance, 1.0));
} else {
vec4 old_value = imageLoad(accumulation_image, pixel);
float weight = 1.0 / float(pc.sample_count + 1u);
vec3 result = mix(old_value.rgb, total_radiance, weight);
imageStore(accumulation_image, pixel, vec4(result, 1.0));
}Sources: reference_view.rgen
Closest-Hit Shader (`closesthit.rchit`)
The closest-hit shader performs all surface shading operations: vertex interpolation, normal mapping, material sampling, next event estimation for direct lighting, and multi-lobe BRDF sampling for indirect bounces.
Vertex Interpolation via Buffer References
Vertex data is accessed through buffer references (device addresses) stored in the GeometryInfo buffer. This avoids indirection through vertex buffers and enables direct fetch from the hit point:
// Fetch triangle indices
IndexBuffer ib = IndexBuffer(geo.index_buffer_address);
uint i0 = ib.indices[3 * gl_PrimitiveID + 0];
uint i1 = ib.indices[3 * gl_PrimitiveID + 1];
uint i2 = ib.indices[3 * gl_PrimitiveID + 2];
// Fetch vertices with byte offset
VertexBuffer v0 = VertexBuffer(geo.vertex_buffer_address + uint64_t(i0) * VERTEX_STRIDE);
// ... interpolate using barycentric coordinatesSources: pt_common.glsl, bindings.glsl
Normal Mapping with Consistency Correction
The shader applies normal mapping using a TBN basis constructed from the interpolated tangent and normal. A consistency correction ensures the shading normal never points below the geometric surface, preventing light leaks:
vec3 N_shading = get_shading_normal(N_interp, vec4(T_world, hit.tangent.w),
normal_rg, mat.normal_scale);
N_shading = ensure_normal_consistency(N_shading, N_face);The ensure_normal_consistency function reflects the shading normal if it points to the wrong side of the geometric normal.
Sources: closesthit.rchit, pt_common.glsl
Next Event Estimation (NEE)
The shader implements NEE for both directional lights and environment lighting, using multiple importance sampling (MIS) to combine light sampling with BRDF sampling strategies.
Directional Lights (delta distribution, no MIS needed):
for (uint i = 0; i < global.directional_light_count; ++i) {
// Shadow ray with terminate-on-first-hit optimization
traceRayEXT(tlas,
gl_RayFlagsTerminateOnFirstHitEXT | gl_RayFlagsSkipClosestHitShaderEXT,
..., shadow_payload);
if (shadow_payload.visible == 1u) {
nee_radiance += evaluate_brdf(...) * light_color * intensity * NdotL;
}
}Environment Lighting (alias table importance sampling + MIS):
vec3 L = sample_env_alias_table(env_r1, env_r2, env_r3, env_r4);
// Shadow ray to check visibility
if (shadow_payload.visible == 1u) {
float mis_w = mis_power_heuristic(pdf_light, brdf_pdf);
nee_radiance += env_color * brdf_val * NdotL * mis_w / pdf_light;
}Sources: closesthit.rchit
Multi-Lobe BRDF Sampling
The BRDF is split into diffuse (Lambertian) and specular (GGX) lobes. Lobe selection uses Fresnel-weighted probability based on the luminance of F_Schlick at the current view angle:
float p_spec = specular_probability(NdotV, F0); // clamped to [0.01, 0.99]
if (rand_lobe < p_spec) {
// Specular: GGX VNDF importance sampling (Heitz 2018)
vec3 H_ts = sample_ggx_vndf(Ve, roughness, vec2(rand_xi0, rand_xi1));
vec3 L_ts = reflect(-Ve, H_ts);
throughput_update = (D * Vis * F * NdotL) / (pdf * p_spec);
} else {
// Diffuse: cosine-weighted hemisphere sampling
vec3 L_ts = sample_cosine_hemisphere(vec2(rand_xi0, rand_xi1));
throughput_update = diffuse_color / (1.0 - p_spec);
}The combined multi-lobe PDF is computed for MIS weighting when the BRDF-sampled ray eventually misses geometry and hits the environment.
Sources: closesthit.rchit, pt_common.glsl
OIDN Auxiliary Output
On bounce 0, the shader writes albedo and normal data to auxiliary images for Intel Open Image Denoise (OIDN):
if (payload.bounce == 0u) {
ivec2 pixel = ivec2(gl_LaunchIDEXT.xy);
imageStore(aux_albedo_image, pixel, vec4(diffuse_color, 1.0));
imageStore(aux_normal_image, pixel, vec4(N_shading, 1.0));
}Sources: closesthit.rchit
Any-Hit Shader (`anyhit.rahit`)
The any-hit shader handles alpha testing for non-opaque geometry. It supports two alpha modes:
| Mode | Value | Behavior |
|---|---|---|
| Opaque | 0 | Never reaches any-hit (hardware skip via VK_GEOMETRY_OPAQUE_BIT_KHR) |
| Mask | 1 | Hard cutoff: discard if alpha < alpha_cutoff |
| Blend | 2 | Stochastic alpha using PCG hash random |
For blended materials, stochastic transparency provides unbiased transparency without sorting:
// Blend: stochastic alpha (PCG hash)
uint seed = gl_LaunchIDEXT.x
^ (gl_LaunchIDEXT.y * 1103515245u)
^ (pc.frame_seed * 747796405u)
^ gl_PrimitiveID
^ (gl_GeometryIndexEXT * 2654435761u);
float rand_val = float(pcg_hash(seed)) / 4294967296.0;
if (rand_val >= texel_alpha) {
ignoreIntersectionEXT;
}Sources: anyhit.rahit
Miss Shaders
Environment Miss (`miss.rmiss`)
When a ray misses all geometry, the environment miss shader samples the IBL cubemap with Y-axis rotation applied:
vec3 dir = rotate_y(gl_WorldRayDirectionEXT,
global.ibl_rotation_sin,
global.ibl_rotation_cos);
vec3 env_color = texture(cubemaps[nonuniformEXT(global.skybox_cubemap_index)], dir).rgb
* global.ibl_intensity;
payload.color = env_color;
payload.hit_distance = -1.0; // Signal path terminationSources: miss.rmiss
Shadow Miss (`shadow_miss.rmiss`)
The shadow miss shader marks light visibility when a shadow ray reaches tMax without hitting geometry:
layout(location = 1) rayPayloadInEXT ShadowPayload shadow_payload;
void main() {
shadow_payload.visible = 1;
}Sources: shadow_miss.rmiss
Shared Utilities (`pt_common.glsl`)
Ray Payloads
Two payload structures are defined:
struct PrimaryPayload {
vec3 color; // Radiance contribution from this bounce
vec3 next_origin; // Next ray origin (offset from surface)
vec3 next_direction; // Next ray direction (BRDF sampled)
vec3 throughput_update; // Path throughput multiplier
float hit_distance; // Hit distance (-1 = miss)
uint bounce; // Current bounce index
float env_mis_weight; // MIS weight for env map on miss
};
struct ShadowPayload {
uint visible; // 0 = occluded, 1 = visible
};Sources: pt_common.glsl
Random Number Generation
The path tracer uses Sobol quasi-random sequences with Cranley-Patterson rotation for low-discrepancy sampling. Blue noise provides per-pixel offsets, and golden-ratio scrambling provides temporal decorrelation:
float rand_pt(uint dim, uint sample_index, ivec2 pixel,
uint frame_seed, uint blue_noise_index) {
float s = sobol_sample(dim, sample_index);
// Per-pixel blue noise offset
ivec2 noise_coord = (pixel + ivec2(dim * 73u, dim * 127u)) & 127;
float offset = texelFetch(textures[blue_noise_index], noise_coord, 0).r;
// Golden-ratio temporal scramble
offset = fract(offset + float(frame_seed) * 0.6180339887);
return fract(s + offset);
}Sources: pt_common.glsl
Environment Map Importance Sampling
The alias table provides O(1) sampling proportional to luminance × sin(theta) weights. The PDF computation uses stored luminance values to ensure exact consistency with the sampling distribution:
vec3 sample_env_alias_table(float rand1, float rand2, float rand3, float rand4) {
uint N = entry_count;
uint idx = min(uint(rand1 * float(N)), N - 1u);
EnvAliasEntry e = env_alias_entries[idx];
uint pixel = (rand2 < e.prob) ? idx : e.alias_index;
// ... convert to direction
}
float env_pdf(vec3 world_dir) {
// Look up stored luminance from alias table
float lum = env_alias_entries[pixel].luminance;
return lum * float(w) * float(h) / (total_luminance * TWO_PI * PI);
}Sources: pt_common.glsl
Ray Origin Offset
The Wächter & Binder method from Ray Tracing Gems Chapter 6 provides robust self-intersection avoidance without scene-dependent epsilon:
vec3 offset_ray_origin(vec3 p, vec3 n_geo) {
ivec3 of_i = ivec3(RT_ORIGIN_INT_SCALE * n_geo);
vec3 p_i = vec3(
intBitsToFloat(floatBitsToInt(p.x) + ((p.x < 0.0) ? -of_i.x : of_i.x)),
// ... y, z
);
return p_i;
}Sources: pt_common.glsl
C++ Integration
The ReferenceViewPass class manages the RT pipeline creation and per-frame dispatch. It compiles all five shader stages and builds the SBT with proper alignment:
const rhi::RTPipelineDesc desc{
.raygen = rgen_module,
.miss = miss_module,
.shadow_miss = shadow_miss_module,
.closesthit = chit_module,
.anyhit = ahit_module,
.max_recursion_depth = 1,
.descriptor_set_layouts = set_layouts,
.push_constant_ranges = {&push_range, 1},
};
rt_pipeline_ = rhi::create_rt_pipeline(*ctx_, desc);Per-frame recording uses push descriptors for the accumulation images and Sobol buffer, then dispatches trace_rays with the image dimensions.
Sources: reference_view_pass.cpp, reference_view_pass.cpp
Related Documentation
- Path Tracing Reference View — Render pass integration and OIDN denoising
- Ray Tracing Infrastructure (AS, RT Pipeline) — TLAS/BLAS construction and RHI abstractions
- Common Shader Library (BRDF, Bindings) — Shared BRDF functions and descriptor layouts
- Pipeline and Shader System — Shader compilation and pipeline creation