Interactive Computational Photography: HDR, Burst Denoising, and Super-Resolution

EN NL


Computational photography is easiest to understand if you stop thinking of a photo as a single exposure and start thinking of it as evidence. A phone camera often knows more than the final image can show in one pass. It can capture different exposure levels, several frames in rapid succession, or slightly shifted samples across a burst, then fuse those measurements into a better result than one frame could provide.

That is the common structure behind the most visible mobile-camera tricks. HDR keeps more of the scene’s brightness range. Burst denoising suppresses grain in low light. Super-resolution recovers detail from subpixel shifts. They are related, but they are not the same problem, and this article keeps that distinction explicit.

The broader field is mapped well in Mobile Computational Photography: A Tour, which is a useful survey if you want the larger landscape around burst photography, noise reduction, and super-resolution. This article narrows the scope to the three techniques already built into the page, so the visualizations can carry the explanation instead of turning the piece into a general survey.

If you want the filter intuition behind frame averaging, convolution and filtering in images and signals is a good companion. And if you want the optics floor that still limits any camera pipeline, diffraction and the Airy disk explains why software can improve capture without making the lens itself disappear.

HDR: Preserve Bright and Dark Detail Before Tone Mapping

High dynamic range is the problem of fitting a scene with very bright and very dark regions into a camera capture and then into a display. A single exposure has to choose where to spend its limited range. If it leans dark, highlights survive but shadows disappear. If it leans bright, shadows open up but lights clip to white.

The HDR visualizer makes that tradeoff concrete by showing a scene brightness timeline, one exposure window, and a merged HDR result. Drag the blue handle across the timeline and watch the single-exposure panel give up different parts of the scene as it moves from dark values toward bright ones. Near the dark end, the brighter regions are the first to clip. Near the bright end, the darker regions are the first to collapse into black. The meter below the panels counts those losses explicitly so the tradeoff is visible as a concrete change, not just as a general impression.

Tone map

The tone-map buttons matter because they separate capture from display. The same merged HDR capture can be rendered with a calmer balance or with a more contrast-heavy look. That difference does not change what was recorded; it changes how the preserved range is compressed for the screen. This is the step people often blur together in casual discussion, but the distinction matters. HDR is not “making the screen handle more light.” It is preserving more scene information first and then deciding how to map that information into the display’s much smaller range.

That framing is consistent with Google’s HDR+ work, which used bursts of constant-exposure raw frames rather than bracketed exposures so alignment stayed more robust while highlight clipping was still controlled. Google’s HDR+ writeup is useful here because it shows the visible symptoms first: blur, noise, and blown highlights in a hard scene, then the cleaner result after burst fusion. Adobe’s Project Indigo makes the same point in plain language: under-expose a bit, capture multiple frames, merge them, and only then tone map the result for a natural look.

The key insight is that HDR is a reconstruction problem before it is a rendering problem. The camera first tries to recover more of the scene. Only after that does it decide how the output should look on a limited display.

Burst Denoising: Reduce Grain Only When the Scene Stays Aligned

Low-light photography runs into a different limit. The scene may fit inside the display range, but each frame is noisy because the sensor collected too little light. One way to fight that noise is to capture several short exposures and average them. That works well if the scene stays still, because random noise tends to cancel while real structure stays in place.

The burst denoising explorer shows that idea from two sides at once: the stack of noisy frames below and the merged preview above. Switch the burst size from 2 to 4 to 8 and keep the scene still. The main image gets progressively cleaner, but the improvement does not scale linearly. That diminishing return is exactly what you expect from averaging aligned noisy measurements. The point is not that more frames are magic. The point is that more aligned frames give the merger more chances to cancel noise without destroying the scene.

Burst size Scene motion

Now change the motion setting. With small motion, the preview still cleans up somewhat, but the moving subject starts to leave a faint trail. With large motion, the trail becomes obvious and the merge stops looking like simple denoising. The scene is no longer static enough for the alignment assumption to hold. That is the real boundary of burst denoising: it is strong when the content is stable and unreliable when the content moves.

This is why burst photography is not just “take more pictures.” It is “take more pictures that can still be aligned.” When that alignment is valid, averaging behaves like a practical noise-reduction filter. When it is not, the average keeps inconsistent positions and turns motion into ghosting. If you want the signal-processing intuition behind that step, the convolution and filtering article covers the basic idea of combining repeated evidence into a cleaner result, but burst photography adds the hard part: the evidence must be registered first.

Google’s burst-HDR paper makes this distinction especially clear by starting from raw frames and using alignment plus merging to improve both low light and highlight handling. Adobe’s Project Indigo article also gives a useful plain-language rule of thumb: combining aligned frames reduces noise roughly with the square root of the number of frames, so gains continue but with diminishing returns. That is why 8 frames looks cleaner than 4, but not eight times cleaner.

Super-Resolution: Recover Detail from Subpixel Shifts

Super-resolution solves a third problem. The scene may be well exposed and the noise may be manageable, but the sensor still samples the world on a limited grid. If a single frame misses fine detail because the samples are too coarse, a second frame can help only if it lands on a slightly different part of that grid. That small offset is the useful part. It is not enough to have more frames. The frames must provide new sampling positions.

The super-resolution visualizer makes that distinction concrete. Set the frame count to 1, 4, or 8, then switch between no shift, subpixel shift, and moving subject. In the no-shift case, extra frames mostly repeat the same samples, so the reconstruction cannot recover much more detail. In the subpixel case, the frames land at slightly different positions, and the reconstructed curve follows the hidden signal more closely. In the moving-subject case, the frames disagree about where the structure belongs, so the reconstruction softens instead of sharpening.

Frames Registration

The useful middle case is the one that matters in practice. Subpixel shifts are smaller than one output pixel, which means the camera is not “zooming” in the ordinary sense. It is collecting a richer set of measurements and then inferring a finer grid from them. That is why the technique is often called computational zoom: it changes what can be recovered from the burst rather than simply enlarging the existing pixels.

The Google paper on handheld multi-frame super-resolution is a strong reference because it ties the idea to natural hand tremor. The tiny motion you usually try to avoid is exactly what makes the frames useful, provided the camera can align them well enough. That is also why this technique is easy to confuse with sharpening. Sharpening boosts edges that are already there. Super-resolution reconstructs detail that one frame did not sample densely enough to capture cleanly.

This is the most important distinction in the visualization. When the frames are locked, more data does not mean more information. When the frames move too much, the information is there but no longer alignable. Only the subpixel-shift case adds the kind of diversity that can be turned into extra detail.

What The Three Techniques Share

HDR, burst denoising, and super-resolution all follow the same broad pattern: capture more evidence than a single frame can comfortably hold, align that evidence when necessary, fuse it, and then render the result for the screen. The difference is in what each method is trying to preserve.

  • HDR preserves brightness range.
  • Burst denoising preserves texture while reducing grain.
  • Super-resolution preserves or recovers spatial detail.

They also fail in different but related ways. Bad exposure choices lead to clipping. Bad frame alignment leads to ghosting and blur. Bad sampling diversity leads to a result that looks averaged but not actually sharper. That is why computational photography is better taught as a family of reconstruction problems than as a bag of camera modes.

The field goes further than the three cases in this article. MIT Media Lab’s Coded Computational Photography overview is a good reminder that exposure, aperture, motion, wavelength, and illumination can all be deliberately coded to make later reconstruction easier. But the basic lesson stays the same even when the techniques become more ambitious: the camera is not just recording the world. It is building a measurement that software can turn into a better image.

If you keep that model in mind, the visualizations become much easier to read. The controls are not decorative. They expose the assumptions the algorithm needs in order to work. When those assumptions hold, the image improves. When they break, the failure mode is usually visible immediately.