Skip to content

michafrank/HeifThumbnailExtractor

Repository files navigation

HEIF (Heic Files) Thumbnail Extractor C#

A lightweight C# library demo for extracting thumbnails from HEIF images (e.g. iPhone .heic files) using the native libheif library.

The library decodes the embedded thumbnail from the image and converts it to a standard RGB format for easy viewing or further processing.


Requirements

  • .NET 8 or later
  • libheif and its dependencies 1.20+

Place the native libheif binaries next to your application. The library searches for a libheif library, so the binary built by vcpkg needs to be renamed.

E.g. for Windows, you need:

  • libde265.dll
  • libheif.dll
  • libx265.dll

AI Key insight

You must query width/height using the channel that actually exists in the decoded image.

libheif only returns width/height for channels that actually have a plane.

If the plane does not exist ? width = -1, height = -1.

So the correct logic is:

Ask for Y plane

If Y plane exists ? use it

If not ? try interleaved

If not ? try R/G/B

If not ? the decode failed

This is exactly how libheif’s own C++ helpers behave internally.

This matches libheif’s internal logic and guarantees you always get the correct dimensions.

Why this works

Because:

YCbCr images ? Y plane always exists

RGB images ? interleaved plane exists

Some images ? R/G/B planes exist separately

libheif returns -1 only when the plane does not exist

By checking planes in the correct order, you always get a valid size.

Let me break it down clearly.

Why the channel is heif_channel_R for a YCbCr 4:4:4 image

Even though the decoded colorspace is: YCbCr

and the chroma is: 4:4:4

libheif still exposes the decoded planes using the R/G/B channel enums when the decoder plugin internally converts the YCbCr planes into an RGB-like layout.

This happens in two cases:

Case 1 — The decoder plugin internally converts YCbCr ? RGB planes

Some HEVC/AVIF decoders (especially on Windows) output:

  • R plane
  • G plane
  • B plane

instead of:

  • Y plane
  • Cb plane
  • Cr plane

even though the colorspace is still reported as YCbCr.

This is a quirk of the plugin API: colorspace describes the encoded format, not necessarily the decoded plane layout.

So you get:

colorspace = YCbCr
chroma = 444
planes = R, G, B

This is valid and expected.

Case 2 — The thumbnail is stored in a special format

iPhone HEIC thumbnails (and many Android ones) are:

HEVC-coded

Full-resolution YCbCr 4:4:4

But decoded by libde265 or dav1d into RGB planar format

So the decoder gives you:

  • R plane
  • G plane
  • B plane

instead of Y/Cb/Cr.

Why the conversion of the thumbnail is necessary

The Thumbnail's Format: It's Not a JPEG

You're right to wonder if the thumbnail is already in a standard format. However, inside a HEIF (.heic) file, the thumbnail is not typically a separate JPEG or PNG file. Instead, it's a smaller, lower-resolution version of the main image that is compressed using the same codec: HEVC (High-Efficiency Video Coding).

The HEIF file is a container, and both the main image and its thumbnail are stored as HEVC-encoded data streams within that container.

The Color Model: YCbCr vs. RGB

This is the most critical point. The way we see images on a screen and the way they are efficiently compressed are often different.

  • RGB (Red, Green, Blue): This is what our screens use. Each pixel is represented by a combination of red, green, and blue light. This is an "additive" color model, and it's how we typically think of digital color. Formats like PNG and JPEG store their final data in a way that can be easily translated to RGB.

  • YCbCr: This is what HEVC (and many other video/image codecs like JPEG) uses for compression. It separates the image into three components:

    • Y: The luma component, which is essentially the black-and-white (brightness) information of the image.
    • Cb & Cr: The chroma components, which represent the color information (blue-difference and red-difference).

Why Use YCbCr? The Magic of Compression

The reason for this separation is a clever trick based on human biology. Our eyes are much more sensitive to changes in brightness (luma) than to changes in color (chroma).

Codecs like HEVC exploit this by keeping the Y (brightness) information at high resolution but reducing the resolution of the Cb and Cr (color) information. This is called chroma subsampling. It allows the codec to throw away a lot of color data that our eyes wouldn't have noticed anyway, leading to significantly smaller file sizes with very little perceptible loss in quality.

The Final Step: Conversion

So, when we use libheif to decode the thumbnail, we are extracting the compressed YCbCr data. This data is not something a standard image viewer or library can directly understand or display. It's just a stream of numbers representing the Y, Cb, and Cr values for each pixel.

To make it a viewable image, we have to perform a mathematical conversion from the YCbCr color space back to the RGB color space. This is what the ConvertYCbCrToRgb method is doing. It takes the Y, Cb, and Cr values for each pixel and calculates the corresponding R, G, and B values.

In summary:

  1. The thumbnail is a small HEVC-encoded image.
  2. HEVC uses the YCbCr color model for efficient compression.
  3. Our target format (PNG) and our screens use the RGB color model.
  4. Therefore, we must decode the HEVC data and convert the resulting YCbCr pixel data to RGB before we can save it as a PNG.

What is stride in image decoding?

A stride is one of those concepts that suddenly makes everything about image buffers click into place once you see it clearly. Let me give you the version that’s actually useful when working with libheif and raw pixel planes.

Stride = the number of bytes per row in memory Even if an image is, say, 200 pixels wide, the number of bytes used for each row in memory is not necessarily 200 × bytes?per?pixel.

That actual number is called the stride (sometimes “row pitch”).

Why stride exists Two main reasons:

  1. Alignment / padding Many image formats pad each row to 4?byte or 8?byte boundaries for performance.

Example:

Image width: 201 pixels

Format: RGB24 (3 bytes per pixel)

Expected row size: 201 × 3 = 603 bytes

But 603 is not aligned

So the library might round up to 608 bytes

? Stride = 608, not 603.

  1. Some formats store extra metadata or padding per row YCbCr planes, interleaved formats, and GPU?friendly layouts often require padding.

Visual intuition

Imagine each row of pixels is a shelf:

Code

| pixel pixel pixel pixel ... padding padding |

The stride is the full width of the shelf, including padding.

The width × bytes?per?pixel is only the part containing actual pixels.

Why stride matters in your libheif wrapper When you copy the decoded thumbnail into a byte array:

csharp
int size = stride * height;
Marshal.Copy(plane, buffer, 0, size);

You are copying the entire padded rows, not just the visible pixels.

When you later convert to PNG or a Bitmap, you must respect the stride:

csharp
for (int y = 0; y < height; y++)
{
    int srcOffset = y * stride;
    // copy only width * 3 bytes of actual pixel data
}

If you ignore stride, your image will look:

skewed, shifted, corrupted, or have colored bars on the right

Concrete example Let’s say libheif gives you:

  • width = 200
  • height = 100
  • stride = 640

But RGB24 is 3 bytes per pixel:

expected row size = 200 × 3 = 600 bytes

stride = 640 bytes ? 40 bytes of padding per row

So each row in memory looks like:

600 bytes of real pixels + 40 bytes of padding

About

HEIF (Heic Files) Thumbnail Extractor C#

Topics

Resources

License

Stars

Watchers

Forks

Languages