XML output incorrectly base64-encodes binary data

The function `base64_encode`, which is used by the `--xml` command-line option of zbarcam/zbarimg to encode binary data in a `<![CDATA[` section, has a bug in it that gives incorrect base64-encoded strings for some binary QR code data.

For example, when you try to evaluate the following code:
```c
const char src[3] = { '\x3a', '\xc2', '\x98'};
char dst[5];
base64_encode(dst, src, 3);
dst[4] = '\0';
printf("%s\n", dst);
```
The output you get is "//+Y".
However, the actual output should be "OsKY".

This is because of the use of signed char types when performing the bit-shifting of groups of 3 bytes into an unsigned integer.
See https://github.com/mchehab/zbar/blob/a549566ea11eb03622bd4458a1728ffe3f589163/zbar/symbol.c#L311-L326

If either the second or third byte in the group has the high-bit set then on platforms that have a signed `char` type, the expression `*(src++) << 8` performs signed integer promotion of the `char` value to an `int`, which sign-extends the top-most bit into the upper 24 bits and then shifts this value left by 8, leaving the upper 16 bits still set to 1. Thus when it is bit-wise or'ed into `buf`, it overwrites the bits already shifted into bits 16-23 by the statement `unsigned int buf = *(src++) << 16;`

To fix this issue, you need to first cast the `char` values to `unsigned char` values before shifting them, so that the promotion to `int` when shifting does not sign-extend the most-significant bit.

For example:
```diff
 unsigned base64_encode(char *dst, const char *src, unsigned srclen)
 {
     static const char alphabet[] =
 	"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
     char *start = dst;
     int nline	= 19;
     for (; srclen; srclen -= 3) {
-        unsigned int buf = *(src++) << 16;
+        unsigned int buf = ((unsigned char)*(src++)) << 16;
         if (srclen > 1)
-            buf |= *(src++) << 8;
+            buf |= ((unsigned char)*(src++)) << 8;
         if (srclen > 2)
-            buf |= *(src++);
+            buf |= ((unsigned char)*(src++));
         *(dst++) = alphabet[(buf >> 18) & 0x3f];
         *(dst++) = alphabet[(buf >> 12) & 0x3f];
         *(dst++) = (srclen > 1) ? alphabet[(buf >> 6) & 0x3f] : '=';
         *(dst++) = (srclen > 2) ? alphabet[buf & 0x3f] : '=';
         if (srclen < 3)
             break;
         if (!--nline) {
             *(dst++) = '\n';
             nline    = 19;
         }
     }
     *(dst++) = '\n';
     *(dst++) = '\0';
     return (dst - start - 1);
 }
```

An alternative might be to change the type of the `src` parameter to be `const unsigned char*` to represent bytes, rather than using the signed `const char*` array.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

XML output incorrectly base64-encodes binary data #310

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	unsigned base64_encode(char dst, const char src, unsigned srclen)
	{
	static const char alphabet[] =
	"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
	char *start = dst;
	int nline = 19;
	for (; srclen; srclen -= 3) {
	unsigned int buf = *(src++) << 16;
	if (srclen > 1)
	buf \|= *(src++) << 8;
	if (srclen > 2)
	buf \|= *(src++);
	*(dst++) = alphabet[(buf >> 18) & 0x3f];
	*(dst++) = alphabet[(buf >> 12) & 0x3f];
	*(dst++) = (srclen > 1) ? alphabet[(buf >> 6) & 0x3f] : '=';
	*(dst++) = (srclen > 2) ? alphabet[buf & 0x3f] : '=';

XML output incorrectly base64-encodes binary data #310

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions