How to validate Base64 in PHP
PHP has two built-in functions for working with Base64: base64_encode and base64_decode. It would seem that this is enough to work with this algorithm, but the practice proves that it is not always so simple. For example, some users encounter problems when trying to validate Base64 values.
First of all I want to show you almost the perfect way to check if string is Base64 encoded.
<?php
// This is our Base64 string we want to check
$input = 'Z3VydQ==';
// By default PHP will ignore “bad” characters, so we need to enable the “$strict” mode
$str = base64_decode($input, true);
// If $input cannot be decoded the $str will be a Boolean “FALSE”
if ($str === false) {
echo 'Bad characters';
} else {
// Even if $str is not FALSE, this does not mean that the input is valid
// This is why now we should encode the decoded string and check it against input
$b64 = base64_encode($str);
// Finally, check if input string and real Base64 are identical
if ($input === $b64) {
echo 'Valid';
} else {
echo 'Invalid';
}
}
As you noticed, I wrote “almost perfect way”, because for an ideal check you may consider the following:
- To support multiline values, fix input using
str_replace(["\r", "\n"], '', $input)
. - Since “pad char” is optional use
rtrim($input, '=') === rtrim($b64, '=')
to check whether they are identical. - If you want to handle multiple standards of Base64, normalize input by replacing 62-63 index characters.
In addition, if you are wondering why such strings as “abcd”, “iujhklsc” or “123412341234” are considered “valid” even they produce a weird result, please let me remind you that Base64 is developed to encode and decode binary data. So there is nothing wrong. You can see for yourself by downloading the following files and decoding them using the Base64 Decode File tool or even using other programming languages or implementations.
Finally, I would like to point out some common mistakes made during the Base64 validation:
1) The biggest mistake is not to check for Boolean FALSE. For example, the snippet below will never show “Valid”, even if the input MA==
is a valid Base64 value for encoded “0” (zero).
<?php
if (base64_decode('MA==')) {
echo 'Valid';
}
2) Perhaps you thought that it’s enough to use the identical comparison operator (===
). However, this is not the case. The following example will always output “Valid”, even if What?!
is invalid Base64 value.
<?php
if (base64_decode('What?!') !== false) {
echo 'Valid';
}
3) The next “Aha! Effect” would be the idea to enable the $strict
mode. However, I do not think that this is what you need. At least because of the fact that any digits and Latin letters will be treated as valid Base64 value. For example, the following will always output “Valid”, although this is clearly not the case.
<?php
if (base64_decode('a+b=c', true) !== false) {
echo 'Valid';
}
By the way, as a joke, I would like to note that the last example is almost identical to the regular expression:
<?php
if (!preg_match('%[^a-z0-9+/=\s]%i', $input)) {
echo 'Valid';
}
Of course, it all depends on what kind of input you want to check and how critical the result is. Nevertheless, if you receive input data from unknown users, I strongly recommend to double check everything.
Comments (11)
I hope you enjoy this discussion. In any case, I ask you to join it.
this not base64 and realname.
F'$
,2'!zY^
, andZ)e
. Can you please encode each of them to Base64 and tell me what you think about the results? Or do you think that “Michelle” is also an invalid Base64 string?