What is Base64?

Comments: 33 | Rating: 4.5/5

Base64 is a encoding algorithm that allows you to transform any characters into an alphabet which consists of Latin letters, digits, plus, and slash. Thanks to it, you can convert Chinese characters, emoji, and even images into a “readable” string, which can be saved or transferred anywhere.

To figuratively understand why Base64 was invented, imagine that during a phone call Alice wants to send an image to Bob. The first problem is that she cannot simply describe how the image looks, because Bob needs an exact copy. In this case, Alice may convert the image into the binary system and dictate to Bob the binary digits (bits), after that he will be able to convert them back to the original image. The second problem is that the tariffs for phone calls are too expensive and dictate each byte as 8 binary digits will last too long. To reduce costs, Alice and Bob agree to use a more efficient data transfer method by using a special alphabet, which replaces every “six digits” with one “letter”.

To realize the difference, check out a 5x5 image converted to binary digits:

010001 110100 100101 000110 001110 000011 011101 100001 000000 010000 000000 000001 000000 001111 000000 000000 000000 001111 111100 000000 000000 000000 000000 000000 000000 000010 110000 000000 000000 000000 000000 000000 000000 010000 000000 000001 000000 000000 000000 000010 000000 100100 010000 000001 000000 000011 001011

Although the same image converted to Base64 looks like this:

R0lGODdhAQABAPAAAP8AAAAAACwAAAAAAQABAAACAkQBADs

I think the difference is obvious. Even if you remove spaces or padding zeros from binary digits, the Base64 string will still be shorter. I grouped bits only to show that each group meets each character of the Base64 string.

Well, the story about Alice and Bob is just a thought-out example to tell you what kind of problem solves the Base64 algorithm. In fact, it is a binary-to-text encoding, whose task is to encode binary data into printable characters, when the data transmission channel or the storage medium cannot handle 8-bit character encodings.

History

The history of the Base64 started long ago, in those times when engineers argued how many bits should be in a byte. Now we use eight-bit bytes, but before that were used seven-bit, six-bit, and even three-bit bytes. By the time the eight-bit encoding was approved as a standard, many systems used old encodings and did not support the “new standard”. This led to the fact that some data was simply lost during the transfer between the new and the old systems. For example, a mail server may discard the eighth bit when sending emails. Moreover, there was another problem with mail servers — they could only send text, but not binary data (such as images, video, archives). And so, in a magical way, clever minds develop an algorithm to solve these problems. Of course, over time, other binary-to-text encodings were developed, but thanks to the simplicity, efficiency and portability, Base64 became the most popular and was used almost everywhere.

For the first time the algorithm was described back in 1987 by a document describing the PEM protocol (if you are interested in the details, check the RFC 989 § 4.3). Since then, the algorithm has evolved, giving rise to new standards that are actively used throughout the world of IT.

Naming

Initially, the algorithm was named as “printable encoding” and only after a couple of years, in June 1992, RFC 1341 defines it as “Base64”. Since this algorithm uses 64 basic characters it was not difficult to give it a name (especially that Base85 already existed). Therefore, I think it will not be a problem for you to guess what means the names of algorithms such as Base16, Base32, Base36, Base58, Base91, or Base122.

Size

During encoding, the Base64 algorithm replaces each three bytes with four bytes and, if necessary, adds padding characters, so the result will always be a multiple of four. Simply put, the size of the result will always be 33% (more exactly, ⁴⁄₃) larger than the original data. The formula for calculating the length of the result string without padding is as follows: n * 4 / 3, where n is the length of the original data.

Usage

Base64 is most commonly used to encode binary data (for example, images, or sound files) for embedding into HTML, CSS, EML, and other text documents. In addition, Base64 is used to encode data that may be unsupported or damaged during transfer, storage, or output. Here are some of the applications of the algorithm:

Attach files when sending emails
Embed images in HTML or CSS via data URI
Preserve raw bytes of cryptographic functions
Output binary data as XML or JSON in API responses
Save binary files to database when BLOB is unavailable
Hide secrets from prying eyes (really a very bad idea)

Security

Base64 is not an encryption algorithm and in no case should it be used to “hash” passwords or “encrypt” sensitive data, because it is a reversible algorithm and the encoded data can be easily decoded. Base64 may only be used to encode raw result of a cryptographic function.

Roughly speaking, in terms of information security, Base64 is just a foreign language that some people do not understand. Nevertheless, even they can understand the meaning of the encoded message simply by using an online translator, which instantly returns the original message.

Comments (33)

I hope you enjoy this discussion. In any case, I ask you to join it.

Alan, 26 december 2019 at 18:46 #

Thanks for a great explanatory article. This is something I've used by feel more than understanding, and it's nice to fill in the blanks in my knowledge. The only thing I'd add is under usage. Your API responses example touches on this at a high level, but I often find it useful for sanitizing string values that can include special characters ({}, <>, ', ;, newline, etc.) without using language specific methods to qualify strings.

reply
- Administrator, 27 december 2019 at 14:16 #↑
  
  Hello Alan,
  Thank you for your comment. I'm glad you like this article.
  
  As for using Base64 to sanitize strings, this is a known practice, but since it has several drawbacks it should be used wisely.
  
  reply
- Duratcho, 15 may 2023 at 12:09 #↑
  
  Keep in mind that base64 uses +, / and = as well. These characters cannot exist in URLs or filenames, so do be careful when "sanitising" strings with this!
  
  reply
John, 19 march 2020 at 17:53 #

Great site, well done for setting this up.
I realised when reading your site that the idea of base64 encoding has similarities to UUencoding that old people (like me) remember from the early days of email in the 1990s. Then it was considered very poor form to include binary attachments, hence the need to turn them into printable ASCII - which is what UUencoding did. UUencoding gave a predictable increase in file size of one third - each three binary characters transformed into four printable ASCII ones. Quite a good Wikipedia article on Uuencode, explaining its relation to Base64, and why Base64 is better.

reply
Ahmad, 26 april 2020 at 11:01 #

Hi
It was the best meaning of Base64 i see in the worldwide thanks to you, actually i saw the encoded image to Base4
in android IDE but wanted more info about it, and one thing that I did not know is that "encode image to Base64" we will get String or when we decode it!?
and here I got it!
Thanks

reply
Apps, 8 july 2020 at 08:53 #

How to identify the encoding algorithms uses on a string?
If we have got the output of the encoding algorithm "3AqxxqQkWV" how to know what encoding algorithm was used?

reply
john, 22 january 2021 at 15:52 #

Your statement:
Simply put, the size of the result will always be 33% (more exactly, 4⁄3) larger than the original data.

Might be better if you replace 4/3 with 1/3 as you are adding that to the original size (because you used the word larger)
I.e. x+x/3=4x/3
But your statement shows:
x+4x/3=7x/3

reply
Hakan_Ozay, 23 april 2021 at 06:35 #

Excellent! Thanks for your effort.

reply
Bruno, 23 april 2021 at 17:49 #

Thank you for this website dude. Cheers.

reply
Suyash, 22 march 2022 at 15:17 #

Love this site and thanks for sharing your passion for Base64, today I learned that I can encode audio to Base 64, truly genius!

reply
Chandra, 19 july 2022 at 19:44 #

Suppose I have a message- Hii = 24 bit
Required to send this message
4*6 = 24 bit required to send message in Base64

In this way no profit of tarrif of phone that above mention in you post

reply
- Administrator, 21 july 2022 at 14:55 #↑
  
  Hello! The thing is that in this case there is no reason to encode your textual message: you can just dictate it letter by letter. But you will need a binary-to-text encoding algorithm if you need to send something that simply cannot be "described" by words (for example, a picture or a video file).
  
  reply
Nick, 30 march 2023 at 06:44 #

So say i have url https://site.domain.com/landingpage.html, and I base64 encode it, how do i parse the base64 encoded url into a browser address line (or clickable link) such that it is decoded into the intended/original url? i am guessing it needs to form part of a query telling the browser to decode it first? eg. how does it look? https://something?=xxxBase64xxx ??

reply
- Administrator, 31 march 2023 at 12:44 #↑
  
  Hello Nick! You can pass it as query string as follows: `https://something?Base64Page=xxxBase64xxx`, then on your page fetch and decode the `Base64Page` parameter from the URL. However, there may be problems with large pages.
  
  reply
  - Nick, 31 march 2023 at 13:49 #↑
    
    Does it matter what page url 'something.com' I choose? Ie. Will it redirect/reflect to my Base64 url I pass in query as u explain, regardless of 'something.com'decoy page?
    
    I swear I recall a lexical obfuscation trick where all you need is a https:// prefix (or other protocol prefix instructor) and then the obfuscated hex/b64/ascii text and it would be decoded by browser somehow?
    
    reply
Alok, 5 april 2023 at 04:50 #

Thanks for the good explanation. I love to use this site for encoding and decoding of base64. Easy to use and reliable for its accuracy. Thanks.

reply
nqhXncMU, 6 january 2024 at 13:05 #

0'XOR(if(now()=sysdate(),sleep(15),0))XOR'Z

reply
nqhXncMU, 6 january 2024 at 13:06 #

vjAPzqMW' OR 61=(SELECT 61 FROM PG_SLEEP(15))--

reply
nqhXncMU, 6 january 2024 at 13:13 #

-1" OR 2+567-567-1=0+0+0+1 --

reply
gBqsPxAZ, 6 january 2024 at 15:07 #

0'XOR(if(now()=sysdate(),sleep(15),0))XOR'Z

reply
gBqsPxAZ, 6 january 2024 at 15:23 #

JXlPQZQj') OR 179=(SELECT 179 FROM PG_SLEEP(15))--

reply
gBqsPxAZ, 6 january 2024 at 15:31 #

Lcno7eU4' OR 777=(SELECT 777 FROM PG_SLEEP(15))--

reply
nqhXncMU, 6 january 2024 at 23:49 #

YqGOJ5jQ' OR 13=(SELECT 13 FROM PG_SLEEP(15))--

reply
nqhXncMU, 7 january 2024 at 00:08 #

-1' OR 2+852-852-1=0+0+0+1 --

reply
nqhXncMU, 7 january 2024 at 00:20 #

0"XOR(if(now()=sysdate(),sleep(15),0))XOR"Z

reply
ncMUFCMU, 7 january 2024 at 00:58 #

-5) OR 210=(SELECT 210 FROM PG_SLEEP(15))--

reply
ncMUFCMU, 7 january 2024 at 01:06 #

0"XOR(if(now()=sysdate(),sleep(15),0))XOR"Z

reply
ncMUFCMU, 7 january 2024 at 01:13 #

1 waitfor delay '0:0:15' --

reply
ncMUFCMU, 7 january 2024 at 01:29 #

-1' OR 2+536-536-1=0+0+0+1 or '6tylmwqN'='

reply
CG3000, 16 august 2024 at 23:51 #

424d36b40400000000003604000028000000e001000080020000010008000000000000b004000000000000000000000100000000000005050c00428613001e3c75006e86910021441900afc3c40037444300668acc0008153e00436742005163770087a49200d6e3e40063b3200045865000486cae001a2941006786b10052757700314462009ac4e100516461001f242200132b60004f9c50007fa8d00032691a0054856f008ba5b200c0ccdc003d2e42003857410023331b007f94900042639c00b1e3fc0057749e007ab0ec0054746600519f13007797b000778472009fabaf004554610090b7d100354f760028571b00add3e80043764000676576002f

reply
- Ds8354, 13 october 2024 at 01:49 #↑
  
  1224+34+76+54
  
  reply
hassan, 29 december 2024 at 18:59 #

thanks, just won 400$ project by completing keyframe related data conversion to base 64, loved that article

reply
JCfUZQsq, 27 february 2025 at 15:40 #

${10000143+9999890}

reply

Base64.Guru

A virtual teacher who reveals to you the great secrets of Base64

What is Base64?

History

Naming

Size

Usage

Security

Comments (33)

Add new comment

Name*

Email*

Do you want to be notified about new comments?

Message*

How do you rate this page?

Recent blog posts