A virtual teacher who reveals to you the great secrets of Base64

Base64 Characters

The Base64 Alphabet contains 64 basic ASCII characters which are used to encode data. Yeah, that’s right, 64 characters is enough to encode any data of any length. The only drawback is that the size of the result will increase to 33%. However, its benefits are much more important, at least because all these symbols are available in 7-bit and 8-bit character sets.

Characters of the Base64 alphabet can be grouped into four groups:

  • Uppercase letters (indices 0-25): ABCDEFGHIJKLMNOPQRSTUVWXYZ
  • Lowercase letters (indices 26-51): abcdefghijklmnopqrstuvwxyz
  • Digits (indices 52-61): 0123456789
  • Special symbols (indices 62-63): +/

It is very important to note that the Base64 letters are case sensitive. This means that, for example, when decoding the values “QQ==”, “Qq==”, “qq==”, and “qQ==” four different results are obtained.

For a better understanding, I grouped all characters into the Base64 table:

Uppercase Letters
IndexCharacter
0A
1B
2C
3D
4E
5F
6G
7H
8I
9J
10K
11L
12M
13N
14O
15P
16Q
17R
18S
19T
20U
21V
22W
23X
24Y
25Z
Lowercase Letters
IndexCharacter
26a
27b
28c
29d
30e
31f
32g
33h
34i
35j
36k
37l
38m
39n
40o
41p
42q
43r
44s
45t
46u
47v
48w
49x
50y
51z
Digits
IndexCharacter
520
531
542
553
564
575
586
597
608
619
Symbols
IndexCharacter
62+
63/

In addition to these characters, the equal sign (=) is used for padding. That is, the equal sign does not own an index and is not involved in the encoding of data. By and large, the padding character ensures that the length of Base64 value is a multiple of 4 bytes and it is always appended at the end of the output. Nevertheless, the heart of the algorithm contains only 64 characters, and for each of them there is a unique index. Only indices determine which characters will be used to encode the data, and only thanks to them you can “recover” the original data. All indices are listed in the Base64 table above.

Given all of the above, a Base64 value can be defined using the following regular expression:

^[A-Za-z0-9+/]+={0,2}$

However, some standards allow and even require the use of multi-line values. In such cases, we need to supplement the list of characters, by allowing “Line Feed” and “Carriage Return”.

^[A-Za-z0-9+/\r\n]+={0,2}$
Comments (19)

I hope you enjoy this discussion. In any case, I ask you to join it.

  • Brice,
    Hey thanks for your dedication to the subject and for making this helpful site. Where I currently am, Wikipedia is filtered and I needed to lookup the 'alphabet' since I didn't know it by heart. I learned a few things too while I was reading.

    Btw, I can't put a space in the Name field of this comment form. I was going to leave my first and last name but validation doesn't allow space.
    • Administrator,
      Hello dear Brice,
      Thank you for your comment. I’m glad you found this site useful.

      I apologize for the inconvenience with the “name” field. At the moment, I can’t fix it because it may break some related things, but I’ll look into it as soon as possible.

      By the way, if you want to remember the Base64 Characters you need just to remember the order of these four groups: Uppercase, Lowercase, Digits, and Symbols. That is, remembering this you can easily compute the Base64 alphabet since all indices, as well as Base64 characters, go in strict order.

      If you are looking for furthermore reading, I recommend you to read What is Base64? as well as explanation of Encode Algorithm and Decode Algorithm.
      • abc,
        you have done a great job.
  • Omar,
    Just wanted to say thanks for putting together this website and all the useful info it contains.

    Keep it up :)
  • James,
    Hello,

    Could you please, specify, how "the size of the result will increase to 33%" when in the introductory article you wrote that with the Base64 encoding source (binary) code would reduce by size not increase?
    • Administrator,
      Hello dear James,
      I deeply apologize for any misleading information.

      If I understand you correctly, by “introductory article” do you mean What is Base64? If so, please note that there I compared the Base64 length with binary numeral system (where each byte is represented as 8 binary digits).

      Anyway, for example, if you encode the string “ABC” (Length = 3) to Base64, the result is “QUJD” (Length = 4). That is, the result is approximately 33% (more exactly, 4/3) larger than the original data.
  • Betty,
    Hi there!

    1. You have a wonderful, comprehensive tool here. Thank you for providing it!

    2. I hope I can get your help with a code a friend sent me. I have been working on this for weeks and keep coming up short.

    The code is:
    Wm0hWW0uWW0hWW0uWW4iWW4uWW0hWW0hWW0hWm4uWW4iWW4uWm0hWW0uWW0hWW4uWWohWW4uWWohWW0=

    I figured it was in base64 due to the = sign. I was told there were three steps to this code.
    Can you make any sense of this?

    Thank you for whatever guidance you may provide!
    • SingingLakitu,
      Have you figured out the code? :) If it helps, I put that through `base64.b64decode` in Python and I got:

      `Zm!Ym.Ym!Ym.Yn"Yn.Ym!Ym!Ym!Zn.Yn"Yn.Zm!Ym.Ym!Yn.Yj!Yn.Yj!Ym`

      To respect your privacy, I did not proceed further :)
      However, I wonder if each group of 3 characters represents some sort of base-3 number...
  • ShkiperDesna,
    Hello! I often used base64 and I noticed that some files end with = and some end with ==. In the Internet I found = points "fill the ending of the last byte by zero-bits", but when should I use == I couldn't find information...
    • Administrator,
      Hi! By and large, the padding character ensures that the length of the Base64 string is a multiple of four. So, if the output string is too small, it appends a padding character until the length is divisible by 4.

      Some examples:
      - A is encoded to QQ== and there are appended two padding characters because neither QQ (2 chars) nor QQ= (3 chars) is divisible by 4.

      - AB is encoded to QUI= and there is appended only one padding character because QUI (3 chars) is not divisible by 4.

      - ABC is encoded to QUJD and there is no need to add a padding character because Base64 string is 4 characters long (that is, it's divisible by 4).

      - ABCD is encoded to QUJDRA== and there are appended two padding characters because QUJDRA (6 chars) nor QUJDRA= (7 chars) is divisible by 4.

      - ABCDE is encoded to QUJDREU= and there is appended only one padding character because QUJDREU= (7 chars) is not divisible by 4.

      - ABCDEF is encoded to QUJDREVG and there is no need to add a padding character because Base64 string is 8 characters long (that is, it's divisible by 4).
      • ShkiperDesna,
        Thank you very very much! Maybe, I would never find it if not you. Good luck)
  • Med,
    Hi, I have more than one string base64, every string means a PDF document with one page and I want to make those strings to one PDF document with PAGES, I search all the internet I didn't find a thing about it.
  • Azhar_Desai,
    Can a Base64 encoded string start with plus(+) or slash(/) character?
    • Administrator,
      Yes, Base64 strings can start with any of these characters. For example, JPEG images always start with /9j/.
  • Sarvy,
    Can a Base64 encoded string contain white spaces in between?
    • Administrator,
      No, it should not contain any whitespaces. However, most often you can decode such strings because many decoding functions simply ignore whitespaces. Please use the Base64 Validator to run some tests and find out some tips.
  • Natan,
    I need to encode a string to base 64 using the Bcrypt custom alphabet (that alphabet uses "." instead of "+" and begin with ".\"). How can I do this? There are some reference about how to encoding base 64 with an specific alphabet?

    My system specification shows the Bcrypt custom alphabet like this: ./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
    • Administrator,
      Hi! It depends on what programming language you are using. Some of them have a built-in function for this, while for others you have to replace characters by yourself. Check out the following example Base64URL in PHP, where it replaces +/ with -_.
  • centrix,
    Hey, we got an update in form of a weird base64 string! As a hint, I think, there was a picture of scissors and a 10 - i tried decoding the base64, but I just can't manage to get to anything that makes sense, would anybody have a clue what would be to be done? The past riddle was to format hex to binary, and then to text, but this one is impossible, I think. The string is:


    V2pBalhOU2dXam1qV3RTZ1dqSTJYdFNnV2pTaldOU2dXakUyV0RTZ1dqbWpXdFNnV2pJMld0U2dXaklqV3RTZ1dqbWpYalNnV2pJaldqU2dXalNqV05TZ1dqSTJXdFNnV2pJald0U2dXakkyWHRTZ1dqU2pXTlNnV2pJald0U2dXakkyWHRTZ1dqSTJXdFNnV2pJMlhEU2dXaklqWURTZ1dqUzJXalNnV2pTaldOU2dXaldqV0RTZ1dqU2pXTlNnV2pJalhqU2dXakkyWHRTZ1dqSTJYdFNnV2ptalhEU2dXalNqV05TZ1dqSWpYTlNnV2ptaldqU2dXam1qWHRTZ1dqbWpYdFNnV2pTaldOU2dXam1qV2pTZ1dqSWpYTlNnV2pTalhqU2dXaklqV2pTZ1dqU2pXTlNnV2pJald0U2dXakkyWHRTZ1dqbWpYdFNnV2pJMlh0U2dXakkyV3RTZ1dqSWpXalNnV2pJMlh0U2dXalMyV2pTZ1dqU2pXTlNnV2pFald0U2dXakVqWURTZ1dqRWpYdFNnV2pBMlhEU2dXalNqV0RTZ1dqU2pXTlNnV2pFalhqU2dXam1qWURTZ1dqSWpXdFNnV2pJMlh0U2dXalNqV05TZ1dqSWpYTlNnV2ptald0U2dXam1qV2pTZ1dqbWpZTlNnV2ptaldEU2dXaklqV2pTZ1dqU2pXTlNnV2pJMld0U2dXaklqV3RTZ1dqSTJYdFNnV2pTaldOU2dXakkyWE5TZ1dqbWpZRFNnV2ptalhqU2dXam1qV2pTZ1dqbWpZTlNnV2ptaldEU2dXalMyV2pTZ1dqU2pXTlNnV2ptaldEU2dXakkyV3RTZ1dqbWpYalNnV2pJMlh0U2dXaklqV2pTZ1dqUzJXalNnV2pTaldOU2dXakkyV3RTZ1dqbWpZTlNnV2pJMlhEU2dXalNqV05TZ1dqbWpYalNnV2pJalhEU2dXakkyWE5TZ1dqbWpXdFNnV2pTaldOU2dXam1qWGpTZ1dqSWpYRFNnV2pJMlhOU2dXam1qV3RTZ1dqU2pXTlNnV2ptalhqU2dXam1qWURTZ1dqSWpXdFNnV2pJMlh0U2dXalNqV0E9PQ==
Add new comment

If you have any questions, remarks, need help, or just like this page, please feel free to let me know by leaving a comment using the form bellow.
I will be happy to read every comment and, if necessary, I will do my best to respond as quickly as possible. Of course, spammers are welcome only as readers.