A virtual teacher who reveals to you the great secrets of Base64

Base64 Characters

The Base64 Alphabet contains 64 basic ASCII characters which are used to encode data. Yeah, that’s right, 64 characters is enough to encode any data of any length. The only drawback is that the size of the result will increase to 33%. However, its benefits are much more important, at least because all these symbols are available in 7-bit and 8-bit character sets.

Characters of the Base64 alphabet can be grouped into four groups:

  • Uppercase letters (indices 0-25): ABCDEFGHIJKLMNOPQRSTUVWXYZ
  • Lowercase letters (indices 26-51): abcdefghijklmnopqrstuvwxyz
  • Digits (indices 52-61): 0123456789
  • Special symbols (indices 62-63): +/

It is very important to note that the Base64 letters are case sensitive. This means that, for example, when decoding the values “QQ==”, “Qq==”, “qq==”, and “qQ==” four different results are obtained.

For a better understanding, I grouped all characters into the Base64 table:

Uppercase Letters
IndexCharacter
0A
1B
2C
3D
4E
5F
6G
7H
8I
9J
10K
11L
12M
13N
14O
15P
16Q
17R
18S
19T
20U
21V
22W
23X
24Y
25Z
Lowercase Letters
IndexCharacter
26a
27b
28c
29d
30e
31f
32g
33h
34i
35j
36k
37l
38m
39n
40o
41p
42q
43r
44s
45t
46u
47v
48w
49x
50y
51z
Digits
IndexCharacter
520
531
542
553
564
575
586
597
608
619
Symbols
IndexCharacter
62+
63/

In addition to these characters, the equal sign (=) is used for padding. That is, the equal sign does not own an index and is not involved in the encoding of data. By and large, the padding character ensures that the length of Base64 value is a multiple of 4 bytes and it is always appended at the end of the output. Nevertheless, the heart of the algorithm contains only 64 characters, and for each of them there is a unique index. Only indices determine which characters will be used to encode the data, and only thanks to them you can “recover” the original data. All indices are listed in the Base64 table above.

Given all of the above, a Base64 value can be defined using the following regular expression:

^[A-Za-z0-9+/]+={0,2}$

However, some standards allow and even require the use of multi-line values. In such cases, we need to supplement the list of characters, by allowing “Line Feed” and “Carriage Return”.

^[A-Za-z0-9+/\r\n]+={0,2}$
Comments (51)

I hope you enjoy this discussion. In any case, I ask you to join it.

  • Brice,
    Hey thanks for your dedication to the subject and for making this helpful site. Where I currently am, Wikipedia is filtered and I needed to lookup the 'alphabet' since I didn't know it by heart. I learned a few things too while I was reading.

    Btw, I can't put a space in the Name field of this comment form. I was going to leave my first and last name but validation doesn't allow space.
    • Administrator,
      Hello dear Brice,
      Thank you for your comment. I’m glad you found this site useful.

      I apologize for the inconvenience with the “name” field. At the moment, I can’t fix it because it may break some related things, but I’ll look into it as soon as possible.

      By the way, if you want to remember the Base64 Characters you need just to remember the order of these four groups: Uppercase, Lowercase, Digits, and Symbols. That is, remembering this you can easily compute the Base64 alphabet since all indices, as well as Base64 characters, go in strict order.

      If you are looking for furthermore reading, I recommend you to read What is Base64? as well as explanation of Encode Algorithm and Decode Algorithm.
      • abc,
        you have done a great job.
        • 123,
          ahah thanks
  • Omar,
    Just wanted to say thanks for putting together this website and all the useful info it contains.

    Keep it up :)
  • James,
    Hello,

    Could you please, specify, how "the size of the result will increase to 33%" when in the introductory article you wrote that with the Base64 encoding source (binary) code would reduce by size not increase?
    • Administrator,
      Hello dear James,
      I deeply apologize for any misleading information.

      If I understand you correctly, by “introductory article” do you mean What is Base64? If so, please note that there I compared the Base64 length with binary numeral system (where each byte is represented as 8 binary digits).

      Anyway, for example, if you encode the string “ABC” (Length = 3) to Base64, the result is “QUJD” (Length = 4). That is, the result is approximately 33% (more exactly, 4/3) larger than the original data.
  • Betty,
    Hi there!

    1. You have a wonderful, comprehensive tool here. Thank you for providing it!

    2. I hope I can get your help with a code a friend sent me. I have been working on this for weeks and keep coming up short.

    The code is:
    Wm0hWW0uWW0hWW0uWW4iWW4uWW0hWW0hWW0hWm4uWW4iWW4uWm0hWW0uWW0hWW4uWWohWW4uWWohWW0=

    I figured it was in base64 due to the = sign. I was told there were three steps to this code.
    Can you make any sense of this?

    Thank you for whatever guidance you may provide!
    • SingingLakitu,
      Have you figured out the code? :) If it helps, I put that through `base64.b64decode` in Python and I got:

      `Zm!Ym.Ym!Ym.Yn"Yn.Ym!Ym!Ym!Zn.Yn"Yn.Zm!Ym.Ym!Yn.Yj!Yn.Yj!Ym`

      To respect your privacy, I did not proceed further :)
      However, I wonder if each group of 3 characters represents some sort of base-3 number...
  • ShkiperDesna,
    Hello! I often used base64 and I noticed that some files end with = and some end with ==. In the Internet I found = points "fill the ending of the last byte by zero-bits", but when should I use == I couldn't find information...
    • Administrator,
      Hi! By and large, the padding character ensures that the length of the Base64 string is a multiple of four. So, if the output string is too small, it appends a padding character until the length is divisible by 4.

      Some examples:
      - A is encoded to QQ== and there are appended two padding characters because neither QQ (2 chars) nor QQ= (3 chars) is divisible by 4.

      - AB is encoded to QUI= and there is appended only one padding character because QUI (3 chars) is not divisible by 4.

      - ABC is encoded to QUJD and there is no need to add a padding character because Base64 string is 4 characters long (that is, it's divisible by 4).

      - ABCD is encoded to QUJDRA== and there are appended two padding characters because QUJDRA (6 chars) nor QUJDRA= (7 chars) is divisible by 4.

      - ABCDE is encoded to QUJDREU= and there is appended only one padding character because QUJDREU= (7 chars) is not divisible by 4.

      - ABCDEF is encoded to QUJDREVG and there is no need to add a padding character because Base64 string is 8 characters long (that is, it's divisible by 4).
      • ShkiperDesna,
        Thank you very very much! Maybe, I would never find it if not you. Good luck)
  • Med,
    Hi, I have more than one string base64, every string means a PDF document with one page and I want to make those strings to one PDF document with PAGES, I search all the internet I didn't find a thing about it.
  • Azhar_Desai,
    Can a Base64 encoded string start with plus(+) or slash(/) character?
    • Administrator,
      Yes, Base64 strings can start with any of these characters. For example, JPEG images always start with /9j/.
  • Sarvy,
    Can a Base64 encoded string contain white spaces in between?
    • Administrator,
      No, it should not contain any whitespaces. However, most often you can decode such strings because many decoding functions simply ignore whitespaces. Please use the Base64 Validator to run some tests and find out some tips.
  • Natan,
    I need to encode a string to base 64 using the Bcrypt custom alphabet (that alphabet uses "." instead of "+" and begin with ".\"). How can I do this? There are some reference about how to encoding base 64 with an specific alphabet?

    My system specification shows the Bcrypt custom alphabet like this: ./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
    • Administrator,
      Hi! It depends on what programming language you are using. Some of them have a built-in function for this, while for others you have to replace characters by yourself. Check out the following example Base64URL in PHP, where it replaces +/ with -_.
  • centrix,
    Hey, we got an update in form of a weird base64 string! As a hint, I think, there was a picture of scissors and a 10 - i tried decoding the base64, but I just can't manage to get to anything that makes sense, would anybody have a clue what would be to be done? The past riddle was to format hex to binary, and then to text, but this one is impossible, I think. The string is:


    V2pBalhOU2dXam1qV3RTZ1dqSTJYdFNnV2pTaldOU2dXakUyV0RTZ1dqbWpXdFNnV2pJMld0U2dXaklqV3RTZ1dqbWpYalNnV2pJaldqU2dXalNqV05TZ1dqSTJXdFNnV2pJald0U2dXakkyWHRTZ1dqU2pXTlNnV2pJald0U2dXakkyWHRTZ1dqSTJXdFNnV2pJMlhEU2dXaklqWURTZ1dqUzJXalNnV2pTaldOU2dXaldqV0RTZ1dqU2pXTlNnV2pJalhqU2dXakkyWHRTZ1dqSTJYdFNnV2ptalhEU2dXalNqV05TZ1dqSWpYTlNnV2ptaldqU2dXam1qWHRTZ1dqbWpYdFNnV2pTaldOU2dXam1qV2pTZ1dqSWpYTlNnV2pTalhqU2dXaklqV2pTZ1dqU2pXTlNnV2pJald0U2dXakkyWHRTZ1dqbWpYdFNnV2pJMlh0U2dXakkyV3RTZ1dqSWpXalNnV2pJMlh0U2dXalMyV2pTZ1dqU2pXTlNnV2pFald0U2dXakVqWURTZ1dqRWpYdFNnV2pBMlhEU2dXalNqV0RTZ1dqU2pXTlNnV2pFalhqU2dXam1qWURTZ1dqSWpXdFNnV2pJMlh0U2dXalNqV05TZ1dqSWpYTlNnV2ptald0U2dXam1qV2pTZ1dqbWpZTlNnV2ptaldEU2dXaklqV2pTZ1dqU2pXTlNnV2pJMld0U2dXaklqV3RTZ1dqSTJYdFNnV2pTaldOU2dXakkyWE5TZ1dqbWpZRFNnV2ptalhqU2dXam1qV2pTZ1dqbWpZTlNnV2ptaldEU2dXalMyV2pTZ1dqU2pXTlNnV2ptaldEU2dXakkyV3RTZ1dqbWpYalNnV2pJMlh0U2dXaklqV2pTZ1dqUzJXalNnV2pTaldOU2dXakkyV3RTZ1dqbWpZTlNnV2pJMlhEU2dXalNqV05TZ1dqbWpYalNnV2pJalhEU2dXakkyWE5TZ1dqbWpXdFNnV2pTaldOU2dXam1qWGpTZ1dqSWpYRFNnV2pJMlhOU2dXam1qV3RTZ1dqU2pXTlNnV2ptalhqU2dXam1qWURTZ1dqSWpXdFNnV2pJMlh0U2dXalNqV0E9PQ==
  • Yashar,
    Heya!
    Thanks for helping us all out!

    Can you please tell how do I extract relevant data of a base64 encoded image?

    Eg:
    Let's say I will always have a defined set of objects in my image (box, circle, triangle) on a white background. Can I create a code that will tell me which object (box etc) is present where in the image and is connected with which other object (let's say two objects are connected via line) - by reading the base64 encoded file?

    A code that can read and convert my image into a defined JSON template is what I want.
    Thanks.
    • Administrator,
      Hey! Sorry, but your question is not related to Base64.
  • Adrienne,
    My program decodes base64 strings with no problem, but recently came across a string with an extra "tail" on it. When everything up to the "." is decoded, it comes out gibberish, so I'd like to know if you know or can surmise how to use the "tail" code to accomplish a proper decode. Thank you! Adrienne

    0XZzxWYmpj.06d1162
    • Adrienne,
      a little additional info; these strings should decode to a json string that starts {"title":"May 15, 2022","subtitle":, also each day the json data is different, but the tail has been the same, ".06d1162"
      Looking at the end of the base64 string I sent, the placement of the "=" would seem to indicate that there is some sort of shifting or shuffling going on.
    • Adrienne,
      I just realized I have the same data, encoded both ways, with and without the tail.

      This is the json string encoded with plain base64:

      
    • Adrienne,
      and this is the same json string encoded with the tail:

      0.06d1162

      and I can upload the plain text json string too if that helps.
    • Adrienne,
      looking at the 2 base64 strings, I realized it's being reversed in small chunks, first 4 chars,then 8,3,3,15,8,2, then the sequence repeats. Now all I have to figure out is how the 06d1162 string at the end translates into the 4-8-3-3-15-8-2 sequence, in case it changes.
      • DJWord,
        Thanks for the start. Assume it's a hex string, invert it (2611d60) and add 2 to each value (0x2+2=4, 0x6+2=8, 0x1+2=3, 0x1+2=3, 0xd(13)+2=15, 0x6+2=8, 0x0+2=2). Worked for the latest b8a308e - 16-10-2-5-12-10-13.
        • Adrienne,
          Yes, I did figure that out, and the formula has been working on different strings, and now I’m starting to see new codes and it’s still working. I didn’t post it since I didn’t know if the page was still monitored. Thank You.
        • Adrienne,
          Oh! just realized you posted the latest code lol so you know the source, very cool
        • Adrienne,
          feel free to email me if you are interested in receiving the puz files
        • Adrienne,
          So I'm curious if you are also "jpd236" author of Crossword Scraper.. I noticed it stopped working for WinnipegFreePress today, as did my crossword downloader, but I discovered the map-code is now encoded in another window.raw string, and no longer reversed.
  • OriginWormy,
    How does a forward slash appear in a Base64 string from plain text?
  • pritesh,
    What regular expression should i use for my string U2FsdGVkX1967SyD064v77zUKCtnEbB1wy2+8Bs5/sM=?
    I have tried both:

    1.                <xs:restriction base="xs:string">
                            <xs:maxLength value="100" />
                            <xs:pattern value="^[A-Za-z0-9+/]+={0,2}$" />
                      </xs:restriction>

    2.                <xs:restriction base="xs:string">
                            <xs:maxLength value="100" />
                            <xs:pattern value="^[A-Za-z0-9+/\r\n]+={0,2}$" />
                      </xs:restriction>

    But both are not able to validate the my base64 string.
    Please help
  • Tzvika,
    Hi there, thanks for this meaningful website!
    after reading your articles and answers and in other forms I understand that,
    base64 can not have whitespaces and the signs + and / can be used depending on the base64 implementation.
    So is it safe to assume that if I encounter a whitespace then it probably override a previous + or / signs ?

    Many thanks!
  • Talha,
    I have a couple of questions.

    1. Why does Base64 not start with 0? any particular reason?
    2. Why the special chars are "+" and "/" specifically? not any other chars.
  • bayu,
    What is the difference

    (base64) = h
    (base64) = q
    vLFfghR5tNV3K9DKhmwArV+SbjWAcgZZzIDTnJ0JgCo=h
    r6wt0ArZSmas0z/zuRK4syYcdBu/2pfLr02IE4OL90U=q


    Please help me
  • nqhXncMU,
    -1); waitfor delay '0:0:15' --
  • nqhXncMU,
    OSq8xMcW')) OR 643=(SELECT 643 FROM PG_SLEEP(15))--
  • nqhXncMU,
    0"XOR(if(now()=sysdate(),sleep(15),0))XOR"Z
  • nqhXncMU,
    1
  • nqhXncMU,
    555'||DBMS_PIPE.RECEIVE_MESSAGE(CHR(98)||CHR(98)||CHR(98),15)||'
  • gBqsPxAZ,
    -5 OR 680=(SELECT 680 FROM PG_SLEEP(15))--
  • nqhXncMU,
    -5) OR 157=(SELECT 157 FROM PG_SLEEP(15))--
  • nqhXncMU,
    -1)) OR 452=(SELECT 452 FROM PG_SLEEP(15))--
  • nqhXncMU,
    -5) OR 721=(SELECT 721 FROM PG_SLEEP(15))--
  • nqhXncMU,
    0'XOR(if(now()=sysdate(),sleep(15),0))XOR'Z
  • ncMUFCMU,
    -1)) OR 692=(SELECT 692 FROM PG_SLEEP(15))--
  • Uelsonn,
    Base58 is better than base64 ?
  • Bigblackchoc,
    9-26-21-7-3-8-23-8-13-9-14-17-6 3-14-12-3-4-19-15-12 7-3-19-3-13
Add new comment

If you have any questions, remarks, need help, or just like this page, please feel free to let me know by leaving a comment using the form bellow.
I will be happy to read every comment and, if necessary, I will do my best to respond as quickly as possible. Of course, spammers are welcome only as readers.