How to check for a valid Base64 encoded string

C#ValidationBase64

C# Problem Overview


Is there a way in C# to see if a string is Base 64 encoded other than just trying to convert it and see if there is an error? I have code code like this:

// Convert base64-encoded hash value into a byte array.
byte[] HashBytes = Convert.FromBase64String(Value);

I want to avoid the "Invalid character in a Base-64 string" exception that happens if the value is not valid base 64 string. I want to just check and return false instead of handling an exception because I expect that sometimes this value is not going to be a base 64 string. Is there some way to check before using the Convert.FromBase64String function?

Thanks!

Update:
Thanks for all of your answers. Here is an extension method you can all use so far it seems to make sure your string will pass Convert.FromBase64String without an exception. .NET seems to ignore all trailing and ending spaces when converting to base 64 so "1234" is valid and so is " 1234 "

public static bool IsBase64String(this string s)
{
    s = s.Trim();
    return (s.Length % 4 == 0) && Regex.IsMatch(s, @"^[a-zA-Z0-9\+/]*={0,3}$", RegexOptions.None);

}

For those wondering about performance of testing vs catching and exception, in most cases for this base 64 thing it is faster to check than to catch the exception until you reach a certain length. The smaller the length faster it is

In my very unscientific testing: For 10000 iterations for character length 100,000 - 110000 it was 2.7 times faster to test first.

For 1000 iterations for characters length 1 - 16 characters for total of 16,000 tests it was 10.9 times faster.

I am sure there is a point where it becomes better to test with the exception based method. I just don't know at what point that is.

C# Solutions


Solution 1 - C#

Use Convert.TryFromBase64String from C# 7.2

public static bool IsBase64String(string base64)
{
   Span<byte> buffer = new Span<byte>(new byte[base64.Length]);
   return Convert.TryFromBase64String(base64, buffer , out int bytesParsed);
}

Solution 2 - C#

Update: For newer versions of C#, there's a much better alternative, please refer to the answer by Tomas here: https://stackoverflow.com/a/54143400/125981.


It's pretty easy to recognize a Base64 string, as it will only be composed of characters 'A'..'Z', 'a'..'z', '0'..'9', '+', '/' and it is often padded at the end with up to three '=', to make the length a multiple of 4. But instead of comparing these, you'd be better off ignoring the exception, if it occurs.

Solution 3 - C#

I know you said you didn't want to catch an exception. But, because catching an exception is more reliable, I will go ahead and post this answer.

public static bool IsBase64(this string base64String) {
     // Credit: oybek https://stackoverflow.com/users/794764/oybek
     if (string.IsNullOrEmpty(base64String) || base64String.Length % 4 != 0
        || base64String.Contains(" ") || base64String.Contains("\t") || base64String.Contains("\r") || base64String.Contains("\n"))
        return false;

     try{
         Convert.FromBase64String(base64String);
         return true;
     }
     catch(Exception exception){
     // Handle the exception
     }
     return false;
}

Update: I've updated the condition thanks to oybek to further improve reliability.

Solution 4 - C#

I believe the regex should be:

    Regex.IsMatch(s, @"^[a-zA-Z0-9\+/]*={0,2}$")

Only matching one or two trailing '=' signs, not three.

s should be the string that will be checked. Regex is part of the System.Text.RegularExpressions namespace.

Solution 5 - C#

Just for the sake of completeness I want to provide some implementation. Generally speaking Regex is an expensive approach, especially if the string is large (which happens when transferring large files). The following approach tries the fastest ways of detection first.

public static class HelperExtensions {
    // Characters that are used in base64 strings.
    private static Char[] Base64Chars = new[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/' };
    /// <summary>
    /// Extension method to test whether the value is a base64 string
    /// </summary>
    /// <param name="value">Value to test</param>
    /// <returns>Boolean value, true if the string is base64, otherwise false</returns>
    public static Boolean IsBase64String(this String value) {
    
        // The quickest test. If the value is null or is equal to 0 it is not base64
        // Base64 string's length is always divisible by four, i.e. 8, 16, 20 etc. 
        // If it is not you can return false. Quite effective
        // Further, if it meets the above criterias, then test for spaces.
        // If it contains spaces, it is not base64
        if (value == null || value.Length == 0 || value.Length % 4 != 0
            || value.Contains(' ') || value.Contains('\t') || value.Contains('\r') || value.Contains('\n'))
            return false;
        
        // 98% of all non base64 values are invalidated by this time.
        var index = value.Length - 1;
        
        // if there is padding step back
        if (value[index] == '=')
            index--;
            
        // if there are two padding chars step back a second time
        if (value[index] == '=')
            index--;
        
        // Now traverse over characters
        // You should note that I'm not creating any copy of the existing strings, 
        // assuming that they may be quite large
        for (var i = 0; i <= index; i++) 
            // If any of the character is not from the allowed list
            if (!Base64Chars.Contains(value[i]))
                // return false
                return false;
                
        // If we got here, then the value is a valid base64 string
        return true;
    }
}

EDIT

As suggested by Sam, you can also change the source code slightly. He provides a better performing approach for the last step of tests. The routine

    private static Boolean IsInvalid(char value) {
        var intValue = (Int32)value;

        // 1 - 9
        if (intValue >= 48 && intValue <= 57) 
            return false;

        // A - Z
        if (intValue >= 65 && intValue <= 90) 
            return false;

        // a - z
        if (intValue >= 97 && intValue <= 122) 
            return false;

        // + or /
        return intValue != 43 && intValue != 47;
    } 

can be used to replace if (!Base64Chars.Contains(value[i])) line with if (IsInvalid(value[i]))

The complete source code with enhancements from Sam will look like this (removed comments for clarity)

public static class HelperExtensions {
    public static Boolean IsBase64String(this String value) {
        if (value == null || value.Length == 0 || value.Length % 4 != 0
            || value.Contains(' ') || value.Contains('\t') || value.Contains('\r') || value.Contains('\n'))
            return false;
        var index = value.Length - 1;
        if (value[index] == '=')
            index--;
        if (value[index] == '=')
            index--;
        for (var i = 0; i <= index; i++)
            if (IsInvalid(value[i]))
                return false;
        return true;
    }
    // Make it private as there is the name makes no sense for an outside caller
    private static Boolean IsInvalid(char value) {
        var intValue = (Int32)value;
        if (intValue >= 48 && intValue <= 57)
            return false;
        if (intValue >= 65 && intValue <= 90)
            return false;
        if (intValue >= 97 && intValue <= 122)
            return false;
        return intValue != 43 && intValue != 47;
    }
}

Solution 6 - C#

Why not just catch the exception, and return False?

This avoids additional overhead in the common case.

Solution 7 - C#

The answer must depend on the usage of the string. There are many strings that may be "valid base64" according to the syntax suggested by several posters, but that may "correctly" decode, without exception, to junk. Example: the 8char string Portland is valid Base64. What is the point of stating that this is valid Base64? I guess that at some point you'd want to know that this string should or should not be Base64 decoded.

In my case, I am reading Oracle connection strings from file app.config that may be either in plain text like:

Data source=mydb/DBNAME;User Id=Roland;Password=secret1;

or in base64 like

VXNlciBJZD1sa.....................................==

(my predecessor considered base64 as encryption :-)

In order to decide if base64 decoding is needed, in this particular use case, I should simply check if the string starts with "Data" (case insensitive). This is much easier, faster, and more reliable, than just try to decode, and see if an exception occurs:

if (ConnectionString.Substring(0, 4).ToLower() != "data")
{
  //..DecodeBase64..
}

I updated this answer; my old conclusion was:

I just have to check for the presence of a semicolon, because that proves that it is NOT base64, which is of course faster than any above method.

Solution 8 - C#

I prefer this usage:

    public static class StringExtensions
    {
        /// <summary>
        /// Check if string is Base64
        /// </summary>
        /// <param name="base64"></param>
        /// <returns></returns>
        public static bool IsBase64String(this string base64)
        {
            //https://stackoverflow.com/questions/6309379/how-to-check-for-a-valid-base64-encoded-string
            Span<byte> buffer = new Span<byte>(new byte[base64.Length]);
            return Convert.TryFromBase64String(base64, buffer, out int _);
        }
    }

Then usage

if(myStr.IsBase64String()){

    ...

}

Solution 9 - C#

I will use like this so that I don't need to call the convert method again

   public static bool IsBase64(this string base64String,out byte[] bytes)
    {
        bytes = null;
        // Credit: oybek http://stackoverflow.com/users/794764/oybek
        if (string.IsNullOrEmpty(base64String) || base64String.Length % 4 != 0
           || base64String.Contains(" ") || base64String.Contains("\t") || base64String.Contains("\r") || base64String.Contains("\n"))
            return false;

        try
        {
             bytes=Convert.FromBase64String(base64String);
            return true;
        }
        catch (Exception)
        {
            // Handle the exception
        }

        return false;
    }

Solution 10 - C#

Knibb High football rules!

This should be relatively fast and accurate but I admit I didn't put it through a thorough test, just a few.

It avoids expensive exceptions, regex, and also avoids looping through a character set, instead using ascii ranges for validation.

public static bool IsBase64String(string s)
    {
        s = s.Trim();
        int mod4 = s.Length % 4;
        if(mod4!=0){
            return false;
        }
        int i=0;
        bool checkPadding = false;
        int paddingCount = 1;//only applies when the first is encountered.
        for(i=0;i<s.Length;i++){
            char c = s[i];
            if (checkPadding)
            {
                if (c != '=')
                {
                    return false;
                }
                paddingCount++;
                if (paddingCount > 3)
                {
                    return false;
                }
                continue;
            }
            if(c>='A' && c<='z' || c>='0' && c<='9'){
                continue;
            }
            switch(c){ 
              case '+':
              case '/':
                 continue;
              case '=': 
                 checkPadding = true;
                 continue;
            }
            return false;
        }
        //if here
        //, length was correct
        //, there were no invalid characters
        //, padding was correct
        return true;
    }

Solution 11 - C#

public static bool IsBase64String1(string value)
        {
            if (string.IsNullOrEmpty(value))
            {
                return false;
            }
            try
            {
                Convert.FromBase64String(value);
                if (value.EndsWith("="))
                {
                    value = value.Trim();
                    int mod4 = value.Length % 4;
                    if (mod4 != 0)
                    {
                        return false;
                    }
                    return true;
                }
                else
                {

                    return false;
                }
            }
            catch (FormatException)
            {
                return false;
            }
        }

Solution 12 - C#

Imho this is not really possible. All posted solutions fails for strings like "test" and so on. If they can be divided through 4, are not null or empty, and if they are a valid base64 character, they will pass all tests. That can be many strings ...

So there is no real solution other than knowing that this is a base 64 encoded string. What I've come up with is this:

if (base64DecodedString.StartsWith("<xml>")
{
    // This was really a base64 encoded string I was expecting. Yippie!
}
else
{
    // This is gibberish.
}

I expect that the decoded string begins with a certain structure, so I check for that.

Solution 13 - C#

Do decode, re encode and compare the result to original string

public static Boolean IsBase64(this String str)
{
    if ((str.Length % 4) != 0)
    {
        return false;
    }

    //decode - encode and compare
    try
    {
        string decoded = System.Text.Encoding.UTF8.GetString(System.Convert.FromBase64String(str));
        string encoded = System.Convert.ToBase64String(System.Text.Encoding.UTF8.GetBytes(decoded));
        if (str.Equals(encoded, StringComparison.InvariantCultureIgnoreCase))
        {
            return true;
        }
    }
    catch { }
    return false;
}

Solution 14 - C#

All answers were been digested into 1 function that ensures 100% that its results will be accurate.

1) Use function as below:

string encoded = "WW91ckJhc2U2NHN0cmluZw==";
Console.WriteLine("Is string base64=" + IsBase64(encoded));

2) Below is the function:

public bool IsBase64(string base64String)
{
    try
    {
        if (!base64String.Equals(Convert.ToBase64String(Encoding.UTF8.GetBytes(Encoding.UTF8.GetString(Convert.FromBase64String(base64String)))), StringComparison.InvariantCultureIgnoreCase) & !System.Text.RegularExpressions.Regex.IsMatch(base64String, @"^[a-zA-Z0-9\+/]*={0,2}$"))
        {
            return false;
        }
        else if ((base64String.Length % 4) != 0 || string.IsNullOrEmpty(base64String) || base64String.Length % 4 != 0 || base64String.Contains(" ") || base64String.Contains(Constants.vbTab) || base64String.Contains(Constants.vbCr) || base64String.Contains(Constants.vbLf))
        {
            return false;
        }
        else return true;
    }
    catch (FormatException ex)
    {
        return false;
    }
}

Solution 15 - C#

Yes, since Base64 encodes binary data into ASCII strings using a limited set of characters, you can simply check it with this regular expression:

/^[A-Za-z0-9=\+/\s\n]+$/s

which will assure the string only contains A-Z, a-z, 0-9, '+', '/', '=', and whitespace.

Solution 16 - C#

I just wanted to point out that none of the answers to date are very useable (depending on your use-case, but bare with me).

All of them will return false positives for strings of a length divisible by 4, not containing whitespace. If you adjust for missing padding, all strings within the [aA-zZ0-9]+ range will register as base64 encoded.

It doesn't matter if you check for valid characters and length, or use the Exception or TryConvert approach, all these methods return false positives.

Some simple examples:

  • "test" will register as base64 encoded
  • "test1" will register as base64 encoded if you adjust for missing padding (trailing '=')
  • "test test" will never register as base64 encoded
  • "tést" will never register as base64 encoded

I'm not saying the methods described here are useless, but you should be aware of the limitations before you use any of these in a production environment.

Solution 17 - C#

I would suggest creating a regex to do the job. You'll have to check for something like this: [a-zA-Z0-9+/=] You'll also have to check the length of the string. I'm not sure on this one, but i'm pretty sure if something gets trimmed (other than the padding "=") it would blow up.

Or better yet check out this stackoverflow question

Solution 18 - C#

Sure. Just make sure each character is within a-z, A-Z, 0-9, /, or +, and the string ends with ==. (At least, that's the most common Base64 implementation. You might find some implementations that use characters different from / or + for the last two characters.)

Solution 19 - C#

I have just had a very similar requirement where I am letting the user do some image manipulation in a <canvas> element and then sending the resulting image retrieved with .toDataURL() to the backend. I wanted to do some server validation before saving the image and have implemented a ValidationAttribute using some of the code from other answers:

[AttributeUsage(AttributeTargets.Property, AllowMultiple = false, Inherited = false)]
public class Bae64PngImageAttribute : ValidationAttribute
{
	public override bool IsValid(object value)
	{
		if (value == null || string.IsNullOrWhiteSpace(value as string))
			return true; // not concerned with whether or not this field is required
		var base64string = (value as string).Trim();

		// we are expecting a URL type string
		if (!base64string.StartsWith("data:image/png;base64,"))
			return false;

		base64string = base64string.Substring("data:image/png;base64,".Length);

		// match length and regular expression
		if (base64string.Length % 4 != 0 || !Regex.IsMatch(base64string, @"^[a-zA-Z0-9\+/]*={0,3}$", RegexOptions.None))
			return false;

		// finally, try to convert it to a byte array and catch exceptions
		try
		{
			byte[] converted = Convert.FromBase64String(base64string);
			return true;
		}
		catch(Exception)
		{
			return false;
		}
	}
}

As you can see I am expecting an image/png type string, which is the default returned by <canvas> when using .toDataURL().

Solution 20 - C#

Check Base64 or normal string

public bool IsBase64Encoded(String str)
{

 try

  {
    // If no exception is caught, then it is possibly a base64 encoded string
    byte[] data = Convert.FromBase64String(str);
    // The part that checks if the string was properly padded to the
    // correct length was borrowed from d@anish's solution
    return (str.Replace(" ","").Length % 4 == 0);
  }
catch
  {
    // If exception is caught, then it is not a base64 encoded string
   return false;
  }

}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionChris MullinsView Question on Stackoverflow
Solution 1 - C#Tomas KubesView Answer on Stackoverflow
Solution 2 - C#Anirudh RamanathanView Answer on Stackoverflow
Solution 3 - C#harsimranbView Answer on Stackoverflow
Solution 4 - C#JD BrennanView Answer on Stackoverflow
Solution 5 - C#OybekView Answer on Stackoverflow
Solution 6 - C#Tyler EavesView Answer on Stackoverflow
Solution 7 - C#RolandView Answer on Stackoverflow
Solution 8 - C#ScholtzView Answer on Stackoverflow
Solution 9 - C#Yaseer ArafatView Answer on Stackoverflow
Solution 10 - C#Jason KView Answer on Stackoverflow
Solution 11 - C#user3181503View Answer on Stackoverflow
Solution 12 - C#testingView Answer on Stackoverflow
Solution 13 - C#PKOSView Answer on Stackoverflow
Solution 14 - C#Sorry IwontTellView Answer on Stackoverflow
Solution 15 - C#Rob RaischView Answer on Stackoverflow
Solution 16 - C#Dimitri TroncquoView Answer on Stackoverflow
Solution 17 - C#JayView Answer on Stackoverflow
Solution 18 - C#user684934View Answer on Stackoverflow
Solution 19 - C#germankiwiView Answer on Stackoverflow
Solution 20 - C#Navdeep KapilView Answer on Stackoverflow