How to hide a string in binary code?

C++Obfuscation

C++ Problem Overview


Sometimes, it is useful to hide a string from a binary (executable) file. For example, it makes sense to hide encryption keys from binaries.

When I say “hide”, I mean making strings harder to find in the compiled binary.

For example, this code:

const char* encryptionKey = "My strong encryption key";
// Using the key

after compilation produces an executable file with the following in its data section:

4D 79 20 73 74 72 6F 6E-67 20 65 6E 63 72 79 70   |My strong encryp|
74 69 6F 6E 20 6B 65 79                           |tion key        |

You can see that our secret string can be easily found and/or modified.

I could hide the string…

char encryptionKey[30];
int n = 0;
encryptionKey[n++] = 'M';
encryptionKey[n++] = 'y';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 's';
encryptionKey[n++] = 't';
encryptionKey[n++] = 'r';
encryptionKey[n++] = 'o';
encryptionKey[n++] = 'n';
encryptionKey[n++] = 'g';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 'e';
encryptionKey[n++] = 'n';
encryptionKey[n++] = 'c';
encryptionKey[n++] = 'r';
encryptionKey[n++] = 'y';
encryptionKey[n++] = 'p';
encryptionKey[n++] = 't';
encryptionKey[n++] = 'i';
encryptionKey[n++] = 'o';
encryptionKey[n++] = 'n';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 'k';
encryptionKey[n++] = 'e';
encryptionKey[n++] = 'y';

…but it's not a nice method. Any better ideas?

PS: I know that merely hiding secrets doesn't work against a determined attacker, but it's much better than nothing…

Also, I know about assymetric encryption, but it's not acceptable in this case. I am refactoring an existing appication which uses Blowfish encryption and passes encrypted data to the server (the server decrypts the data with the same key).

I can't change the encryption algorithm because I need to provide backward compatibility. I can't even change the encryption key.

C++ Solutions


Solution 1 - C++

I'm sorry for long answer.

Your answers are absolutely correct, but the question was how to hide string and do it nicely.

I did it in such way:

#include "HideString.h"




DEFINE_HIDDEN_STRING(EncryptionKey, 0x7f, ('M')('y')(' ')('s')('t')('r')('o')('n')('g')(' ')('e')('n')('c')('r')('y')('p')('t')('i')('o')('n')(' ')('k')('e')('y'))
DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))




int main()
{
std::cout << GetEncryptionKey() << std::endl;
std::cout << GetEncryptionKey2() << std::endl;



return 0;




}

}

HideString.h:

#include <boost/preprocessor/cat.hpp>
#include <boost/preprocessor/seq/for_each_i.hpp>
#include <boost/preprocessor/seq/enum.hpp>




#define CRYPT_MACRO(r, d, i, elem) ( elem ^ ( d - i ) )




#define DEFINE_HIDDEN_STRING(NAME, SEED, SEQ)

static const char* BOOST_PP_CAT(Get, NAME)()

{

static char data[] = {

BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ)),

'\0'

};



static bool isEncrypted = true;

if ( isEncrypted )

{

for (unsigned i = 0; i < ( sizeof(data) / sizeof(data[0]) ) - 1; ++i)

{

data[i] = CRYPT_MACRO(_, SEED, i, data[i]);

}



isEncrypted = false;

}



return data;

}

#define DEFINE_HIDDEN_STRING(NAME, SEED, SEQ)
static const char* BOOST_PP_CAT(Get, NAME)()
{
static char data[] = {
BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ)),
'\0'
};

static bool isEncrypted = true;
if ( isEncrypted )
{
for (unsigned i = 0; i < ( sizeof(data) / sizeof(data[0]) ) - 1; ++i)
{
data[i] = CRYPT_MACRO(_, SEED, i, data[i]);
}

isEncrypted = false;
}

return data;
}

Most tricky line in HideString.h is:

BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ))

Lets me explane the line. For code:

DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))

BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ)
generate sequence:
( 'T'  ^ ( 0x27 - 0 ) ) ( 'e'  ^ ( 0x27 - 1 ) ) ( 's'  ^ ( 0x27 - 2 ) ) ( 't'  ^ ( 0x27 - 3 ) )

BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ))
generate:
'T' ^ ( 0x27 - 0 ), 'e' ^ ( 0x27 - 1 ), 's' ^ ( 0x27 - 2 ), 't' ^ ( 0x27 - 3 )

and finally,

DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))
generate:
static const char* GetEncryptionKey2()
{
static char data[] = {
'T' ^ ( 0x27 - 0 ), 'e' ^ ( 0x27 - 1 ), 's' ^ ( 0x27 - 2 ), 't' ^ ( 0x27 - 3 ),
'\0'
};
static bool isEncrypted = true;
if ( isEncrypted )
{
for (unsigned i = 0; i < ( sizeof(data) / sizeof(data[0]) ) - 1; ++i)
{
data[i] = ( data[i] ^ ( 0x27 - i ) );
}
isEncrypted = false;
}
return data;
}

data for "My strong encryption key" looks like:

0x00B0200C  32 07 5d 0f 0f 08 16 16 10 56 10 1a 10 00 08  2.]......V.....
0x00B0201B  00 1b 07 02 02 4b 01 0c 11 00 00 00 00 00 00  .....K.........

Thank you very much for your answers!

Solution 2 - C++

As noted in the comment to pavium's answer, you have two choices:

  • Secure the key
  • Secure the decryption algorithm

Unfortunately, if you must resort to embedding both the key and the algorithm within the code, neither is truly secret, so you're left with the (far weaker) alternative of security through obscurity. In other words, as you mentioned, you need a clever way to hide either or both of them inside your executable.

Here are some options, though you need to remember that none of these is truly secure according to any cryptographic best practices, and each has its drawbacks:

  1. Disguise your key as a string that would normally appear within the code. One example would be the format string of a printf() statement, which tends to have numbers, letters, and punctuation.
  2. Hash some or all of the code or data segments on startup, and use that as the key. (You'll need to be a bit clever about this to ensure the key doesn't change unexpectedly!) This has a potentially desirable side-effect of verifying the hashed portion of your code each time it runs.
  3. Generate the key at run-time from something that is unique to (and constant within) the system for example, by hashing the MAC address of a network adapter.
  4. Create the key by choosing bytes from other data. If you have static or global data, regardless of type (int, char, etc.), take a byte from somewhere within each variable after it's initialized (to a non-zero value, of course) and before it changes.

Please let us know how you solve the problem!

Edit: You commented that you're refactoring existing code, so I'll assume you can't necessarily choose the key yourself. In that case, follow a 2-step process: Use one of the above methods to encrypt the key itself, then use that key to decrypt the users' data.

Solution 3 - C++

  1. Post it as a code golf problem
  2. Wait for a solution written in J
  3. Embed a J interpreter in your app

Solution 4 - C++

For C check this out: https://github.com/mafonya/c_hide_strings

For C++ this:

class Alpha : public std::string
{
public:
    Alpha(string str)
    {
        std::string phrase(str.c_str(), str.length());
        this->assign(phrase);
    }
    Alpha c(char c) {
        std::string phrase(this->c_str(), this->length());
        phrase += c;
        this->assign(phrase);

        return *this;
    }
};

In order to use this, just include Alpha and:

Alpha str("");
string myStr = str.c('T').c('e').c('s').c('t');

So mystr is "Test" now and the string is hidden from strings table in binary.

Solution 5 - C++

Your example doesn't hide the string at all; the string is still presented as a series of characters in the output.

There are a variety of ways you can obfuscate strings. There's the simple substitution cypher, or you might perform a mathematical operation on each character (an XOR, for instance) where the result feeds into the next character's operation, etc., etc.

The goal would be to end up with data that doesn't look like a string, so for example if you're working in most western languages, most of your character values will be in the range 32-127 — so your goal would be for the operation to mostly put them mostly out of that range, so they don't draw attention.

Solution 6 - C++

Hiding passwords in your code is security by obscurity. This is harmful because makes you think you have some level of protection, when in fact you have very little. If something is worth securing, it is worth securing properly.

> PS: I know that it doesn't work > against real hacker, but it's much > better than nothing...

Actually, in a lot of situations nothing is better than weak security. At least you know exactly where you stand. You don't need to be a "real hacker" to circumvent an embedded password ...

EDIT: Responding to this comment:

> I know about pairs of keys, but it not > acceptable in this case. I refactoring > existing appication which uses > Blowfish encryption. Encrypted data > passed to server and server decrypt > data. I can't change ecryption > algorithm because I should provide > backward compatibility.

If you care about security at all, maintaining backwards compatibility is a REALLY BAD reason to leave yourself vulnerable with embedded passwords. It is a GOOD THING to break backwards compatibility with an insecure security scheme.

It is like when the street kids discover that you leave your front door key under the mat, but you keep doing it because grandpa expects to find it there.

Solution 7 - C++

This is as secure as leaving your bike unlocked in Amsterdam, the Netherlands near Central Station. (Blink, and it's gone!)

If you're trying to add security to your application then you're doomed to fail from the start since any protection scheme will fail. All you can do is make it more complex for a hacker to find the information he needs. Still, a few tricks:

*) Make sure the string is stored as UTF-16 in your binary.

*) Add numbers and special characters to the string.

*) Use an array of 32-bits integers instead of a string! Convert each to a string and concatenate them all.

*) Use a GUID, store it as binary and convert it to a string to use.

And if you really need some pre-defined text, encrypt it and store the encrypted value in your binary. Decrypt it in runtime where the key to decrypt is one of the options I've mentioned before.

Do realize that hackers will tend to crack your application in other ways than this. Even an expert at cryptography will not be able to keep something safe. In general, the only thing that protects you is the profit a hacker can gain from hacking your code, compared to the cost of hacking it. (These costs would often be just a lot of time, but if it takes a week to hack your application and just 2 days to hack something else, something else is more likely to be attacked.)


Reply to comment: UTF-16 would be two bytes per character, thus harder to recognize for users who look at a dump of the binary, simply because there's an additional byte between every letter. You can still see the words, though. UTF-32 would even be better because it adds more space between letters. Then again, you could also compress the text a bit by changing to an 6-bit-per-character scheme. Every 4 characters would then compact to three numbers. But this would restrict you to 2x26 letters, 10 digits and perhaps the space and dot to get at 64 characters.

The use of a GUID is practical if you store the GUID in it's binary format, not it's textual format. A GUID is 16 bytes long and can be randomly generated. Thus it's difficult to guess the GUID that's used as password. But if you still need to send plain text over, a GUID could be converted to a string representation to be something like "3F2504E0-4F89-11D3-9A0C-0305E82C3301". (Or Base64-encoded as "7QDBkvCA1+B9K/U0vrQx1A==".) But users won't see any plain text in the code, just some apparently random data. Not all bytes in a GUID are random, though. There's a version number hidden in GUIDs. Using a GUID isn't the best option for cryptographic purposes, though. It's either calculated based on your MAC address or by a pseudo-random number, making it reasonable predictable. Still, it's easy to create and easy to store, convert and use. Creating something longer doesn't add more value since a hacker would just try to find other tricks to crack the security. It's just a question about how willing they are to invest more time into analyzing the binaries.

In general, the most important thing that keeps your applications safe is the number of people who are interested in it. If no one cares about your application then no one will bother to hack it either. When you're the top product with 500 million users, then your application is cracked within an hour.

Solution 8 - C++

You can use a c++ library I have developed for that purpose. Another article which is much simpler to implement, won as the best c++ article of September 2017. For a more simple way to hide strings, see TinyObfuscate.

Solution 9 - C++

I was once in a similarly awkward position. I had data that needed to be in the binary but not in plain text. My solution was to encrypt the data using a very simple scheme that made it look like the rest of the program. I encrypted it by writing a program that took a string, converted all the characters to the ASCII code (padded with zeros as necessary to get a three digit number) and then added a random digit to the beginning and the end of the 3 digit code. Thus each character of the string was represented by 5 characters (all numbers) in the encrypted string. I pasted that string into the application as a constant and then when I needed to use the string, I decrypted and stored the result in a variable just long enough to do what I needed to.

So to use your example, "My strong encryption key" becomes "207719121310329211541116181145111157110071030703283101101109309926114151216611289116161056811109110470321510787101511213". Then when you need your encryption key, decode it but undoing the process.

It's certainly not bulletproof but I wasn't aiming for that.

Solution 10 - C++

The technology of encryption is strong enough to secure important data without hiding it in a binary file.

Or is your idea to use a binary file to disguise the fact that something is hidden?

That would be called steganography.

Solution 11 - C++

It's a client-server application! Don't store it in the client itself, that's the place where hackers will obviously look. Instead, add (for your new client only) an extra server function (over HTTPS) to retrieve this password. Thus this password should never hit the client disk.

As a bonus, it becomes a lot easier to fix the server later. Just send a different, per-client time-limited password every time. Don't forget to allow for longer passwords in your new client.

Solution 12 - C++

You can encode the string using some trivial encoding, e.g. xor with binary 01010101. No real protection of course, but foils the use of tools like string.

Solution 13 - C++

Here is a example of what they explained, but be aware this will be fairly simply broken by anyone thats a "hacker" but will stop kiddies with a hex editor. The example i provided simply adds the value 80 and subtracks the index from it and then makes a string again. If you where planning on storing this in a binary file then there are plenty of ways to convert a string to a byte[] array.

When you have this working in your app, i would make the "math" i used a bit more complex

To make it clear, for those not understanding.... You encrypt the string before you save it so its NOT saved in clear text. If the encrypted text is never gonna change you dont even include the encrypt function in your release, you just have the decrypt one. So when you want to decrypt the string, you read the file, and then decrypt the content. Meaning your string is never gonna be stored on file in plain text format.

You can off course also have the encrypted string stored as a constants string in your application and decrypt when you need it, choose what is right for you problem depending on the size of the string and how often it changes.

string Encrypted = EncryptMystring("AAbbBb");
string Decrypted = DecryptMystring(Encrypted);

string DecryptMystring(string RawStr)
    {
        string DecryptedStr = "";
        for (int i = 0; i < RawStr.Length; i++)
        {
            DecryptedStr += (char)((int)RawStr[i] - 80 + i);
        }

        return DecryptedStr;
    }

    string EncryptMystring(string RawStr)
    {
        string EncryptedStr = "";
        for (int i = 0; i < RawStr.Length; i++)
        {
            EncryptedStr += (char)((int)RawStr[i] + 80 - i);
        }

        return EncryptedStr;
    }

Solution 14 - C++

You can take a look at antispy C/C++ Obfuscation Library for all platforms they offer a range of obfuscation techniques.

Their string encryption will solve your problem.

Solution 15 - C++

If you store the encryption key in reverse ("yek noitpyrcne gnorts yM") and then reverse it in your code (String.Reverse), this would prevent a simple search through the binary for the text of your encryption key.

To reiterate the point made by every other poster here, however, this will accomplish virtually nothing for you in terms of security.

Solution 16 - C++

create a function that assigns your password to a static char array and returns a pointer to this function. Then run this function through a obfuscation program.

If the program does a good job. it should be impossible to read your plain text password using a hex editor to examine the program binary. (at least, not without reverse engineering the assembly language. That should stop all the script kiddies armed with "strings" or hex editors, except for the criminally insane hacker that has nothing better to waste their time on.)

Solution 17 - C++

I think you want to make it look like instructions, your example of

x[y++]='M'; x[y++]='y'; ...

Would do just that, the long sequence of repeated instructions with a little variation may stand out and that would be bad, the byte in question may get encoded in the instruction as is and that would be bad, so perhaps the xor method, and perhaps some other tricks to make that long section of code not stand out, some dummy function calls perhaps. Depends on your processor as well, ARM for example it is real easy to look at binary data and pick out the instructions from the data and from there (if you are looking for a default key) to possibly pick out what might be the key because it is data but is not ascii and attack that. Likewise a block of similar instructions with the immediate field varying, even if you have the compiler xor the data with a constant.

Solution 18 - C++

I wonder if after first obscuring it like others have mentioned, you could embed your string in an assembly block to try and make it look like instructions. You could then have an "if 0" or "goto just_past_string_assembly" to jump over the "code" which is really hiding your string. This would probably require a bit more work to retrieve the string in code (a one-time coding cost), but it might prove to be a bit more obscure.

Solution 19 - C++

Encrypt the encryption key with another code. Show an image of the other code to the user. Now the user has to enter the key that he sees (like a captcha, but always the same code). This makes it also impossible for other programs to predict the code. Optionally you can save a (salted) hash of the code to verify the input of the user.

Solution 20 - C++

I suggest m4.

  1. Store you string with macros like const string sPassword = _ENCRYPT("real password");

  2. Before build, expand macros to encrypted string with m4, so your code look like const string sPassword = "encrypted string";

  3. Decrypt in runtime environment.

Solution 21 - C++

Here's a perl script to generate obfuscated c-code to hide a plaintext password from "strings" program.

  obfuscate_password("myPassword123");

  sub obfuscate_password($) {
  
  my $string = shift;
  my @c = split(//, $string);
  push(@c, "skip"); # Skip Null Terminator
                    # using memset to clear this byte
  # Add Decoy Characters
  for($i=0; $i < 100; $i++) {
    $ch = rand(255);
    next if ($ch == 0);
    push(@c, chr($ch));
  }                     
  my $count1 = @c;
  print "  int x1, x2, x3, x4;\n";
  print "  char password[$count1];\n";
  print "  memset(password, 0, $count1);\n";
  my $count2 = 0;
  my %dict  = ();
  while(1) {
    my $x = int(rand($count1));
    $y = obfuscate_expr($count1, $x);
    next if (defined($dict{$x}));
    $dict{$x} = 1;
    last if ($count2+1 == $count1);
    if ($c[$x] ne "skip") {
      #print "  $y\n";
      print "  $y password[x4] = (char)" . ord($c[$x]) . ";\n";
    }
    $count2++;
  }
  }

  sub obfuscate_expr($$) {
    my $count  = shift;
    my $target = shift;
    #return $target;

    while(1) {

       my $a = int(rand($count*2));
       my $b = int(rand($count*2));
       my $c = int(rand($count*2));
       next if (($a == 0) || ($b == 0) || ($c == 0));
       my $y = $a - $b;
       #print "$target: $y : $a - $b\n";
       if ($y == $target) {
          #return "$a - $b + $c";
          return "x1=$a; x2=$b; x3=$c; x4=x1-x2+x3; x5= +=x4;";
       }
    } 
  }

Solution 22 - C++

One can use llvm-obfuscator (e.g. this fork) to have transparent string encryption. Setup may be kind of painful, especially if you want to integrate this in XCode (instructions available online 1, 2, but require adaptations for each new release of llvm and of XCode).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDmitriyView Question on Stackoverflow
Solution 1 - C++DmitriyView Answer on Stackoverflow
Solution 2 - C++Adam LissView Answer on Stackoverflow
Solution 3 - C++KenView Answer on Stackoverflow
Solution 4 - C++mafonyaView Answer on Stackoverflow
Solution 5 - C++T.J. CrowderView Answer on Stackoverflow
Solution 6 - C++Stephen CView Answer on Stackoverflow
Solution 7 - C++Wim ten BrinkView Answer on Stackoverflow
Solution 8 - C++Michael HaephratiView Answer on Stackoverflow
Solution 9 - C++CorinView Answer on Stackoverflow
Solution 10 - C++paviumView Answer on Stackoverflow
Solution 11 - C++MSaltersView Answer on Stackoverflow
Solution 12 - C++sleskeView Answer on Stackoverflow
Solution 13 - C++EKSView Answer on Stackoverflow
Solution 14 - C++superreeenView Answer on Stackoverflow
Solution 15 - C++MusiGenesisView Answer on Stackoverflow
Solution 16 - C++BillView Answer on Stackoverflow
Solution 17 - C++old_timerView Answer on Stackoverflow
Solution 18 - C++HarveyView Answer on Stackoverflow
Solution 19 - C++Van UitkonView Answer on Stackoverflow
Solution 20 - C++banyuduView Answer on Stackoverflow
Solution 21 - C++BillView Answer on Stackoverflow
Solution 22 - C++Alex CohnView Answer on Stackoverflow