01
Jan
12

Encryption Basics

NOTE: This is a revised version of my original post to reflect readers concerns regarding statements made that do not reflect best practices surrounding encryption key management.  A big thank you to Andrew Jamieson who reviewed and commented on this revised posting.

During the last couple of years, I have run into more and more questions regarding encryption and encryption key management than I thought existed.  As a result, I have come to the realization that, for most people, encryption is some mystical science.  The stories of the Enigma machine and Bletchley Park have only seemed to add to that mysticism.  Over the years, I have collected my thoughts based on all of the questions and developed this distilled and very simplified version of guidance for those of you struggling with encryption.

For the security and encryption purists out there, I do not represent this post in any way, shape, or form as the “be all, to end all” on encryption.  Volumes upon volumes of books and Web sites have been dedicated to encryption, which is probably why it gets the bad reputation it does as the vast majority of these discussions are about as esoteric as they can be.

In addition, this post is written in regards to the most common method of encryption used in encrypting data stored in a database or file and that is the use of an encryption algorithm against a column of data or an entire file.  It does not cover public key infrastructure (PKI) or other techniques that could be used.  So please do not flame me for missing your favorite algorithm, other forms of encryption or some other piece of encryption minutiae.

There are all sorts of nuances to encryption methods and I do not want to cloud the basic issues so that people can get beyond the mysticism.  This post is for educating people so that they have a modicum of knowledge to identify hyperbole from fact.

The first thing I want to clarify to people is that encryption and hashing are two entirely different methods.  While both methods obscure information, the key thing to remember is that encryption is reversible and hashing is not reversible.  Even security professionals get balled up interchanging hashing and encryption, so I wanted to make sure everyone understands the difference.

The most common questions I get typically revolve around how encryption works.  Non-mathematicians should not need to know how an encryption algorithm works, that is for the experts that develop and prove that they work.  In my opinion, unless you are a mathematician studying cryptography, I recommend that people trust the research conducted by the experts regarding encryption algorithms.

That is not to say you should not know strong cryptography from weak cryptography.  I am just suggesting that the underlying mathematics that defines a strong algorithm can be beyond even some mathematicians, so why we expect non-mathematicians to understand encryption at this level is beyond me.  My point is that the algorithms work.  How they work is not and should not be a prerequisite for management and even security professionals to using encryption.

This leads me to the most important thing people need to know about encryption.  If you only take away one thing from this post, it would be that strong encryption comes down to four basic principles.

  • The algorithm used;
  • The key used;
  • How the key is managed; and
  • How the key is protected.

If you understand these four basic principles you will be miles ahead of everyone else that is getting twisted up in the details and missing these key points.  If you look at PCI requirement 3, the tests are structured around these four basic principles.

On the algorithm side of the equation, the best algorithm currently in use is the Advanced Encryption Standard (AES).  AES was selected by the United States National Institute of Standards and Technology (NIST) in 2001 as the official encryption standard for the US government.  AES replaced the Data Encryption Standard (DES) that was no longer considered secure.  AES was selected through a competition where 15 algorithms were evaluated.  While the following algorithms were not selected as the winner of the NIST competition, Twofish, Serpent, RC6 and MARS were finalists and are also considered strong encryption algorithms.  Better yet, for all of you in the software development business, AES, Twofish, Serpent and MARS are open source.  Other algorithms are available, but these are the most tested and reliable of the lot.

One form of DES, Triple DES (3DES) 168-bit key strength, is still considered strong encryption.  However how long that will remain the case is up for debate  I have always recommended staying away from 3DES 168-bit unless you have no other choice, which can be the case with older devices and software.  If you are currently using 3DES, I would highly recommend you develop a plan to migrate away from using it.

This brings up another key take away from this discussion.  Regardless of the algorithm used, they are not perfect.  Over time, encryption algorithms are likely to be shown to have flaws or be breakable by the latest computing power available.  Some flaws may be annoyances that you can work around or you may have to accept some minimal risk of their continued use.  However, some flaws may be fatal and require the discontinued use of the algorithm as was the case with DES.  The lesson here is that you should always be prepared to change your encryption algorithm.  Not that you will likely be required to make such a change on a moment’s notice.  But as the experience with DES shows, what was considered strong in the past, is no longer strong or should not be relied upon.  Changes in computing power and research could make any algorithm obsolete thus requiring you to make a change.

Just because you use AES or another strong algorithm does not mean your encryption cannot be broken.  If there is any weak link in the use of encryption, it is the belief by many that the algorithm is the only thing that matters.  As a result, we end up with a strong algorithm using a weak key.  Weak keys, such as a key comprised of the same character, a series of consecutive characters, easily guessed phrase or a key of insufficient length, are the reasons most often cited as why encryption fails.  In order for encryption to be effective, encryption keys need to be strong as well.  Encryption keys should be a minimum of 32 characters in length.  However in the encryption game, the longer and more random the characters in a key the better, which is why you see organizations using 64 to 256 character long random key strings.  When I use the term ‘character’ that can be printable characters of upper and lower case alphabetic as well as numeric and special characters.  But ‘character’ can also include hexadecimal values as well if your key entry interface allows for hexadecimal values to be entered.  The important thing to remember is that you should ensure that the values you enter for your key are as hard to guess or brute force as maximum key size of the algorithm you are using.  For example, using a seven character password to generate a 256 bit AES key does not provide for the full strength of that algorithm.

This brings us to the topic of encryption key generation.  There are a number of Web sites that can generate pseudo-random character strings for use as encryption keys.  To be correct, any Web site claiming to generate a “random” string of characters is only pseudo-random.  This is because the character generator algorithm is a mathematical formula and by its very nature is not truly random.  My favorite Web site for this purpose is operated by Gibson Research Corporation (GRC).  It is my favorite because it runs over SSL and is set up so that it is not cached or processed by search engines to better guarantee security.  The GRC site generates 63 character long hexadecimal strings, alphanumeric strings and printable ASCII strings, not numerical strings provided by other random and pseudo-random number generator sites.  Using such a site, you can generate keys or seed values for key generators.  You can combine multiple results from these Web sites to generate longer key values.

In addition, you can have multiple people individually go to the Web site, obtain a pseudo-random character string and then have each of them enter their character string into the system.  This is also known as split key knowledge as individuals only know their input to the final value of the key.  Under such an approach, the key generator system asks each key custodian to enter their value (called a ‘component’) separately and the system allows no key custodian to come into contact with any other custodian’s component value.  The key is then generated by combining the entered values in such a way that none of the individual inputs provides any information about the final key.  It is important to note that simply concatenating the input values to form the key does not provide this function, and therefore does not ensure split knowledge of the key value.

Just because you have encrypted your data does not mean your job is over.  Depending on how your encryption solution is implemented, you may be required to protect your encryption keys as well as periodically change those keys.  Encryption key protection can be as simple as storing the key components on separate pieces of paper in separate, sealed envelopes or as high tech as storing them on separate encrypted USB thumb drives.  Each of these would then be stored in separate safes.

You can also store encryption keys on a server not involved in storing encrypted data.  This server should not be any ordinary server as it needs to be securely configured and very limited access.  Using this approach is where those key encryption keys (KEK) come into play.  The way this works is that each custodian generates a KEK and encrypts their component with the KEK.  Those encrypted components can then be placed in an encrypted folder or zip file where computer operations have the encryption key.  This is where you tend to see PGP used for encryption as multiple decryption keys can be used so that in an emergency, operations can decrypt the archive and then the key custodians or their backups can decrypt their key components.

Finally, key changes are where a lot of organizations run into issues.  This is because key changes can require that the information be decrypted using the old key and then encrypted with the new key.  That decrypt/encrypt process can take days, weeks even years depending on the volume of data involved.  And depending on the time involved and how the decrypt/encrypt process is implemented, cardholder data can potentially be decrypted or exposed because of a compromised key for a long period of time.

The bottom line is that organizations can find out that key changes are not really feasible or introduce more risk than they are willing to accept.  As a result, protection of the encryption keys takes on even more importance because key changes are not feasible.  This is another reason why sales of key management appliances are on the rise.

That is encryption in a nutshell, a sort of “CliffsNotes” for the non-geeky out there.  In future posts I intend to go into PKI and other nuances to encryption and how to address the various PCI requirements in requirements 3 and 4.  For now, I wanted to get a basic educational foundation out there for people to build on and to remove that glassy eyed look that can occur when the topic of encryption comes up.


6 Responses to “Encryption Basics”


  1. January 30, 2012 at 4:01 PM

    I disagree with recommendations that people obtain key values from online sources, and indeed I would probably consider this a non-compliance to the split knowledge requirements of PCI DSS (and certainly for PCI PIN). I am also concerned that there is very little around dual control and split knowledge. Simply having two values is not split knowledge – the way in which those values are managed and combined is the important part. This is where I see most people get things wrong. For example, storing a plaintext encryption key on paper, or a USB stick is a violation of split knowledge. Equally, taking two ‘parts’ of a key and concatenating them to produce a final key is also a violation of split knowledge.

    • January 30, 2012 at 5:38 PM

      Andrew, I agree with your comments regarding the managing of the keys, but I have to take issue with your idea of what constitutes split knowledge.

      Two different people obtaining two different 63 character values is not split knowledge? The GRC site generates independent values for each person that goes to the page which is why I recommend it. So, exactly what, in your opinion, is split knowledge then if it is not independently derived values that are not shared? This is what irritates people is that experts make encryption difficult with “cult like” mystic rites. The Masons have it simpler.

      If the key parts are stored separately they sure as shooting are split knowledge. I went back and reread the post and I was pretty clear on separate safes. However, I suppose I could have been clearer on separate USB drives and other media.

      And arguing about keeping the keys on paper in separate safes is out of compliance, how is that not compliant? It has been done that way in the banking industry since forever and I don’t hear about ATM network encryption keys being compromised every week which, based on your comment, one would think ATMs would be compromised on a regular basis. Not everything in this world needs a high tech solution. In fact, sometimes the more low tech the solution, the more secure. When was the last time you heard of a manual combination safe or safety deposit box being hacked from a computer?

      Key management systems typically take one part of the key, refresh the screen and then the second key is entered and so on. So, I could have been much clearer on that point. The custodians enter each of their values without the other seeing any of the other values and maybe that is what is the burr under your saddle. What is done with those values after the fact by the system, who cares? The order in which they are entered only matters if the system that generates them is tracking them for entry as in some older ATM and POS systems. However, in today’s environment, most key generators don’t care about the order in which the keys get entered because the system uses them as “seeds” to generate the actual key from those values leaving the original values worthless.

      • January 30, 2012 at 7:20 PM

        I certainly don’t mean to make encryption or key management ‘cult like’ in any way, but the devil is in the details, and I want to be sure that people understand where the details provided are either incorrect or insufficient. To address your questions:

        ‘What in your opinion is split knowledge’:
        From the horses mouth, so to speak, we have from the PCI glossary document

        “Split Knowledge: Condition in which two or more entities separately have key components that
        individually convey no knowledge of the resultant cryptographic key.”

        Unfortunately, in your example, there is no way to confirm that the GRC site is not caching the keys. Has this site been compromised? What is the condition of the PC being used to download the key(s)? Does that cache the web pages? There are too many unknowns for anyone to confirm that there is not a single person who could obtain the full key value. It _could_ be OK, but how do you confirm this?

        I would recommend people use a LiveCD version of Linux with OpenSSL, and use this to generate their keys. This can then be vetted, proceduralised, and audited to confirm that the key components have not been compromised. This is not possible when using an online system.

        Unfortunately I do not agree that you were clear on separate safes. I quote:

        ” Encryption key protection can be as simple as storing the keys on paper in a sealed envelope or on an encrypted USB thumb drive in a safe to as complex as investing in a key management appliance.”

        Important words here being “keys” (plural, and not components), “a sealed envelope” (singular), and “a safe” (singular). Keeping two full length components in separate lock-boxes or in separate safes is an acceptable way to meet the split knowledge requirements. Keys should only ever exist in one of three forms: Plaintext in an SCD, encrypted with a key of equal or greater strength, or as full length components that are combined using an acceptable key generation method (such as XOR or Shamir secret sharing).

        Additionally, the comment of “what is done with those values after the fact by the system, who cares?” hits to a fundamental point of key management. The way in which the key components are combined is _essential_ to the security of the system. If they are concatenated, then that does not provide split knowledge – referencing back to the PCI glossary definition, the two individuals would both know half of the key.

        One point we do agree on is that encryption and key management does not need to be complicated. However, this does not mean that anything can be done – it just means that simple solutions can work well. Extrapolating from my comments to say that I am disagreeing with this statement is disingenuous at best. I have worked with a number of clients who have very simple key management systems, and we implement full dual control and split knowledge over our GPG encryption systems here as well (another point we disagree on is that PGP/GPG does not necessarily provide compliant key management out of the box).

      • February 1, 2012 at 5:51 AM

        I accept your comments and agree that I need to revise my post. This is what one gets when they proof and write. Unfortunately, that will have to wait until the weekend. But I thank you for the feedback.

        In regards to GRC, those pages are not cached. Steve Gibson is very picky about that and has a whole write up on the page that generates the keys and what he has done to ensure that the site is not cached as well as how the keys are generated. This is why I recommend it over other similar pages that I know nothing about.

        While all of your comments are accurate regarding the key generation via the Internet, at some point you need to get out of the “Stalin’s doctors” mode of paranoia and accept some risk. That is not to imply that every Internet key generator can be trusted for the reasons you identify. This is why I recommend the GRC site because I know that it can be trusted. However, life is risky no matter what and all of the things you point out could happen, but at what frequency? At the end of the day, for a small business owner that needs a cheap or free solution, this is a risk you will need to accept.

        Thanks again.

  2. January 18, 2012 at 10:16 AM

    “To be correct, any Web site claiming to generate a “random” string of characters is only pseudo-random”

    Overall a good primer, but that statement is just flat out incorrect.

    http://www.random.org/

    That one is based on atmospheric noise. I’ve seen other sites that provide random bits based on radioactive decay, etc. If anything in the universe can be called truly random, these can.

    • January 18, 2012 at 6:17 PM

      Thank you for the comment and I totally agree about random.org. However, from a general comment, 95%+ of the “random” number generators on the Web are not truly random.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Announcements

The Encryption Basics (http://pciguru.wordpress.com/2012/01/01/encryption-basics/) posting has been updated to reflect changes recommended by Andrew Jamieson to improve the accuracy of the post.

At the bottom of this sidebar, you can now subscribe to the PCI Guru blog through either RSS or email. Pick your preferred subscription method and keep up to date with the PCI Guru.

Calendar

January 2012
M T W T F S S
« Dec   Feb »
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Enter your email address to subscribe to the PCI Guru blog and receive notifications of new posts by email.

Join 411 other followers


Follow

Get every new post delivered to your Inbox.

Join 411 other followers