Here is something that troubles me. Mainly because I think a lot of QSAs are not looking at cold and warm disaster recovery sites since they are technically out of scope in the PCI DSS if they do not process, store or transmit cardholder data. However, this is something that needs to be looked at because under the PCI DSS, a disaster recovery site is in-scope once it is processing, storing or transmitting cardholder data.
First, let us get the terminology straight as to what we are talking about. A disaster recovery ‘cold site’ is defined as a site that has physical and environmental controls, but does not contain any equipment. Essentially, it is an empty data center. It may have racks installed with electrical connections, but the racks are empty or only contain enough infrastructure to allow for basic connectivity from the racks to the telecommunications point-of-presence (POP). An organization may have backup circuits installed to this facility, but they too are not in use. As a result, a cold site is not in-scope because it does not process, store or transmit cardholder data.
At the other end of the spectrum, a disaster recovery ‘hot site’ is ready to go at a moment’s notice. Applications and data are replicas of what is running in the production data center. Should a failure occur at the production data center, the hot site will immediately step in and take over usually without users knowing that a failure has occurred. Obviously, a hot site is always in scope for PCI compliance as cardholder data is processed, stored or transmitted at the hot site just as it is at the production data center.
And in between cold and hot sits the disaster recovery ‘warm site’. Warm sites have servers, data storage and infrastructure all ready to go, but the equipment is not running and no data is available. It may or may not be a carbon copy of the production data center from an equipment standpoint, but it will have enough servers and infrastructure to be ready to process as quickly as possible. The applications may or may not be already installed on the servers. And data is usually expected to be restored from backup media.
Where things get messy is with just how much preparation is being done at the warm site. Remember, disaster recovery sites are only in-scope if they process, store or transmit cardholder data. And that is where some QSAs get in trouble; they neglect to ask the critical question of whether or not the warm site has cardholder data. A lot of organizations are now replicating data between their production data center and their warm site. It is cheaper, safer and faster to have a replica available than to try and recover data from backup media. Particularly when you have invested in storage area network (SAN) farms at both locations. SANs make the replication process easy and seamless with the only limitation being the time it takes to replicate over the connection between the two sites.
Regardless of what the PCI DSS states, I believe that all disaster recovery sites need to be assessed for PCI compliance regardless and here is why. The PCI DSS states that disaster recovery sites are not in-scope unless they process, store or transmit cardholder data. However, in the same breath, the PCI DSS states that once a disaster recovery site is activated, the site is in-scope and is required to comply with the PCI DSS requirements just as the production data center complied. So, how does one know that their disaster recovery cold or warm site will comply with the PCI DSS if it is never assessed? You do not know it will comply unless you assess it. How is that for a Catch-22? That is why I contend that all disaster recovery sites should be assessed whether they process, store or transmit cardholder data or not.
The bottom line here is that if you are not in compliance when you activate your disaster recovery site, you can be fined for non-compliance with the PCI DSS. And if you think that you can get a ‘by’ on compliance by saying you were in the midst of recovering from a disaster, think again, particularly if a breach occurs during your recovery. So, I highly recommend that all organizations make sure that their disaster recovery sites are and will be PCI compliant once they are active and that they are assessed by your QSA.
UPDATE: A number of people pointed out to me that it might be unreasonable to assess the disaster recovery site annually and I would agree. However, you should assess it at least once or if any changes occur that might effect the control environment at the site.
Well, at least the risk assessment should identify the risk of a cold site not being compliant and apply a relevant control – which may or may not require a full assessment of the site – it could just involve a good solid recovery plan that assures the rebuild at the cold site will be test confirmed to the same standards as production before its switched on – its for the QSA to decide whether the control is adequate. That will depend on the likely hood of a disaster and the impact it would have on the given scenario. I can’t believe there are QSA’s who simply don’t ask if the cold site processes or contains CHD!
When it’s technically out of scope until it’s in use, most QSAs don’t bring it up as their clients will argue ad-nauseum that it is not in scope.
PCI calls for annual risk assessment of the environment. If we have a DR hot site, should we look to perform risk assessment every year? Like performing full PCI assessment of DR site every year may not be practical, performing risk assessment of the DR site every year may not be practical. Your thoughts? Thank you.
As I said in the post, technically, a DR site is not in scope until it is activated. However, that is the rub. How do you know if your DR site will be compliant unless you assess it? While an annual assessment might not be feasible, as long as you assess it whenever changes occur to it or the PCI DSS and at least every three years, you should be fine.