Everyone that is going through the PCI compliance process tries to get systems, processes, whatever, out of scope. And while getting things out of scope is a good thing, it does not mean that they do not need to be assessed. And this is one of the most contentious points of a PCI compliance assessment.
One of the biggest misconceptions about the PCI compliance assessment process is that, just because an organization says that something is out of scope, does not mean that it does not have to be examined. The PCI compliance assessment process is all about trust, but verify. So, when an organization says that a particular element is out of scope, it is up to their QSA to confirm that the item is, in fact, out of scope.
Take for example network segmentation that is used to delineate an organization’s cardholder data environment (CDE). A QSA is required to confirm that the network segmentation implemented does in fact keep the CDE logically or physically separated from the rest of an organization. That confirmation process will likely review firewall rules, access control lists and other controls on the network to prove that the CDE is segregated. And going through these items can sometimes result in a lot of QSA effort, particularly as network complexity increases.
Another area where the out of scope effort can be messy is in the area of applications and whether they process, store or transmit cardholder data. Proving that an application does not store cardholder data is typically fairly straight forward. The QSA just examines the data schemas for files and databases looking for fields named credit card number or any 16 character fields. A QSA will also typically run queries against the database looking for 16 digit numbers that start with known BINs. I have been involved in a number of assessments where we have found cardholder data being stored in text and comment fields through our queries. Determining whether an application is processing or transmitting cardholder data is more complicated and problematic. It can take a quite a lot of effort to determine using an organization’s Quality Assurance or Testing facilities, but it can be accomplished.
The biggest clarification for v2.0 of the PCI DSS is that it is the responsibility of the organization being assessed to prove that their CDE is in fact accurate. This had always been the implicit case, but with v2.0 of the PCI DSS, the PCI SSC has explicitly stated this fact. Page 11 of the PCI DSS states:
“At least annually and prior to the annual assessment, the assessed entity should confirm the accuracy of their PCI DSS scope by identifying all locations and flows of cardholder data and ensuring they are included in the PCI DSS scope.”
As a result, the organization being assessed should provide proof to their QSA that they have taken an examination of all of their processes, automated and manual, and have determined what is in-scope and out of scope. The results of this self examination are used by the QSA to confirm that the CDE definition, as documented by the organization, is accurate.
This clarification has resulted in a lot of questions. The primary of which is along the lines of, “How am I supposed to prove that I have assessed my entire environment and made sure the CDE is the only place where cardholder data exists?” While the implications of this question are obvious for the Wal*Mart’s and Best Buy’s of the world, even small and midsized merchants can have difficulties meeting this requirement. And I can assure you that even the “big boys” with their data loss prevention and other solutions are not hyped on scanning every server and workstation they have for cardholder data (CHD).
For determining whether or not CHD is present in flat files on computers, there are a number of open source (i.e., “free”) solutions. While I discuss a lot of tools and share my experiences, this is not an endorsement of any particular tool.
At the simplest are the following tools.
- ccsrch – (http://ccsrch.sourceforge.net/) – If this is not the original credit card search utility, it should be. ccsrch identifies unencrypted and numerically contiguous primary account numbers (PAN) and credit card track data on Windows or UNIX operating systems. One of the biggest shortcomings of ccsrch is that it will not run over a network, so scanning multiple computers is a chore. The other big shortcoming of ccsrch is that unless the data is in clear text in the file, ccsrch will not identify it. As a result, file formats such as PDF, Word and Excel could contain CHD and may not necessarily be recognized. It has been my experience that ccsrch tosses back a high number of false positive results due to its file format limitations and therefore recognizing data that is not a PAN as a PAN.
- Find_SSNs – (http://security.vt.edu/resources_and_information/find_ssns.html) – While the file name seems to imply it only searches for social security numbers, it also searches for PANs and will do so for a variety of file formats such as Word, Excel, PDFs, etc. Find_SSNs runs on a variety of Windows and UNIX platforms, but as with ccsrch, it does not run over a network; it must be run machine by machine. Find_SSNs seems to have a very low false positive rate.
- SENF – (https://senf.security.utexas.edu/) – Sensitive Number Finder (SENF) is a Java application developed at the University of Texas. If a computer runs Java, it will run SENF so it is relatively platform independent and supports many file formats similar to Find_SSNs. That said, as with the previous tools, SENF will not run over a network, it must run on each individual machine. I have found SENF to have a much lower false positive rate than ccsrch, but not as low as either Find_SSNs or Spider.
- Spider – (http://www2.cit.cornell.edu/security/tools/) – This used to be my favorite utility for finding PANs. Spider will scan multiple computers over a network, albeit slowly and the fact that it has a propensity for crashing when run over the network. However, it also seems to have a low false positive rate that is comparable to Find_SSNs.
I still use Spider and Find_SSNs for scanning log and debug files for PANs as I have yet to find anything as simple, fast and reasonably accurate when dealing with flat text files. And yes, I use both as checks against each other for further reducing the false positive rate. It amazes me, as well as my clients, the amount of incidental and occasional CHD that we find in log and debug files due to mis-configuration of applications and vendors who forget to turn off debugging mode after researching problems.
But I am sure a lot of you are saying, “Flat files? Who stores anything in flat files these days?” And that is the biggest issue with the aforementioned open source solutions; none of them will scan a database from a table schema perspective. If the database data store does coincidentally store clear text PANs as legible text, the aforementioned utilities will find it but that is pretty rare due to data compression, indexing and other issues with some database management systems. As such, if you wanted to stay with open source, you had to be willing to use their code as a base and adapt it to scanning a particular database and table schemas unless you were willing to go to a commercial solution. That is until OpenDLP (http://code.google.com/p/opendlp/).
OpenDLP is my personal open source favorite now for a number of reasons. First, it uses Regular Expressions (RegEx) so you can use it to look not only for PANs, but a whole host of other information as long as it conforms to something that can be described programmatically such as social security numbers, driver’s license numbers, account numbers, etc. Secondly, it will also scan Microsoft SQL Server and MySQL databases. And finally, it will scan reliably over the network without an agent on Windows (over SMB) and UNIX systems (over SSH using sshfs).
At least I have gotten fewer client complaints over OpenDLP than I have for Spider for network scanning. That said, OpenDLP can still tie up a server or workstation while it scans it remotely and it will really tie up a server running SQL Server or MySQL. As such, you really need to plan ahead for scanning so that it is done overnight, after backups, etc. And do not expect to scan everything all at once unless you have only a few systems to scan. It can take a week or more for even small organizations.
But what if you have Oracle, DB/2, Sybase or some other database management system? Unless you are willing to take the OpenDLP source code and modify it for your particular data base management system, I am afraid you are only left with commercial solutions such as Application Security Inc.’s .DbProtect, Identity Finder DLP, ControlCase Data Discovery or Symantec Data Loss Prevention. Not that these solutions handle every database management system, but they do handle more than one database vendor and some handle most of them.
You should now have some ideas of how to scope your CDE so that you are prepared for your next PCI assessment.

Question, does the SAQ apply only to the CDE? Specifically, as a level 1 service provider and merchant, do we need to provide security awareness training for the employees in the CDE or every employee? Thanks.
Security awareness training, from a PCI compliance perspective, only applies to those employees that come into contact with cardholder data as part of their job functions.
That said, as a Level 1 service provider, you should not be doing an SAQ, you are required to do a Report On Compliance (ROC).
Yes on the ROC. We’re going thru the SAQ D inhouse as a heads up before we hire the QSA. Thank you and I enjoy your website. Have a nice day!
Keep me in mind as we would love a chance to bid on your ROC work.
Thank you for your post, unfortunantly you didn’t mention nothing about how can we determine whether systems or application transit card holder data.
I didn’t mention it because it is the merchant’s or service provider’s responsibility to know this information based on the applications they use and where those applications execute. This information can be obtained from a software vendor’s documentation. If you do not have the documentation, you may be able to get it over the Internet if the vendor is still in business. If you still cannot obtain the documentation, then it is probably time to get a new application.
If you are running customer developed software, then you will need to discuss this with the developer. Remember, you may have to pay your developer to get this information if you no longer have a contract with them. Also, I have started to encounter developers that hold their customers hostage over such discussions requesting exorbitant fees or just refusing to discuss this topic. In those instances, I would suggest getting a new developer.
When I was QSA I was using Card Recon, SENF & ccsrch – all of them had their pluses and shortfalls. Did not heard about OpenDLP though.
Card Recon was only commercial tool available at that time. Hope they improved from 2009.
Overall I was using different tools and approaches for different platforms (*NIX, Win, etc). Of course, most fun was with Databases.
A fork of ccsrch has been created with many new features… using the original ccsrch as the base.
https://github.com/adamcaudill/ccsrch
Thank’s for posting this. All too often I see situations where customers want to be “out of scope, out of mind”. I’ve said this before, and I will say it again – First, keep in mind that PCI DSS is about reducing risk – your risk as a merchant and to your customers and partners – to heavy cost of a breach. Managing risk is not just about PCI compliance. Its your business responsibility to you and your shareholders as a merchant. PCI compliance is not the be all and end all of reducing risk to your business when it comes to the overal card process – there are peripheral processes that need to be considered where real risk of compromise exists.
Scope reduction is a wonderful goal – but to prove you are out of scope, you need to actually look at the systems that are claimed to be of scope. It’s like a reverse of the Heisenberg principle – if you measure something, you change its behavior by the act of measuring – in this case, you need to measure something to know what it is in or out of scope first before making a decision its out of scope – there’s no scope reducing magic wand, but there are lots of technologies with true scope reduction capability to make it easier once implemented.
So while PCI controls may not apply, risk based controls may still apply. This area in particular is a big deal on the e-commerce side. When there is 100% scope reduction, e.g. using various techniques from page-integrated-encryption or page redirects and so on yielding 100% out of scope situations because the merchant is no longer handling card data at all in any form, its still important for the merchant to maintain a healthy web infrastructure to avoid the overall system from being compromised or becoming a source of an end user compromise leading to downstream breaches e.g. malware on the web servers unrelated to the payment capture such as a merchants home page. The card brands highlight this in guidance, but its often brushed over which is unfortunate and if not taken seriously, merchants will be left answering questions like “Why am I out of PCI scope yet my customers are getting the card data stolen because my web infrastructure has been compromised by viral payloads?”
My only problem with your comment is on page redirects. Look at my post on redirects and reposts (http://pciguru.wordpress.com/2011/11/12/of-redirects-and-reposts/).