Wednesday, February 10, 2010

Don't Search For Your Social Security Number, Ever!

Last night I listened to the February 3 OWASP Podcast, and one of the topics discussed was why you shouldn't be searching for your SSN via Google. The reason this is bad should be fairly obvious, but having this information in Google's hands and in the hands of someone with malicious intentions is slightly different. Unless you can (or want to risk) compromising Google's systems, your search requests simply end up within their stockpile of information and hidden from the rest of the world. Or do they??

This is something I've been thinking about for a bit, but their discussion certainly got me thinking harder about the "how" portion. Then it hit me. Google Adwords and Analytics!! The big disclaimer here is that I have not, and will not attempt this theory. While I populated this information in Adwords, I did not let this ad campaign run nor did I allow it to begin collecting numbers. I am merely sharing this idea with all of you to hopefully get your brains spinning around the possibilities of such an attack. The potential attack has several components- the ability to monitor the frequency and identity of social security numbers that are requested through search engines, and the ability to launch targeted phishing attacks based on those searches. My goal is not to enable the criminals of the world to carry out such evil deeds, but rather to educate the community and spread the word about the possibilities of such activities taking place.

The very nature of Google's business model ensures that they wouldn't consider this an issue, and while I don't fully support their level of integrity when dealing with our data, in this case I have to agree with them. While you read this posting, think back to when Google's CEO Eric Schmidt stated the following:

"If you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place, but if you really need that kind of privacy, the reality is that search engines including Google do retain this information for some time, and it's important, for example that we are all subject in the United States to the Patriot Act. It is possible that that information could be made available to the authorities."


Now onto the evil....

Step 1 Generate a List of Every Possible Social Security Number.

I wrote a quick Perl script to do this, using both the regular 9-digit format as well as with dashes inserted. If you notice, the ssn array gets cleared at every 100,000 numbers and prints to an output file. I ran the script without this and it slowed my box to a crawl, since there were upwards of two billion values being held in memory. Below is the code to do this. If you run it on a Windows machine, simply change /usr/bin/perl at the top to c:\perl\bin or wherever your Perl directory is located.

#!/usr/bin/perl
use strict;

open(OUTFILE, ">ssnz.txt");
my @ssnz_array;
my $elementsInArray = 0;
my $ssnWithDashes = "";

print "Now generating SSNz\n";

for (my $count = 100000000; $count < 1000000000; $count++)
{
$ssnWithDashes = $count;
$ssnWithDashes =~ s/(\d\d\d)?(\d\d)?(\d\d\d\d)/$1-$2-$3/;

if ($elementsInArray != 100000)
{
push(@ssnz_array, "$count\n", "$ssnWithDashes\n");
$elementsInArray++;
}

else
{
push(@ssnz_array, "$count\n", "$ssnWithDashes\n");
print OUTFILE @ssnz_array;
@ssnz_array = ();
$elementsInArray = 0;
print "Completed $count\n";
}
}

print OUTFILE @ssnz_array;

Step 2- Create Google Adwords and Analytics Accounts

In this step, several things must be done. The first is to create a new ad campaign and ad group using Google Adwords. I named my new campaign “SSNs Are Awesome”, and the ad group to display my ad “All your SSNs Are Belong To Me”.

Within this ad group, I created an ad that would be displayed every time an SSN is searched for on Google. There will obviously be a certain level of false positives, but somewhere within the noise you should find some treasures. The goal of this ad is to lure people searching for their SSNs to visit this website. If someone is searching for their SSN, they are either curious as to whether or not its floating around on the web, or they may be worried about the possibility that their identity has been stolen. Using fear as our bait here, my example ad says “You may be at risk because your Social Security number was found!”


The next step to this process is uploading the SSNs.


Through a bit of trial and error, I discovered that Google imposes a limit of 2,000 keywords per ad group. Therefore, you would have to create a bunch of ad groups in order to cover the entire range of SSNs. I'm sure that with a bit of tinkering its possible to automate this process though.


The final step is to also create a Google Analytics account. Within Analytics, you want to add a site to monitor (which gets created in Step 3), and take the Analytics Javascript and embed it within the HTML body of your main page. The reason for adding Analytics to this page is so that you would be able to tie the actual strings used in the search engine to the requests to your page, even outside of what may trigger an Adwords ad to be displayed or clicked. The result- a lot more information to harvest and even more potential personally identifiable information that can be tied to an SSN.


Step 3- Create A Site To Collect Additional Information

Simply having an SSN isn't enough. A real criminal would want additional information to tie the SSN to a real person. Getting personally identifiable information is the primary goal of the site that would be getting linked to through the previously created ad.

This is where things get exciting. Not only can we extract information through the Adwords statistics, we can also use the Referer header from the Google search to tie the SSN to the visitor. Even if they don't enter any information, someone evil would now know their SSN and their source IP address, allowing them to conduct further recon if they chose to.

Below is an example of how your search request will always be sent via the Referer header from Google to any site you traverse to through your search results. The sensitive example SSN information is highlighted in red:


What types of evil things could be done from such a site? We could simply use the SSN that was parsed through the Referer header, and create a convincing page displaying their SSN to them and politely asking them to give us more personal information so we can "determine if their SSN is on the web and remove it for them".


As a backup plan in case they are either too lazy to enter this information or if they think the site is suspicious, perhaps dropping a keylogger or something of that nature on the box would be a good idea. Since the SSN is already known at this point, access to their personal accounts such as email and social networking sites would more than likely give an attacker all of the information required to fill in the blanks.

Moral To The Story- Don't Search For Sensitive Stuff!

While Google's business tactics and integrity can and always will be questioned, the bottom-line is that exposing this information to the world should be minimized. I think all of us have at one point or another searched for things we shouldn't have. If those of us that should "know better" do this stuff, you can be confident that those who don't understand what really happens with their data are doing this stuff frequently. Using the tactics described above, you could substitute any type of data to achieve similar results. Corporate espionage, personal spying, identity theft, and many other scenarios can easily unfold by using a little trial, error, and Google's massive horde of personal information.

-Jack



5 comments:

Pavel said...

Seems like the lesson here is search but NEVER click any results!

Raf said...

Well done Jack. ...an ingenious little way to harvest people's info. Makes the point certainly.

Did you do it?

Jack Mannino said...

@Pavel- Certainly clicking on "the bait" makes it much worse. Nevertheless, this technique can certainly be used in scenarios other than for SSNs. Of course, if someone searches for "123-45-6789 Jim Smith", well they've done a lot of the legwork for you haven't they =)

@Raf- I didn't. Come on, I have SOME morals don't I?? (don't answer that) =)

Todd said...

Hopefully perspective employers, landlords, etc., are not doing these searches for us...

cdman83 said...

Interesting blogpost. One somewhat off-topic (Perl) tip would be that you can separate digits in a number with "_" in Perl for easier readability. So your cycle could become:

for (my $count = 100_000_000; $count < 1_000_000_000; $count++)

Which is easier to read.

Regards.