The Harvester

By: The_Eccentric

Harvester is a great open source intelligence tool (OSINT) for getting emails and user names from public sources such as Google or Linkedin.
When and how is this valuable to the Social Engineering and Intelligence world?
- When conducting passive reconnaissance about you target trying to build a valid target profile which includes a list of user names and email addresses.
- Emails and user names are similar to your real name.  They can be used to identify you in the virtual world and or in your workplace. They can lead to identifying your friends, your family, and your social groups. So you can see how valuable it is to have this information in your target profile. Think of email accounts and user names as almost the equivalent to a social security number in the real world. Extremely valuable “if you know what to do with it ;)”
For mining of email accounts go for the conventional choices first:

Personal

@gmail, @hotmail, @aol, @yahoo, etc.  Search the internet for common first and last names for both male and female and use variations of these (first.last, first_last, first initial+last, last+first initial)

Work

Use same name approach as above but also add common titles such as admin, abuse, administrator, etc.
User Created – These are from user-groups created expressions username@archlinuxpwns.com
Some good sites you would want user names from in order to  build a profile would be:
Facebook, Twitter, Blippy, MySpace, Linkedin, Friendster
OK lets get started with The Harvester
The application can be found (if your using Backtrack4 http://www.backtrack-linux.org/downloads )
in /pentest/enumeration/google/theharvester/theharvester.py
To execute simply navigate to the /pentest/enumeration/google/theharvester/ directory and enter ./theharvester.py

If your not using Backtrack 4, you can download it directly from

http://www.edge-security.com/theHarvester.php

Simply navigate to the /tmp/ directory and execute

wget http://www.edge-security.com/soft/theHarvester-1.5.tar

use tar xvf theHarvester-1.5.tar to open the package.

This creates the following directory/files:

theHarvester/

theHarvester/COPYING

theHarvester/LICENSES

theHarvester/README

theHarvester/theHarvester.py

Now move these files to where you would like them to reside and from which you will be executing them going forward.

*Which ever route you take once you’ve got it done and opened should be looking like this.*

I’m going to choose a basic look up to show you how simple but powerful this tool is lets look at bestbuy.com

This was interesting just off of limiting to 500 query’s

I pulled the 6 email addys and went with a human like one Ballard -@ – bestbuy.com

Plugged ballard@bestbuy.com into http://www.pipl.com

Searched around a little and surprisingly I found something :)

Using some simple searches and reading I was able to determine that Ballard@bestbuy.com was Shari.Ballard@bestbuy.com who is the Executive Vice President, Retail Channel. As you can see  this is a big find off of something simple and easy but yet powerful. Going from there I was able to determine the email addresses of most of the senior executives at BestBuy. I also determined that the email naming convention for bestbuy.com is firstname.lastname.

From this you can take many routes. Add these email addresses to strengthen your target profile, create a good list for spear fishing attacks going for senior executives in the company, as well as having some valuable background information to use should you every get inside the corporate building for further reconnaissance or social engineering.

Passive Reconnaissance Flowchart

Getting deeper into uses of theHarvester assume we have established a Target (Target stage) which is schools and which will be broken down further later in our during process.  For the sake  of relevance I’m doing three (3) schools around my home area, and to see how deep down the flow chart we can get. Information will be gathered to build a profile for these schools and we will try to transcend from passive reconnaissance to active reconnaissance.

Just like in our example with BestBuy we plug in are schools into the harvester (Tool stage) and see what results we obtain.

As shown above we ran the schools names through the harvester(Source Information stage), just using the end tag of .edu, as you can see lots of email addresses are listed.  This is of course typical of the results you can expect from a target like a school.  From a social engineering, security, and intelligence perspective this is a gold mine of information for you to capitalize on.

The next step is to document all three categories. A very good multitasking note taking application is Basket Note Pads

http://basket.kde.org/index.php

Our next step is both manual and time consuming work where we plug all these email addresses into our applications (Plug-in stage) Pipl, Facebook, Twitter, and Blippy.  As you can see we have a large number of targets we can choose from but for the sake of demonstration and brevity we will just choose two of them.

From the Louisville section we will go with the one at the top of the list allan.tasman@lousville.edu. Following the flow chart where going to plug this into all four (4) of our applications and see what we get.  Based on my experience I can tell you that you will want to focus on the output from pipl.com to determine where to go next.

While we did not get anything back from Twitter, Facebook, or Blippy but with Pipl and we got something to work with and even a picture.

By going to the link seen above, we can determine that he’s a part of the World Psychiatric Association (WPA) Executive Committee this is a very important piece of information.

Scrolling down further we find our guy. So not only do we get important information on him but more importantly we now have information on others in executive positions. This is valuable information for using the Social Engineering Toolkit, also a part of the BackTrack 4 distribution.

Trying one more address, gavin.arteel@lousville.edu and following our flow chart, we put this address into Pipl, and since we have a first and last name, we will try something different on this one. So we plug in his first, last name and since we also have his location we put that into the name section.

This is the way I prefer using Pipl. It acts as a hub and then by breaking it down further on this search we get a little better information and we find a Facebook profile.

By launching the link and entering Facebook, you can see his “Networks” section validates our email and the location of University of Louisville.

We see  this guy has some capacity in him having a Philosophical quote in Latin and kinda humorous “Bibo ergo sum” (I drink, therefore I am)

From just an email address we have done pretty well building up a profile on someone. From here you can go further into passive reconnaissance of what friends he has to gathering additional intelligence. This enables us to gain a perspective on what kind of guy he is,  what kind of lifestyle leads, or tried to portray, which would help you greatly in building up a fake profile enabling us to move to the next step of direct reconnaissance, which will be covered later.

Wrapping up you can see as I followed the passive recon flow chart how i went from just a school and email addy got tons of Intel to build up leverage and a profile to start planning an attack using theHarvester. This is not a write law way of doing it I’m just trying to simply it to where you don’t have to an Recon Ninja to do.

Tools that where used here

theHarvester

Pipl

Facebook

All where available freely by the internet another reason for loving Open Source Intelligence (OSINT)

This should show you why the information gathering phase is one of the most important parts of a penetration test. But the most over looked most times.

I look forward to further breaking down other framework tools in the near future.

Note: theHarvester, Email and usernames finder., Christian Martorella Copyright © 2003-2008 Edge-Security Retrieved 4/7/2010

Come join us on irc: irc.freenode.net #SEunited

About these ads