Let’s say we have a packet capture file (.pcap) and we want to get as much information out of it as possible. One option could be wireshark and its command line version tshark. Using the latter we will be able to manipulate and format the output using tools like sed, grep, awk…
Since we are dealing with mostly http traffic we may be interested in the sites that have been visited. To obtain this information we can use the http.host field and then a bit of sorting and this will show us the top 10 sites.
1 2 |
tshark -T fields -e http.host -r tor.pcap > dns.txt cat dns.txt | sort | uniq -c | sort -nr | head |
1 |
tshark -R 'http contains "User-Agent:"' -T fields -e http.user_agent -r tor2b.pcap | sort | uniq -c | sort -nr | less |
The option -R allows us to define display filters, in the same way we would in wireshark. You can find a list of useful display filters here.
Another interesting bit of data are email addresses, which we can extract by using a regexp on the raw data.
1 2 |
tshark -r tor.pcap -R "data-text-lines" -T fields -e text > alldata.txt grep -Eio '\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b' alldata.txt | sort | uniq |
We can also get a list of all the requested URLs (via the GET method):
1 |
tshark -r http-traffic.pcap -T fields -e http.host -e http.request.uri -Y 'http.request.method == "GET"' | sort | uniq | less |
Don’t forget to take a look at the official documentation.