Page 65 - Hands-On Bug Hunting for Penetration Testers
P. 65
Preparing for an Engagement Chapter 3
Downloading the JavaScript
There's one more step before we can point this at a site d we need to download the actual
JavaScript! Before analyzing the source code using our TDBOKT wrapper, we need to pull it
from the target page. Pulling the code once in a single, discrete process (and from a single
URL) means that, even as we develop more tooling around attack-surface reconnaissance,
we can hook this script up to other services: it could pull the JavaScript from a URL
supplied by a crawler, it could feed JavaScript or other assets into other analysis tools, or it
could analyze other page metrics.
So the simplest version of this script should be: the script takes a URL, looks at the source
code for that page to find all JavaScript libraries, and then downloads those files to the
specified location.
The first thing we need to do is grab the HTML from the URL of the page we're inspecting.
Let's add some code that accepts the VSM and EJSFDUPSZ CLI arguments, and defines our
target and where to store the downloaded JavaScript. Then, let's use the SFRVFTUT library
to pull the data and Beautiful Soup to make the HTML string a searchable object:
#!/usr/bin/env python2.7
import os, sys
import requests
from bs4 import BeautifulSoup
url = sys.argv[1]
directory = sys.argv[2]
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
Then we need to iterate over each script tag and use the TSD attribute data to download the
file to a directory within our current root:
for script in soup.find_all('script'):
if script.get('src'): download_script(script.get('src'))
[ 50 ]