Page 65 - Hands-On Bug Hunting for Penetration Testers
P. 65

Preparing for an Engagement                                                 Chapter 3

            Downloading the JavaScript

            There's one more step before we can point this at a site d we need to download the actual
            JavaScript! Before analyzing the source code using our TDBOKT wrapper, we need to pull it
            from the target page. Pulling the code once in a single, discrete process (and from a single
            URL) means that, even as we develop more tooling around attack-surface reconnaissance,
            we can hook this script up to other services: it could pull the JavaScript from a URL
            supplied by a crawler, it could feed JavaScript or other assets into other analysis tools, or it
            could analyze other page metrics.
            So the simplest version of this script should be: the script takes a URL, looks at the source
            code for that page to find all JavaScript libraries, and then downloads those files to the
            specified location.

            The first thing we need to do is grab the HTML from the URL of the page we're inspecting.
            Let's add some code that accepts the VSM and EJSFDUPSZ CLI arguments, and defines our
            target and where to store the downloaded JavaScript. Then, let's use the SFRVFTUT library
            to pull the data and Beautiful Soup to make the HTML string a searchable object:

                #!/usr/bin/env python2.7
                import os, sys
                import requests
                from bs4 import BeautifulSoup

                url = sys.argv[1]
                directory = sys.argv[2]

                r = requests.get(url)
                soup = BeautifulSoup(r.text, 'html.parser')

            Then we need to iterate over each script tag and use the TSD attribute data to download the
            file to a directory within our current root:

                for script in soup.find_all('script'):
                    if script.get('src'): download_script(script.get('src'))















                                                    [ 50 ]
   60   61   62   63   64   65   66   67   68   69   70