Thursday, March 23, 2023
HomePythonReverse Engineering Fb API: Non-public Video Downloader

Reverse Engineering Fb API: Non-public Video Downloader


Welcome again! That is the third publish within the reverse engineering collection. The primary publish was reverse engineering Soundcloud API and the second was reverse engineering Fb API to obtain public movies. On this publish we are going to check out downloading personal movies. We’ll reverse engineer the API calls made by Fb and can strive to determine how we will obtain movies within the HD format (when accessible).

Step 1: Recon

The very first step is to open up a personal video in an incognito tab simply to ensure we can’t entry it with out logging it. This ought to be the response from Fb:

Image

This confirms that we can’t entry the video with out logging in. Generally that is fairly apparent however it doesn’t damage to verify.

We all know of our first step. It’s to determine a solution to log-into Fb utilizing Python. Solely after that may we entry the video. Let’s login utilizing the browser and verify what info is required to log-in.

I received’t go into a lot element for this step. The gist is that whereas logging in, the desktop web site and the cell web site require roughly the identical POST parameters however curiously in case you log-in utilizing the cell web site you don’t have to produce quite a lot of extra info which the desktop web site requires. You may get away with doing a POST request to the next URL together with your username and password:

https://m.fb.com/login.php

We’ll later see that the next API requests would require a _fbdtsg parameter. The worth of this parameter is embedded within the HTML response and might simply be extracted utilizing common expressions or a DOM parsing library.

Let’s proceed exploring the web site and the video API and see what we will discover.

Similar to what we did within the final publish, open up the video, monitor the XHR requests within the Developer Instruments and seek for the MP4 request.

Image

Subsequent step is to determine the place the MP4 hyperlink is coming from. I attempted looking out the unique HTML web page however couldn’t discover the hyperlink. Which means that Fb is utilizing an XHR API request to get the URL from the server. We have to search by means of all the XHR API requests and verify their responses for the video URL. I did simply that and the response of the third API request contained the MP4 hyperlink:

Image

The API request was a POST request and the url was:

https://www.fb.com/video/tahoe/async/10114393524323267/?chain=true&isvideo=true&originalmediaid=10214393524262467&playerorigin=permalink&playersuborigin=tahoe&ispermalink=true&numcopyrightmatchedvideoplayedconsecutively=0&storyidentifier=DzpfSTE1MzA5MDEwODE6Vks6MTAyMTQzOTMNjE4Njc&dpr=2

I attempted to deconstruct the URL. The foremost dynamic components of the URL appear to be the originalmediaid and _storyidentifier. _I searched the unique HTML web page and located that each of those have been there within the unique video web page. We additionally want to determine the POST knowledge despatched with this request. These are the parameters which have been despatched:

__user: <---redacted-->
__a: 1
__dyn: <---redacted-->
__req: 3
__be: 1
__pc: PHASED:DEFAULT
__rev: <---redacted-->
fb_dtsg: <---redacted-->
jazoest: <---redacted-->
__spin_r:  <---redacted-->
__spin_b:  <---redacted-->
__spin_t:  <---redacted-->

I’ve redacted a lot of the stuff in order that my private info will not be leaked. However you get the concept. I once more searched the HTML web page and was capable of finding a lot of the info within the web page. There was sure info which was not within the HTML web page like _jazoest _but as we transfer alongside you will note that we don’t actually need it to obtain the video. We are able to merely ship an empty string instead.

It looks like now we have all of the items we have to obtain a video. Right here is a top level view:

  1. Open the Video after logging in
  2. Seek for the parameters within the HTML response to craft the API url
  3. Open the API url with the required POST parameters
  4. Seek for _hdsrc or _sdsrc within the response of the API request

Now lets create a script to automate these duties for us.

Step 2: Automate it

The very first step is to determine how the login takes place. Within the recon part I discussed which you could simply log-in utilizing the cell web site. We’ll do precisely that. We’ll log-in utilizing the cell web site after which open the homepage utilizing the authenticated cookies in order that we will extract the _fbdtsg parameter from the homepage for subsequent requests.

import requests 
import re
import urllib.parse

e mail = ""
password = ""

session = requests.session()
session.headers.replace({
  'Consumer-Agent': 'Mozilla/5.0 (X11; Linux i686; rv:39.0) Gecko/20100101 Firefox/39.0'
})
response = session.get('https://m.fb.com')
response = session.publish('https://m.fb.com/login.php', knowledge={
  'e mail': e mail,
  'cross': password
}, allow_redirects=False)

Change the e-mail and password variable together with your e mail and password and this script ought to log you in. How do we all know whether or not now we have efficiently logged in? We are able to verify for the presence of ‘c_user’ key within the cookies. If it exists then the login has been profitable.

Let’s verify that and extract the fb_dtsg from the homepage. Whereas we’re at that allow’s extract the user_id from the cookies as properly as a result of we are going to want it later.

if 'c_user' in response.cookies:
    # login was profitable
    homepage_resp = session.get('https://m.fb.com/residence.php')
    fb_dtsg = re.search('identify="fb_dtsg" worth="(.+?)"', homepage_resp.textual content).group(1)
    user_id = response.cookies['c_user']

So now we have to open up the video web page, extract all the required API POST arguments from it and do the POST request.

if 'c_user' in response.cookies:
    # login was profitable
    homepage_resp = session.get('https://m.fb.com/residence.php')
    fb_dtsg = re.search('identify="fb_dtsg" worth="(.+?)"', homepage_resp.textual content).group(1)
    user_id = response.cookies['c_user']
    
    video_url = "https://www.fb.com/username/movies/101214393524261127/"
    video_id = re.search('movies/(.+?)/', video_url).group(1)

    video_page = session.get(video_url)
    identifier = re.search('ref=tahoe","(.+?)"', video_page.textual content).group(1)
    final_url = "https://www.fb.com/video/tahoe/async/{0}/?chain=true&isvideo=true&originalmediaid={0}&playerorigin=permalink&playersuborigin=tahoe&ispermalink=true&numcopyrightmatchedvideoplayedconsecutively=0&storyidentifier={1}&dpr=2".format(video_id,identifier)
    
    knowledge = {'__user': user_id,
            '__a': '',
            '__dyn': '',
            '__req': '',
            '__be': '',
            '__pc': '',
            '__rev': '',
            'fb_dtsg': fb_dtsg,
            'jazoest': '',
            '__spin_r': '',
            '__spin_b': '',
            '__spin_t': '',
    }
    api_call = session.publish(final_url, knowledge=knowledge)
    strive:
        final_video_url = re.search('hd_src":"(.+?)",', api_call.textual content).group(1)
    besides AttributeError:
        final_video_url = re.search('sd_src":"(.+?)"', api_call.textual content).group(1)
print(final_video_url)

You could be questioning what the knowledge dictionary is doing and why there are quite a lot of keys with empty values. Like I mentioned in the course of the recon course of, I attempted making profitable POST requests utilizing the minimal quantity of information. Because it seems Fb solely cares about _fbdtsg and the __person key. You possibly can let every part else be an empty string. Just remember to do ship these keys with the request although. It doesn’t work if the hot button is fully absent.

On the very finish of the script we first seek for the HD supply after which the SD supply of the video. If HD supply is discovered we output that and if not then we output the SD supply.

Our closing script seems to be one thing like this:

import requests 
import re
import urllib.parse
import sys

e mail = sys.argv[-2]
password = sys.argv[-1]

print("E mail: "+e mail)
print("Go:  "+password)

session = requests.session()
session.headers.replace({
  'Consumer-Agent': 'Mozilla/5.0 (X11; Linux i686; rv:39.0) Gecko/20100101 Firefox/39.0'
})
response = session.get('https://m.fb.com')
response = session.publish('https://m.fb.com/login.php', knowledge={
  'e mail': e mail,
  'cross': password
}, allow_redirects=False)

if 'c_user' in response.cookies:
    # login was profitable
    homepage_resp = session.get('https://m.fb.com/residence.php')
    fb_dtsg = re.search('identify="fb_dtsg" worth="(.+?)"', homepage_resp.textual content).group(1)
    user_id = response.cookies['c_user']
    
    video_url = sys.argv[-3]
    print("Video url:  "+video_url)
    video_id = re.search('movies/(.+?)/', video_url).group(1)

    video_page = session.get(video_url)
    identifier = re.search('ref=tahoe","(.+?)"', video_page.textual content).group(1)
    final_url = "https://www.fb.com/video/tahoe/async/{0}/?chain=true&isvideo=true&originalmediaid={0}&playerorigin=permalink&playersuborigin=tahoe&ispermalink=true&numcopyrightmatchedvideoplayedconsecutively=0&storyidentifier={1}&dpr=2".format(video_id,identifier)
    
    knowledge = {'__user': user_id,
            '__a': '',
            '__dyn': '',
            '__req': '',
            '__be': '',
            '__pc': '',
            '__rev': '',
            'fb_dtsg': fb_dtsg,
            'jazoest': '',
            '__spin_r': '',
            '__spin_b': '',
            '__spin_t': '',
    }
    api_call = session.publish(final_url, knowledge=knowledge)
    strive:
        final_video_url = re.search('hd_src":"(.+?)",', api_call.textual content).group(1)
    besides AttributeError:
        final_video_url = re.search('sd_src":"(.+?)"', api_call.textual content).group(1)

print(final_video_url.substitute('',''))

I made a few modifications to the script. I used sys.argv to get video_url, e mail and password from the command line. You possibly can hardcore your username and password in order for you.

Save the above file as _facebookdownloader.py and run it like this:

$ python facebook_downloader.py video_url e mail password

Change video_url with the precise video url like this https://www.fb.com/username/movies/101214393524261127/ and substitute the e-mail and password together with your precise e mail and password.

After working this script, it’ll output the supply url of the video to the terminal. You possibly can open the URL in your browser and from there you need to have the ability to right-click and obtain the video simply.

I hope you guys loved this fast tutorial on reverse engineering the Fb API for making a video downloader. In case you have any questions/feedback/ideas please put them within the feedback beneath or e mail me. I’ll take a look at reverse engineering a distinct web site for my subsequent publish. Observe my weblog to remain up to date!

Thanks! Have an important day!

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments