Hello guys, on this submit we’ll discover ways to bypass downloading restrictions on Soundcloud. We are going to create a Python script which is able to enable us to obtain even these songs which aren’t enabled for downloading. We are going to work on this venture in a step-by-step foundation the place we’ll deal with every downside as we encounter it. I’ll attempt to make it as common in nature as doable in an effort to comply with this venture even when Soundcloud has modified its web site format or the best way it serves media information. So with none additional ado let’s get began:
Be aware: I don’t endorse illicit downloading of another person’s content material. That is merely an academic information and ought to be used to obtain your individual content material solely.
1. Reverse Engineering the MP3 URL Technology Logic
Let’s begin by opening up Chrome. Soundcloud doesn’t present us with the .mp3
url on the media web page so we have to determine how and from the place Soundcloud will get the .mp3
url. Open up Soundcloud and open this publically accessible music file which we can be utilizing for testing functions.
Now we have to open the Chrome developer instruments. The community tab within the chrome developer instruments will enable us to see the entire requests which the browser makes once we open Soundcloud. After opening up the developer instruments and navigating to the community tab it is best to find yourself with one thing just like this:
Now refresh the web page with the developer instruments open. It’s best to begin seeing the requests pane getting populated by tons of various hyperlinks. Don’t really feel intimidated, we’ll make sense of all of this in only a bit. You may see that there are already 100+ requests being made by Soundcloud. We have to discover a approach to filter the requests in order that they grow to be manageable for us to sift via.
Whereas wanting on the requests normally I noticed that Soundcloud is making a number of requests to an api.soundcloud.com
endpoint. When you ever see any requests being made to an api
endpoint at all times discover these first. Here’s what you’ll find yourself with after filtering these requests which have api
of their url:
Now after filtering the requests I noticed that there was a stream
url. That caught my consideration as a result of more often than not stream
urls do precisely what they stand for. They stream the media content material. So I clicked on the stream hyperlink and noticed what response we had been getting from Soundcloud on that endpoint:
And lo-and-behold. That endpoint returns a few media hyperlinks. Now the one we’re excited by is the http_mp3_128_url
as a result of they’re normally probably the most easy to obtain.
There appears to be an issue. Every time we strive opening the http_mp3_128_url
url in a brand new tab we’re greeted with the 403 Forbidden
error. There may be positively one thing fishy happening as a result of if I scroll down within the developer instruments I can see that Soundcloud is efficiently accessing that url with none Forbidden
error. Now a lot of the instances what occurs is that the server checks the headers and cookies of the browser to confirm that a licensed particular person is accessing the endpoint. Nevertheless, I’m not logged in so there could be one thing else happening.
After refreshing the web page a few instances I noticed that http_mp3_128_url
url modified after each refresh. That should imply that the urls are for one time use solely and are programmatically generated on each entry. And after the browser performs the media file for the primary time, the urls expire and that’s the reason we had been getting a Forbidden error. To confirm my remark I opened the stream
url in a brand new tab after which tried accessing the http_mp3_128_url
url myself, earlier than the Soundcloud participant.
Unexpectedly we’re in a position to entry the media file with out the Forbidden
error!
Now we have to deconstruct the stream
url as effectively in order that we will generate it ourselves. The stream
url in my case is that this:
https://api.soundcloud.com/i1/tracks/391350885/streams?client_id=6pDzV3ImgWPohE7UmVQOCCepAaKOgrVL
All the pieces appears fairly generic. The cliend_id
is certainly the SoundCloud API key as a result of I’m not logged in. The attention-grabbing a part of the url is 391350885
which isn’t part of the unique media url. The place did this quantity come from?
I filtered the community requests with this quantity and couldn’t discover it’s supply. The very subsequent factor which I did was to look the HTML supply of the web page and bam! The monitor quantity was embedded in that!
Now that we all know how Soundcloud generates the .mp3
url, it’s the good time to put in writing a script to automate this. The script ought to soak up a Soundcloud url and may return an mp3 url. So let’s get began.
2. Making a Python Script for Automating the URL era
Begin up by creating an app.py
file in your listing. This may maintain the entire required code.
$ contact app.py
Now import the required libraries. We can be utilizing requests
for making the HTTP requests, sys
for taking command-line inputs and re
for extracting the textual content from the HTML web page. Lots of people object to the utilization of re for extracting textual content from HTML however on this case the place we all know that we’re solely extracting a small piece of textual content from the web page it’s superb.
import requests
import re
import sys
Lets write down the preliminary code for taking in a Soundcloud URL from the command line and opening up the Soundcloud web page utilizing requests
.
import sys
import requests
import re
url = sys.argv[-1]
html = requests.get(url)
We aren’t utilizing argparse as a result of we’ll quickly be changing this script into an internet API. Now we have to discover a approach to extract the monitor id from the web page. Right here is an easy regex which works:
track_id = re.search(r'soundcloud://sounds:(.+?)"', html.textual content)
Now we have to open up the api
url and get the precise mp3
stream hyperlink. To do this add the next code to your python file:
final_page = requests.get("https://api.soundcloud.com/i1/tracks/{0}/streams?client_id=6pDzV3ImgWPohE7UmVQOCCepAaKOgrVL".format(track_id.group(1)))
print(final_page.json()['http_mp3_128_url'])
And there you go. You’ve got the entire script which provides you with the mp3 hyperlink from a Soundcloud media url. Right here is the entire code:
import sys
import requests
import re
import json
url = sys.argv[-1]
html = requests.get(url)
track_id = re.search(r'soundcloud://sounds:(.+?)"', html.textual content)
final_page = requests.get("https://api.soundcloud.com/i1/tracks/{0}/streams?client_id=6pDzV3ImgWPohE7UmVQOCCepAaKOgrVL".format(track_id.group(1)))
print(final_page.json()['http_mp3_128_url'])
Go on, save this in a file and run it. However the issue is that this isn’t terribly helpful. How about we flip this into an online app which anybody can use? Now that can be much more helpful.
3. Turning this into an online app
We can be utilizing Flask to transform this into an online app. The Flask web site supplies us with some very primary code which we will use as our start line.
from flask import Flask
app = Flask(__name__)
@app.route("/")
def howdy():
return "Howdy World!"
Save the above code in a app.py
file. Run the next command within the terminal:
$ Flask_APP=app.py flask run
This may inform the flask
command line program about the place to search out our flask code which it must serve. If every part is working superb, it is best to see the next output:
* Working on http://localhost:5000/
Now we have to implement a customized URL endpoint which is able to take the Soundcloud media URL because the enter and can redirect person to the MP3 file URL. Let’s identify our customized endpoint /generate_link
and make it settle for question parameters.
@app.route("/generate_link")
def generate_link():
media_url = request.args.get('url','')
return media_url
Our customized end-point doesn’t actually do something. It merely echoes again no matter you move it via the url
question parameter. The rationale for not implementing the remainder of the performance is that we haven’t truly transformed our earlier script right into a module. Let’s do this actual fast first:
import sys
import requests
import re
import json
def get_link(url):
html = requests.get(url)
track_id = re.search(r'soundcloud://sounds:(.+?)"', html.textual content)
final_page = requests.get("https://api.soundcloud.com/i1/tracks/{0}/streams?client_id=6pDzV3ImgWPohE7UmVQOCCepAaKOgrVL".format(track_id.group(1)))
return final_page.json()['http_mp3_128_url']
I’m assuming that this module is saved into the identical listing as your flask app. Right here is my present listing construction:
$ ls
app.py soundcloudDownload.py
The soundcloudDownload.py
comprises the script (now transformed to a module) and the app.py
comprises the flask app. Now let’s import soundcloudDownload.py
into our app.py
file and implement the performance to make the net app a bit extra helpful:
from soundcloudDownload import get_link
from flask import Flask, request
app = Flask(__name__)
@app.route("/")
def howdy():
return "Howdy World!"
@app.route("/generate_link")
def generate_link():
media_url = request.args.get('url','')
return get_link(media_url)
Now restart your flask app in your terminal and take a look at accessing the next url:
http://localhost:5000/generate_link?url=https://soundcloud.com/m-yasoob-khalid/shutdown
If every part works superb it is best to get one thing just like this within the response in your browser:
https://cf-media.sndcdn.com/og4Ho8QAsLWj.128.mp3?Coverage=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLW1lZGl
hLnNuZGNkbi5jb20vb2c0SG84UUFzTFdqLjEyOC5tcDMiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOj
E1MTczNTA3NDV9fX1dfQ__&Signature=XQAyN~Atl8OGeqwmxKa7Zx7S50YX229mdIq-XiU753cGKEmWac8FGK~GdSylj0Uo2sqBnJxzDA
fC3Ahv1MbY~LPGQ8A-q36-vwF6Z5v88-BvflDMmYuXnj0gqWvolR1GMq6SsgMPRGCfNu4D8cS0NckRCif8dGCEQxQVQ2laSCC4e4lpkuqtS
gJOJ6L26N8zrma~2lCJc7TxqCp3~aROuejC-4JVm7P6f4vtB38-l7vT-nWjrsHNC33YLI~Kex6ciOeRGGmFU-eyUDSpooIzrfj6wiR-1A66
MLWFkuUoKboSRfy9Zz6zFSqgPTXZKePHKoKuMzDjEAV42j5Gbm8dgQ__&Key-Pair-Id=APKAJAGZ7VMH2PFPW6UQ
Nonetheless the person wants to repeat the url and open it in a brand new tab. Let’s enhance the state of affairs by implementing computerized redirection to the MP3 web page:
from soundcloudDownload import get_link
from flask import Flask, re
app = Flask(__name__)
@app.route("/")
def howdy():
return "Howdy World!"
@app.route("/generate_link")
def generate_link():
media_url = request.args.get('url','')
return redirect(get_link(media_url), code=302)
Now whenever you strive opening the identical generate_link
url in your browser try to be redirected to an mp3 file. Nice! All the pieces is working completely superb and as promised you’ve gotten reverse engineered the soundcloud net app and found out a approach to obtain mp3 information.
4. Additional Steps
Now we will go forward and implement a usable net interface to this net api however I’ll depart that as an train for the reader. Search on-line on how you need to use Jinja templates with flask after which make a front-end for this. You can even create a browser extension which injects a obtain button to the entire soundcloud media pages. That manner the person gained’t even have to repeat the url. They will merely click on the obtain button and the obtain will begin. The tip-goal is to take away as many steps as doable and streamline the method. A easy rule of thumb is that the much less the variety of steps required to realize a process, the extra usable a service/app is.
I’d flip this into an online app with search performance and an MP3 participant. To be able to keep tuned please comply with my weblog.