Velvet Star Monitor

Standout celebrity highlights with iconic style.

general

How to download website offline with authenticated username and password?

Writer Matthew Martinez

I have an account of this tutorial website testdriven.io, and I would like to download the tutorial offline for my team member to lean without having to login the credential.

So, I tried several ways without success.

First, I logined account and start download as wget -r --mirror -p --convert-links -P . . However, the result was an offline website without login account and tutorial was limitted accordingly.

Second, I tried to pass the parameter string as this

wget --save-cookies cookies.txt \ --keep-session-cookies \ --post-data 'login=&password=z9vi2gE82lO@sTN' \ --delete-after \ 

Yet, it returned

--2019-12-18 02:01:22--
Resolving testdriven.io (testdriven.io)... 104.27.143.239, 104.27.142.239, 2606:4700:30::681b:8eef, ...
Connecting to testdriven.io (testdriven.io)|104.27.143.239|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-12-18 02:01:23 ERROR 403: Forbidden.

Thus, how can I manage to download full offline tutorial with providing authenticated username and password? Thanks.

1

2 Answers

The website will store your auth information in a cookie.

You can find this in your browser's network inspector. Look under the request headers and grab the cookies for use with wget.

web inspector

You will need to pass the cookie into wget, and theoretically maintain a cookie jar as well using --save-cookies and --load-cookies.

For example:

wget -r --mirror -p --convert-links -P . \ --header="Cookie: __cfduid=ddebc00435655a6a20430c65436f729851576611229; csrftoken=6QuufXScgoQkyEe18dAL9YmqhxlyJpegNtyMCr4LgAUuvBs3KUzQwqEYBvWZV4yg; sessionid=c5gbfxkhqwpblxlhatgfh3wtfgy0zgpp" \ --save-cookies cookies.txt \ --load-cookies cookies.txt \ --accept-regex '/courses/' \ 
2

Read man wget, especially the part that says:

 --user=user --password=password Specify the username user and password password for both FTP and HTTP file retrieval. These parameters can be overridden using the --ftp-user and --ftp-password options for FTP connections and the --http-user and --http-password options for HTTP connections.

Read about all the wget options. Would this help?:

--metalink-over-http Issues HTTP HEAD request instead of GET and extracts Metalink metadata from response headers. Then it switches to Metalink download. If no valid Metalink metadata is found, it falls back to ordinary HTTP download.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy