Partner links

Wget and the user agent option

Ubuntu 15.10 GNOME Terminal

Wget is a command-line utility for downloading files.

The official description on its man page on my Linux distribution says that it is “free utility for non-interactive download of files from the Web”, and that it “supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.”

It especially comes in handy when I need to download an image from another Web resource. I’ve never had any problem using it until just today when I got a 403 Forbidden error message. The edited output is shown in this code block:

wget http://www.example-site.com/image.png
--2015-01-12 19:23:14--  http://www.example-site.com/image.png

Resolving www.example-site.com (www.example-site.com)... 555.111.111.111, 555.111.113.112
Connecting to www.example-site.com (www.example-site.com)|555.111.111.111|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2015-01-12 19:23:14 ERROR 403: Forbidden.

Years ago I read somewhere that that usually indicates an Apache server that has been configured via .htaccess to deny file downloads or something of that nature. Since the image file was something I really needed for an article, I decided to poke around the documentation and search the Internet for clues. That effort drew my attention to one of the tools options – the –user-agent or -U option.

Here’s what the man page has to say about it:

The HTTP protocol allows the clients to identify themselves using a “User-Agent” header field. This enables distinguishing the WWW software, usually for statistical purposes or for tracing of protocol violations. Wget normally identifies as Wget/version, version being the current version number of Wget.

However, some sites have been known to impose the policy of tailoring the output according to the “User-Agent”-supplied information. While this is not such a bad idea in theory, it has been abused by servers denying information to clients other than (historically) Netscape or, more frequently, Microsoft Internet Explorer. This option allows you to change the “User-Agent” line issued by Wget. Use of this option is discouraged, unless you really know what you are doing.

Specifying empty user agent with –user-agent=”” instructs Wget not to send the “User-Agent” header in HTTP requests.

With that, I decided to retry the download by specifying an empty user agent as shown in this code block:

wget http://www.example-site.com/image.png
--2015-01-12 19:23:14--  http://www.example-site.com/image.png

Resolving www.example-site.com (www.example-site.com)... 555.111.111.111, 555.111.113.112
Connecting to www.example-site.com (www.example-site.com)|555.111.111.111|:80... connected.

HTTP request sent, awaiting response... 200 OK
Length: 39946 (39K) [image/png]
Saving to: ‘image.png’

image.png         100%[=========================>]  39.01K  --.-KB/s   in 0.07s  

2015-01-12 19:37:58 (567 KB/s) - ‘image.png’ saved [39946/39946]

That was all it took. There might be other situations where the –user-agent or -U option likely fail, but in this specific case, that was what the doctor ordered.

Share:

Facebook
Twitter
Pinterest
LinkedIn

Partner links

Newsletter: Subscribe for updates

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments

Get the latest

On social media

Security distros

Hacker
Linux distros for hacking and pentesting

Crypto mining OS

Bitcoin
Distros for mining bitcoin and other cryptocurrencies

Crypto hardware

MSI GeForce GTX 1070
Installing Nvidia GTX 1070 GPU drivers on Ubuntu

Disk guide

LVM
Beginner's guide to disks & disk partitions in Linux

Bash guide

Bash shell terminal
How to set the PATH variable in Bash
Categories
Archives
0
Hya, what do you think? Please comment.x
()
x