Related

Wget tricks and hacks with Linux part 1

wget stands for web get .Wget is a terminal based download manager.wget allows you to download content from web to your system with efficiently. It supports http,https and fpt protocols. Files can be retrieved with proxies too. Wget is an open source software.It is very powerful it has many awesome features:

Features of wget

One it's cool feature is ,it works in the background even,if you are not logged in.It does not need your presence constantly like many other download manager.

It has resume support.

Wget has been designed for robustness over slow or unstable network connections. if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved. If the server supports regetting, it will instruct the server to continue the download from where it left off.

Let's explore it's awesome features with real work.So fire up your terminal. Type the following command

--help

Below command will show you all the available that commands that you can use with wget.there are a lot of cool things you can do with wget.

root@seven:~# wget --help
GNU Wget 1.16, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...

Mandatory arguments to long options are mandatory for short options too.

Startup:
  -V,  --version                   display the version of Wget and exit.
  -h,  --help                      print this help.
  -b,  --background                go to background after startup.
  -e,  --execute=COMMAND           execute a `.wgetrc'-style command.

Logging and input file:
  -o,  --output-file=FILE          log messages to FILE.
  -a,  --append-output=FILE        append messages to FILE.
  -d,  --debug                     print lots of debugging information.
  -q,  --quiet                     quiet (no output).
  -v,  --verbose                   be verbose (this is the default).
  -nv, --no-verbose                turn off verboseness, without being quiet.
       --report-speed=TYPE         Output bandwidth as TYPE.  TYPE can be bits.
  -i,  --input-file=FILE           download URLs found in local or external FILE.
  -F,  --force-html                treat input file as HTML.
  -B,  --base=URL                  resolves HTML input-file links (-i -F)
                                   relative to URL.
       --config=FILE               Specify config file to use.
       --no-config                 Do not read any config file.

                                   local.

Download a single web page-wget

Just write the address or url with wget it will download page for you.

root@seven:~# wget www.google.com

Your files will be saved into your present working directory.So check out your directory and a file with index.html will be downloaded.

Download Entire website with Wget

In order to download full website we need to give -r option.Then it will download recursively means it will download everything which linked to the webpage like images,pages etc.

root@seven:~# wget -r http://www.demoweb.com

-r     option tells wget to download recursively. it downloads all the images ,pages and data that is linked on the front page.

But many sites do not want you to download their entire site. To prevent this, they check how browsers identify. Many sites refuses you to connect or sends a blank page if they detect you are not using a web-browser. You might get a message like:
Sorry, but the download manager you are using to view this site is not supported. We do not support use of such download managers as flashget, go!zilla, or getright.

There is -U option that can be used to tell the site to that you are using some commonly accepted browser.

So follow the below command:

root@seven:~# wget  -r -p -U Mozilla http://www.demosite.com

-U     with -U you should specify your default browser.


Save file under different name and location

When you download a webpage or any other file it's always saved into your present working directory and name will be inherited from the source.We can change both[name and default]. Type the following command.

root@seven:~# wget -O /root/Desktop/googlesourcecode www.google.com

-O for specifying your location where you want your file to be saved.

Resuming files with Wget

Now lets say you are downloading a large file and you stopped it for some reasons.Dont worry wget takes good care of you.Even if file your are downloading with other Download managers can be resumed.

root@seven:~# wget http://kali2.mirror.garr.it/mirrors/kali-images/kali-2.0/kali-linux-2.0-amd64.iso

When you hit enter it will start downloading kali linux iso image which is obviously damn large.Note down the size of the file and exit by pressing control+c.Now we will resume the same file with the following command:

root@seven:~# wget -c http://kali2.mirror.garr.it/mirrors/kali-images/kali-2.0/kali-linux-2.0-amd64.iso

In order to resume we have to give -c options which tells wget to resume the file.So just insert -c after wget and re-enter the same URL.It will start where you left off.

Give limited width

When you start downloading a file it will use your full bandwidth,so if you want to limit your downloading speed with certain amount you can do with limit-rate

root@seven:~# wget --limit-rate=220k http://kali2.mirror.garr.it/mirrors/kali-images/kali-2.0/kali-linux-2.0-amd64.iso

tries option with wget

If you have a slow connection then sometimes it takes a little time to ping servers.In such cases your downloading stops you have to start over again but wget allows us to reconnect automatically you just need to specify the amount of time you want to wget to try.

root@seven:~# wget --tries 10 www.google.com

So you have to supply --tries parameter it will attempt to download file 10 times before giving up.

Send files to background with wget

-b options sends file to the background .Now file will be downloaded in the background without interrupting your workflow.You can check log file to know the info.

wget -b www.google.com

kill wget processes

root@seven:~# killall wget

It will delete all the wget files running.You can delete specific operations for that you have to supply process id.

Bulk Downloading with wget

This is an awesome features.You can put url's in a single text file and give it to the wget.Syntax goes like this:

root@seven:~# wget -i bulkdownload.txt 

Check for broken links

This feature is specially for web developers.It will check for for broken links.You can use wget as a spider too.

root@seven:~# wget --spider -o broken.txt -r -p http://www.zeeroseven.com

--spider is for crawling website it finds pages.
-o is for writing to a file.
-r is for recursive.