curl vs wget

sudo apt-get install curl

For example, the Download link at the National Geographic Travel Photo Contest 2014 works for curl but not wget. I can use curl with -o option but wget with -o will not work in this case. Note with curl, we can also use the -O (capital O) option which will write output to a local file named like the remote file.

curl \
 http://travel.nationalgeographic.com/u/TvyamNb-BivtNwcoxtkc5xGBuGkIMh_nj4UJHQKuoXEsSpOVjL0t9P0vY7CvlbxSYeJUAZrEdZUAnSJk2-sJd-XIwQ_nYA/ \
 -o owl.jpg

Should I Use Curl Or Wget? and curl vs Wget

The main benefit of using the wget command is that it can be used to recursively download files.
The curl command lets you use wildcards to specify the URLs you wish to retrieve. And curl supports more protocols than wget (HTTP, HTTPS, FTP) does.
~~The wget command can recover when a download fails whereas the curl command cannot.~~

Actually curl supports continuous downloading too. But not all FTP connection supports continuous downloading. The following examples show it is possible to use the continuous downloading option in wget/curl for downloading file from ncbi FTP but not from illumina FTP.

$ wget -c ftp://igenome:[email protected]/Drosophila_melanogaster/Ensembl/BDGP6/Drosophila_melanogaster_Ensembl_BDGP6.tar.gz
--2017-04-13 10:46:16--  ftp://igenome:*password*@ussd-ftp.illumina.com/Drosophila_melanogaster/Ensembl/BDGP6/Drosophila_melanogaster_Ensembl_BDGP6.tar.gz
           => ‘Drosophila_melanogaster_Ensembl_BDGP6.tar.gz’
Resolving ussd-ftp.illumina.com (ussd-ftp.illumina.com)... 66.192.10.36
Connecting to ussd-ftp.illumina.com (ussd-ftp.illumina.com)|66.192.10.36|:21... connected.
Logging in as igenome ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /Drosophila_melanogaster/Ensembl/BDGP6 ... done.
==> SIZE Drosophila_melanogaster_Ensembl_BDGP6.tar.gz ... 762893718
==> PASV ... done.    ==> REST 1706053 ... 
REST failed, starting from scratch.
 
==> RETR Drosophila_melanogaster_Ensembl_BDGP6.tar.gz ... done.
Length: 762893718 (728M), 761187665 (726M) remaining (unauthoritative)
 
 0% [                                                                                                                   ] 374,832     79.7KB/s  eta 2h 35m ^C
 
$ curl -L -O -C - ftp://igenome:[email protected]/Drosophila_melanogaster/Ensembl/BDGP6/Drosophila_melanogaster_Ensembl_BDGP6.tar.gz
** Resuming transfer from byte position 1706053
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  727M    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
curl: (31) Couldn't use REST

$ wget -c ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz
--2017-04-13 10:52:02--  ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz
           => ‘common_all_20160601.vcf.gz’
Resolving ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)... 2607:f220:41e:250::7, 130.14.250.10
Connecting to ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)|2607:f220:41e:250::7|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /snp/organisms/human_9606_b147_GRCh37p13/VCF ... done.
==> SIZE common_all_20160601.vcf.gz ... 1023469198
==> EPSV ... done.    ==> RETR common_all_20160601.vcf.gz ... done.
Length: 1023469198 (976M) (unauthoritative)
 
24% [===========================>                                                                                       ] 255,800,120 55.2MB/s  eta 15s    ^C
 
$ wget -c ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz
--2017-04-13 10:52:11--  ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz
           => ‘common_all_20160601.vcf.gz’
Resolving ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)... 2607:f220:41e:250::7, 130.14.250.10
Connecting to ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)|2607:f220:41e:250::7|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /snp/organisms/human_9606_b147_GRCh37p13/VCF ... done.
==> SIZE common_all_20160601.vcf.gz ... 1023469198
==> EPSV ... done.    ==> REST 267759996 ... done.    
==> RETR common_all_20160601.vcf.gz ... done.
Length: 1023469198 (976M), 755709202 (721M) remaining (unauthoritative)
 
47% [++++++++++++++++++++++++++++++========================>                                                            ] 491,152,032 50.6MB/s  eta 12s    ^C

$ curl -L -O -C - ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 65  976M   65  639M    0     0  83.7M      0  0:00:11  0:00:07  0:00:04 90.4M^C

curl man page, supported protocols

https://curl.haxx.se/docs/manpage.html

wget overwrites the existing file

Use the -N or --timestamping option to turn on time-stamping. Don't re-retrieve files unless newer than local.

wget and username/password

http://www.cyberciti.biz/faq/wget-command-with-username-password/

Download and Un-tar(Extract) in One Step

If we don't want to avoid saving a temporary file, we can use one piped statement.

curl http://download.osgeo.org/geos/geos-3.5.0.tar.bz2 | tar xvz
# OR
wget http://download.osgeo.org/geos/geos-3.5.0.tar.bz2 -O - | tar jx

# For .gz file
wget -O - ftp://ftp.direcory/file.gz | gunzip -c > gunzip.out

See shellhacks.com. Note that the magic part of the wget option "-O -"; it will output the document to the standard output instead of a file.

The "-c" in gunzip is to have gzip output to the console. PS. it seems not necessary to use the "-c" option.

Download and execute the script in one step

See Execute bash script from URL. Note "-s" parameter in curl means the silent mode.

curl -s https://server/path/script.sh | sudo sh

curl -s http://server/path/script.sh | sudo bash /dev/stdin arg1 arg2

sudo -v && wget -nv -O- https://download.calibre-ebook.com/linux-installer.sh | sudo sh /dev/stdin

Download and install binary software using sudo

One example (Calibre) is like

sudo -v && wget -nv -O- https://raw.githubusercontent.com/kovidgoyal/calibre/master/setup/linux-installer.py | \
sudo python -c "import sys; main=lambda:sys.stderr.write('Download failed\n'); exec(sys.stdin.read()); main()"

Note that in wget the option "-O-" means writing to standard output (so the file from the URL is NOT written to the disk) and "-nv" means no verbose.

If the option "-O-" is not used, we'd better to use "-N" option in wget to overwrite an existing file.

See the Logging and Download options in wget's manual.

       -O file
       --output-document=file
           The documents will not be written to the appropriate files, but all
           will be concatenated together and written to file.  If - is used as
           file, documents will be printed to standard output, disabling link
           conversion.  (Use ./- to print to a file literally named -.)

curl and POST request

curl and proxy

How to use curl command with proxy username/password on Linux/ Unix

Website performance

httpstat – A Curl Statistics Tool to Check Website Performance

wget/curl a file with correct name when redirected

wget --trust-server-names <url> 
# Or
wget --content-disposition <url>
# Or
curl -JLO <url>

wget to download a folder

https://stackoverflow.com/questions/8755229/how-to-download-all-files-but-not-html-from-a-website-using-wget

wget -A pdf,jpg,PDF,JPG -m -p -E -k -K -np http://site/path/

wget to download a website

http://linux.about.com/od/commands/a/Example-Uses-Of-The-Command-Wget.htm
https://www.gnu.org/software/wget/manual/wget.html
11 Best Free Website Downloader Software For Windows
WebHTTrack Website Copier! sudo apt install webhttrack On Ubuntu, the app is in a web application (http://HOSTNAME:8080). We can launch it by typing 'webhttrack' in order to launch it in our default browser. See Grabbing Websites with WebHTTrack in Linux-magazine.

To download a copy of a complete web site, use the recursive option ('-r') By default it will go up to five levels deep. You can change the default level by using the '-l' option.

All files linked to in the documents are are downloaded to enable complete offline viewing ('-p' and '--convert-links' options). Instead of having the progress messages displayed on the standard output, you can save it to a log file with the -o option.

wget -p --convert-links -r -l2 linux.about.com -o logfile
wget -p --convert-links -r -l1 https://csgillespie.github.io/efficientR # create csgillespie/efficientR

Save Web Pages As Single HTML Files With Monolith

Save Web Pages As Single HTML Files For Offline Use With Monolith (Console)

Internet application: wttr.in, check weather from console/terminal

https://github.com/chubin/wttr.in. https://wttr.in is located in Germany.
The weather and visualization backend is wego (both look very similar). The weather data is based on forecast.io which leads to darksky.net.

Display Weather Forecast In Your Terminal With Wttr.in

$ curl wttr.in
$ curl wttr.in/?m        # check to use the metric system
$ curl wttr.in/washington
$ curl wttr.in/olney     # not sure about which olney
$ curl wttr.in/~olney    # show the exact location at the bottom
$ curl wttr.in/taipei

10 Ways to Check the Weather From Your Linux Desktop

Internet application: cheat.sh

See man -> Cheat.sh.

Files downloaded from a browser and wget

Same file downloaded through a browser and the wget command has a different file size and behavior.

$ ls -lh biotrip*.gz
-rw-r--r-- 1 brb brb 198M May 15 09:11 biotrip_0.1.0_may19.tar.gz
-rw-rw-r-- 1 brb brb 195M May 14 16:57 biotrip_0.1.0.tar.gz

$ file biotrip_0.1.0_may19.tar.gz # downloaded from a browser (chrome browser, Mac or Linux)
biotrip_0.1.0_may19.tar.gz: POSIX tar archive

$ file biotrip_0.1.0.tar.gz       # downloaded from the wget command
biotrip_0.1.0.tar.gz: gzip compressed data, from HPFS filesystem (OS/2, NT)

$ tar xzvf biotrip_0.1.0_may19.tar.gz 
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Curl

Contents

curl vs wget

curl man page, supported protocols

wget overwrites the existing file

wget and username/password

Download and Un-tar(Extract) in One Step

Download and execute the script in one step

Download and install binary software using sudo

curl and POST request

curl and proxy

Website performance

wget/curl a file with correct name when redirected

wget to download a folder

wget to download a website

Save Web Pages As Single HTML Files With Monolith

Internet application: wttr.in, check weather from console/terminal

Internet application: cheat.sh

Files downloaded from a browser and wget

Navigation menu

Curl

curl vs wget

curl man page, supported protocols

wget overwrites the existing file

wget and username/password

Download and Un-tar(Extract) in One Step

Download and execute the script in one step

Download and install binary software using sudo

curl and POST request

curl and proxy

Website performance

wget/curl a file with correct name when redirected

wget to download a folder

wget to download a website

Save Web Pages As Single HTML Files With Monolith

Internet application: wttr.in, check weather from console/terminal

Internet application: cheat.sh

Files downloaded from a browser and wget

Navigation menu

Search