Curl
curl vs wget
sudo apt-get install curl
For example, the Download link at the National Geographic Travel Photo Contest 2014 works for curl but not wget. I can use curl with -o option but wget with -o will not work in this case. Note with curl, we can also use the -O (capital O) option which will write output to a local file named like the remote file.
curl \ http://travel.nationalgeographic.com/u/TvyamNb-BivtNwcoxtkc5xGBuGkIMh_nj4UJHQKuoXEsSpOVjL0t9P0vY7CvlbxSYeJUAZrEdZUAnSJk2-sJd-XIwQ_nYA/ \ -o owl.jpg
Should I Use Curl Or Wget? and curl vs Wget
- The main benefit of using the wget command is that it can be used to recursively download files.
- The curl command lets you use wildcards to specify the URLs you wish to retrieve. And curl supports more protocols than wget (HTTP, HTTPS, FTP) does.
The wget command can recover when a download fails whereas the curl command cannot.
Actually curl supports continuous downloading too. But not all FTP connection supports continuous downloading. The following examples show it is possible to use the continuous downloading option in wget/curl for downloading file from ncbi FTP but not from illumina FTP.
$ wget -c ftp://igenome:[email protected]/Drosophila_melanogaster/Ensembl/BDGP6/Drosophila_melanogaster_Ensembl_BDGP6.tar.gz --2017-04-13 10:46:16-- ftp://igenome:*password*@ussd-ftp.illumina.com/Drosophila_melanogaster/Ensembl/BDGP6/Drosophila_melanogaster_Ensembl_BDGP6.tar.gz => ‘Drosophila_melanogaster_Ensembl_BDGP6.tar.gz’ Resolving ussd-ftp.illumina.com (ussd-ftp.illumina.com)... 66.192.10.36 Connecting to ussd-ftp.illumina.com (ussd-ftp.illumina.com)|66.192.10.36|:21... connected. Logging in as igenome ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /Drosophila_melanogaster/Ensembl/BDGP6 ... done. ==> SIZE Drosophila_melanogaster_Ensembl_BDGP6.tar.gz ... 762893718 ==> PASV ... done. ==> REST 1706053 ... REST failed, starting from scratch. ==> RETR Drosophila_melanogaster_Ensembl_BDGP6.tar.gz ... done. Length: 762893718 (728M), 761187665 (726M) remaining (unauthoritative) 0% [ ] 374,832 79.7KB/s eta 2h 35m ^C $ curl -L -O -C - ftp://igenome:[email protected]/Drosophila_melanogaster/Ensembl/BDGP6/Drosophila_melanogaster_Ensembl_BDGP6.tar.gz ** Resuming transfer from byte position 1706053 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 727M 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 curl: (31) Couldn't use REST $ wget -c ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz --2017-04-13 10:52:02-- ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz => ‘common_all_20160601.vcf.gz’ Resolving ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)... 2607:f220:41e:250::7, 130.14.250.10 Connecting to ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)|2607:f220:41e:250::7|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /snp/organisms/human_9606_b147_GRCh37p13/VCF ... done. ==> SIZE common_all_20160601.vcf.gz ... 1023469198 ==> EPSV ... done. ==> RETR common_all_20160601.vcf.gz ... done. Length: 1023469198 (976M) (unauthoritative) 24% [===========================> ] 255,800,120 55.2MB/s eta 15s ^C $ wget -c ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz --2017-04-13 10:52:11-- ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz => ‘common_all_20160601.vcf.gz’ Resolving ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)... 2607:f220:41e:250::7, 130.14.250.10 Connecting to ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)|2607:f220:41e:250::7|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /snp/organisms/human_9606_b147_GRCh37p13/VCF ... done. ==> SIZE common_all_20160601.vcf.gz ... 1023469198 ==> EPSV ... done. ==> REST 267759996 ... done. ==> RETR common_all_20160601.vcf.gz ... done. Length: 1023469198 (976M), 755709202 (721M) remaining (unauthoritative) 47% [++++++++++++++++++++++++++++++========================> ] 491,152,032 50.6MB/s eta 12s ^C $ curl -L -O -C - ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/common_all_20160601.vcf.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 65 976M 65 639M 0 0 83.7M 0 0:00:11 0:00:07 0:00:04 90.4M^C
curl man page, supported protocols
https://curl.haxx.se/docs/manpage.html
wget overwrites the existing file
Use the -N or --timestamping option to turn on time-stamping. Don't re-retrieve files unless newer than local.
wget and username/password
http://www.cyberciti.biz/faq/wget-command-with-username-password/
Download and Un-tar(Extract) in One Step
If we don't want to avoid saving a temporary file, we can use one piped statement.
curl http://download.osgeo.org/geos/geos-3.5.0.tar.bz2 | tar xvz # OR wget http://download.osgeo.org/geos/geos-3.5.0.tar.bz2 -O - | tar jx # For .gz file wget -O - ftp://ftp.direcory/file.gz | gunzip -c > gunzip.out
See shellhacks.com. Note that the magic part of the wget option "-O -"; it will output the document to the standard output instead of a file.
The "-c" in gunzip is to have gzip output to the console. PS. it seems not necessary to use the "-c" option.
Download and execute the script in one step
See Execute bash script from URL. Note "-s" parameter in curl means the silent mode.
curl -s https://server/path/script.sh | sudo sh curl -s http://server/path/script.sh | sudo bash /dev/stdin arg1 arg2 sudo -v && wget -nv -O- https://download.calibre-ebook.com/linux-installer.sh | sudo sh /dev/stdin
Download and install binary software using sudo
One example (Calibre) is like
sudo -v && wget -nv -O- https://raw.githubusercontent.com/kovidgoyal/calibre/master/setup/linux-installer.py | \ sudo python -c "import sys; main=lambda:sys.stderr.write('Download failed\n'); exec(sys.stdin.read()); main()"
Note that in wget the option "-O-" means writing to standard output (so the file from the URL is NOT written to the disk) and "-nv" means no verbose.
If the option "-O-" is not used, we'd better to use "-N" option in wget to overwrite an existing file.
See the Logging and Download options in wget's manual.
-O file --output-document=file The documents will not be written to the appropriate files, but all will be concatenated together and written to file. If - is used as file, documents will be printed to standard output, disabling link conversion. (Use ./- to print to a file literally named -.)
curl and POST request
- http://superuser.com/questions/149329/what-is-the-curl-command-line-syntax-to-do-a-post-request
- https://learn.adafruit.com/raspberry-pi-physical-dashboard?view=all (the original post I saw)
- http://conqueringthecommandline.com/book/curl
curl and proxy
How to use curl command with proxy username/password on Linux/ Unix
Website performance
httpstat – A Curl Statistics Tool to Check Website Performance
wget/curl a file with correct name when redirected
wget --trust-server-names <url> # Or wget --content-disposition <url> # Or curl -JLO <url>
wget to download a folder
wget -A pdf,jpg,PDF,JPG -m -p -E -k -K -np http://site/path/
wget to download a website
- http://linux.about.com/od/commands/a/Example-Uses-Of-The-Command-Wget.htm
- https://www.gnu.org/software/wget/manual/wget.html
- 11 Best Free Website Downloader Software For Windows
- WebHTTrack Website Copier! sudo apt install webhttrack On Ubuntu, the app is in a web application (http://HOSTNAME:8080). We can launch it by typing 'webhttrack' in order to launch it in our default browser. See Grabbing Websites with WebHTTrack in Linux-magazine.
To download a copy of a complete web site, use the recursive option ('-r') By default it will go up to five levels deep. You can change the default level by using the '-l' option.
All files linked to in the documents are are downloaded to enable complete offline viewing ('-p' and '--convert-links' options). Instead of having the progress messages displayed on the standard output, you can save it to a log file with the -o option.
wget -p --convert-links -r -l2 linux.about.com -o logfile wget -p --convert-links -r -l1 https://csgillespie.github.io/efficientR # create csgillespie/efficientR
Save Web Pages As Single HTML Files With Monolith
Save Web Pages As Single HTML Files For Offline Use With Monolith (Console)
Internet application: wttr.in, check weather from console/terminal
- https://github.com/chubin/wttr.in. https://wttr.in is located in Germany.
- The weather and visualization backend is wego (both look very similar). The weather data is based on forecast.io which leads to darksky.net.
- Display Weather Forecast In Your Terminal With Wttr.in
$ curl wttr.in $ curl wttr.in/?m # check to use the metric system $ curl wttr.in/washington $ curl wttr.in/olney # not sure about which olney $ curl wttr.in/~olney # show the exact location at the bottom $ curl wttr.in/taipei
- 10 Ways to Check the Weather From Your Linux Desktop
Internet application: cheat.sh
See man -> Cheat.sh.
Files downloaded from a browser and wget
Same file downloaded through a browser and the wget command has a different file size and behavior.
$ ls -lh biotrip*.gz -rw-r--r-- 1 brb brb 198M May 15 09:11 biotrip_0.1.0_may19.tar.gz -rw-rw-r-- 1 brb brb 195M May 14 16:57 biotrip_0.1.0.tar.gz $ file biotrip_0.1.0_may19.tar.gz # downloaded from a browser (chrome browser, Mac or Linux) biotrip_0.1.0_may19.tar.gz: POSIX tar archive $ file biotrip_0.1.0.tar.gz # downloaded from the wget command biotrip_0.1.0.tar.gz: gzip compressed data, from HPFS filesystem (OS/2, NT) $ tar xzvf biotrip_0.1.0_may19.tar.gz gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now