Linux Programming: Difference between revisions
(325 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Shell Programming = | = Shell Programming = | ||
== Some Resources == | == Some Resources == | ||
* http:// | * [https://hpc.nih.gov/training/handouts/BashScripting.pptx Bash shell scripting for Helix and Biowulf] | ||
* [http://google.github.io/styleguide/shell.xml Shell Style Guide] from Google | |||
* http://learnshell.org/ | * http://learnshell.org/ | ||
* http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html | * http://tldp.org '''T'''he '''L'''inux '''D'''ocumentation '''P'''roject | ||
** [http://tldp.org/LDP/Bash-Beginners-Guide/html/index.html Bash Guide for Beginners] | |||
** [http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html BASH Programming] - Introduction HOW-TO | |||
** [http://tldp.org/LDP/abs/html/index.html Advanced Bash-Scripting Guide] | |||
* [https://bash.cyberciti.biz/guide/Main_Page Linux Shell Scripting Tutorial] from cyberciti.biz | |||
* [http://www.tecmint.com/enable-shell-debug-mode-linux/ Shell debugging] | |||
* [https://www.tecmint.com/useful-tips-for-writing-bash-scripts-in-linux/ 10 Useful Tips for Writing Effective Bash Scripts in Linux] | |||
* [https://zwischenzugs.com/2018/01/06/ten-things-i-wish-id-known-about-bash/ Ten Things I Wish I’d Known About bash] & [https://leanpub.com/learnbashthehardway Learn Bash the Hard Way] $4.99 | |||
* [https://opensource.com/article/20/1/improve-bash-scripts 5 ways to improve your Bash scripts] | |||
== | === Understand shell command options === | ||
[http://explainshell.com/ explainshell.com]. For example, https://explainshell.com/explain?cmd=rsync+-avz+--progress+--partial+-e | |||
=== Check shell scripts === | |||
./ | [https://www.howtogeek.com/788955/how-to-validate-the-syntax-of-a-linux-bash-script-before-running-it/ How To Validate the Syntax of a Linux Bash Script Before Running It] | ||
./ | [http://www.shellcheck.net/ ShellCheck] & download the binary from [https://launchpad.net/ubuntu/+source/shellcheck Launchpad]. | ||
./ | |||
If a statement missed a single quote the shell may show an error on a different line (though the error message is still useful). Therefore it is useful to verify the syntax of the script first before running it. | |||
=== Writing Secure Shell Scripts === | |||
[https://www.linuxjournal.com/content/writing-secure-shell-scripts Writing Secure Shell Scripts] | |||
./ | |||
=== | === Bioinformatics === | ||
[https://github.com/stephenturner/oneliners Bioinformatics one-liners] | |||
=== | === Data science === | ||
[https://datascienceatthecommandline.com/2e/chapter-4-creating-command-line-tools.html Data Science at the Command Line] Obtain, Scrub, Explore, and Model Data with Unix Power Tools | |||
== | == Special characters == | ||
[https://www.howtogeek.com/439199/15-special-characters-you-need-to-know-for-bash/ 15 Special Characters You Need to Know for Bash] | |||
== Progress bar == | |||
[https://www.linuxjournal.com/content/how-add-simple-progress-bar-shell-script How to Add a Simple Progress Bar in Shell Script] | |||
== Simple calculation == | |||
=== echo === | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
echo $(( 11/5 )) | |||
# or | |||
echo $((11/5)) | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Note: only return an integer number. | |||
== | === bc: an arbitrary precision calculator language === | ||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
bc -l <<< "11/5" | |||
# Without '-l' we only get the integer part | |||
# Or interactive | |||
bc -i | |||
scale=5 | |||
11/5 | |||
quit | |||
</syntaxhighlight> | </syntaxhighlight> | ||
where '''-l''' means to use the predefined math routines and '''<<<''' is a [http://linux.die.net/abs-guide/x15683.html here string]. Note '''bc''' returns a real number. | |||
== Here documents == | |||
=== << === | |||
* http://linux.die.net/abs-guide/here-docs.html | |||
* [https://www.cyberciti.biz/faq/using-heredoc-rediection-in-bash-shell-script-to-write-to-file/ How to use a here documents to write data to a file in bash script] | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
#!/bin/bash | |||
cat <<!FUNKY! | |||
hello | |||
this is a here | |||
document | |||
$var on line | |||
!FUNKY! | |||
</syntaxhighlight> | </syntaxhighlight> | ||
To disable pathname/parameter/variable expansion, command substitution, arithmetic expansion such as $HOME, ..., add quotes to EOF; 'EOF'. | |||
=== <<< here string === | |||
http://linux.die.net/abs-guide/x15683.html | |||
== Redirect == | |||
=== | === stdin, stdout, and stderr === | ||
https:// | [https://www.howtogeek.com/435903/what-are-stdin-stdout-and-stderr-on-linux/ What Are stdin, stdout, and stderr on Linux?] | ||
Redirecting output. File descriptor number 1 (2) means standard output (error). | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
./myProgram > stdout.txt # redirect std out to <stdout.txt> | |||
./myProgram 2> stderr.txt # redirect std err to <stderr.txt> by using the 2> operator | |||
./myProgram > stdout.txt 2> stderr.txt # combination of above two | |||
./myProgram > stdout.txt 2>&1 # redirect std err to std out <stdout.txt> | |||
./myProgram >& /dev/null # prevent writing std out and std err to the screen | |||
ps >> outptu.txt # append | |||
</syntaxhighlight> | |||
Redirecting input | |||
<syntaxhighlight lang='bash'> | |||
./myProgram < input.txt | |||
</syntaxhighlight> | </syntaxhighlight> | ||
=== Using cat or echo to create a new file that needs sudo right === | |||
The following command does not work | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
sudo cat myFile > /opt/myFile | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Solution 1 ('''sudo sh -c'''). We can use [http://stackoverflow.com/questions/82256/how-do-i-use-sudo-to-redirect-output-to-a-location-i-dont-have-permission-to-wr something] like | |||
<syntaxhighlight lang='bash'> | |||
sudo sh -c 'cat myFile > /opt/myFile' | |||
</syntaxhighlight> | |||
Solution 2 ('''sudo tee'''). See '[https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-as-a-web-server-and-reverse-proxy-for-apache-on-one-ubuntu-16-04-server How To Configure Nginx as a Web Server and Reverse Proxy for Apache on One Ubuntu 16.04 Server]' | |||
<syntaxhighlight lang='bash'> | |||
echo "<?php phpinfo(); ?>" | sudo tee /var/www/html/info.php | |||
</syntaxhighlight> | |||
If we want to append something to an existing file, use '''-a''' option in the '''tee''' command. | |||
=== Create a simple text file with multiple lines; write data to a file in bash script === | |||
< | Each of the methods below can be used in a bash script. | ||
# | <syntaxhighlight lang='bash'> | ||
# Method 1: printf. We can add \t for tab delimiter | |||
$ printf '%s \n' 'Line 1' 'Line 2' 'Line 3' > out.txt | |||
# Method 2: echo. We can add \t for tab delimiter | |||
$ echo -e 'Line 1\t12\t13 | |||
$ Line 2\t22\t23 | |||
$ Line 3\t32\t33' > out.txt | |||
# Method 3: echo | |||
$ echo $'Line 1\nLine 2\nLine 3' > out.txt | |||
For | # Method 4: here document, http://tldp.org/LDP/abs/html/here-docs.html | ||
< | # For the TAB character, use Ctrl-V, TAB. | ||
# | # Note that first line can be: cat <<EOF > out.txt | ||
# The filename can be a variable if this is used inside a bash file | |||
$ cat > out.txt <<EOF | |||
> line1 Second | |||
> lin2 abcd | |||
> line3ss dkflaf | |||
> EOF | |||
$ | |||
</syntaxhighlight> | </syntaxhighlight> | ||
See also [https://www.cyberciti.biz/faq/using-heredoc-rediection-in-bash-shell-script-to-write-to-file/ How to use a here documents to write data to a file in bash script] | |||
To escape the quotes, use a back slash. For example | |||
{{Pre}} | |||
echo $'#!/bin/bash\nmodule load R/3.6.0\nRscript --vanilla -e "rmarkdown::render(\'gse6532.Rmd\')"' | |||
</pre> | |||
</ | will obtain | ||
<pre> | |||
#!/bin/bash | |||
module load R/3.6.0 | |||
Rscript --vanilla -e "rmarkdown::render('gse6532.Rmd')" | |||
</pre> | |||
=== | === >& === | ||
&> file is not part of the official POSIX shell spec, but has been added to many Bourne shells as a convenience extension (it originally comes from csh). In a portable shell script (and if you don't need portability, why are you writing a shell script?), use > file 2>&1 only. | |||
a | |||
=== Redirect Output and Errors To /dev/null === | |||
http://www.cyberciti.biz/faq/how-to-redirect-output-and-errors-to-devnull/ | |||
<syntaxhighlight lang='bash'> | |||
command > /dev/null 2>&1 | |||
# OR | |||
command &>/dev/null | |||
</syntaxhighlight> | </syntaxhighlight> | ||
In addition we can put a process in the background by adding the '&' sign; see the [[Linux#dclock_.28digital.29|dclock]] example. | |||
=== tee -redirect to both a file and the screen same time === | |||
To redirect to both a file and the screen the same time, use tee command. See | |||
# | * http://www.cyberciti.biz/faq/linux-redirect-error-output-to-file/ | ||
* http://www.cyberciti.biz/faq/saving-stdout-stderr-into-separate-files/ | |||
* https://en.wikipedia.org/wiki/Tee_(command) | |||
* [https://www.howtoforge.com/linux-tee-command/ Linux tee Command Explained for Beginners (6 Examples)] | |||
* [https://stackoverflow.com/a/6991563 Since bash version 4 you may use |& as an abbreviation for 2>&1 |] | |||
<syntaxhighlight lang='bash'> | |||
command1 |& tee log.txt | |||
## or ## | |||
command1 -arg |& tee log.txt | |||
## or ## | |||
command1 2>&1 | tee log.txt | |||
# use the option '-a' for *append* | |||
echo "new line of text" | sudo tee -a /etc/apt/sources.list | |||
# redirect output of one command to another | |||
ls file* | tee output.txt | wc -l | |||
# streaming file (e.g. running an arduino sketch on Udoo) | |||
# for streaming files, cp command (still need Ctrl + c) will not | |||
# show anything on screen though copying is executed. | |||
cat /dev/ttymxc3 | tee out.txt # Ctrl + c | |||
# | |||
</syntaxhighlight> | </syntaxhighlight> | ||
* [http://stackoverflow.com/questions/692000/how-do-i-write-stderr-to-a-file-while-using-tee-with-a-pipe How do I write stderr to a file while using “tee” with a pipe?] | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
command > >(tee stdout.log) 2> >(tee stderr.log >&2) | |||
(( | |||
</syntaxhighlight> | </syntaxhighlight> | ||
=== | === Methods To Create A File In Linux === | ||
[https://www.2daygeek.com/linux-command-to-create-a-file/ 10 Methods To Create A File In Linux] | |||
<syntaxhighlight lang='bash'> | |||
=== Prepend === | |||
[https://www.cyberciti.biz/faq/bash-prepend-text-lines-to-file/ BASH Prepend A Text / Lines To a File] | |||
== Pipe == | |||
The operator is |. | |||
<syntaxhighlight lang='bash'> | |||
ps > psout.txt | |||
sort psout.txt > pssort.out | |||
</syntaxhighlight> | </syntaxhighlight> | ||
can be simplified to | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
ps | sort > pssort.out | |||
</syntaxhighlight> | </syntaxhighlight> | ||
For example, | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
$ head /etc/passwd | |||
root:x:0:0:root:/root:/bin/bash | |||
daemon:x:1:1:daemon:/usr/sbin:/bin/sh | |||
bin:x:2:2:bin:/bin:/bin/sh | |||
sys:x:3:3:sys:/dev:/bin/sh | |||
sync:x:4:65534:sync:/bin:/bin/sync | |||
$ | $cat /etc/passwd | cut -d: -f7 | sort | uniq -c | sort -nr | ||
18 /bin/sh | |||
13 /bin/false | |||
2 /bin/bash | |||
1 /bin/sync | |||
</syntaxhighlight> | </syntaxhighlight> | ||
where cut command will extract the 7th field separated by the : character and write to the output stream. sort command will sort alphabetically sorts the line it reads from its input and returns the new sort to its output. The uniq command will remove and count duplicated lines. The final sort command will sort its input numerically in reverse order. | |||
=== Dash (-) at the end of a command mean? === | |||
<syntaxhighlight lang='bash'> | * http://unix.stackexchange.com/questions/16357/usage-of-dash-in-place-of-a-filename. It means 'standard input' or anything that will be used (required or interpreted) by the software. The following example is from [https://opensource.com/article/18/7/how-use-dd-linux How to use dd command] <syntaxhighlight lang='bash'> | ||
# | # ssh [email protected] "dd if=/dev/sda | gzip -1 -" | dd of=backup.gz | ||
</syntaxhighlight> | |||
* http://unix.stackexchange.com/questions/41828/what-does-dash-at-the-end-of-a-command-mean | |||
=== Process substitution === | |||
https://en.wikipedia.org/wiki/Process_substitution | |||
=== | === Powerfulness of pipes === | ||
Consider the following commands (<span style="color: red">samtools gives its output on stdout which is a good opportunity to use pipes</span>) | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
samtools mpileup -go temp.bcf -uf genome.fa dedup.bam | |||
bcftools call -vmO v -o sample1_raw.vcf temp.bcf | |||
</syntaxhighlight> | </syntaxhighlight> | ||
The disadvantage of this approach is it will create a temporary file (temp.bcf in this case). If the size of the temporary file is enormous large (several hundred of GB), it will waste/eat up the hard disk space no to say the time used to create the temporary file. If we use pipes, we can save the time and disk space of the temporary file. | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
samtools mpileup -uf genome.fa dedup.bam | bcftools call -vmO v -o sample1_raw.vcf | |||
</syntaxhighlight> | |||
=== Send a stdout to a remote computer === | |||
See [[Linux#Bypass_SSH_password_login_.28convenient_for_CVS.2C_git_etc.29|here (bypass SSH password)]] for a case (utilize '''cat''', '''ssh''' and '''>>''' commands). | |||
=== Execute a bash script downloaded (without saving first) from the internet === | |||
See the example of [https://about.gitlab.com/downloads/#raspberrypi2 install Gitlab] | |||
<syntaxhighlight lang='bash'> | |||
sudo curl -sS https://packages.gitlab.com/install/repositories/gitlab/raspberry-pi2/script.deb.sh | sudo bash | |||
</syntaxhighlight> | </syntaxhighlight> | ||
where '''-s''' means silent and '''-S''' means showing error messages if it fails. Note that '''curl''' will download the file to standard output. So using the pipe operator is a reasonable sequence after running the '''curl'''. | |||
=== Use wget to download and decompress at one line === | |||
https://stackoverflow.com/questions/16262980/redirect-pipe-wget-download-directly-into-gunzip | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
wget -O - ftp://ftp.direcory/file.gz | gunzip -c > file.out | |||
</syntaxhighlight> | </syntaxhighlight> | ||
where "-O -" means to print to standard output (sort of like the default behavior of "curl"). See https://www.gnu.org/software/wget/manual/wget.html | |||
== | === Use pipe and while loop to process multiple files === | ||
See an example at [[#while|while]]. | |||
=== Pipe vs redirect === | |||
* Pipe is used to pass output to another program or utility. | |||
* Redirect is used to pass output to either a file or stream. | |||
In other words, ''thing1 | thing2'' does the same thing as ''thing1 > temp_file'' && ''thing2 < temp_file''. | |||
... | == Shebang (#!) == | ||
A shebang is the character sequence consisting of the characters number sign and exclamation mark (that is, "#!") at the beginning of a script. See the [http://en.wikipedia.org/wiki/Shebang_%28Unix%29 Wikipedia] page. | |||
The syntax looks like | |||
<pre> | <pre> | ||
#! interpreter [optional-arg] | |||
! | |||
- | |||
</pre> | </pre> | ||
For example, | |||
< | * <code>#!/bin/sh</code> — Execute the file using sh, the [[Bourne shell]], or a compatible shell | ||
#!/bin/bash | * <code>#!/bin/csh -f</code> — Execute the file using csh, the [[C shell]], or a compatible shell, and suppress the execution of the user’s ''.cshrc'' file on startup | ||
* <code>#!/usr/bin/perl -T</code> — Execute using [[Perl]] with the option for [[Taint checking|''taint'' checks]] | |||
=== When Is It Better to Use #!/bin/bash Instead of #!/bin/sh in a Shell Script? === | |||
http://www.howtogeek.com/276607/when-is-it-better-to-use-bin-bash-instead-of-bin-sh-in-a-shell-script/ | |||
</ | === Howto Make Script More '''Portable''' With #!/usr/bin/env As a Shebang === | ||
https://www.cyberciti.biz/tips/finding-bash-perl-python-portably-using-env.html | |||
This is useful if the interpreter location is different on Linux and Mac OSs. | |||
<syntaxhighlight lang='bash'> | |||
# Linux | |||
$ which Rscript | |||
/usr/bin/Rscript | |||
# Mac | |||
$ which Rscript | |||
/usr/local/bin/Rscript | |||
</syntaxhighlight> | |||
We can use the following on the first line of the shell script. | |||
<syntaxhighlight lang='bash'> | |||
#!/usr/bin/env Rscript | |||
</syntaxhighlight> | |||
== Comments == | |||
For a single line, we can use the '#' sign. [https://www.cyberciti.biz/faq/bash-comment-out-multiple-line-code/ Shell Script Put Multiple Line Comments under Bash/KSH]. | |||
== | For a block of code, we use | ||
=== ''' | <syntaxhighlight lang='bash'> | ||
#!/bin/bash | |||
echo before comment | |||
: <<'END' | |||
bla bla | |||
blurfl | |||
END | |||
echo after comment | |||
</syntaxhighlight> | |||
== Variables == | |||
<syntaxhighlight lang='bash'> | |||
food=Banana | |||
echo $food | |||
food="Apple" | |||
echo $food | |||
</syntaxhighlight> | |||
=== When do I need to use the '''export''' command === | |||
Consider the following | |||
<pre> | <pre> | ||
MY_DIRECTORY=/path/to/my/directory | |||
export MY_DIRECTORY | |||
./my_script.sh | |||
</pre> | |||
If you don’t use the export command in the above example, the MY_DIRECTORY variable will not be available to the my_script.sh script. It will only be available within the '''current shell session''' as a local shell variable. | |||
When you set a variable in a shell session without using the export command, it is only available within that shell session as a local shell variable. This means that the variable and its value are only accessible within the current shell session and '''are not passed to child processes (e.g. my_script.sh) or other programs that are started from the command line'''. | |||
</pre> | Cf. When I put '''LS_COLORS''' in the .bashrc file, I don't need to use the export command. | ||
< | === '''export -n''' command: remove from environment === | ||
https://linuxconfig.org/learning-linux-commands-export | |||
It will export an environment variable to the subshell/forked process. For example | |||
<syntaxhighlight lang='bash'> | |||
</ | $ export MYVAR=10 # export a variable | ||
$ export -n MYVAR # remove a variable | |||
</syntaxhighlight> | |||
= | To see the current process ID, use | ||
<syntaxhighlight lang='bash'> | |||
echo $$ | |||
</syntaxhighlight> | |||
</ | |||
''' | To create a new process, use | ||
<syntaxhighlight lang='bash'> | |||
bash | |||
</syntaxhighlight> | |||
When using the export command without any option and arguments it will simply print all names marked for an export to a child process. | |||
</ | <syntaxhighlight lang='bash'> | ||
$ export | |||
declare -x EDITOR="nano" | |||
declare -x HISTTIMEFORMAT="%d/%m/%y %T " | |||
declare -x HOME="/home/brb" | |||
declare -x LANG="en_US.UTF-8" | |||
declare -x LESSCLOSE="/usr/bin/lesspipe %s %s" | |||
declare -x LESSOPEN="| /usr/bin/lesspipe %s" | |||
declare -x LOGNAME="brb" | |||
... | |||
declare -x PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games" | |||
declare -x PWD="/home/brb" | |||
declare -x SHELL="/bin/bash" | |||
... | |||
declare -x USER="brb" | |||
declare -x VISUAL="nano" | |||
</syntaxhighlight> | |||
=== | === echo command === | ||
* https://en.wikipedia.org/wiki/Echo_(command) | |||
* [https://www.howtogeek.com/446071/how-to-use-the-echo-command-on-linux/ How to Use the Echo Command on Linux] | |||
** Writing Text to the Terminal | |||
** Using Variables With echo | |||
** Using Commands With echo | |||
** Formatting Text With echo | |||
** Using echo With Files and Directories | |||
** Writing to Files with echo | |||
=== | === String manipulation === | ||
http://www.thegeekstuff.com/2010/07/bash-string-manipulation/ | |||
==== '''dirname''' and '''basename''' commands ==== | |||
http://www.tldp.org/LDP/LG/issue18/bash.html | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
# On directories | |||
$ dirname ~/Downloads | |||
/home/chronos/user | |||
$ basename ~/Downloads | |||
Downloads | |||
# On files | |||
$ dirname ~/Downloads/DNA_Helix.zip | |||
/home/chronos/user/Downloads | |||
$ basename ~/Downloads/DNA_Helix.zip | |||
DNA_Helix.zip | |||
$ basename ~/Downloads/DNA_Helix.zip .zip | |||
DNA_Helix | |||
$ basename ~/Downloads/annovar.latest.tar.gz | |||
annovar.latest.tar.gz | |||
$ basename ~/Downloads/annovar.latest.tar.gz .gz | |||
annovar.latest.tar | |||
$ basename ~/Downloads/annovar.latest.tar.gz .tar.gz | |||
annovar.latest | |||
$ basename ~/Downloads/annovar.latest.tar.gz .latest.tar.gz | |||
annovar | |||
</syntaxhighlight> | </syntaxhighlight> | ||
=== | ==== Escape characters and quotes ==== | ||
<pre> | <pre> | ||
echo $USER # brb | |||
echo My name is $USER | |||
echo "My name is $USER" # My name is brb | |||
echo 'My name is $USER' # 'My name is $USER'; single quote will not interpret the variable | |||
# we use the single quotes if we want to present the characters literally or | |||
# pass the characters to the shell. | |||
grep '.*/udp' /etc/services # normally . and * and slash characters have special meaning | |||
echo \$USER # we escape $ so $ lost its special meaning | |||
echo '\' | |||
echo \'text\' # 'text' | |||
</pre> | |||
</ | |||
==== | ==== When to use double quotes with a variable ==== | ||
[https://unix.stackexchange.com/questions/78002/when-to-use-double-quotes-with-a-variable-in-shell-script when to use double quotes with a variable in shell script?] | |||
==== Concatenate string variables (not safe) ==== | |||
http://stackoverflow.com/questions/4181703/how-can-i-concatenate-string-variables-in-bash | |||
<syntaxhighlight lang='bash'> | |||
a='hello' | |||
b='world' | |||
c=$a$b | |||
echo $c | |||
# | # Bash also supports a += operator | ||
$ A="X Y" | |||
$ A+="Z" | |||
$ echo "$A" | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Often we need to use "double quotes" around the string variables if the string variables represent some directories. | |||
<syntaxhighlight lang='bash'> | |||
<syntaxhighlight lang= | mkdir "tmp 1" | ||
touch "tmp 1/tmpfile" | |||
tmpvar="tmp 1" | |||
echo tmpvar | |||
# tmp 1 | |||
# | |||
ls $tmpvar | |||
ls: cannot access tmp: No such file or directory | |||
ls: cannot access 1: No such file or directory | |||
ls "$tmpvar" | |||
# tmpfile | |||
</syntaxhighlight> | </syntaxhighlight> | ||
== | However, for integers | ||
== ''' | <syntaxhighlight lang='bash'> | ||
echo $a | |||
24 | |||
((a+=12)) | |||
echo $a | |||
36 | |||
</syntaxhighlight> | |||
Note that the [http://tldp.org/LDP/abs/html/dblparens.html double parentheses construct] in ((a+=12)) permits arithmetic expansion and evaluation. | |||
==== '''${parameter}''' - Concatenate a string variable and a constant string; variable substitution ==== | |||
[http://tldp.org/LDP/abs/html/parameter-substitution.html#PARAMSUBREF Parameter substitution ${}]. Cf $() for command execution | |||
<syntaxhighlight lang='bash'> | |||
x=foo | |||
y=bar | |||
z=$x$y # $z is now "foobar" | |||
z="$x$y" # $z is still "foobar" | |||
z="$xand$y" # does not work | |||
z="${x}and$y" # does work, "fooandbar" | |||
</syntaxhighlight> | |||
And | |||
<syntaxhighlight lang='bash'> | |||
your_id=${USER}-on-${HOSTNAME} | |||
echo "$your_id" | |||
echo "Old \$PATH = $PATH" | |||
PATH=${PATH}:/opt/bin # Add /opt/bin to $PATH for duration of script. | |||
echo "New \$PATH = $PATH" | |||
</syntaxhighlight> | |||
And using "{" in order to create a new string based on an existing variable | |||
<pre> | <pre> | ||
pdir="/tmp/files/today" | |||
fname="report" | |||
mkdir -p $pdir | |||
touch $pdir/$fname # OK | |||
ls -l $pdir/$fname | |||
touch $pdir/$fname_new # No error but it does not do anything | |||
# because this variable does not exist yet | |||
ls $pdir/$fname_new | |||
touch $pdir/${fname}_new | |||
ls $pdir/${fname}_new # Works | |||
</pre> | </pre> | ||
=== | ==== '''$(command)''' - Command Execution and Assign Output of Shell Command To a Variable; Command substitution ==== | ||
[https://www.cyberciti.biz/faq/unix-linux-bsd-appleosx-bash-assign-variable-command-output/ Bash Assign Output of Shell Command To Variable] | |||
<syntaxhighlight lang='bash'> | |||
$(command) | |||
`command` # ` is a backquote/backtick, not a single quotation sign | |||
# this is a legacy support; not recommended by https://www.shellcheck.net/ | |||
</syntaxhighlight> | |||
< | Note all new scripts should use the $(...) form, which was introduced to avoid some rather complex rules. | ||
$ | |||
Example 1. | |||
<syntaxhighlight lang='bash'> | |||
sudo apt-get install linux-headers-$(uname -r) | |||
</syntaxhighlight> | |||
Example 2. | |||
<syntaxhighlight lang='bash'> | |||
user=$(echo "$UID") | |||
</syntaxhighlight> | |||
Example 3. | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/sh | |||
echo | echo The current directory is $PWD | ||
echo The current users are $(who) | |||
sudo chown `id -u` SomeDir # change the ownership to the current user. Dangerous! | |||
# Or sudo chown `whoami` SomeDirOrSomeFile | |||
exit 0 | exit 0 | ||
</ | </syntaxhighlight> | ||
Example 4. Create a new file with automatically generated filename | |||
<pre> | <pre> | ||
$ | touch file-$(date -I) | ||
</pre> | </pre> | ||
Example 5. Use '''$(your expression)''' to run nest expressions. For example, | |||
< | <syntaxhighlight lang='bash'> | ||
# cd into the directory containing the 'touch' command. | |||
cd $(dirname $(type -P touch)) | |||
== | BACKUPDIR=/nas/backup | ||
LASTDAYPATH=${BACKUPDIR}/$(ls ${BACKUPDIR} | tail -n 1) | |||
$ | </syntaxhighlight> | ||
</ | |||
The concept of putting the result of a command into a script variable is very powerful, as it makes it easy to use existing commands in scripts and capture their output. | |||
'''Arithmetic Expansion''' | |||
<syntaxhighlight lang='bash'> | |||
< | |||
$((...)) | $((...)) | ||
</ | </syntaxhighlight> | ||
is a better alternative to the '''expr''' command. More examples: | is a better alternative to the '''expr''' command. More examples: | ||
< | <syntaxhighlight lang='bash'> | ||
for i in $(seq 1 3) | for i in $(seq 1 3) | ||
do echo SRR$(( i + 1027170 ))'_1'.fastq | do echo SRR$(( i + 1027170 ))'_1'.fastq | ||
done | done | ||
</ | </syntaxhighlight> | ||
Note that the single quote above is required. The above will output SRR1027171_1.fastq, SRR102172_1.fastq and SRR1027173_1.fastq. | Note that the single quote above is required. The above will output SRR1027171_1.fastq, SRR102172_1.fastq and SRR1027173_1.fastq. | ||
'''Parameter Expansion''' | '''Parameter Expansion''' | ||
< | <syntaxhighlight lang='bash'> | ||
${parameter} | ${parameter} | ||
</ | </syntaxhighlight> | ||
== | ==== Double Parentheses (()) ==== | ||
[https://fedoramagazine.org/bash-shell-scripting-for-beginners-part-1/ Bash Shell Scripting for beginners (Part 1)] fedoramagazine. Double parentheses are simple, they are for mathematical equations. | |||
=== | ==== extract substring ==== | ||
https://www.cyberciti.biz/faq/how-to-extract-substring-in-bash/ | |||
<syntaxhighlight lang='bash'> | |||
${parameter:offset:length} | |||
</syntaxhighlight> | |||
< | Example: | ||
# | <syntaxhighlight lang='bash'> | ||
## define var named u ## | |||
u="this is a test" | |||
$ | var="${u:10:4}" | ||
echo "${var}" | |||
</syntaxhighlight> | |||
Or use the '''cut''' command. | |||
<syntaxhighlight lang='bash'> | |||
u="this is a test" | |||
echo "$u" | cut -d' ' -f 4 | |||
echo "$u" | cut --delimiter=' ' --fields=4 | |||
########################################## | |||
## WHERE | |||
## -d' ' : Use a whitespace as delimiter | |||
## -f 4 : Select only 4th field | |||
########################################## | |||
var="$(cut -d' ' -f 4 <<< $u)" | |||
echo "${var}" | |||
</ | </syntaxhighlight> | ||
=== | === Environment variables === | ||
[https://www.howtogeek.com/668503/how-to-set-environment-variables-in-bash-on-linux/ How to Set Environment Variables in Bash on Linux] | |||
$ | <syntaxhighlight lang='bash'> | ||
$HOME | |||
$PATH | |||
$0 -- name of the shell script | |||
$# -- number of parameters passed (so it does include the program itself) | |||
$$ process ID of the shell script, often used inside a script for generating unique temp filenames | |||
$? -- the exit value of the last run command; 0 means OK and none-zero means something wrong | |||
$_ -- previous command's last argument | |||
</syntaxhighlight> | |||
Example 1 (check if a command run successfully): | |||
<syntaxhighlight lang='bash'> | |||
some_command | |||
if [ $? -eq 0 ]; then | |||
echo OK | |||
else | |||
echo FAIL | |||
fi | |||
# OR | |||
if some_command; then | |||
printf 'some_command succeeded\n' | |||
else | |||
printf 'some_command failed\n' | |||
fi | |||
$ tabix -f -p vcf ~/SeqTestdata/usefulvcf/hg19/CosmicCodingMuts.vcf.gz | |||
brb@brb-P45T-A:/tmp$ echo $? | |||
0 | |||
$ tabix -f -p vcf ~/Downloads/CosmicCodingMuts.vcf.gz | |||
Not a BGZF file: /home/brb/Downloads/CosmicCodingMuts.vcf.gz | |||
tbx_index_build failed: /home/brb/Downloads/CosmicCodingMuts.vcf.gz | |||
</ | $ echo $? | ||
1 | |||
</syntaxhighlight> | |||
= | Example 2 (check whether a host is reachable) | ||
<syntaxhighlight lang='bash'> | |||
< | ping DOMAIN -c2 &> /dev/null | ||
if [ $? -eq 0 ]; | |||
then | |||
echo Successful | |||
else | |||
echo Failure | |||
fi | |||
</syntaxhighlight> | |||
where -c is used to limit the number of packets to be sent and &> /dev/null is used to redirect both ''stderr'' and ''stdout'' to /dev/null so that it won't be printed on the terminal. | |||
Example 3 (check if users have supply a correct number of parameters): | |||
< | <syntaxhighlight lang='bash'> | ||
#!/bin/bash | |||
if [ $# -ne 2 ]; then | |||
echo "Usage: $0 ProgramName filename" | |||
exit 1 | |||
fi | |||
match_text=$1 | |||
filename=$2 | |||
</syntaxhighlight> | |||
Example 4 (make a new directory and cd to it) | |||
<syntaxhighlight lang='bash'> | |||
mkdir -p "newDir/subDir"; cd "$_" | |||
</syntaxhighlight> | |||
==== How to List Environment Variables ==== | |||
[https://www.howtogeek.com/842780/linux-list-environment-variables/ How to List Environment Variables on Linux] | |||
<pre> | |||
printenv | |||
</pre> | </pre> | ||
==== Unset/Remove an environment variable ==== | |||
<syntaxhighlight lang='bash'> | |||
$ export MSG="HELLO WORLD" | |||
$ echo $MSG | |||
HELLO WORLD | |||
$ unset MSG | |||
$ echo $MSG | |||
$ | |||
</syntaxhighlight> | |||
==== Set an environment variable and run a command on the same line, env command ==== | |||
<ul> | |||
<li>[https://stackoverflow.com/a/10856348 Setting an environment variable before a command in Bash is not working for the second command in a pipe] | |||
<li>[https://stackoverflow.com/a/20858414 What does 'bash -c' do?] | |||
<pre> | <pre> | ||
FOO=bar bash -c 'somecommand someargs | somecommand2' | |||
</pre> | </pre> | ||
<li>env: run a program in a modified environment. [https://www.man7.org/linux/man-pages/man1/env.1.html man env], [https://www.geeksforgeeks.org/env-command-in-linux-with-examples/# env command in Linux with Examples] | |||
<pre> | <pre> | ||
env RSTUDIO_WHICH_R=/opt/R/4.2.3/bin/R rstudio ~/Project/project.Rproj | |||
</pre> | </pre> | ||
Note that the environment is not changed. RSTUDIO_WHICH_R is not exported. | |||
<li>https://en.wikipedia.org/wiki/Env. ''Note that this use of env is often unnecessary since most shells support setting environment variables in front of a command''. | |||
<pre> | <pre> | ||
# | env DISPLAY=foo.bar:1.0 xcalc | ||
# OR | |||
DISPLAY=foo.bar:1.0 xcalc | |||
</pre> | |||
</ul> | |||
=== Parameter variables === | |||
* [https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html Shell Parameter Expansion] - Important !! | |||
* http://tldp.org/LDP/abs/html/othertypesv.html | |||
* https://bash.cyberciti.biz/guide/Pass_arguments_into_a_function | |||
<syntaxhighlight lang='bash'> | |||
$1, $2, .... -- parameters given to the script | |||
$* -- list of all the parameters, in a single variable | |||
$@ -- subtle variation on $*. | |||
$! -- the process id of the last command run in the background. | |||
</syntaxhighlight> | |||
Example 1. | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/bash | |||
echo "$1 likes to eat $2 and $3 every day." | |||
echo "bye:-)" | |||
</syntaxhighlight> | |||
Example 2. | |||
<syntaxhighlight lang='bash'> | |||
$ touch /tmp/tmpfile_$$ | |||
$ set foo bar bam | |||
echo | $ echo $# | ||
echo | 3 | ||
$ echo $@ | |||
foo bar bam | |||
$ set foo bar bam & | |||
echo | [1] 28212 | ||
$ echo $! | |||
28212 | |||
< | [1]+ Done set foo bar bam | ||
</syntaxhighlight> | |||
Example 3. [https://www.lifewire.com/pass-arguments-to-bash-script-2200571 $@] parameter for a variable number of parameters | |||
<syntaxhighlight lang='bash'> | |||
$ cat stats.sh | |||
for FILE1 in "$@" | |||
do | |||
wc $FILE1 | |||
done | |||
$ sh stats.sh songlist1 songlist2 songlist3 | |||
</syntaxhighlight> | |||
We can also use parentheses around the variable name. | |||
<syntaxhighlight lang='bash'> | |||
QT_ARCH=x86_64 | |||
QT_SDK_BINARY=QtSDK-4.8.0-${QT_ARCH}.tar.gz | |||
QT_SD_URL=https://xxx.com/$QT_SDK_BINARY | |||
</syntaxhighlight> | |||
[http://stackoverflow.com/questions/1224766/how-do-i-rename-the-extension-for-a-batch-of-files How do I rename the extension for a batch of/multiple files?] See '''man bash''' [https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html Shell Parameter Expansion] | |||
<syntaxhighlight lang='bash'> | |||
# Solution 1: | |||
for file in *.html; do | |||
mv "$file" "`basename "$file" .html`.txt" | |||
done | |||
# Solution 2: | |||
for file in *.html | |||
do | |||
mv "$file" "${file%.html}.txt" | |||
done | |||
</syntaxhighlight> | |||
== | ==== Get filename without Path ==== | ||
[https://tecadmin.net/how-to-extract-filename-extension-in-shell-script/ How to Extract Filename & Extension in Shell Script] | |||
<pre> | |||
fullfilename="/var/log/mail.log" | |||
filename=$(basename "$fullfilename") | |||
echo $filename | |||
</pre> | |||
==== Extension without filename ==== | |||
[https://tecadmin.net/how-to-extract-filename-extension-in-shell-script/ How to Extract Filename & Extension in Shell Script] | |||
<pre> | <pre> | ||
$ | fullfilename="/var/log/mail.log" | ||
filename=$(basename "$fullfilename") | |||
ext="${filename##*.}" | |||
echo $ext | |||
</pre> | </pre> | ||
==== Discard the extension name and "%" symbol ==== | |||
<syntaxhighlight lang='bash'> | |||
$ vara=fillename.ext | |||
$ echo $vara | |||
fillename.ext | |||
$ echo ${vara::-4} # works on Bash 4.3, eg Ubuntu | |||
fillename | |||
$ echo ${vara::${#vara}-4} # works on Bash 4.1, eg Biowulf readhat | |||
</syntaxhighlight> | |||
http://stackoverflow.com/questions/27658675/how-to-remove-last-n-characters-from-a-bash-variable-string | |||
Another way (not assuming 3 letters for the suffix) https://www.cyberciti.biz/faq/unix-linux-extract-filename-and-extension-in-bash/ | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
dest="/nas100/backups/servers/z/zebra/mysql.tgz" | |||
## get file name i.e. basename such as mysql.tgz | |||
tempfile="${dest##*/}" | |||
## display filename | |||
echo "${tempfile%.*}" | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Or better with (See [https://stackoverflow.com/questions/965053/extract-filename-and-extension-in-bash Extract filename and extension in Bash] and [https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html Shell parameter expansion]). [https://tecadmin.net/how-to-extract-filename-extension-in-shell-script/ How to Extract Filename & Extension in Shell Script] | |||
<syntaxhighlight lang='bash'> | |||
< | fullfilename="/var/log/mail.log" | ||
filename=$(basename "$fullfilename") | |||
fname="${filename%.*}" | |||
echo $fname # mail | |||
$ UEFI_ZIP_FILE="UDOOX86_B02-UEFI_Update_rel102.zip" | |||
$ UEFI_ZIP_DIR="${UEFI_ZIP_FILE%.*}" | |||
$ echo $UEFI_ZIP_DIR | |||
UDOOX86_B02-UEFI_Update_rel102 | |||
= | $ FILE="example.tar.gz" | ||
* | $ echo "${FILE%%.*}" | ||
* | example | ||
$ echo "${FILE%.*}" | |||
example.tar | |||
$ echo "${FILE#*.}" | |||
tar.gz | |||
$ echo "${FILE##*.}" | |||
gz | |||
</syntaxhighlight> | |||
== | ==== Space in variable value==== | ||
Suppose we have a script file called 'foo' that can remove spaces from a file name. Note: '''tr''' command is used to delete characters specified by the '-d' parameter. | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/sh | |||
NAME=`ls $1 | tr -d ' '` | |||
echo $NAME | |||
mv $1 $NAME | |||
</syntaxhighlight> | |||
Now we try the program: | |||
<syntaxhighlight lang='bash'> | |||
$ touch 'file 1.txt' | |||
$ ./foo 'file 1.txt' | |||
ls: cannot access file: No such file or directory | |||
ls: cannot access 1.txt: No such file or directory | |||
mv: cannot stat ‘file’: No such file or directory | |||
</syntaxhighlight> | |||
The way to fix the program is to use double quotes around $1 | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/sh | |||
NAME=`ls "$1" | tr -d ' '` | |||
echo $NAME | |||
mv "$1" $NAME | |||
</syntaxhighlight> | |||
and test it | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
$ ./foo "file 1.txt" | |||
file1.txt | |||
</syntaxhighlight> | |||
If we concatenate the variable, put the double quotes around the variables, not the whole string. | |||
<syntaxhighlight lang='bash'> | |||
$ rm "$outputDir/tmp/$tmpfd/tmpa" # fine | |||
$ rm "$outputDir/tmp/$tmpfd/tmp*.txt" | |||
rm: annovar6-12/tmp/tmp_bt20_raw/tmp*.txt: No such file or directory | |||
$ rm "$outputDir"/tmp/$tmpfd/tmp*.txt | |||
</syntaxhighlight> | |||
See https://unix.stackexchange.com/questions/131766/why-does-my-shell-script-choke-on-whitespace-or-other-special-characters | |||
==== getopts function - parse options from shell script command line ==== | |||
* https://www.lifewire.com/pass-arguments-to-bash-script-2200571 | |||
* https://www.computerhope.com/unix/bash/getopts.htm | |||
* [https://www.howtogeek.com/778410/how-to-use-getopts-to-parse-linux-shell-script-options/ How to Use getopts to Parse Linux Shell Script Options] | |||
==== Check if command line argument is missing (? :) and specifying the default (:-) ==== | |||
Search for [https://stackoverflow.com/a/3953666 ternary (conditional) operator] and check out [https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html parameter Expansion] in Bash Reference Manual. [https://linuxhint.com/bash_operator_examples/ 74 Bash Operators Examples] | |||
<pre> | |||
#!/usr/bin/env bash | |||
NAME=${1?Error: no name given} | |||
NAME2=${2:-friend} | |||
echo "HELLO! $NAME and $NAME2" | |||
</pre> | |||
=== Shell expansion === | |||
https://www.gnu.org/software/bash/manual/html_node/Shell-Expansions.html#Shell-Expansions | |||
==== Curly brace {} expansion and array ==== | |||
* A [https://wizardzines.com/comics/parameter-expansion/?s=09 Comic] from Wizard zines. | |||
* [https://www.cyberciti.biz/faq/explain-brace-expansion-in-cp-mv-bash-shell-commands/ Explain: {,} in cp or mv Bash Shell Commands] | |||
* [https://unix.stackexchange.com/questions/157286/copying-files-with-multiple-extensions Copy multiple types of extensions] | |||
: <syntaxhighlight lang='bash'> | |||
cp -v *.{txt,jpg,png} destination/ | |||
</syntaxhighlight> | </syntaxhighlight> | ||
* [https://www.linux.com/blog/learn/2019/2/all-about-curly-braces-bash All about {Curly Braces} in Bash] | |||
** Array Builder <syntaxhighlight lang='bash'> | |||
echo {0..10} | |||
echo {10..0..2} | |||
echo {z..a..2} | |||
mkdir test{10..12} # test10, test11, test12 directories | |||
rm -rf test{10..12} | |||
</syntaxhighlight> | </syntaxhighlight> | ||
** Parameter expansion <syntaxhighlight lang='bash'> | |||
<syntaxhighlight lang='bash'> | # convert jpg to png | ||
for i in *.jpg; do convert $i ${i%jpg}png; done | |||
# | |||
a="Hello World!" | |||
echo Goodbye${a#Hello} | |||
# Goodbye World! | |||
</syntaxhighlight> | </syntaxhighlight> | ||
** Output Grouping | |||
* [https://www.makeuseof.com/bash-script-array-usage/ How to Use Arrays in a Bash Script] | |||
==== Square brackets ==== | |||
[https://www.linux.com/blog/2019/3/using-square-brackets-bash-part-1 Using Square Brackets in Bash: Part 1] | |||
Globbing: Using wildcards to get all the results that fit a certain pattern is precisely | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
ls *.jpg # the asterisk means "zero or more characters" | |||
ls d*k? # ?, which means "exactly one character" | |||
touch file0{0..9}{0..9} # This will create files file000, file001, file002, etc., through file097, file098 and file099. | |||
ls file0[78]? # list the files in the 70s and 80s | |||
ls file0[259][278] # list file022, file027, file028, file052, file057, file058, file092, file097, and file98 | |||
</syntaxhighlight> | </syntaxhighlight> | ||
== | == Conditions == | ||
We can use the '''test''' command to check if a file exists. The command is test -f <filename>. | |||
[] is just the same as writing test, and would always leave a space after the test | |||
< | word. | ||
<pre> | |||
if test -f fred.c; then ...; fi | |||
if [ -f fred.c ] | |||
then | |||
... | |||
fi | |||
if [ -f fred.c ]; then | |||
... | |||
fi | |||
</ | </pre> | ||
=== Boolean variables === | |||
< | [https://www.cyberciti.biz/faq/how-to-declare-boolean-variables-in-bash-and-use-them-in-a-shell-script/ How to declare Boolean variables in bash and use them in a shell script] | ||
<pre> | |||
failed=0 # False | |||
jobdone=1 # True | |||
## more readable syntax ## | |||
failed=false | |||
jobdone=true | |||
echo | if [ $failed -eq 1 ] | ||
</ | then | ||
echo "Job failed" | |||
else | |||
echo "Job done" | |||
fi | |||
</pre> | |||
We can define them as a string and make our code more readable. | |||
=== What is the difference between test, [ and [[ ? === | |||
http://mywiki.wooledge.org/BashFAQ/031 | |||
[ ("test" command) and [[ ("new test" command) are used to evaluate expressions. [[ works only in Bash, Zsh and the Korn shell, and is more powerful; [ and ''test'' are available in POSIX shells. | |||
''test'' implements the old, portable syntax of the command. In almost all shells (the oldest Bourne shells are the exception), [ is a synonym for ''test'' (but requires a final argument of ]). | |||
''' | |||
[[ is a new improved version of it, and is a keyword, not a program. | |||
=== String comparison === | |||
<pre> | |||
== ==> strings are equal (== is a synonym for =) | |||
= ==> strings are equal | |||
!= ==> strings are not equal | |||
-z ==> string is null | |||
-n ==> string is not null | |||
</pre> | |||
For example, the following script check if users have provided an argument to the script. | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
$!/bin/sh | |||
if [ -z "$1"]; then | |||
echo "Provide a \"file name\", using quotes to nullify the space." | |||
exit 1 | |||
fi | |||
mv -i "$1" `ls "$1" | tri -d ' '` | |||
</syntaxhighlight> | </syntaxhighlight> | ||
where the '''-i''' parameter is to reconfirm the overwrite by the '''mv''' command. | |||
To check whether Xcode (either full Xcode or command line developer tools only) has been installed or not on Mac | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
if [ -z "$(xcode-select -p 2>&1 | grep error)" ] | |||
then | |||
echo "Xcode has been installed"; | |||
else | |||
echo "Xcode has not been installed"; | |||
fi | |||
# only print out message if xcode was not found | |||
if [ -n "$(xcode-select -p 2>&1 | grep error)" ] | |||
then | |||
echo "Xcode has not been installed"; | |||
fi | |||
</syntaxhighlight> | |||
note the 'error' keyword comes from macOS when the [[#Install_Xcode|Xcode has not been installed]]. Also the double quotes around '''$( )''' is needed to avoid the error [http://stackoverflow.com/questions/13781216/bash-meaning-of-too-many-arguments-error-from-if-square-brackets [: too many arguments” error]. | |||
[https://www.cyberciti.biz/faq/bash-check-if-string-starts-with-character-such-as/ Check if string starts with such as "#"]. <syntaxhighlight lang='bash'> | |||
if [[ "$var" =~ ^#.* ]]; then | |||
echo "yes" | |||
fi | |||
</syntaxhighlight> | </syntaxhighlight> | ||
== | === Arithmetic/Integer comparison === | ||
<pre> | |||
expr1 -eq expr2 ==> check equal | |||
expr1 -ne expr2 ==> check not equal | |||
expr1 -gt expr2 ==> expr1 > expr2 | |||
expr1 -ge expr2 ==> expr1 >= expr2 | |||
expr1 -lt expr2 ==> expr1 < expr2 | |||
expr1 -le expr2 ==> expr1 <= expr2 | |||
! expr ==> opposite of expr | |||
</pre> | |||
=== File conditionals === | |||
<pre> | <pre> | ||
-d file ==> True if the file is a directory | |||
-e file ==> True if the file exists | |||
-f file ==> True if the file is a regular file | |||
-r file ==> True if the file is readable | |||
-s file ==> True if the file has non-zero size | |||
-w file ==> True if the file is writable | |||
-x file ==> True if the file is executable | |||
</pre> | </pre> | ||
Example 1: Suppose we want to know if the first argument (if given) match a specific string. We can use (note the space before and after '==') | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
pushd /var/www | #!/bin/bash | ||
pushd /usr/src | if [ $1 == "console" ]; then | ||
dirs | echo 'Console' | ||
pushd +2 | else | ||
echo 'Non-console' | |||
fi | |||
</syntaxhighlight> | |||
Example 2: [https://www.cyberciti.biz/faq/linux-unix-script-check-if-file-empty-or-not/ Check If File Is Empty Or Not Using Shell Script] | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/bash | |||
_file="$1" | |||
[ $# -eq 0 ] && { echo "Usage: $0 filename"; exit 1; } | |||
[ ! -f "$_file" ] && { echo "Error: $0 file not found."; exit 2; } | |||
if [ -s "$_file" ] | |||
then | |||
echo "$_file has some data." | |||
# do something as file has data | |||
else | |||
echo "$_file is empty." | |||
# do something as file is empty | |||
fi | |||
</syntaxhighlight> | |||
=== Check if running as root === | |||
<syntaxhighlight lang='bash'> | |||
if [ $UID -ne 0 ]; | |||
then | |||
echo "Run as root" | |||
exit 1; | |||
fi | |||
</syntaxhighlight> | |||
== Control Structures == | |||
=== '''if''' === | |||
<pre> | |||
if condition | |||
then | |||
statements | |||
elif [ condition ]; then | |||
statements | |||
else | |||
statements | |||
fi | |||
</pre> | |||
For example, we can run a '''cp''' command if two files are different. | |||
<pre> | |||
if ! cmp -s "$filesrc" "$filecur" | |||
then | |||
cp $filesrc $filecur | |||
fi | |||
</pre> | |||
==== String Comparison ==== | |||
http://stackoverflow.com/questions/2237080/how-to-compare-strings-in-bash | |||
<syntaxhighlight lang='bash'> | |||
answer=no | |||
if [ -f "genome.fa" ]; then | |||
echo -n 'Do you want to continue [yes/no]: ' | |||
read answer | |||
fi | |||
if [ "$answer" == "no" ]; then | |||
echo AAA | |||
fi | |||
if [ "$answer"=="no" ]; then | |||
# failed if condition | |||
echo BBB | |||
fi | |||
</syntaxhighlight> | |||
# You want the quotes around $answer, because if $answer is empty. | |||
# Space in bash is important. | |||
#* Spaces between '''if''' and '''[''' and ''']''' are important | |||
#* A space before and after the double equal signs is important all. So if we reply with 'yes', the code still runs 'echo BBB' statement. | |||
=== '''while''' === | |||
<pre> | |||
while condition do | |||
statements | |||
done | |||
</pre> | |||
* https://www.cyberciti.biz/faq/bash-while-loop/, https://bash.cyberciti.biz/guide/While_loop | |||
* http://www.tldp.org/LDP/Bash-Beginners-Guide/html/sect_09_02.html, http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-7.html | |||
* Pipe and while <syntaxhighlight lang='bash'> | |||
$ function mylist() { | |||
ls *.r | |||
} | |||
$ mylist | while read file; do wc -l ${file}; done | |||
</syntaxhighlight> | |||
'''until''' | |||
<pre> | |||
until condition | |||
do | |||
statements | |||
done | |||
</pre> | |||
=== case === | |||
[https://www.howtogeek.com/766978/how-to-use-case-statements-in-bash-scripts/ How to Use Case Statements in Bash Scripts] | |||
=== Semicolon === | |||
Command1; command2; command3; command4 | |||
Every commands will be executed whether the execution is successful or not. | |||
=== '''AND list &&''' === | |||
[https://www.linuxuprising.com/2021/11/how-to-run-command-after-previous-one.html How To Run A Command After The Previous One Has Finished On Linux] | |||
<pre> | |||
statement1 && statement2 && statement3 && ... | |||
</pre> | |||
If command1 finishes successfully then run command2. | |||
<syntaxhighlight lang='bash'> | |||
touch /tmp/f1 | |||
echo "data" >/tmp/f2 | |||
[ -s /tmp/f1 ] | |||
echo $? # 1 | |||
[ -s /tmp/f2 ] | |||
echo $? # 0 | |||
[ -s /tmp/f1 ] && echo "not empty" || echo "empty" # empty | |||
[ -s /tmp/f2 ] && echo "not empty" || echo "empty" # not empty | |||
</syntaxhighlight> | |||
=== '''OR list ||''' === | |||
<pre> | |||
statement1 || statement2 || statement3 || ... | |||
</pre> | |||
If command1 fails then run command2. | |||
For example, | |||
<syntaxhighlight lang='bash'> | |||
codename=$(lsb_release -s -c) | |||
if [ $codename == "rafaela" ] || [ $codename == "rosa" ]; then | |||
codename="trusty" | |||
fi | |||
</syntaxhighlight> | |||
=== Chaining rule (command1 && command2 || command3) === | |||
[https://opensource.com/article/18/11/control-operators-bash-shell Coupled commands with control operators in Bash] | |||
[https://www.tecmint.com/chaining-operators-in-linux-with-practical-examples/ 10 Useful Chaining Operators in Linux with Practical Examples]. | |||
* Ampersand Operator (&), | |||
* semi-colon Operator (;), | |||
* AND Operator (&&), | |||
* OR Operator (||), | |||
* NOT Operator (!), | |||
* AND – OR operator (&& – ||), | |||
* PIPE Operator (|), | |||
* Command Combination Operator {}, | |||
* Precedence Operator (), | |||
* Concatenation Operator (\). | |||
A combination of ‘AND‘ and ‘OR‘ Operator is much like an ‘if-else‘ statement. | |||
<syntaxhighlight lang='bash'> | |||
$ ping -c3 www.google.com && echo "Verified" || echo "Host Down" | |||
</syntaxhighlight> | |||
[https://opensource.com/article/19/10/programming-bash-syntax-tools How to program with Bash: Syntax and tools] | |||
<pre> | |||
# command1 && command2 | |||
$ Dir=/root/testdir ; mkdir $Dir/ && cd $Dir | |||
# command1 || command2 | |||
$ Dir=/root/testdir ; mkdir $Dir || echo "$Dir was not created." | |||
# preceding commands ; command1 && command2 || command3 ; following commands | |||
# "If command1 exits with a return code of 0, then execute command2, otherwise execute command3." | |||
$ Dir=/root/testdir ; mkdir $Dir && cd $Dir || echo "$Dir was not created." | |||
$ Dir=~/testdir ; mkdir $Dir && cd $Dir || echo "$Dir was not created." | |||
</pre> | |||
=== for + do + done === | |||
<pre> | |||
for variable in values | |||
do | |||
statements | |||
done | |||
</pre> | |||
The values can be an explicit list | |||
<syntaxhighlight lang='bash'> | |||
i=1 | |||
for day in Mon Tue Wed Thu Fri | |||
do | |||
echo "Weekday $((i++)) : $day" | |||
done | |||
</syntaxhighlight> | |||
or a variable | |||
<syntaxhighlight lang='bash'> | |||
i=1 | |||
weekdays="Mon Tue Wed Thu Fri" | |||
for day in $weekdays | |||
do | |||
echo "Weekday $((i++)) : $day" | |||
done | |||
# Output | |||
# Weekday 1 : Mon | |||
# Weekday 2 : Tue | |||
# Weekday 3 : Wed | |||
# Weekday 4 : Thu | |||
# Weekday 5 : Fri | |||
</syntaxhighlight> | |||
Note that we should not put a double quotes around $weekdays variable. If we put a double quotes around $weekdays, it will prevent word splitting. See [http://www.thegeekstuff.com/2011/07/bash-for-loop-examples/ thegeekstuff] article. | |||
<syntaxhighlight lang='bash'> | |||
i=1 | |||
weekdays="Mon Tue Wed Thu Fri" | |||
for day in "$weekdays" | |||
do | |||
echo "Weekday $((i++)) : $day" | |||
done | |||
# Output | |||
# Weekday 1 : Mon Tue Wed Thu Fri | |||
</syntaxhighlight> | |||
To loop over all script files in a directory | |||
<syntaxhighlight lang='bash'> | |||
FILES=/path/to/PATTERN*.sh | |||
for f in $FILES; | |||
do | |||
( | |||
"$f" | |||
)& | |||
done | |||
wait | |||
</syntaxhighlight> | |||
OR | |||
<syntaxhighlight lang='bash'> | |||
FILES=" | |||
file1 | |||
/path/to/file2 | |||
/path/to/file3 | |||
" | |||
for f in $FILES; | |||
do | |||
( | |||
"$f" | |||
)& | |||
done | |||
wait | |||
</syntaxhighlight> | |||
Here we run the script in the background and wait to exit until all are finished. | |||
See [http://www.cyberciti.biz/faq/bash-loop-over-file/ loop over files] from cyberciti.biz. | |||
==== Example 1: convert pdfs to tifs using ImageMagick ==== | |||
"for" looping over files, check [http://www.cyberciti.biz/faq/bash-loop-over-file/ cyberciti.biz]) | |||
<syntaxhighlight lang="bash"> | |||
outdir="../plosone" | |||
indir="../fig" | |||
if [[ ! -d $outdir ]]; | |||
then | |||
mkdir $outdir | |||
fi | |||
in=(file1.pdf file2.pdf file3.pdf) | |||
for (( i=0; i<${#in[@]} ; i++ )) | |||
do | |||
convert -strip -units PixelsPerInch -density 300 -resample 300 \ | |||
-alpha off -colorspace RGB -depth 8 -trim -bordercolor white \ | |||
-border 1% -resize '2049x2758>' -resize '980x980<' +repage \ | |||
-compress lzw $indir/${in[$i]} $outdir/Figure$[$i+1].tiff | |||
done | |||
</syntaxhighlight> | |||
==== Example 2: download with wget and parsing with 'sed' ==== | |||
A second [http://www.everydayanalytics.ca/2015/01/WTIandOntarioGasPrices.html example] is to download all the (Ontario gasoline price) data with wget and parsing and concatenating the data with other *nix tools like 'sed': | |||
<syntaxhighlight lang="bash"> | |||
# Download data | |||
for i in $(seq 1990 2014) | |||
do wget http://www.energy.gov.on.ca/fuelupload/ONTREG$i.csv | |||
done | |||
# Retain the header | |||
head -n 2 ONTREG1990.csv | sed 1d > ONTREG_merged.csv | |||
# Loop over the files and use sed to extract the relevant lines | |||
for i in $(seq 1990 2014) | |||
do | |||
tail -n 15 ONTREG$i.csv | sed 13,15d | sed 's/./-01-'$i',/4' >> ONTREG_merged.csv | |||
done | |||
</syntaxhighlight> | |||
==== Example 3: download ==== | |||
Download all 20 sra files (60GB in total) from [ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP032/SRP032789 SRP032789]. | |||
<syntaxhighlight lang="bash"> | |||
for x in $(seq 1027175 1027180) | |||
do wget ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP032/SRP032789/SRR$x/SRR$x.sra | |||
done | |||
</syntaxhighlight> | |||
https://github.com/MarioniLab/EmptyDrops2017/blob/master/data/download_10x.sh | |||
<pre> | |||
for x in \ | |||
http://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc4k/pbmc4k_raw_gene_bc_matrices.tar.gz \ | |||
http://cf.10xgenomics.com/samples/cell-exp/2.1.0/neurons_900/neurons_900_raw_gene_bc_matrices.tar.gz \ | |||
http://cf.10xgenomics.com/samples/cell-exp/1.1.0/293t/293t_raw_gene_bc_matrices.tar.gz \ | |||
http://cf.10xgenomics.com/samples/cell-exp/1.1.0/jurkat/jurkat_raw_gene_bc_matrices.tar.gz \ | |||
http://cf.10xgenomics.com/samples/cell-exp/2.1.0/t_4k/t_4k_raw_gene_bc_matrices.tar.gz \ | |||
http://cf.10xgenomics.com/samples/cell-exp/2.1.0/neuron_9k/neuron_9k_raw_gene_bc_matrices.tar.gz | |||
do | |||
wget $x | |||
destname=$(basename $x) | |||
stub=$(echo $destname | sed "s/_raw_.*//") | |||
mkdir -p $stub | |||
tar -xvf $destname -C $stub | |||
rm $destname | |||
done | |||
</pre> | |||
==== Example 4: convert files from DOS to Unix ==== | |||
Convert all files from DOS to Unix format | |||
<syntaxhighlight lang="bash"> | |||
for f in *.txt; do tr -d '\r' < $f > tmp.txt; mv tmp.txt $f ; done | |||
# Or | |||
for file in $*; do tr -d '\r' < $f > tmp.txt; mv tmp.txt $f ; done | |||
</syntaxhighlight> | |||
==== Example 5: print all files in a directory ==== | |||
<syntaxhighlight lang="bash"> | |||
for f in /etc/*.conf | |||
do | |||
echo "$f" | |||
done | |||
</syntaxhighlight> | |||
==== Example 6: use ping to find all the live machines on the network ==== | |||
<syntaxhighlight lang="bash"> | |||
for ip in 192.168.0.{1..255} ; | |||
do | |||
ping $ip -c 2 &> /dev/null ; | |||
if [ $? -eq 0 ]; | |||
then | |||
echo $ip is alive | |||
fi | |||
done | |||
</syntaxhighlight> | |||
==== Example 7: sed on multiple files ==== | |||
<pre> | |||
for i in *.htm*; do sed -i 's/String1/String2/' "$i"; done | |||
</pre> | |||
Note if the string contains special characters like forward slashes (eg https://www.google.com), we need to escape them by using the backslash sign. | |||
==== Example 8: run in parallel ==== | |||
<syntaxhighlight lang="bash"> | |||
for ip in 192.168.0.{1..255} ; | |||
do | |||
( | |||
ping $ip -c2 &> /dev/null ; | |||
if [ $? -eq 0 ]; | |||
then | |||
echo $ip is alive | |||
fi | |||
)& | |||
done | |||
wait | |||
</syntaxhighlight> | |||
where we enclose the loop body in ()&. () encloses a block of commands to run as a subshell and & sends it to the background. '''wait''' waits for all background jobs to complete. | |||
'''Good technique !!!''' | |||
* [[#GNU_Parallel|GNU '''parallel''' command]] | |||
* http://unix.stackexchange.com/questions/103920/parallelize-a-bash-for-loop | |||
* http://stackoverflow.com/questions/27934784/shell-script-to-loop-and-start-processes-in-parallel | |||
* http://superuser.com/questions/158165/parallel-shell-loops | |||
=== wait command === | |||
<ul> | |||
<li>An example where we shall wait until files are deleted before continuing the script. | |||
<syntaxhighlight lang='sh'> | |||
cd /home/ubuntu | |||
if [ -d "R-devel" ]; then | |||
rm -rf "R-devel" & | |||
wait # Wait for the deletion to complete | |||
echo "R-devel folder deleted successfully." | |||
else | |||
echo "R-devel folder does not exist." | |||
fi | |||
wget -O - https://stat.ethz.ch/R/daily/R-devel.tar.gz | tar -xzk | |||
cd R-devel | |||
./configure --prefix=/opt/R/devel --enable-R-shlib | |||
make | |||
</syntaxhighlight> | |||
</ul> | |||
== Functions == | |||
* http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-8.html, http://tldp.org/LDP/abs/html/functions.html | |||
* http://www.thegeekstuff.com/2010/04/unix-bash-function-examples/ | |||
* https://www.howtoforge.com/tutorial/linux-shell-scripting-lessons-5/ | |||
* [https://wizardzines.com/comics/bash-functions/ Cartoon] from wizardzines.com | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/bash | |||
fun () { echo "This is a function"; echo; } | |||
fun () { echo "This is a function"; echo } # Error! | |||
function quit { | |||
exit | |||
} | |||
function hello { | |||
echo Hello! | |||
} | |||
function e { | |||
echo $1 | |||
} | |||
$ ./e World | |||
</syntaxhighlight> | |||
=== [https://www.cyberciti.biz/faq/how-to-find-bash-shell-function-source-code-on-linuxunix/ How to find bash shell function source code on Linux/Unix] === | |||
<syntaxhighlight lang='bash'> | |||
$ type -a function_name | |||
# To list all function names | |||
$ declare -F | |||
$ declare -F | grep function_name | |||
$ declare -F | grep foo | |||
</syntaxhighlight> | |||
How do I find the file where a bash function is defined? | |||
<syntaxhighlight lang='bash'> | |||
declare -F function_name | |||
</syntaxhighlight> | |||
=== Function arguments === | |||
<syntaxhighlight lang='bash'> | |||
source ~/bin/setpath # add bgzip & tabix directories to $PATH | |||
function raw2exon { | |||
# put your comments here | |||
inputvcf=$1 | |||
outputvcf=$2 | |||
inputbed=$3 | |||
if [[ $4 ]]; then | |||
oldpath=$PWD | |||
cd $4 | |||
fi | |||
bgzip -c $inputvcf > $inputvcf.gz | |||
tabix -p vcf $inputvcf.gz | |||
head -$(grep '#' $inputvcf | wc -l) $inputvcf > $outputvcf # header | |||
tabix -R $inputbed $inputvcf.gz >> $outputvcf | |||
wc -l $inputvcf | |||
wc -l $outputvcf | |||
rm $inputvcf.gz $inputvcf.gz.tbi | |||
if [[ $4 ]]; then | |||
cd $oldpath | |||
fi | |||
} | |||
inputbed=S04380110_Regions.bed | |||
raw2exon 'mu0001_raw.vcf' 'mu0001_exon.vcf' $inputbed ~/Downloads/ | |||
</syntaxhighlight> | |||
==== Exit function ==== | |||
[https://bash.cyberciti.biz/guide/Exit_command exit command and the exit statuses] | |||
<pre> | |||
$ cat testfun.sh | |||
#!/bin/bash | |||
ping -q -c 1 $1 >/dev/null 2>&1 | |||
if [ $? -ne 0 ] | |||
then | |||
echo "An error occurred while checking the server status". | |||
exit 3 | |||
fi | |||
exit 0 | |||
$ chmod +x testfun.sh | |||
$ ./testfun.sh www.cyberciti.biz999 | |||
An error occurred while checking the server status. | |||
$ echo $? | |||
3 | |||
</pre> | |||
== List of commands == | |||
<pre> | |||
break ==> escaping from an enclosing for, while or until loop | |||
: ==> null command | |||
continue ==> make the enclosing for, while or until loo continue at the next iteration | |||
. ==> executes the command in the current shell | |||
eval ==> evaluate arguments | |||
exec ==> replacing the current shell with a different program | |||
export ==> make the variable named as its parameter available in subshells | |||
expr ==> evaluate its arguments as an expression | |||
printf ==> similar to echo | |||
set ==> sets the parameter variables for the shell. Useful for using fields in commands that output spaced-separated values | |||
shift ==> moves all the parameter variables down by one. | |||
trap ==> specify the actions to take on receipt of signals. | |||
unset ==> remove variables or functions from the environment. | |||
mktemp ==> create a temporary file | |||
</pre> | |||
== Run the previous command == | |||
[https://unix.stackexchange.com/a/3748 Understanding the exclamation mark (!) in bash] | |||
<pre> | |||
$ apt update # Permission denied | |||
$ sudo !! # Equivalent sudo apt update | |||
</pre> | |||
''' "!" ''' invokes history expansion. To run the most recent command ''beginning'' with “foo”: | |||
<pre> | |||
!foo | |||
# Run the most recent command beginning with "service" as root | |||
sudo !service | |||
</pre> | |||
== Cache console output on the CLI? == | |||
Try the ‘’’script’’’ command line utility to create a typescript of everything printed on your terminal. | |||
To exit (to end script session) type ‘’’exit’’’ or logout or press control-D. | |||
== '''set -e''', '''set -x''' and '''trap''' == | |||
Exit immediately if a command exits with a non-zero status. Type '''help set''' in command line. Very useful! | |||
See also the [[#trap|trap]] command that is related to non-zero exit. | |||
See | |||
* [http://stackoverflow.com/questions/19622198/what-means-the-set-e-operation-in-a-bash-script-and-some-other-information-abo stackoverflow.com] | |||
* [http://www.peterbe.com/plog/set-ex set -ex] | |||
=== '''bash -x''' === | |||
Call your script with something like | |||
<syntaxhighlight lang='bash'> | |||
bash –x –v hello_world.sh | |||
</syntaxhighlight> | |||
OR | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/bash –x -v | |||
echo Hello World! | |||
</syntaxhighlight> | |||
where | |||
* '''-x''' displays commands and their results | |||
* '''-v''' displays everything, even comments and spaces | |||
This is the same as using '''set -x''' in your bash script. | |||
=== '''set -x''' example === | |||
Bash script | |||
<syntaxhighlight lang='bash'> | |||
set -ex | |||
export DEBIAN_FRONTEND=noninteractive | |||
codename=$(lsb_release -s -c) | |||
if [ $codename == "rafaela" ] || [ $codename == "rosa" ]; then | |||
codename="trusty" | |||
fi | |||
echo $codename | |||
echo step 1 | |||
echo step 2 | |||
exit 0 | |||
</syntaxhighlight> | |||
Without '''-x''' option: | |||
<pre> | |||
trusty | |||
step 1 | |||
step 2 | |||
</pre> | |||
With '''-x''' option: | |||
<pre> | |||
+ export DEBIAN_FRONTEND=noninteractive | |||
+ DEBIAN_FRONTEND=noninteractive | |||
++ lsb_release -s -c | |||
+ codename=rafaela | |||
+ '[' rafaela == rafaela ']' | |||
+ codename=trusty | |||
+ echo trusty | |||
trusty | |||
+ echo step 1 | |||
step 1 | |||
+ echo step 2 | |||
step 2 | |||
+ exit 0 | |||
</pre> | |||
=== trap and error handler === | |||
* http://www.computerhope.com/unix/utrap.htm | |||
* http://linuxcommand.org/wss0160.php | |||
* http://www.tutorialspoint.com/unix/unix-signals-traps.htm | |||
* http://www.ibm.com/developerworks/aix/library/au-usingtraps/ | |||
* http://bash.cyberciti.biz/guide/Trap_statement | |||
* http://steve-parker.org/sh/trap.shtml (trap with a user-defined function) | |||
* http://www.turnkeylinux.org/blog/shell-error-handling (set -e) | |||
* http://unix.stackexchange.com/questions/17314/what-is-signal-0-in-a-trap-command (do something on EXIT) | |||
* http://unix.stackexchange.com/questions/79648/how-to-trigger-error-using-trap-command | |||
* [https://opensource.com/article/20/6/bash-trap Using Bash traps in your scripts] | |||
* [http://redsymbol.net/articles/bash-exit-traps/ How "Exit Traps" Can Make Your Bash Scripts Way More Robust And Reliable] | |||
The syntax to use '''trap''' command is | |||
<pre> | |||
trap command signal | |||
</pre> | |||
For example, | |||
<pre> | |||
$ cat traptest.sh | |||
#!/bin/sh | |||
trap 'rm -f /tmp/tmp_file_$$' INT | |||
echo creating file /tmp/tmp_file_$$ | |||
date > /tmp/tmp_file_$$ | |||
echo 'press interrupt to interrupt ...' | |||
while [ -f /tmp/tmp_file_$$ ]; do | |||
echo file exists | |||
sleep 1 | |||
done | |||
echo the file no longer exists | |||
trap - INT | |||
echo creaing file /tmp/tmp_file_$$ | |||
date > /tmp/tmp_file_$$ | |||
echo 'press interrupt to interrupt ...' | |||
while [ -f /tmp/tmp_file_$$ ]; do | |||
echo file exists | |||
sleep 1 | |||
done | |||
echo we never get here | |||
exit 0 | |||
</pre> | |||
will get an output like | |||
<pre> | |||
$ ./traptest.sh | |||
creating file /tmp/tmp_file_21389 | |||
press interrupt to interrupt ... | |||
file exists | |||
file exists | |||
^Cthe file no longer exists | |||
creaing file /tmp/tmp_file_21389 | |||
press interrupt to interrupt ... | |||
file exists | |||
file exists | |||
^C | |||
</pre> | |||
The first when we use trap, it will delete the file when we hit Ctrl+C. The second time when we use trap, we do not specify any command to be exected when an INT signal occurs. So the default behavior occurs. That is, the final echo and exit statements are never executed. | |||
Note that the following two are different. | |||
<pre> | |||
trap - INT | |||
trap '' INT | |||
</pre> | |||
The second command will IGNORE signals (Ctrl+C in this case) so if we apply this statement above, we will not be able to use Ctrl+C to kill the execution. | |||
=== DEBUG trap to step through line by line === | |||
[https://twitter.com/b0rk/status/1312413117436104705 You can use the "DEBUG" trap to step through a bash script line by line] | |||
== Bash shell find out if a command exists or not == | |||
http://www.cyberciti.biz/faq/unix-linux-shell-find-out-posixcommand-exists-or-not/ | |||
=== POSIX === | |||
* [https://en.wikipedia.org/wiki/POSIX Portable Operating System Interface] | |||
* [https://statisticsglobe.com/as-posixlt-function-r as.POSIXlt Function in R (2 Examples)] | |||
=== POSIX built-in commands === | |||
* '''command''' is one of bash built-in commands (alias, bind, command, declare, echo, help, let, printf, read, source, type, typeset, ulimit and unalias). | |||
* [https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html Bash Builtin Commands] and [https://www.gnu.org/software/bash/manual/bashref.html#Shell-Builtin-Commands Shell Builtin Commands] | |||
* [http://ftp.gnu.org/gnu/bash/ Bash source code] | |||
* [https://unix.stackexchange.com/questions/319667/what-is-command-on-bash What is '''command''' on bash?] | |||
* [https://unix.stackexchange.com/questions/11454/what-is-the-difference-between-a-builtin-command-and-one-that-is-not What is the difference between a builtin command and one that is not?] | |||
* Use '''command''' command to tell if a command can be found. | |||
* Use '''type''' command to tell if a command is built-in. | |||
<syntaxhighlight lang='bash'> | |||
# command -v will return >0 when the command1 is not found | |||
command -v command1 >/dev/null && echo "command1 Found In \$PATH" || echo "command1 Not Found in \$PATH" | |||
$ help command | |||
command: command [-pVv] command [arg ...] | |||
Execute a simple command or display information about commands. | |||
Runs COMMAND with ARGS suppressing shell function lookup, or display | |||
information about the specified COMMANDs. Can be used to invoke commands | |||
on disk when a function with the same name exists. | |||
Options: | |||
-p use a default value for PATH that is guaranteed to find all of | |||
the standard utilities | |||
-v print a description of COMMAND similar to the `type' builtin | |||
-V print a more verbose description of each COMMAND | |||
Exit Status: | |||
Returns exit status of COMMAND, or failure if COMMAND is not found. | |||
$ type command | |||
command is a shell builtin | |||
$ type export | |||
export is a shell builtin | |||
$ type wget | |||
wget is /usr/bin/wget | |||
$ type tophat | |||
-bash: type: tophat: not found | |||
$ type sleep | |||
sleep is /bin/sleep | |||
$ command -v tophat | |||
$ command -v wget | |||
/usr/bin/wget | |||
</syntaxhighlight> | |||
On macOS, | |||
<syntaxhighlight lang='bash'> | |||
$ help command | |||
command: command [-pVv] command [arg ...] | |||
Runs COMMAND with ARGS ignoring shell functions. If you have a shell | |||
function called `ls', and you wish to call the command `ls', you can | |||
say "command ls". If the -p option is given, a default value is used | |||
for PATH that is guaranteed to find all of the standard utilities. If | |||
the -V or -v option is given, a string is printed describing COMMAND. | |||
The -V option produces a more verbose description. | |||
</syntaxhighlight> | |||
=== type -P === | |||
<pre> | |||
type -P command1 &>/dev/null && echo "Found" || echo "Not Found" | |||
$ help type | |||
type: type [-afptP] name [name ...] | |||
Display information about command type. | |||
For each NAME, indicate how it would be interpreted if used as a | |||
command name. | |||
Options: | |||
-a display all locations containing an executable named NAME; | |||
includes aliases, builtins, and functions, if and only if | |||
the `-p' option is not also used | |||
-f suppress shell function lookup | |||
-P force a PATH search for each NAME, even if it is an alias, | |||
builtin, or function, and returns the name of the disk file | |||
that would be executed | |||
-p returns either the name of the disk file that would be executed, | |||
or nothing if `type -t NAME' would not return `file'. | |||
-t output a single word which is one of `alias', `keyword', | |||
`function', `builtin', `file' or `', if NAME is an alias, shell | |||
reserved word, shell function, shell builtin, disk file, or not | |||
found, respectively | |||
Arguments: | |||
NAME Command name to be interpreted. | |||
Exit Status: | |||
Returns success if all of the NAMEs are found; fails if any are not found. | |||
typeset: typeset [-aAfFgilrtux] [-p] name[=value] ... | |||
Set variable values and attributes. | |||
Obsolete. See `help declare'. | |||
</pre> | |||
=== Find all bash builtin commands === | |||
https://www.cyberciti.biz/faq/linux-unix-bash-shell-list-all-builtin-commands/ | |||
<pre> | |||
$ help | |||
$ help | less | |||
$ help | grep read | |||
</pre> | |||
=== Find if a command is internal or external === | |||
<pre> | |||
$ type -a COMMAND-NAME-HERE | |||
$ type -a cd | |||
$ type -a uname | |||
$ type -a : | |||
$ command -V ls | |||
$ command -V cd | |||
$ command -V food | |||
</pre> | |||
== pause by '''read -p''' command == | |||
http://www.cyberciti.biz/tips/linux-unix-pause-command.html | |||
<pre> | |||
read -p "Press [Enter] key to start backup..." | |||
</pre> | |||
If we want to ask users about a yes/no question, we can use [http://stackoverflow.com/questions/226703/how-do-i-prompt-for-input-in-a-linux-shell-script this method] | |||
<pre> | |||
while true; do | |||
read -p "Do you wish to install this program? " yn | |||
case $yn in | |||
[Yy]* ) make install; break;; | |||
[Nn]* ) exit;; | |||
* ) echo "Please answer yes or no.";; | |||
esac | |||
done | |||
</pre> | |||
OR | |||
<pre> | |||
echo "Do you wish to install this program?" | |||
select yn in "Yes" "No"; do | |||
case $yn in | |||
Yes ) make install; break;; | |||
No ) exit;; | |||
esac | |||
done | |||
</pre> | |||
=== Keyboard input and Arithmetic === | |||
http://linuxcommand.org/wss0110.php | |||
read | |||
<pre> | |||
#!/bin/bash | |||
echo -n "Enter some text > " | |||
read text | |||
echo "You entered: $text" | |||
</pre> | |||
Arithmetic | |||
<pre> | |||
#!/bin/bash | |||
# An applications of the simple command | |||
# echo $((2+2)) | |||
# That is, when you surround an arithmetic expression with the double parentheses, | |||
# the shell will perform arithmetic evaluation. | |||
first_num=0 | |||
second_num=0 | |||
echo -n "Enter the first number --> " | |||
read first_num | |||
echo -n "Enter the second number -> " | |||
read second_num | |||
echo "first number + second number = $((first_num + second_num))" | |||
echo "first number - second number = $((first_num - second_num))" | |||
echo "first number * second number = $((first_num * second_num))" | |||
echo "first number / second number = $((first_num / second_num))" | |||
echo "first number % second number = $((first_num % second_num))" | |||
echo "first number raised to the" | |||
echo "power of the second number = $((first_num ** second_num))" | |||
</pre> | |||
and a program that formats an arbitrary number of seconds into hours and minutes: | |||
<pre> | |||
#!/bin/bash | |||
seconds=0 | |||
echo -n "Enter number of seconds > " | |||
read seconds | |||
# use the division operator to get the quotient | |||
hours=$((seconds / 3600)) | |||
# use the modulo operator to get the remainder | |||
seconds=$((seconds % 3600)) | |||
minutes=$((seconds / 60)) | |||
seconds=$((seconds % 60)) | |||
echo "$hours hour(s) $minutes minute(s) $seconds second(s)" | |||
</pre> | |||
== xargs == | |||
xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (the default command is echo, located at /bin/echo) one or more times with any initial-arguments followed by items read from standard input. | |||
* [https://en.wikipedia.org/wiki/Xargs Wikipedia] | |||
<ul> | |||
<li>[https://www.howtogeek.com/435164/how-to-use-the-xargs-command-on-linux/ How to Use the xargs Command on Linux]. Need to string some Linux commands together, but one of them doesn’t accept piped input. | |||
<syntaxhighlight lang='bash'> | |||
$ touch a.txt b.txt | |||
$ ls -1 ./*.txt | |||
./a.txt | |||
./b.txt | |||
$ ls -1 ./*.txt | xargs | |||
./a.txt ./b.txt | |||
</syntaxhighlight> | |||
</li> | |||
<li>[https://www.cloudsavvyit.com/7984/using-xargs-in-combination-with-bash-c-to-create-complex-commands/ Using xargs in Combination With bash -c to Create Complex Commands] | |||
</li> | |||
</ul> | |||
* [https://www.howtoforge.com/tutorial/linux-xargs-command/ 8 Practical Examples of Linux Xargs Command for Beginners] | |||
* [http://www.computerhope.com/unix/xargs.htm man] page | |||
=== Example1 - Find files named core in or below the directory /tmp and delete them === | |||
<syntaxhighlight lang='bash'> | |||
find /tmp -name core -type f -print0 | xargs -0 /bin/rm -f | |||
</syntaxhighlight> | |||
where, '''-0''' If there are blank spaces or characters (including single quote, newlines, et al) many commands will not work. This option take cares of file names with blank space. | |||
Another case: suppose I have a file with filename ''-sT''. It seems not possible to delete it directly with the ''rm'' command. | |||
<syntaxhighlight lang='bash'> | |||
$ rm "-sT" | |||
rm: invalid option -- 's' | |||
Try 'rm ./-sT' to remove the file ‘-sT’. | |||
Try 'rm --help' for more information. | |||
$ $ ls *T | |||
ls: option requires an argument -- 'T' | |||
Try 'ls --help' for more information. | |||
$ ls "*T" | |||
ls: cannot access *T: No such file or directory | |||
$ ls "*s*" | |||
ls: cannot access *s*: No such file or directory | |||
$ find . -maxdepth 1 -iname '*-sT' | |||
./-sT | |||
$ find . -maxdepth 1 -iname '*-sT' | xargs -0 /bin/rm -f | |||
$ find . -maxdepth 1 -iname '*-sT' | xargs /bin/rm -f # WORKS | |||
</syntaxhighlight> | |||
Similarly, suppose I have a file of zero size. The file name is "-f3". I cannot delete it. | |||
<syntaxhighlight lang='bash'> | |||
$ ls -lt | |||
total 448 | |||
-rw-r--r-- 1 mingc mingc 0 Jan 16 11:35 -f3 | |||
$ rm -f3 | |||
rm: invalid option -- '3' | |||
Try `rm ./-f3' to remove the file `-f3'. | |||
Try `rm --help' for more information. | |||
$ find . -size 0 -print0 |xargs -0 rm | |||
</syntaxhighlight> | |||
=== Example2 - Find files from the grep coammand and sort them by date === | |||
<syntaxhighlight lang='bash'> | |||
grep -l "Polyphen" tmp/*.* | xargs ls -lt | |||
</syntaxhighlight> | |||
=== Example3 - [http://stackoverflow.com/questions/4341442/gzip-with-all-cores Gzip with multiple jobs] === | |||
<syntaxhighlight lang='bash'> | |||
CORES=$(grep -c '^processor' /proc/cpuinfo) | |||
find /source -type f -print0 | xargs -0 -n 1 -P $CORES gzip -9 | |||
</syntaxhighlight> | |||
where | |||
* find -print0 / xargs -0 protects you from whitespace in filenames | |||
* xargs -n 1 means one gzip process per file | |||
* xargs -P specifies the number of jobs | |||
* gzip -9 means maximum compression | |||
== [https://en.wikipedia.org/wiki/GNU_parallel GNU Parallel] == | |||
* http://www.gnu.org/software/parallel/ | |||
* https://www.gnu.org/software/parallel/parallel_tutorial.html | |||
* https://www.biostars.org/p/63816/ | |||
* https://biowize.wordpress.com/2015/03/23/task-automation-with-bash-and-parallel/ | |||
* http://www.shakthimaan.com/posts/2014/11/27/gnu-parallel/news.html | |||
* https://www.msi.umn.edu/support/faq/how-can-i-use-gnu-parallel-run-lot-commands-parallel | |||
* http://deepdish.io/2014/09/15/gnu-parallel/ | |||
* http://davetang.org/muse/2013/11/18/using-gnu-parallel/ | |||
* https://vimeo.com/20838834, https://youtu.be/OpaiGYxkSuQ | |||
A simple trick without using GNU Parallel is [[#Example_7:_run_in_parallel|run the commands in background]]. | |||
=== Example: same command, different command line argument === | |||
Input from the command line ([https://www.gnu.org/software/parallel/man.html#SYNOPSIS Synopsis] about the triple colon ":::"): | |||
<syntaxhighlight lang='bash'> | |||
parallel echo ::: A B C | |||
parallel gzip --best ::: *.html # '--best' means best compression | |||
parallel gunzip ::: *.CEL.gz | |||
</syntaxhighlight> | |||
Input from a file: | |||
<syntaxhighlight lang='bash'> | |||
parallel -a abc-file echo | |||
</syntaxhighlight> | |||
Input is a STDIN: | |||
<syntaxhighlight lang='bash'> | |||
cat abc-file | parallel echo | |||
find . -iname "*after*" | parallel wc -l | |||
</syntaxhighlight> | |||
Another similar example is to gzip each individual files | |||
<syntaxhighlight lang='bash'> | |||
</syntaxhighlight> | |||
=== Example: each command containing an index === | |||
Instead of | |||
<syntaxhighlight lang='bash'> | |||
for i in $(seq 1 100) | |||
do | |||
someCommand data$i.fastq > output$i.txt & | |||
done | |||
</syntaxhighlight> | |||
, we can use | |||
<syntaxhighlight lang='bash'> | |||
parallel --jobs 16 someCommand data{}.fastq '>' output{}.txt ::: {1..100} | |||
</syntaxhighlight> | |||
=== Example: each command not containing an index === | |||
<syntaxhighlight lang='bash'> | |||
for i in *gz; do | |||
zcat $i > $(basename $i .gz).unpacked | |||
done | |||
</syntaxhighlight> | |||
can be written as | |||
<syntaxhighlight lang='bash'> | |||
parallel 'zcat {} > {.}.unpacked' ::: *.gz | |||
</syntaxhighlight> | |||
=== Example: run several subscripts from a master script === | |||
Suppose I have a bunch of script files: script1.sh, script2.sh, ... And an optional master script (file ext does not end with .sh). | |||
My goal is to run them using GNU Parallel. | |||
I can just run them using | |||
<syntaxhighlight lang='bash'> | |||
parallel './{}' ::: *.sh | |||
</syntaxhighlight> | |||
where "./" means the .sh files are located in the current directory and {} denotes each individual .sh file. | |||
More detail: | |||
<syntaxhighlight lang='bash'> | |||
$ mkdir test-par; cd test-par | |||
$ echo echo A > script1.sh | |||
$ echo echo B > script2.sh | |||
$ echo echo C > script3.sh | |||
$ echo echo D > script4.sh | |||
$ chmod +x *.sh | |||
$ cat > script # master script (not needed for GNU parallel method) | |||
./script1.sh | |||
./script2.sh | |||
./script3.sh | |||
./script4.sh | |||
$ time bash script | |||
A | |||
B | |||
C | |||
D | |||
real 0m0.025s | |||
user 0m0.004s | |||
sys 0m0.004s | |||
$ time parallel './{}' ::: *.sh # No need of a master script | |||
# may need to add --gnu option if asked. | |||
A | |||
B | |||
C | |||
D | |||
real 0m0.778s | |||
user 0m0.588s | |||
sys 0m0.144s # longer time because of the parallel overhead | |||
</syntaxhighlight> | |||
=== Note === | |||
* When I run scripts (seqtools_vc) sequentially I can get the standard output on screen. However, I may not get these output when I use GNU parallel. | |||
* There is a risk/problem if all scripts are trying to generate required/missing files when they detect the required files are absent. | |||
== [https://github.com/shenwei356/rush rush] - cross-platform tool for executing jobs in parallel == | |||
== Debugging Scripts == | |||
* [https://www.tecmint.com/enable-shell-debug-mode-linux/ How To Enable Shell Script Debugging Mode in Linux] (very good) Some options (note options can be used in 1. the '''set''' command 2. the first line of the shell file or 3. the terminal where the shell is invoked) | |||
** -e: exit if a command yields a nonzero exit status | |||
** -v: short for verbose | |||
** -n: short for noexec or no ecxecution | |||
** -x: short for xtrace or execution trace | |||
* [http://www.tecmint.com/trace-shell-script-execution-in-linux/ How to Trace Execution of Commands in Shell Script with Shell Tracing] | |||
* [https://www.tecmint.com/check-syntax-in-shell-script/ How to Perform Syntax Checking Debugging Mode in Shell Scripts] | |||
* http://www.cyberciti.biz/tips/debugging-shell-script.html | |||
Run a shell script with -x option. Then each lines of the script will be shown on the stdout. We can see which line takes long time or which lines broke the code (''it still runs through the script''). | |||
<pre> | |||
$ bash -x script-name | |||
</pre> | |||
* Use of set builtin command | |||
* Use of intelligent DEBUG function | |||
To run a bash script line by line: | |||
* [http://bashdb.sourceforge.net/ Bash Debugger] | |||
* Use '''Geany'''. See the next session. | |||
=== Geany === | |||
* (Ubuntu 12.04 only): By default, it does not have the terminal tab. Install virtual terminal emulator. Run | |||
<syntaxhighlight lang='bash'> | |||
sudo apt-get install libvte-dev | |||
</syntaxhighlight> | |||
* Step 1: Keyboard shortcut. Select a region of code. Edit -> >Commands->Send selection to Terminal. You can also assign a keybinding for this. To do so: go to Edit->Preferences and pick the Keybindings tab. See a screenshot [http://askubuntu.com/questions/528367/shortcut-to-send-selection-to-terminal-in-geany here]. I assign F12 (no any quote) for the shortcut. [http://www.geany.org/manual/current/#keybindings This is a complete list of the keybindings]. | |||
* Step 2: Newline character. Another issue is that the last line of sent code does not have a newline character. So I need to switch to the Terminal and press Enter. The solution is to modify the <geany.conf> (find its location using locate geany.conf. On my ubuntu 14 (geany 1.26), it is under '''~/.config/geany/geany.conf''') and set send_selection_unsafe=true. See [http://www.r-bloggers.com/using-geany-for-programming-in-r/ here]. | |||
* Step 3: PATH variable. | |||
<pre> | |||
$ tmpname=$(basename $inputVCF) | |||
Command 'basename' is available in '/usr/bin/basename' | |||
The command could not be located because '/usr/bin' is not included in the PATH environment variable. | |||
</pre> | |||
The solution is to run '''PATH=$PATH:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin''' in the Terminal window before running our script. | |||
* Step 4 (optional): Change background color. | |||
Another handy change to geany is to change its background to black. To do that, go to Edit -> Preferences -> Editor. Once on the Editor options level, select the Display tab to the far right of the dialog, and you will notice a checkbox marked ''invert syntax highlighting colors''. | |||
See [https://ask.fedoraproject.org/en/question/25734/how-to-set-gnome-terminal-in-geany-instead-of-xterm/ this post] about changing the default terminal in the ''Terminal'' window. The default is xterm (see the output of '''echo $TERM'''). | |||
== Examples == | |||
* <[http://nebc.nerc.ac.uk/downloads/bl8_only/upgrade8.sh upgrade8.sh]> file from [http://environmentalomics.org/bio-linux-installation/ BioLinux installation] page | |||
* [http://padamson.github.io/r/shiny/2016/03/13/install-required-r-packages.html Install required R packages] using a mixture of bash and R. | |||
== How to wrap a long linux command == | |||
Use backslash character. However, make sure the backslash character is the last character at a line. For example the first example below does not work since there is an extra space character after \. | |||
Example 1 (not work) | |||
<pre> | |||
sudo apt-get install libcap-dev libbz2-dev libgcrypt11-dev libpci-dev libnss3-dev libxcursor-dev \ | |||
libxcomposite-dev libxdamage-dev libxrandr-dev libdrm-dev libfontconfig1-dev libxtst-dev \ | |||
libcups2-dev libpulse-dev libudev-dev | |||
</pre> | |||
vs example 2 (work) | |||
<pre> | |||
sudo apt-get install libcap-dev libbz2-dev libgcrypt11-dev libpci-dev libnss3-dev libxcursor-dev \ | |||
libxcomposite-dev libxdamage-dev libxrandr-dev libdrm-dev libfontconfig1-dev libxtst-dev \ | |||
libcups2-dev libpulse-dev libudev-dev | |||
</pre> | |||
== Command line path navigation == | |||
'''pushd''' and '''popd''' are used to switch between multiple directories without the copying nad posting of directory paths. Thy operate on a stack; a last in first out data structure ('''LIFO'''). | |||
<syntaxhighlight lang='bash'> | |||
pushd /var/www | |||
pushd /usr/src | |||
dirs | |||
pushd +2 | |||
popd | popd | ||
</syntaxhighlight> | </syntaxhighlight> | ||
When we have only two locations, an alternative and easier way is '''cd -'''. | When we have only two locations, an alternative and easier way is '''cd -'''. | ||
<syntaxhighlight lang='bash'> | |||
cd /usr/src | |||
# Do something | |||
cd /var/www | |||
cd - # /usr/src | |||
</syntaxhighlight> | |||
== bd – Quickly Go Back to a Parent Directory == | |||
* https://www.tecmint.com/bd-quickly-go-back-to-a-linux-parent-directory/ | |||
* https://raw.github.com/vigneshwaranr/bd/master/bd | |||
== Create log file == | |||
* Create a log file with date | |||
<syntaxhighlight lang='bash'> | |||
logfile="output_$(date +"%Y%m%d%H%M").log" | |||
</syntaxhighlight> | |||
* Redirect the error to a log file | |||
<syntaxhighlight lang='bash'> | |||
logfile="output_$(date +"%Y%m%d%H%M").log" | |||
module load XXX || exit 1 | |||
echo "All output redirected to '$logfile'" | |||
set -ex | |||
exec 2>$logfile | |||
# Task 1 | |||
start_time=$(date +%s) | |||
# Do something with possible error output | |||
end_time=$(date +%s) | |||
echo "Task 1 Started: tarted: "$start_date"; Ended: "$end_date"; Elapsed time: "$(($end_time - $start_time))" sec">>$logfile | |||
# Task 2 | |||
start_time=$(date +%s) | |||
# Do something with possible error output | |||
end_time=$(date +%s) | |||
echo "Task 1 Started: tarted: "$start_date"; Ended: "$end_date"; Elapsed time: "$(($end_time - $start_time))" sec">>$logfile | |||
</syntaxhighlight> | |||
= Text processing = | |||
== tr (similar to sed) == | |||
''It seems tr does not take general regular expression.'' | |||
The '''tr''' utility copies the given input to produced the output with substitution or deletion of selected characters. '''tr''' abbreviated as translate or transliterate. | |||
* http://www.thegeekstuff.com/2012/12/linux-tr-command/ | |||
* http://www.cyberciti.biz/faq/how-to-use-linux-unix-tr-command/ | |||
* https://www.howtoforge.com/linux-tr-command/ | |||
It will read from STDIN and write to STDOUT. The syntax is | |||
<syntaxhighlight lang='bash'> | |||
tr [OPTION] SET1 [SET2] | |||
</syntaxhighlight> | |||
If both the SET1 and SET2 are specified and ‘-d’ OPTION is not specified, then tr command will replace each characters in SET1 with each character in same position in SET2. For example, | |||
<syntaxhighlight lang='bash'> | |||
# translate to uppercase | |||
$ echo 'linux' | tr "[:lower:]" "[:upper:]" | |||
# Translate braces into parenthesis | |||
$ tr '{}' '()' < inputfile > outputfile | |||
# Replace comma with line break | |||
$ tr ',' '\n' < inputfile | |||
# Split a long line using the space | |||
$ echo $line | tr ' ' '\n' | |||
# Translate white-space to tabs | |||
$ echo "This is for testing" | tr [:space:] '\t' | |||
# Join/merge all the lines in a file into a single line | |||
$ tr -s '\n' ' ' < file.txt | |||
# note sed cannot match \n easily as tr command. | |||
# See | |||
# http://stackoverflow.com/questions/1251999/how-can-i-replace-a-newline-n-using-sed | |||
# https://unix.stackexchange.com/questions/26788/using-sed-to-convert-newlines-into-spaces | |||
</syntaxhighlight> | |||
tr can also be used to remove particular characters using -d option. For example, | |||
<syntaxhighlight lang='bash'> | |||
$ echo "the geek stuff" | tr -d 't' | |||
he geek suff | |||
$ tr -d "\15" < input > output # octal digit 15 | |||
</syntaxhighlight> | |||
A practical example | |||
<syntaxhighlight lang='bash'> | |||
#!/bin/bash | |||
echo -n "Enter file name : " | |||
read myfile | |||
echo -n "Are you sure ( yes or no ) ? " | |||
read confirmation | |||
confirmation="$(echo ${confirmation} | tr 'A-Z' 'a-z')" | |||
if [ "$confirmation" == "yes" ]; then | |||
[ -f $myfile ] && /bin/rm $myfile || echo "Error - file $myfile not found" | |||
else | |||
: # do nothing | |||
fi | |||
</syntaxhighlight> | |||
Second example | |||
<syntaxhighlight lang='bash'> | |||
$ ifconfig | cut -c-10 | tr -d ' ' | tr -s '\n' | |||
eth0 | |||
eth1 | |||
ip6tnl0 | |||
lo | |||
sit0 | |||
# without tr -s '\n' | |||
eth0 | |||
eth1 | |||
ip6tnl0 | |||
lo | |||
sit0 | |||
</syntaxhighlight> | |||
where tr -d ' ' deletes every space character in each line. The \n newline character is squeezed using tr -s '\n' to produce a list of interface names. We use cut to extract the first 10 characters of each line. | |||
== Regular Expression and grep == | |||
* https://regexper.com/ You can type for example '[a-z]*.[0-9]' to see what it is doing. | |||
** ( ?[a-zA-Z]+ ?) match all words in a given text | |||
** [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} match an IP address | |||
* [http://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples/ 15 Practical Grep Command Examples In Linux] | |||
* [https://www.cyberciti.biz/faq/sed-remove-last-character-from-each-line/ Sed bracket expressions]. sed remove last character from each line. | |||
* Period means a single character. [https://www.digitalocean.com/community/tutorials/using-grep-regular-expressions-to-search-for-text-patterns-in-linux Using Grep & Regular Expressions to Search for Text Patterns in Linux] | |||
* Linux command line: '''grep PATTERN FILENAME''' or '''grep -E 'PATTERN1|PATTERN2' FILENAME''' (extended regular expression) | |||
<syntaxhighlight lang='bash'> | |||
echo -e "today is Monday\nHow are you" | grep Monday | |||
grep -E "[a-z]+" filename | |||
# or | |||
egrep "[a-z]+" filename | |||
grep -i PATTERN FILENAME # ignore case | |||
grep -v PATTERN FILENAME # inverse match | |||
grep -c PATTERN FILENAME # count the number of lines in which a matching string appears | |||
grep -n PATTERN FILENAME # print the line number | |||
grep -R PATTERN DIR # recursively search many files and follow symbolic links | |||
grep -r PATTERN DIR # recursively search many files | |||
grep -e "pattern1" -e "pattern2" FILENAME # multiple patterns OR operation (older Linux) | |||
egrep 'pattern1|pattern2' FILENAME # multiple patterns (newer Linux) | |||
grep -f PATTERNFILE FILENAME # PATTERNFILE contains patterns line-by-line | |||
grep -F PATTERN FILENAME # Interpret PATTERN as a list of fixed strings, separated by | |||
# newlines, any of which is to be matched. | |||
grep -r --include \*.Rmd --include \*.R "file\.csv" ./ # search with only Rmd & R files | |||
grep -r --exclude "README" PATTERN DIR # excluding files in which to search | |||
grep -o \<dt\>.*<\/dt\> FILENAME # print only the matched string (<dt> .... </dt>) | |||
grep -w # checking for full words, not for sub-strings | |||
grep -E -w "SRR2923335.1|SRR2923335.1999" # match in words (either SRR2923335.1 or SRR2923335.1999) | |||
</syntaxhighlight> | |||
* Extract the IP address from ifconfig command | |||
<syntaxhighlight lang='bash'> | |||
$ ifconfig eth1 | |||
eth1 Link encap:Ethernet HWaddr 00:14:d1:b0:df:9f | |||
inet addr:192.168.1.172 Bcast:192.168.1.255 Mask:255.255.255.0 | |||
inet6 addr: fe80::214:d1ff:feb0:df9f/64 Scope:Link | |||
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 | |||
RX packets:29113 errors:0 dropped:0 overruns:0 frame:0 | |||
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 | |||
collisions:0 txqueuelen:1000 | |||
RX bytes:28561660 (28.5 MB) TX bytes:3516957 (3.5 MB) | |||
$ ifconfig eth1 | egrep -o "inet addr:[^ ]*" | grep -o "[0-9.]*" | |||
192.168.1.172 | |||
</syntaxhighlight> | |||
where egrep -o "inet addr:[^ ]*" will match the pattern starting with inet addr: and ends with some non-space character sequence (specified by [^ ]*). Now in the next pipe, it prints the character combination of digits and '.'. | |||
=== --include option === | |||
<ul> | |||
<li>[https://stackoverflow.com/a/10628271 how do I use the grep --include option for multiple file types?] You can use multiple --include flags. '''grep -r --include=*.{html,php,htm} "pattern" /some/path/''' | |||
<syntaxhighlight lang='bash'> | |||
grep -r --include *.{c,cpp} PATTERN DIR # including files in which to search | |||
</syntaxhighlight> | |||
<li>[https://stackoverflow.com/a/24197797 grep --include command doesn't work in OSX Zsh]. The trick is to use '''quotes'''. | |||
<syntaxhighlight lang='bash'> | |||
grep -rl --include='*.Rmd' "pattern" ./ | |||
grep --include='*.rb' --include=='*.h*' -rnw . -e "pattern" | |||
</syntaxhighlight> | |||
</ul> | |||
=== Bash Find Out IF a Variable Contains a Substring === | |||
* [https://www.cyberciti.biz/faq/bash-find-out-if-variable-contains-substring/ Bash Find Out IF a Variable Contains a Substring] | |||
* [https://www.howtogeek.com/825503/how-to-tell-if-a-bash-string-contains-a-substring-on-linux/ How to Tell If a Bash String Contains a Substring on Linux] | |||
=== grep returns TRUE or FALSE === | |||
[https://unix.stackexchange.com/a/48536 Can grep return true/false or are there alternative methods] | |||
== less -S: print long lines == | |||
Causes lines longer than the screen width to be chopped rather than folded. [https://www.man7.org/linux/man-pages/man1/less.1.html man less]. | |||
== cut: extract columns or character positions from text files == | |||
http://www.thegeekstuff.com/2013/06/cut-command-examples/ | |||
<syntaxhighlight lang='bash'> | |||
cut -f 5-7 somefile # columns 5-7. | |||
cut -c 5-7 somefile # character positions 5-7 | |||
</syntaxhighlight> | |||
'''The default delimiter is TAB'''. If the field delimiter is different from TAB you need to specify it using -d: | |||
<syntaxhighlight lang='bash'> | |||
cut -d' ' -f100-105 myfile > outfile | |||
# | |||
cut -d: -f6 somefile # colon-delimited file | |||
# | |||
grep "/bin/bash" /etc/passwd | cut -d':' -f1-4,6,7 # field 1 through 4, 6 and 7 | |||
cut -f3 --complement somefile # print all the columns except the third column | |||
</syntaxhighlight> | |||
To specify the output delimiter, we shall use --output-delimiter. NOTE that to specify the Tab delimiter in '''cut''', we shall use $'\t'. See http://www.computerhope.com/unix/ucut.htm. For example, | |||
<syntaxhighlight lang='bash'> | |||
cut -f 1,3 -d ':' --output-delimiter=$'\t' somefile | |||
</syntaxhighlight> | |||
If I am not sure about the number of the final field, I can leave the number off. | |||
<syntaxhighlight lang='bash'> | |||
cut -f 1- -d ':' --output-delimiter=$'\t' somefile | |||
</syntaxhighlight> | |||
=== A simple shell function to show the first 3 columns and 3 rows of the matrix === | |||
<syntaxhighlight lang='sh'> | |||
function show_matrix() { | |||
if [ -z "$1" ] || [ -z "$2" ]; then | |||
echo "Usage: show_matrix <filename> <delimiter>" | |||
return 1 | |||
fi | |||
if [ "$2" != "tab" ] && [ "$2" != "comma" ]; then | |||
echo "Delimiter must be 'tab' or 'comma'" | |||
return 1 | |||
fi | |||
if [ "$2" == "tab" ]; then | |||
cut -f1-3 "$1" | head -n 3 | |||
elif [ "$2" == "comma" ]; then | |||
cut -d',' -f1-3 "$1" | head -n 3 | |||
fi | |||
} | |||
# show_matrix data.txt tab | |||
# show_matrix data.txt comma | |||
</syntaxhighlight> | |||
== awk: operate on rows and/or columns == | |||
'''awk''' is a tool designed to work with data streams. It can operate on columns and rows. If supports many built-in functionalities, such as arrays and functions, in the C programming language. Its biggest advantage is its flexibility. | |||
* https://en.wikipedia.org/wiki/AWK | |||
* https://www.tutorialspoint.com/awk/awk_workflow.htm | |||
* http://www.thegeekstuff.com/2010/01/awk-introduction-tutorial-7-awk-print-examples | |||
* http://www.theunixschool.com/p/awk-sed.html | |||
* http://www.grymoire.com/Unix/Awk.html | |||
* https://www.howtogeek.com/562941/how-to-use-the-awk-command-on-linux/ | |||
* [https://www.networkworld.com/article/3454979/the-many-faces-of-awk.html The many faces of awk] | |||
** Plucking out columns of data | |||
** Printing simple text | |||
** Doing math with awk | |||
Structure of an awk script | |||
<syntaxhighlight lang='bash'> | |||
awk pattern { action } | |||
awk ' BEGIN{ print "start" } pattern { AWK commands } END { print "end" } ' file | |||
</syntaxhighlight> | |||
The three of components ('''BEGIN''', '''END''' and a common statements block with the '''pattern''' match option) are optional and any of them can be absent in the script. The pattern can be also called a '''condition'''. | |||
The default delimiter for fields is a space. | |||
Some examples: | |||
<syntaxhighlight lang='bash'> | |||
awk 'BEGIN { i=0 } { i++ } END { print i}' filename | |||
echo -e "line1\nline2" | awk 'BEGIN { print "start" } { print } END { print "End" }' | |||
seq 5 | awk 'BEGIN { sum=0; print "Summation:" } { print $1"+"; sum+=$1 } END { print "=="; print sum }' | |||
awk -F : '{print $6}' somefile # colon-delimited file, print the 6th field (cut can do it) | |||
# | |||
awk --field-searator="\\t" '{print $6}' filename # tab-delimited (cut can do it) | |||
awk -F":" '{ print $1 " " $3 }' /etc/passwd # (cut can do it) | |||
awk -F "\t" '{OFS="\t"} {$1="mouse"$1; print $0}' genes.gtf > genescb.gtf | |||
# or | |||
awk -F "\t" 'BEGIN {OFS="\t"} {$1="mouse"$1; print $0}' genes.gtf > genescb.gtf | |||
# replace ELEMENT with mouseELEMENT for data on the 1st column; tab separator was used for input (-F) and output (OFS) | |||
awk 'NR % 4 == 1 {print ">" $0 } NR % 4 == 2 {print $0}' input > output | |||
# extract rows 1,2,5,6,9,10,13,14,.... from input | |||
awk 'NR % 4 == 0 {print ">" $0 } NR % 4 == 3 {print $0}' input > output | |||
# extract rows 3,4,7,8,11,12,15,16,.... from input | |||
awk '(NR==2),(NR==4) {print $0}' input | |||
# print rows 2-4. | |||
awk '{ print ($1-32)*(5/9) }' | |||
# fahrenheit-to-celsius calculator, http://www.hcs.harvard.edu/~dholland/computers/awk.html | |||
# http://stackoverflow.com/questions/3700957/printing-lines-from-a-file-where-a-specific-field-does-not-start-with-something | |||
awk '$7 !~ /^mouse/ { print $0 }' input # column 7 not starting with 'mouse' | |||
awk '$7 ~ /^mouse/ { print $0 }' input # column 7 starting with 'mouse' | |||
awk '$7 ~ /mouse/ { print $0 }' input # column 7 containing 'mouse' | |||
</syntaxhighlight> | |||
It seems AWK is useful for finding/counting a subset of rows or columns. It is not most used for string substitution. | |||
=== Print the string between two parentheses === | |||
https://unix.stackexchange.com/questions/108250/print-the-string-between-two-parentheses | |||
<syntaxhighlight lang='bash'> | |||
$ awk -F"[()]" '{print $2}' file | |||
$ echo ">gi|52546690|ref|NM_001005239.1| subfamily H, member 1 (OR11H1), mRNA" | awk -F"[()]" '{print $2}' | |||
OR11H1 | |||
$ echo ">gi|284172348|ref|NM_002668.2| proteolipid protein 2 (colonic epithelium-enriched) (PLP2), mRNA" | awk -F"[()]" '{print $2}' | |||
colonic epithelium-enriched # WRONG | |||
</syntaxhighlight> | |||
=== Insert a line === | |||
https://stackoverflow.com/a/18276534 | |||
<pre> | |||
awk '/KEYWORDS/ { print; print "new line"; next }1' foo.input | |||
</pre> | |||
=== Count number of columns in file === | |||
https://stackoverflow.com/a/8629351 | |||
<pre> | |||
awk -F'|' '{print NF; exit}' stores.dat # Change '|' as needed | |||
</pre> | |||
== sed (stream editor): substitution of text == | |||
* https://en.wikipedia.org/wiki/Sed | |||
By default, ''sed'' only prints the substituted text. To save the changes along the substitutions to the same file, use the '''-i''' option. | |||
<syntaxhighlight lang='bash'> | |||
sed 's/text/replace/' file > newfile | |||
mv newfile file | |||
# OR better | |||
sed -i 's/text/replace/' file | |||
</syntaxhighlight> | |||
The '''sed''' command will replace the first occurrence of the pattern in each line. If we want to replace every occurrence, we need to add the '''g''' parameter at the end, as follows: | |||
<syntaxhighlight lang='bash'> | |||
sed -i 's/pattern/replace/g' file | |||
</syntaxhighlight> | |||
To remove blank lines | |||
<syntaxhighlight lang='bash'> | |||
sed '/^$/d' filename | |||
</syntaxhighlight> | |||
To [http://serverfault.com/questions/466118/using-sed-to-remove-both-an-opening-and-closing-square-bracket-around-a-string remove square brackets] | |||
<syntaxhighlight lang='bash'> | |||
# method 1. replace ] & [ by the empty string | |||
$ echo '00[123]44' | sed 's/[][]//g' | |||
0012344 | |||
# method 2 - use tr | |||
$ echo '00[123]00' | tr -d '[]' | |||
0012300 | |||
</syntaxhighlight> | |||
To replace all three-digit numbers with another specified word in a file | |||
<syntaxhighlight lang='bash'> | |||
sed -i 's/\b[0-9]\{3\}\b/NUMBER/g' filename | |||
echo -e "I love 111 but not 1111." | sed 's/\b[0-9]\{3\}\b/NUMBER/g' | |||
</syntaxhighlight> | |||
where {3} is used for matching the preceding character thrice. \ in \{3\} is used to give a special meaning for { and }. \b is the word boundary marker. | |||
Variable string and quoting | |||
<syntaxhighlight lang='bash'> | |||
text=hello | |||
echo hello world | sed "s/$text/HELLO/" | |||
</syntaxhighlight> | |||
Double quoting expand the expression by evaluating it. | |||
=== sed takes whatever follows the "s" as the separator === | |||
* [http://backreference.org/2010/02/20/using-different-delimiters-in-sed/ Using different delimiters in sed] | |||
* http://www.grymoire.com/Unix/Sed.html#uh-2 , | |||
* https://en.wikipedia.org/wiki/Sed#Substitution_command | |||
Suppose I like to replace "../jquery-ui.min.js" with "jquery-ui.js", I can use | |||
{{Pre}} | |||
echo '<script src="../jquery-ui.min.js"></script>' | sed 's|../jquery-ui.min.js|jquery-ui.js|g' | |||
# <script src="jquery-ui.js"></script> | |||
</pre> | |||
<syntaxhighlight lang='bash'> | |||
$ cat tmp | |||
@SQ SN:chrX LN:155270560 | |||
@SQ SN:chrY LN:59373566 | |||
@RG ID:NEAT | |||
$ sed 's,^@RG.*,@RG\tID:None\tSM:None\tLB:None\tPL:Illumina,g' tmp | |||
@SQ SN:chrX LN:155270560 | |||
@SQ SN:chrY LN:59373566 | |||
@RG ID:None SM:None LB:None PL:Illumina | |||
$ sed 's/^@RG.*/@RG\tID:None\tSM:None\tLB:None\tPL:Illumina/g' tmp | |||
@SQ SN:chrX LN:155270560 | |||
@SQ SN:chrY LN:59373566 | |||
@RG ID:None SM:None LB:None PL:Illumina | |||
</syntaxhighlight> | |||
=== Case insensitive === | |||
https://www.cyberciti.biz/faq/unixlinux-sed-case-insensitive-search-replace-matching/ | |||
<pre> | |||
# Newer version - add 'i' or 'I' after 'g' | |||
sed 's/find-word/replace-word/gI' input.txt > output.txt | |||
sed -i 's/find-word/replace-word/gI' input.txt | |||
# Older version/macOS | |||
sed 's/[wW][oO][rR][dD]/replace-word/g' input.txt > output.txt | |||
sed 's/[Ll]inux/Unix/g' input.txt > output.txt | |||
</pre> | |||
=== macOS === | |||
[https://www.mkyong.com/mac/sed-command-hits-undefined-label-error-on-mac-os-x/ "undefined label" error on Mac OS X] | |||
<pre> | |||
$ sed -i 's/mkyong/google/g' testing.txt | |||
sed: 1: "testing.txt": undefined label 'esting.txt' | |||
# Solution | |||
$ sed -i '.bak' 's/mkyong/google/g' testing.txt | |||
</pre> | |||
=== Application: Get the top directory name of a tarball or zip file without extract it === | |||
<syntaxhighlight lang='bash'> | |||
dn=`unzip -vl filename.zip | sed -n '5p' | awk '{print $8}'` # 5 is the line number to print | |||
echo -e "$(basename $dn)" | |||
dn=`tar -tf filename.tar.bz2 | grep -o '^[^/]\+' | sort -u` # '-u' means unique | |||
echo -e $dn | |||
dn=`tar -tf filename.tar.gz | grep -o '^[^/]\+' | sort -u` | |||
echo -e $dn | |||
# Assume there is a sub-directory called htslibXXXX | |||
dn=$(basename `find -maxdepth 1 -name 'htslib*'`) | |||
echo -e $dn | |||
</syntaxhighlight> | |||
=== Application: Grab the line number from the 'grep -n' command output === | |||
Follow [http://stackoverflow.com/questions/10589929/find-the-line-number-where-a-specific-word-appears-with-grep here] | |||
<syntaxhighlight lang='bash'> | |||
grep -n 'regex' filename | sed 's/^\([0-9]\+\):.*$/\1/' # return line numbers for each matches | |||
# OR | |||
grep -n 'regex' filename | awk -F: '{print $1}' | |||
echo 123:ABCD | sed 's/^\([0-9]\+\):.*$/\1/' # 123 | |||
</syntaxhighlight> | |||
where '''\1''' means to keep the substring of the pattern and '''\(''' & '''\)''' are used to mark the pattern. See http://www.grymoire.com/Unix/Sed.html for more examples, e.g. search repeating words or special patterns. | |||
If we want to find the to directory for a zipped file (see [https://en.wikipedia.org/wiki/Zip_(file_format) wikipedia] for the zip format), we can use | |||
<syntaxhighlight lang='bash'> | |||
unzip -vl snpEff.zip | head | grep -n 'CRC-32' | awk -F: '{print $1}' | |||
</syntaxhighlight> | |||
=== Application: Delete first few characters on each row === | |||
http://www.theunixschool.com/2014/08/sed-examples-remove-delete-chars-from-line-file.html | |||
* To remove 1st n characters of every line: | |||
<syntaxhighlight lang='bash'> | |||
# delete the first 4 characters from each line | |||
$ sed -r 's/.{4}//' file | |||
</syntaxhighlight> | |||
=== Application: delete lines === | |||
[https://linuxhint.com/sed-command-to-delete-a-line/ Sed Command to Delete a Line] | |||
* Delete a single line | |||
* Delete a range of lines | |||
* Delete multiple lines | |||
* Delete all lines except specified range | |||
* Delete empty lines | |||
* Delete lines based on pattern | |||
* Delete lines starting with a specific character | |||
* Delete lines ending with specific character | |||
* Deleting lines that match the pattern and the next line | |||
* Deleting line from the pattern match to the end | |||
=== Application: comment out certain lines === | |||
https://unix.stackexchange.com/a/128595. To comment lines 2 through 4 of bla.conf: | |||
<pre> | |||
sed -i '2,4 s/^/#/' bla.conf | |||
</pre> | |||
This is useful when I need to comment out line 240 & 242 on shell scripts (related to pdf file) generated from BRB-SeqTools. | |||
== Substitution of text: perl == | |||
* Add or remove 'chr' from vcf file https://www.biostars.org/p/18530/ | |||
== How to delete the first few rows of a text file == | |||
https://unix.stackexchange.com/questions/37790/how-do-i-delete-the-first-n-lines-of-an-ascii-file-using-shell-commands | |||
Suppose we want to remove the first 3 rows of a text file | |||
* sed | |||
: <syntaxhighlight lang='bash'> | |||
$ sed -e '1,3d' < t.txt # output to screen | |||
$ sed -i -e 1,3d yourfile # directly change the file | |||
</syntaxhighlight> | |||
* tail | |||
: <syntaxhighlight lang='bash'> | |||
$ tail -n +4 t.txt # output to screen | |||
</syntaxhighlight> | |||
* awk | |||
: <syntaxhighlight lang='bash'> | |||
$ awk 'NR > 3 { print }' < t.txt # output to screen | |||
</syntaxhighlight> | |||
== Delete the last row of a file == | |||
<syntaxhighlight lang='bash'> | |||
sed -i '$d' FILE | |||
</syntaxhighlight> | |||
== Show the first few characters from a text file == | |||
<syntaxhighlight lang='bash'> | |||
head -c 50 file # return the first 50 bytes | |||
</syntaxhighlight> | |||
== Remove/Delete The Empty Lines In A File == | |||
https://www.2daygeek.com/remove-delete-empty-lines-in-a-file-in-linux/ | |||
<pre> | |||
sed -i '/KEYWORD/d' File | |||
</pre> | |||
== cat: merge by rows == | |||
<syntaxhighlight lang='bash'> | |||
cat file1 file2 > output | |||
</syntaxhighlight> | |||
== paste: merge by columns == | |||
<syntaxhighlight lang='bash'> | |||
paste -d"\t" file1 file2 file3 > output | |||
paste file1 file2 file3 | column -s $'\t' > output | |||
</syntaxhighlight> | |||
= Web = | |||
Reference: [http://www.amazon.com/Linux-Scripting-Cookbook-Second-Edition/dp/1782162747 Linux Shell Scripting Cookbook] | |||
== Copy a complete webiste == | |||
<syntaxhighlight lang='bash'> | |||
wget --mirror --convert-links URL | |||
# OR | |||
wget -r -N -k -l DEPTH URL | |||
</syntaxhighlight> | |||
== HTTP or FTP authentication == | |||
<syntaxhighlight lang='bash'> | |||
wget --user username --password pass URL | |||
</syntaxhighlight> | |||
== Download a web page as plain text (instead of HTML text) == | |||
<syntaxhighlight lang='bash'> | |||
lynx URL -dump > TextWebPage.txt | |||
</syntaxhighlight> | |||
== cURL == | |||
<syntaxhighlight lang='bash'> | |||
curl http://google.com -o index.html --progress | |||
curl http://google.com --silent -o index.html | |||
# Cookies | |||
curl http://example.com --cookie "user=ABCD;pass=EFGH" | |||
curl URL --cookie-jar cookie_file | |||
# Setting a user agent string | |||
# http://www.useragentstring.com/pages/useragentstring.php | |||
curl URL --user-agent "Mozilla/5.0" | |||
# Authenticating | |||
curl -u user:pass http://test_auth.com | |||
curl -u user http://test_auth.com | |||
# Printing response headers excluding the data | |||
# For example, to check whether a page is reachable or not | |||
# by checking the 'Content-length' parameter. | |||
curl -I URL | |||
</syntaxhighlight> | |||
== Image crawler and downloader == | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
#!/bin/bash | |||
# | #Desc: Images downloader | ||
#Filename: img_downloader.sh | |||
if [ $# -ne 3 ]; | |||
then | |||
echo "Usage: $0 URL -d DIRECTORY" | |||
exit -1 | |||
fi | |||
= | for i in {1..4} | ||
do | |||
case $1 in | |||
-d) shift; directory=$1; shift ;; | |||
*) url=${url:-$1}; shift;; | |||
esac | |||
done | |||
= | mkdir -p $directory; | ||
baseurl=$(echo $url | egrep -o "https?://[a-z.]+") | |||
echo Downloading $url | |||
< | curl -s $url | egrep -o "<img src=[^>]*>" | | ||
sed 's/<img src=\"\([^"]*\).*/\1/g' > /tmp/$$.list | |||
sed -i "s|^/|$baseurl/|" /tmp/$$.list | |||
cd $directory; | |||
while read filename; | |||
curl - | do | ||
echo Downloading $filename | |||
curl -s -O "$filename" --silent | |||
done < /tmp/$$.list | |||
</syntaxhighlight> | |||
</syntaxhighlight> | |||
== | == Find broken links in a website by '''lynx -traversal''' == | ||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
#!/bin/bash | #!/bin/bash | ||
#Desc: | #Desc: Find broken links in a website | ||
# | |||
if [ $# -ne 1 ]; | |||
then | |||
echo -e "$Usage: $0 URL\n" | |||
exit 1; | |||
fi | |||
echo Broken links: | |||
mkdir /tmp/$$.lynx | |||
cd /tmp/$$.lynx | |||
lynx -traversal $1 > /dev/null | |||
count=0; | |||
sort -u reject.dat > links.txt | |||
while read link; | |||
do | |||
output=`curl -I $link -s | grep "HTTP/.*OK"`; | |||
if [[ -z $output ]]; | |||
then | |||
echo $link; | |||
let count++ | |||
fi | |||
done < links.txt | |||
[ $count -eq 0 ] && echo No broken links found. | |||
</syntaxhighlight> | |||
== Track changes to a website == | |||
== | |||
<syntaxhighlight lang='bash'> | <syntaxhighlight lang='bash'> | ||
#!/bin/bash | #!/bin/bash | ||
#Desc: | #Desc: Script to track changes to webpage | ||
if [ $# -ne 1 ]; | if [ $# -ne 1 ]; | ||
then | then | ||
echo -e "$Usage: $0 URL\n" | echo -e "$Usage: $0 URL\n" | ||
exit 1; | exit 1; | ||
fi | fi | ||
first_time=0 | |||
# Not first time | |||
if [ ! -e "last.html" ]; | |||
if [ ! -e "last.html" ]; | |||
then | then | ||
first_time=1 | first_time=1 | ||
Line 1,089: | Line 2,922: | ||
get URL --post-data "postvar1=var1&postvar2=var2" -O out.html | get URL --post-data "postvar1=var1&postvar2=var2" -O out.html | ||
</syntaxhighlight> | </syntaxhighlight> | ||
== Change detection of a website == | |||
* http://bhfsteve.blogspot.com/2013/03/monitoring-web-page-for-changes-using.html | |||
* https://www.reddit.com/r/commandline/comments/2e2bkj/linux_software_to_monitor_website_changes/ | |||
* http://specto.sourceforge.net/ and https://www.linux.com/news/monitor-web-page-changes-specto | |||
* http://www.mostlymaths.net/2010/01/cron-diff-wget-watch-changes-in-webpage.html | |||
= Working with Files = | = Working with Files = | ||
== '''iconv''' command == | |||
* [https://www.howtogeek.com/iconv-command-linux/ How To Use the iconv Command on Linux] | |||
* [http://www.tecmint.com/convert-files-to-utf-8-encoding-in-linux/ How to Convert Files to UTF-8 Encoding in Linux] | |||
* https://stackoverflow.com/questions/11316986/how-to-convert-iso8859-15-to-utf8 | |||
<pre> | |||
$ file test.R | |||
test.R: ISO-8859 text, with CRLF line terminators | |||
$ iconv -f ISO-8859 -t UTF-8 test.R # 'ISO-8859' is not supported | |||
$ iconv -t UTF-8 test.R # partial conversion?? | |||
$ iconv -f ISO-8859-1 -T UTF-8 test.R # Works | |||
</pre> | |||
== '''nl''' command == | == '''nl''' command == | ||
Add line numbers to a text file | Add line numbers to a text file | ||
Line 1,111: | Line 2,963: | ||
== '''file''' command == | == '''file''' command == | ||
< | <pre style="white-space: pre-wrap; /* CSS 3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* IE 5.5+ */ " > | ||
$ file thumbs/g7.jpg | $ file thumbs/g7.jpg | ||
thumbs/g7.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI), density 72x72, segment length 16, Exif Standard: [TIFF image data, little-endian, direntries=10, orientation=upper-left, xresolution=134, yresolution=142, resolutionunit=2, software=Adobe Photoshop CS Windows, datetime=2004:03:31 22:28:58], baseline, precision 8, 100x75, frames 3 | thumbs/g7.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI), density 72x72, segment length 16, Exif Standard: [TIFF image data, little-endian, direntries=10, orientation=upper-left, xresolution=134, yresolution=142, resolutionunit=2, software=Adobe Photoshop CS Windows, datetime=2004:03:31 22:28:58], baseline, precision 8, 100x75, frames 3 | ||
Line 1,123: | Line 2,975: | ||
$ file R-3.2.3.tar.gz | $ file R-3.2.3.tar.gz | ||
R-3.2.3.tar.gz: gzip compressed data, last modified: Thu Dec 10 03:12:50 2015, from Unix | R-3.2.3.tar.gz: gzip compressed data, last modified: Thu Dec 10 03:12:50 2015, from Unix | ||
</pre> | |||
== date == | |||
[https://www.networkworld.com/article/3481602/displaying-dates-and-times-your-way-in-linux.html Displaying dates and times your way in Linux] | |||
== print by skipping rows == | |||
http://stackoverflow.com/questions/604864/print-a-file-skipping-x-lines-in-bash | |||
<syntaxhighlight lang='bash'> | |||
$ tail -n +<N+1> <filename> # excluding first N lines | |||
# print by starting at line N+1. | |||
$ tail -n +11 /tmp/myfile # starting at line 11, or skipping the first 10 lines | |||
</syntaxhighlight> | </syntaxhighlight> | ||
== '''tail -f''' | == '''tail -f''' (follow) == | ||
When we use the '-f' (follow) option, we can monitor a growing file. For example, we can create a new file called tmp.txt and run 'tail -f tmp.txt'. Now we open another terminal and run 'for i in {0..100}; do sleep 2; echo $i >> ~/output.txt ; done'. We will see in the 1st terminal that the content of tmp.txt is changed. | When we use the '-f' (follow) option, we can monitor a growing file. For example, we can create a new file called tmp.txt and run 'tail -f tmp.txt'. Now we open another terminal and run 'for i in {0..100}; do sleep 2; echo $i >> ~/output.txt ; done'. We will see in the 1st terminal that the content of tmp.txt is changed. | ||
Line 1,205: | Line 3,068: | ||
= Terminals = | = Terminals = | ||
== Fun command line utilities == | |||
[https://ostechnix.com/fun-linux-command-line-tools/ Turn Your Terminal Into A Playground: 20+ Funny Linux Command Line Tools]: cowsay, fortune, figlet, sl, ASCIIquarium, cmatrix, lolcat, ponysay, charasay, party parrot, ternimal, paclear, lavat, pond, cbonsai, dotacat, finger, pinky, no more secrets, hollywood, bucklespring, bb, toilet, sl-alt, fetch utilities, telehack, display star wars episode. | |||
== Reading from and Writing to the Terminal == | == Reading from and Writing to the Terminal == | ||
== The termios Structure == | == The termios Structure == | ||
== Terminal Output == | == Terminal Output == | ||
Line 1,219: | Line 3,086: | ||
= Development Tools = | = Development Tools = | ||
== | == Books == | ||
[https://www.hpe.com/us/en/insights/articles/top-linux-developers-recommended-programming-books-1808.html Top Linux developers' recommended programming books] | |||
== GNU Make and Makefiles == | |||
* [http://kbroman.org/minimal_make/ minimal make] A minimal tutorial on make from Karl Broman. | * [http://kbroman.org/minimal_make/ minimal make] A minimal tutorial on make from Karl Broman. | ||
* http://makefiletutorial.com/index.html | * http://makefiletutorial.com/index.html | ||
* [http://gromnitsky.users.sourceforge.net/articles/notes-for-new-make-users/ Notes for new Make users] | |||
== Writing a Manual Page == | == Writing a Manual Page == | ||
Line 1,229: | Line 3,100: | ||
= Debugging = | = Debugging = | ||
== debug a bash shell == | |||
[https://www.cyberciti.biz/tips/debugging-shell-script.html How To Debug a Bash Shell Script Under Linux or UNIX] | |||
== gdb == | == gdb == | ||
= Processes and Signals = | = Processes and Signals = | ||
== Search a process ID by its name == | |||
Use '''pgrep''' https://askubuntu.com/questions/612315/how-do-i-search-for-a-process-by-name-without-using-grep. For example (tested on Linux and macOS), | |||
<syntaxhighlight lang='bash'> | |||
$ pgrep RStudio # assume RStudio is running | |||
27043 | |||
$ pgrep geany # geany is not running. | |||
$ | |||
</syntaxhighlight> | |||
= POSIX threads = | = POSIX threads = |
Latest revision as of 16:38, 21 November 2024
Shell Programming
Some Resources
- Bash shell scripting for Helix and Biowulf
- Shell Style Guide from Google
- http://learnshell.org/
- http://tldp.org The Linux Documentation Project
- Bash Guide for Beginners
- BASH Programming - Introduction HOW-TO
- Advanced Bash-Scripting Guide
- Linux Shell Scripting Tutorial from cyberciti.biz
- Shell debugging
- 10 Useful Tips for Writing Effective Bash Scripts in Linux
- Ten Things I Wish I’d Known About bash & Learn Bash the Hard Way $4.99
- 5 ways to improve your Bash scripts
Understand shell command options
explainshell.com. For example, https://explainshell.com/explain?cmd=rsync+-avz+--progress+--partial+-e
Check shell scripts
How To Validate the Syntax of a Linux Bash Script Before Running It
ShellCheck & download the binary from Launchpad.
If a statement missed a single quote the shell may show an error on a different line (though the error message is still useful). Therefore it is useful to verify the syntax of the script first before running it.
Writing Secure Shell Scripts
Bioinformatics
Data science
Data Science at the Command Line Obtain, Scrub, Explore, and Model Data with Unix Power Tools
Special characters
15 Special Characters You Need to Know for Bash
Progress bar
How to Add a Simple Progress Bar in Shell Script
Simple calculation
echo
echo $(( 11/5 )) # or echo $((11/5))
Note: only return an integer number.
bc: an arbitrary precision calculator language
bc -l <<< "11/5" # Without '-l' we only get the integer part # Or interactive bc -i scale=5 11/5 quit
where -l means to use the predefined math routines and <<< is a here string. Note bc returns a real number.
Here documents
<<
- http://linux.die.net/abs-guide/here-docs.html
- How to use a here documents to write data to a file in bash script
#!/bin/bash cat <<!FUNKY! hello this is a here document $var on line !FUNKY!
To disable pathname/parameter/variable expansion, command substitution, arithmetic expansion such as $HOME, ..., add quotes to EOF; 'EOF'.
<<< here string
http://linux.die.net/abs-guide/x15683.html
Redirect
stdin, stdout, and stderr
What Are stdin, stdout, and stderr on Linux?
Redirecting output. File descriptor number 1 (2) means standard output (error).
./myProgram > stdout.txt # redirect std out to <stdout.txt> ./myProgram 2> stderr.txt # redirect std err to <stderr.txt> by using the 2> operator ./myProgram > stdout.txt 2> stderr.txt # combination of above two ./myProgram > stdout.txt 2>&1 # redirect std err to std out <stdout.txt> ./myProgram >& /dev/null # prevent writing std out and std err to the screen ps >> outptu.txt # append
Redirecting input
./myProgram < input.txt
Using cat or echo to create a new file that needs sudo right
The following command does not work
sudo cat myFile > /opt/myFile
Solution 1 (sudo sh -c). We can use something like
sudo sh -c 'cat myFile > /opt/myFile'
Solution 2 (sudo tee). See 'How To Configure Nginx as a Web Server and Reverse Proxy for Apache on One Ubuntu 16.04 Server'
echo "<?php phpinfo(); ?>" | sudo tee /var/www/html/info.php
If we want to append something to an existing file, use -a option in the tee command.
Create a simple text file with multiple lines; write data to a file in bash script
Each of the methods below can be used in a bash script.
# Method 1: printf. We can add \t for tab delimiter $ printf '%s \n' 'Line 1' 'Line 2' 'Line 3' > out.txt # Method 2: echo. We can add \t for tab delimiter $ echo -e 'Line 1\t12\t13 $ Line 2\t22\t23 $ Line 3\t32\t33' > out.txt # Method 3: echo $ echo $'Line 1\nLine 2\nLine 3' > out.txt # Method 4: here document, http://tldp.org/LDP/abs/html/here-docs.html # For the TAB character, use Ctrl-V, TAB. # Note that first line can be: cat <<EOF > out.txt # The filename can be a variable if this is used inside a bash file $ cat > out.txt <<EOF > line1 Second > lin2 abcd > line3ss dkflaf > EOF $
See also How to use a here documents to write data to a file in bash script
To escape the quotes, use a back slash. For example
echo $'#!/bin/bash\nmodule load R/3.6.0\nRscript --vanilla -e "rmarkdown::render(\'gse6532.Rmd\')"'
will obtain
#!/bin/bash module load R/3.6.0 Rscript --vanilla -e "rmarkdown::render('gse6532.Rmd')"
>&
&> file is not part of the official POSIX shell spec, but has been added to many Bourne shells as a convenience extension (it originally comes from csh). In a portable shell script (and if you don't need portability, why are you writing a shell script?), use > file 2>&1 only.
Redirect Output and Errors To /dev/null
http://www.cyberciti.biz/faq/how-to-redirect-output-and-errors-to-devnull/
command > /dev/null 2>&1 # OR command &>/dev/null
In addition we can put a process in the background by adding the '&' sign; see the dclock example.
tee -redirect to both a file and the screen same time
To redirect to both a file and the screen the same time, use tee command. See
- http://www.cyberciti.biz/faq/linux-redirect-error-output-to-file/
- http://www.cyberciti.biz/faq/saving-stdout-stderr-into-separate-files/
- https://en.wikipedia.org/wiki/Tee_(command)
- Linux tee Command Explained for Beginners (6 Examples)
- Since bash version 4 you may use |& as an abbreviation for 2>&1 |
command1 |& tee log.txt ## or ## command1 -arg |& tee log.txt ## or ## command1 2>&1 | tee log.txt # use the option '-a' for *append* echo "new line of text" | sudo tee -a /etc/apt/sources.list # redirect output of one command to another ls file* | tee output.txt | wc -l # streaming file (e.g. running an arduino sketch on Udoo) # for streaming files, cp command (still need Ctrl + c) will not # show anything on screen though copying is executed. cat /dev/ttymxc3 | tee out.txt # Ctrl + c
command > >(tee stdout.log) 2> >(tee stderr.log >&2)
Methods To Create A File In Linux
10 Methods To Create A File In Linux
Prepend
BASH Prepend A Text / Lines To a File
Pipe
The operator is |.
ps > psout.txt sort psout.txt > pssort.out
can be simplified to
ps | sort > pssort.out
For example,
$ head /etc/passwd root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync $cat /etc/passwd | cut -d: -f7 | sort | uniq -c | sort -nr 18 /bin/sh 13 /bin/false 2 /bin/bash 1 /bin/sync
where cut command will extract the 7th field separated by the : character and write to the output stream. sort command will sort alphabetically sorts the line it reads from its input and returns the new sort to its output. The uniq command will remove and count duplicated lines. The final sort command will sort its input numerically in reverse order.
Dash (-) at the end of a command mean?
- http://unix.stackexchange.com/questions/16357/usage-of-dash-in-place-of-a-filename. It means 'standard input' or anything that will be used (required or interpreted) by the software. The following example is from How to use dd command
# ssh [email protected] "dd if=/dev/sda | gzip -1 -" | dd of=backup.gz
- http://unix.stackexchange.com/questions/41828/what-does-dash-at-the-end-of-a-command-mean
Process substitution
https://en.wikipedia.org/wiki/Process_substitution
Powerfulness of pipes
Consider the following commands (samtools gives its output on stdout which is a good opportunity to use pipes)
samtools mpileup -go temp.bcf -uf genome.fa dedup.bam bcftools call -vmO v -o sample1_raw.vcf temp.bcf
The disadvantage of this approach is it will create a temporary file (temp.bcf in this case). If the size of the temporary file is enormous large (several hundred of GB), it will waste/eat up the hard disk space no to say the time used to create the temporary file. If we use pipes, we can save the time and disk space of the temporary file.
samtools mpileup -uf genome.fa dedup.bam | bcftools call -vmO v -o sample1_raw.vcf
Send a stdout to a remote computer
See here (bypass SSH password) for a case (utilize cat, ssh and >> commands).
Execute a bash script downloaded (without saving first) from the internet
See the example of install Gitlab
sudo curl -sS https://packages.gitlab.com/install/repositories/gitlab/raspberry-pi2/script.deb.sh | sudo bash
where -s means silent and -S means showing error messages if it fails. Note that curl will download the file to standard output. So using the pipe operator is a reasonable sequence after running the curl.
Use wget to download and decompress at one line
https://stackoverflow.com/questions/16262980/redirect-pipe-wget-download-directly-into-gunzip
wget -O - ftp://ftp.direcory/file.gz | gunzip -c > file.out
where "-O -" means to print to standard output (sort of like the default behavior of "curl"). See https://www.gnu.org/software/wget/manual/wget.html
Use pipe and while loop to process multiple files
See an example at while.
Pipe vs redirect
- Pipe is used to pass output to another program or utility.
- Redirect is used to pass output to either a file or stream.
In other words, thing1 | thing2 does the same thing as thing1 > temp_file && thing2 < temp_file.
Shebang (#!)
A shebang is the character sequence consisting of the characters number sign and exclamation mark (that is, "#!") at the beginning of a script. See the Wikipedia page.
The syntax looks like
#! interpreter [optional-arg]
For example,
#!/bin/sh
— Execute the file using sh, the Bourne shell, or a compatible shell#!/bin/csh -f
— Execute the file using csh, the C shell, or a compatible shell, and suppress the execution of the user’s .cshrc file on startup#!/usr/bin/perl -T
— Execute using Perl with the option for taint checks
When Is It Better to Use #!/bin/bash Instead of #!/bin/sh in a Shell Script?
Howto Make Script More Portable With #!/usr/bin/env As a Shebang
https://www.cyberciti.biz/tips/finding-bash-perl-python-portably-using-env.html
This is useful if the interpreter location is different on Linux and Mac OSs.
# Linux $ which Rscript /usr/bin/Rscript # Mac $ which Rscript /usr/local/bin/Rscript
We can use the following on the first line of the shell script.
#!/usr/bin/env Rscript
Comments
For a single line, we can use the '#' sign. Shell Script Put Multiple Line Comments under Bash/KSH.
For a block of code, we use
#!/bin/bash echo before comment : <<'END' bla bla blurfl END echo after comment
Variables
food=Banana echo $food food="Apple" echo $food
When do I need to use the export command
Consider the following
MY_DIRECTORY=/path/to/my/directory export MY_DIRECTORY ./my_script.sh
If you don’t use the export command in the above example, the MY_DIRECTORY variable will not be available to the my_script.sh script. It will only be available within the current shell session as a local shell variable.
When you set a variable in a shell session without using the export command, it is only available within that shell session as a local shell variable. This means that the variable and its value are only accessible within the current shell session and are not passed to child processes (e.g. my_script.sh) or other programs that are started from the command line.
Cf. When I put LS_COLORS in the .bashrc file, I don't need to use the export command.
export -n command: remove from environment
https://linuxconfig.org/learning-linux-commands-export
It will export an environment variable to the subshell/forked process. For example
$ export MYVAR=10 # export a variable $ export -n MYVAR # remove a variable
To see the current process ID, use
echo $$
To create a new process, use
bash
When using the export command without any option and arguments it will simply print all names marked for an export to a child process.
$ export declare -x EDITOR="nano" declare -x HISTTIMEFORMAT="%d/%m/%y %T " declare -x HOME="/home/brb" declare -x LANG="en_US.UTF-8" declare -x LESSCLOSE="/usr/bin/lesspipe %s %s" declare -x LESSOPEN="| /usr/bin/lesspipe %s" declare -x LOGNAME="brb" ... declare -x PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games" declare -x PWD="/home/brb" declare -x SHELL="/bin/bash" ... declare -x USER="brb" declare -x VISUAL="nano"
echo command
- https://en.wikipedia.org/wiki/Echo_(command)
- How to Use the Echo Command on Linux
- Writing Text to the Terminal
- Using Variables With echo
- Using Commands With echo
- Formatting Text With echo
- Using echo With Files and Directories
- Writing to Files with echo
String manipulation
http://www.thegeekstuff.com/2010/07/bash-string-manipulation/
dirname and basename commands
http://www.tldp.org/LDP/LG/issue18/bash.html
# On directories $ dirname ~/Downloads /home/chronos/user $ basename ~/Downloads Downloads # On files $ dirname ~/Downloads/DNA_Helix.zip /home/chronos/user/Downloads $ basename ~/Downloads/DNA_Helix.zip DNA_Helix.zip $ basename ~/Downloads/DNA_Helix.zip .zip DNA_Helix $ basename ~/Downloads/annovar.latest.tar.gz annovar.latest.tar.gz $ basename ~/Downloads/annovar.latest.tar.gz .gz annovar.latest.tar $ basename ~/Downloads/annovar.latest.tar.gz .tar.gz annovar.latest $ basename ~/Downloads/annovar.latest.tar.gz .latest.tar.gz annovar
Escape characters and quotes
echo $USER # brb echo My name is $USER echo "My name is $USER" # My name is brb echo 'My name is $USER' # 'My name is $USER'; single quote will not interpret the variable # we use the single quotes if we want to present the characters literally or # pass the characters to the shell. grep '.*/udp' /etc/services # normally . and * and slash characters have special meaning echo \$USER # we escape $ so $ lost its special meaning echo '\' echo \'text\' # 'text'
When to use double quotes with a variable
when to use double quotes with a variable in shell script?
Concatenate string variables (not safe)
http://stackoverflow.com/questions/4181703/how-can-i-concatenate-string-variables-in-bash
a='hello' b='world' c=$a$b echo $c # Bash also supports a += operator $ A="X Y" $ A+="Z" $ echo "$A"
Often we need to use "double quotes" around the string variables if the string variables represent some directories.
mkdir "tmp 1" touch "tmp 1/tmpfile" tmpvar="tmp 1" echo tmpvar # tmp 1 ls $tmpvar ls: cannot access tmp: No such file or directory ls: cannot access 1: No such file or directory ls "$tmpvar" # tmpfile
However, for integers
echo $a 24 ((a+=12)) echo $a 36
Note that the double parentheses construct in ((a+=12)) permits arithmetic expansion and evaluation.
${parameter} - Concatenate a string variable and a constant string; variable substitution
Parameter substitution ${}. Cf $() for command execution
x=foo y=bar z=$x$y # $z is now "foobar" z="$x$y" # $z is still "foobar" z="$xand$y" # does not work z="${x}and$y" # does work, "fooandbar"
And
your_id=${USER}-on-${HOSTNAME} echo "$your_id" echo "Old \$PATH = $PATH" PATH=${PATH}:/opt/bin # Add /opt/bin to $PATH for duration of script. echo "New \$PATH = $PATH"
And using "{" in order to create a new string based on an existing variable
pdir="/tmp/files/today" fname="report" mkdir -p $pdir touch $pdir/$fname # OK ls -l $pdir/$fname touch $pdir/$fname_new # No error but it does not do anything # because this variable does not exist yet ls $pdir/$fname_new touch $pdir/${fname}_new ls $pdir/${fname}_new # Works
$(command) - Command Execution and Assign Output of Shell Command To a Variable; Command substitution
Bash Assign Output of Shell Command To Variable
$(command) `command` # ` is a backquote/backtick, not a single quotation sign # this is a legacy support; not recommended by https://www.shellcheck.net/
Note all new scripts should use the $(...) form, which was introduced to avoid some rather complex rules.
Example 1.
sudo apt-get install linux-headers-$(uname -r)
Example 2.
user=$(echo "$UID")
Example 3.
#!/bin/sh echo The current directory is $PWD echo The current users are $(who) sudo chown `id -u` SomeDir # change the ownership to the current user. Dangerous! # Or sudo chown `whoami` SomeDirOrSomeFile exit 0
Example 4. Create a new file with automatically generated filename
touch file-$(date -I)
Example 5. Use $(your expression) to run nest expressions. For example,
# cd into the directory containing the 'touch' command. cd $(dirname $(type -P touch)) BACKUPDIR=/nas/backup LASTDAYPATH=${BACKUPDIR}/$(ls ${BACKUPDIR} | tail -n 1)
The concept of putting the result of a command into a script variable is very powerful, as it makes it easy to use existing commands in scripts and capture their output.
Arithmetic Expansion
$((...))
is a better alternative to the expr command. More examples:
for i in $(seq 1 3) do echo SRR$(( i + 1027170 ))'_1'.fastq done
Note that the single quote above is required. The above will output SRR1027171_1.fastq, SRR102172_1.fastq and SRR1027173_1.fastq.
Parameter Expansion
${parameter}
Double Parentheses (())
Bash Shell Scripting for beginners (Part 1) fedoramagazine. Double parentheses are simple, they are for mathematical equations.
extract substring
https://www.cyberciti.biz/faq/how-to-extract-substring-in-bash/
${parameter:offset:length}
Example:
## define var named u ## u="this is a test" var="${u:10:4}" echo "${var}"
Or use the cut command.
u="this is a test" echo "$u" | cut -d' ' -f 4 echo "$u" | cut --delimiter=' ' --fields=4 ########################################## ## WHERE ## -d' ' : Use a whitespace as delimiter ## -f 4 : Select only 4th field ########################################## var="$(cut -d' ' -f 4 <<< $u)" echo "${var}"
Environment variables
How to Set Environment Variables in Bash on Linux
$HOME $PATH $0 -- name of the shell script $# -- number of parameters passed (so it does include the program itself) $$ process ID of the shell script, often used inside a script for generating unique temp filenames $? -- the exit value of the last run command; 0 means OK and none-zero means something wrong $_ -- previous command's last argument
Example 1 (check if a command run successfully):
some_command if [ $? -eq 0 ]; then echo OK else echo FAIL fi # OR if some_command; then printf 'some_command succeeded\n' else printf 'some_command failed\n' fi $ tabix -f -p vcf ~/SeqTestdata/usefulvcf/hg19/CosmicCodingMuts.vcf.gz brb@brb-P45T-A:/tmp$ echo $? 0 $ tabix -f -p vcf ~/Downloads/CosmicCodingMuts.vcf.gz Not a BGZF file: /home/brb/Downloads/CosmicCodingMuts.vcf.gz tbx_index_build failed: /home/brb/Downloads/CosmicCodingMuts.vcf.gz $ echo $? 1
Example 2 (check whether a host is reachable)
ping DOMAIN -c2 &> /dev/null if [ $? -eq 0 ]; then echo Successful else echo Failure fi
where -c is used to limit the number of packets to be sent and &> /dev/null is used to redirect both stderr and stdout to /dev/null so that it won't be printed on the terminal.
Example 3 (check if users have supply a correct number of parameters):
#!/bin/bash if [ $# -ne 2 ]; then echo "Usage: $0 ProgramName filename" exit 1 fi match_text=$1 filename=$2
Example 4 (make a new directory and cd to it)
mkdir -p "newDir/subDir"; cd "$_"
How to List Environment Variables
How to List Environment Variables on Linux
printenv
Unset/Remove an environment variable
$ export MSG="HELLO WORLD" $ echo $MSG HELLO WORLD $ unset MSG $ echo $MSG $
Set an environment variable and run a command on the same line, env command
- Setting an environment variable before a command in Bash is not working for the second command in a pipe
- What does 'bash -c' do?
FOO=bar bash -c 'somecommand someargs | somecommand2'
- env: run a program in a modified environment. man env, env command in Linux with Examples
env RSTUDIO_WHICH_R=/opt/R/4.2.3/bin/R rstudio ~/Project/project.Rproj
Note that the environment is not changed. RSTUDIO_WHICH_R is not exported.
- https://en.wikipedia.org/wiki/Env. Note that this use of env is often unnecessary since most shells support setting environment variables in front of a command.
env DISPLAY=foo.bar:1.0 xcalc # OR DISPLAY=foo.bar:1.0 xcalc
Parameter variables
- Shell Parameter Expansion - Important !!
- http://tldp.org/LDP/abs/html/othertypesv.html
- https://bash.cyberciti.biz/guide/Pass_arguments_into_a_function
$1, $2, .... -- parameters given to the script $* -- list of all the parameters, in a single variable $@ -- subtle variation on $*. $! -- the process id of the last command run in the background.
Example 1.
#!/bin/bash echo "$1 likes to eat $2 and $3 every day." echo "bye:-)"
Example 2.
$ touch /tmp/tmpfile_$$ $ set foo bar bam $ echo $# 3 $ echo $@ foo bar bam $ set foo bar bam & [1] 28212 $ echo $! 28212 [1]+ Done set foo bar bam
Example 3. $@ parameter for a variable number of parameters
$ cat stats.sh for FILE1 in "$@" do wc $FILE1 done $ sh stats.sh songlist1 songlist2 songlist3
We can also use parentheses around the variable name.
QT_ARCH=x86_64 QT_SDK_BINARY=QtSDK-4.8.0-${QT_ARCH}.tar.gz QT_SD_URL=https://xxx.com/$QT_SDK_BINARY
How do I rename the extension for a batch of/multiple files? See man bash Shell Parameter Expansion
# Solution 1: for file in *.html; do mv "$file" "`basename "$file" .html`.txt" done # Solution 2: for file in *.html do mv "$file" "${file%.html}.txt" done
Get filename without Path
How to Extract Filename & Extension in Shell Script
fullfilename="/var/log/mail.log" filename=$(basename "$fullfilename") echo $filename
Extension without filename
How to Extract Filename & Extension in Shell Script
fullfilename="/var/log/mail.log" filename=$(basename "$fullfilename") ext="${filename##*.}" echo $ext
Discard the extension name and "%" symbol
$ vara=fillename.ext $ echo $vara fillename.ext $ echo ${vara::-4} # works on Bash 4.3, eg Ubuntu fillename $ echo ${vara::${#vara}-4} # works on Bash 4.1, eg Biowulf readhat
Another way (not assuming 3 letters for the suffix) https://www.cyberciti.biz/faq/unix-linux-extract-filename-and-extension-in-bash/
dest="/nas100/backups/servers/z/zebra/mysql.tgz" ## get file name i.e. basename such as mysql.tgz tempfile="${dest##*/}" ## display filename echo "${tempfile%.*}"
Or better with (See Extract filename and extension in Bash and Shell parameter expansion). How to Extract Filename & Extension in Shell Script
fullfilename="/var/log/mail.log" filename=$(basename "$fullfilename") fname="${filename%.*}" echo $fname # mail $ UEFI_ZIP_FILE="UDOOX86_B02-UEFI_Update_rel102.zip" $ UEFI_ZIP_DIR="${UEFI_ZIP_FILE%.*}" $ echo $UEFI_ZIP_DIR UDOOX86_B02-UEFI_Update_rel102 $ FILE="example.tar.gz" $ echo "${FILE%%.*}" example $ echo "${FILE%.*}" example.tar $ echo "${FILE#*.}" tar.gz $ echo "${FILE##*.}" gz
Space in variable value
Suppose we have a script file called 'foo' that can remove spaces from a file name. Note: tr command is used to delete characters specified by the '-d' parameter.
#!/bin/sh NAME=`ls $1 | tr -d ' '` echo $NAME mv $1 $NAME
Now we try the program:
$ touch 'file 1.txt' $ ./foo 'file 1.txt' ls: cannot access file: No such file or directory ls: cannot access 1.txt: No such file or directory mv: cannot stat ‘file’: No such file or directory
The way to fix the program is to use double quotes around $1
#!/bin/sh NAME=`ls "$1" | tr -d ' '` echo $NAME mv "$1" $NAME
and test it
$ ./foo "file 1.txt" file1.txt
If we concatenate the variable, put the double quotes around the variables, not the whole string.
$ rm "$outputDir/tmp/$tmpfd/tmpa" # fine $ rm "$outputDir/tmp/$tmpfd/tmp*.txt" rm: annovar6-12/tmp/tmp_bt20_raw/tmp*.txt: No such file or directory $ rm "$outputDir"/tmp/$tmpfd/tmp*.txt
getopts function - parse options from shell script command line
- https://www.lifewire.com/pass-arguments-to-bash-script-2200571
- https://www.computerhope.com/unix/bash/getopts.htm
- How to Use getopts to Parse Linux Shell Script Options
Check if command line argument is missing (? :) and specifying the default (:-)
Search for ternary (conditional) operator and check out parameter Expansion in Bash Reference Manual. 74 Bash Operators Examples
#!/usr/bin/env bash NAME=${1?Error: no name given} NAME2=${2:-friend} echo "HELLO! $NAME and $NAME2"
Shell expansion
https://www.gnu.org/software/bash/manual/html_node/Shell-Expansions.html#Shell-Expansions
Curly brace {} expansion and array
- A Comic from Wizard zines.
- Explain: {,} in cp or mv Bash Shell Commands
- Copy multiple types of extensions
cp -v *.{txt,jpg,png} destination/
- All about {Curly Braces} in Bash
- Array Builder
echo {0..10} echo {10..0..2} echo {z..a..2} mkdir test{10..12} # test10, test11, test12 directories rm -rf test{10..12}
- Parameter expansion
# convert jpg to png for i in *.jpg; do convert $i ${i%jpg}png; done a="Hello World!" echo Goodbye${a#Hello} # Goodbye World!
- Output Grouping
- Array Builder
- How to Use Arrays in a Bash Script
Square brackets
Using Square Brackets in Bash: Part 1
Globbing: Using wildcards to get all the results that fit a certain pattern is precisely
ls *.jpg # the asterisk means "zero or more characters" ls d*k? # ?, which means "exactly one character" touch file0{0..9}{0..9} # This will create files file000, file001, file002, etc., through file097, file098 and file099. ls file0[78]? # list the files in the 70s and 80s ls file0[259][278] # list file022, file027, file028, file052, file057, file058, file092, file097, and file98
Conditions
We can use the test command to check if a file exists. The command is test -f <filename>.
[] is just the same as writing test, and would always leave a space after the test word.
if test -f fred.c; then ...; fi if [ -f fred.c ] then ... fi if [ -f fred.c ]; then ... fi
Boolean variables
How to declare Boolean variables in bash and use them in a shell script
failed=0 # False jobdone=1 # True ## more readable syntax ## failed=false jobdone=true if [ $failed -eq 1 ] then echo "Job failed" else echo "Job done" fi
We can define them as a string and make our code more readable.
What is the difference between test, [ and [[ ?
http://mywiki.wooledge.org/BashFAQ/031
[ ("test" command) and [[ ("new test" command) are used to evaluate expressions. [[ works only in Bash, Zsh and the Korn shell, and is more powerful; [ and test are available in POSIX shells.
test implements the old, portable syntax of the command. In almost all shells (the oldest Bourne shells are the exception), [ is a synonym for test (but requires a final argument of ]).
[[ is a new improved version of it, and is a keyword, not a program.
String comparison
== ==> strings are equal (== is a synonym for =) = ==> strings are equal != ==> strings are not equal -z ==> string is null -n ==> string is not null
For example, the following script check if users have provided an argument to the script.
$!/bin/sh if [ -z "$1"]; then echo "Provide a \"file name\", using quotes to nullify the space." exit 1 fi mv -i "$1" `ls "$1" | tri -d ' '`
where the -i parameter is to reconfirm the overwrite by the mv command.
To check whether Xcode (either full Xcode or command line developer tools only) has been installed or not on Mac
if [ -z "$(xcode-select -p 2>&1 | grep error)" ] then echo "Xcode has been installed"; else echo "Xcode has not been installed"; fi # only print out message if xcode was not found if [ -n "$(xcode-select -p 2>&1 | grep error)" ] then echo "Xcode has not been installed"; fi
note the 'error' keyword comes from macOS when the Xcode has not been installed. Also the double quotes around $( ) is needed to avoid the error [: too many arguments” error.
Check if string starts with such as "#".
if [[ "$var" =~ ^#.* ]]; then echo "yes" fi
Arithmetic/Integer comparison
expr1 -eq expr2 ==> check equal expr1 -ne expr2 ==> check not equal expr1 -gt expr2 ==> expr1 > expr2 expr1 -ge expr2 ==> expr1 >= expr2 expr1 -lt expr2 ==> expr1 < expr2 expr1 -le expr2 ==> expr1 <= expr2 ! expr ==> opposite of expr
File conditionals
-d file ==> True if the file is a directory -e file ==> True if the file exists -f file ==> True if the file is a regular file -r file ==> True if the file is readable -s file ==> True if the file has non-zero size -w file ==> True if the file is writable -x file ==> True if the file is executable
Example 1: Suppose we want to know if the first argument (if given) match a specific string. We can use (note the space before and after '==')
#!/bin/bash if [ $1 == "console" ]; then echo 'Console' else echo 'Non-console' fi
Example 2: Check If File Is Empty Or Not Using Shell Script
#!/bin/bash _file="$1" [ $# -eq 0 ] && { echo "Usage: $0 filename"; exit 1; } [ ! -f "$_file" ] && { echo "Error: $0 file not found."; exit 2; } if [ -s "$_file" ] then echo "$_file has some data." # do something as file has data else echo "$_file is empty." # do something as file is empty fi
Check if running as root
if [ $UID -ne 0 ]; then echo "Run as root" exit 1; fi
Control Structures
if
if condition then statements elif [ condition ]; then statements else statements fi
For example, we can run a cp command if two files are different.
if ! cmp -s "$filesrc" "$filecur" then cp $filesrc $filecur fi
String Comparison
http://stackoverflow.com/questions/2237080/how-to-compare-strings-in-bash
answer=no if [ -f "genome.fa" ]; then echo -n 'Do you want to continue [yes/no]: ' read answer fi if [ "$answer" == "no" ]; then echo AAA fi if [ "$answer"=="no" ]; then # failed if condition echo BBB fi
- You want the quotes around $answer, because if $answer is empty.
- Space in bash is important.
- Spaces between if and [ and ] are important
- A space before and after the double equal signs is important all. So if we reply with 'yes', the code still runs 'echo BBB' statement.
while
while condition do statements done
- https://www.cyberciti.biz/faq/bash-while-loop/, https://bash.cyberciti.biz/guide/While_loop
- http://www.tldp.org/LDP/Bash-Beginners-Guide/html/sect_09_02.html, http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-7.html
- Pipe and while
$ function mylist() { ls *.r } $ mylist | while read file; do wc -l ${file}; done
until
until condition do statements done
case
How to Use Case Statements in Bash Scripts
Semicolon
Command1; command2; command3; command4
Every commands will be executed whether the execution is successful or not.
AND list &&
How To Run A Command After The Previous One Has Finished On Linux
statement1 && statement2 && statement3 && ...
If command1 finishes successfully then run command2.
touch /tmp/f1 echo "data" >/tmp/f2 [ -s /tmp/f1 ] echo $? # 1 [ -s /tmp/f2 ] echo $? # 0 [ -s /tmp/f1 ] && echo "not empty" || echo "empty" # empty [ -s /tmp/f2 ] && echo "not empty" || echo "empty" # not empty
OR list ||
statement1 || statement2 || statement3 || ...
If command1 fails then run command2.
For example,
codename=$(lsb_release -s -c) if [ $codename == "rafaela" ] || [ $codename == "rosa" ]; then codename="trusty" fi
Chaining rule (command1 && command2 || command3)
Coupled commands with control operators in Bash
10 Useful Chaining Operators in Linux with Practical Examples.
- Ampersand Operator (&),
- semi-colon Operator (;),
- AND Operator (&&),
- OR Operator (||),
- NOT Operator (!),
- AND – OR operator (&& – ||),
- PIPE Operator (|),
- Command Combination Operator {},
- Precedence Operator (),
- Concatenation Operator (\).
A combination of ‘AND‘ and ‘OR‘ Operator is much like an ‘if-else‘ statement.
$ ping -c3 www.google.com && echo "Verified" || echo "Host Down"
How to program with Bash: Syntax and tools
# command1 && command2 $ Dir=/root/testdir ; mkdir $Dir/ && cd $Dir # command1 || command2 $ Dir=/root/testdir ; mkdir $Dir || echo "$Dir was not created." # preceding commands ; command1 && command2 || command3 ; following commands # "If command1 exits with a return code of 0, then execute command2, otherwise execute command3." $ Dir=/root/testdir ; mkdir $Dir && cd $Dir || echo "$Dir was not created." $ Dir=~/testdir ; mkdir $Dir && cd $Dir || echo "$Dir was not created."
for + do + done
for variable in values do statements done
The values can be an explicit list
i=1 for day in Mon Tue Wed Thu Fri do echo "Weekday $((i++)) : $day" done
or a variable
i=1 weekdays="Mon Tue Wed Thu Fri" for day in $weekdays do echo "Weekday $((i++)) : $day" done # Output # Weekday 1 : Mon # Weekday 2 : Tue # Weekday 3 : Wed # Weekday 4 : Thu # Weekday 5 : Fri
Note that we should not put a double quotes around $weekdays variable. If we put a double quotes around $weekdays, it will prevent word splitting. See thegeekstuff article.
i=1 weekdays="Mon Tue Wed Thu Fri" for day in "$weekdays" do echo "Weekday $((i++)) : $day" done # Output # Weekday 1 : Mon Tue Wed Thu Fri
To loop over all script files in a directory
FILES=/path/to/PATTERN*.sh for f in $FILES; do ( "$f" )& done wait
OR
FILES=" file1 /path/to/file2 /path/to/file3 " for f in $FILES; do ( "$f" )& done wait
Here we run the script in the background and wait to exit until all are finished.
See loop over files from cyberciti.biz.
Example 1: convert pdfs to tifs using ImageMagick
"for" looping over files, check cyberciti.biz)
outdir="../plosone" indir="../fig" if [[ ! -d $outdir ]]; then mkdir $outdir fi in=(file1.pdf file2.pdf file3.pdf) for (( i=0; i<${#in[@]} ; i++ )) do convert -strip -units PixelsPerInch -density 300 -resample 300 \ -alpha off -colorspace RGB -depth 8 -trim -bordercolor white \ -border 1% -resize '2049x2758>' -resize '980x980<' +repage \ -compress lzw $indir/${in[$i]} $outdir/Figure$[$i+1].tiff done
Example 2: download with wget and parsing with 'sed'
A second example is to download all the (Ontario gasoline price) data with wget and parsing and concatenating the data with other *nix tools like 'sed':
# Download data for i in $(seq 1990 2014) do wget http://www.energy.gov.on.ca/fuelupload/ONTREG$i.csv done # Retain the header head -n 2 ONTREG1990.csv | sed 1d > ONTREG_merged.csv # Loop over the files and use sed to extract the relevant lines for i in $(seq 1990 2014) do tail -n 15 ONTREG$i.csv | sed 13,15d | sed 's/./-01-'$i',/4' >> ONTREG_merged.csv done
Example 3: download
Download all 20 sra files (60GB in total) from SRP032789.
for x in $(seq 1027175 1027180) do wget ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP032/SRP032789/SRR$x/SRR$x.sra done
https://github.com/MarioniLab/EmptyDrops2017/blob/master/data/download_10x.sh
for x in \ http://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc4k/pbmc4k_raw_gene_bc_matrices.tar.gz \ http://cf.10xgenomics.com/samples/cell-exp/2.1.0/neurons_900/neurons_900_raw_gene_bc_matrices.tar.gz \ http://cf.10xgenomics.com/samples/cell-exp/1.1.0/293t/293t_raw_gene_bc_matrices.tar.gz \ http://cf.10xgenomics.com/samples/cell-exp/1.1.0/jurkat/jurkat_raw_gene_bc_matrices.tar.gz \ http://cf.10xgenomics.com/samples/cell-exp/2.1.0/t_4k/t_4k_raw_gene_bc_matrices.tar.gz \ http://cf.10xgenomics.com/samples/cell-exp/2.1.0/neuron_9k/neuron_9k_raw_gene_bc_matrices.tar.gz do wget $x destname=$(basename $x) stub=$(echo $destname | sed "s/_raw_.*//") mkdir -p $stub tar -xvf $destname -C $stub rm $destname done
Example 4: convert files from DOS to Unix
Convert all files from DOS to Unix format
for f in *.txt; do tr -d '\r' < $f > tmp.txt; mv tmp.txt $f ; done # Or for file in $*; do tr -d '\r' < $f > tmp.txt; mv tmp.txt $f ; done
Example 5: print all files in a directory
for f in /etc/*.conf do echo "$f" done
Example 6: use ping to find all the live machines on the network
for ip in 192.168.0.{1..255} ; do ping $ip -c 2 &> /dev/null ; if [ $? -eq 0 ]; then echo $ip is alive fi done
Example 7: sed on multiple files
for i in *.htm*; do sed -i 's/String1/String2/' "$i"; done
Note if the string contains special characters like forward slashes (eg https://www.google.com), we need to escape them by using the backslash sign.
Example 8: run in parallel
for ip in 192.168.0.{1..255} ; do ( ping $ip -c2 &> /dev/null ; if [ $? -eq 0 ]; then echo $ip is alive fi )& done wait
where we enclose the loop body in ()&. () encloses a block of commands to run as a subshell and & sends it to the background. wait waits for all background jobs to complete.
Good technique !!!
- GNU parallel command
- http://unix.stackexchange.com/questions/103920/parallelize-a-bash-for-loop
- http://stackoverflow.com/questions/27934784/shell-script-to-loop-and-start-processes-in-parallel
- http://superuser.com/questions/158165/parallel-shell-loops
wait command
- An example where we shall wait until files are deleted before continuing the script.
cd /home/ubuntu if [ -d "R-devel" ]; then rm -rf "R-devel" & wait # Wait for the deletion to complete echo "R-devel folder deleted successfully." else echo "R-devel folder does not exist." fi wget -O - https://stat.ethz.ch/R/daily/R-devel.tar.gz | tar -xzk cd R-devel ./configure --prefix=/opt/R/devel --enable-R-shlib make
Functions
- http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-8.html, http://tldp.org/LDP/abs/html/functions.html
- http://www.thegeekstuff.com/2010/04/unix-bash-function-examples/
- https://www.howtoforge.com/tutorial/linux-shell-scripting-lessons-5/
- Cartoon from wizardzines.com
#!/bin/bash fun () { echo "This is a function"; echo; } fun () { echo "This is a function"; echo } # Error! function quit { exit } function hello { echo Hello! } function e { echo $1 } $ ./e World
How to find bash shell function source code on Linux/Unix
$ type -a function_name # To list all function names $ declare -F $ declare -F | grep function_name $ declare -F | grep foo
How do I find the file where a bash function is defined?
declare -F function_name
Function arguments
source ~/bin/setpath # add bgzip & tabix directories to $PATH function raw2exon { # put your comments here inputvcf=$1 outputvcf=$2 inputbed=$3 if [[ $4 ]]; then oldpath=$PWD cd $4 fi bgzip -c $inputvcf > $inputvcf.gz tabix -p vcf $inputvcf.gz head -$(grep '#' $inputvcf | wc -l) $inputvcf > $outputvcf # header tabix -R $inputbed $inputvcf.gz >> $outputvcf wc -l $inputvcf wc -l $outputvcf rm $inputvcf.gz $inputvcf.gz.tbi if [[ $4 ]]; then cd $oldpath fi } inputbed=S04380110_Regions.bed raw2exon 'mu0001_raw.vcf' 'mu0001_exon.vcf' $inputbed ~/Downloads/
Exit function
exit command and the exit statuses
$ cat testfun.sh #!/bin/bash ping -q -c 1 $1 >/dev/null 2>&1 if [ $? -ne 0 ] then echo "An error occurred while checking the server status". exit 3 fi exit 0 $ chmod +x testfun.sh $ ./testfun.sh www.cyberciti.biz999 An error occurred while checking the server status. $ echo $? 3
List of commands
break ==> escaping from an enclosing for, while or until loop : ==> null command continue ==> make the enclosing for, while or until loo continue at the next iteration . ==> executes the command in the current shell eval ==> evaluate arguments exec ==> replacing the current shell with a different program export ==> make the variable named as its parameter available in subshells expr ==> evaluate its arguments as an expression printf ==> similar to echo set ==> sets the parameter variables for the shell. Useful for using fields in commands that output spaced-separated values shift ==> moves all the parameter variables down by one. trap ==> specify the actions to take on receipt of signals. unset ==> remove variables or functions from the environment. mktemp ==> create a temporary file
Run the previous command
Understanding the exclamation mark (!) in bash
$ apt update # Permission denied $ sudo !! # Equivalent sudo apt update
"!" invokes history expansion. To run the most recent command beginning with “foo”:
!foo # Run the most recent command beginning with "service" as root sudo !service
Cache console output on the CLI?
Try the ‘’’script’’’ command line utility to create a typescript of everything printed on your terminal.
To exit (to end script session) type ‘’’exit’’’ or logout or press control-D.
set -e, set -x and trap
Exit immediately if a command exits with a non-zero status. Type help set in command line. Very useful!
See also the trap command that is related to non-zero exit.
See
bash -x
Call your script with something like
bash –x –v hello_world.sh
OR
#!/bin/bash –x -v echo Hello World!
where
- -x displays commands and their results
- -v displays everything, even comments and spaces
This is the same as using set -x in your bash script.
set -x example
Bash script
set -ex export DEBIAN_FRONTEND=noninteractive codename=$(lsb_release -s -c) if [ $codename == "rafaela" ] || [ $codename == "rosa" ]; then codename="trusty" fi echo $codename echo step 1 echo step 2 exit 0
Without -x option:
trusty step 1 step 2
With -x option:
+ export DEBIAN_FRONTEND=noninteractive + DEBIAN_FRONTEND=noninteractive ++ lsb_release -s -c + codename=rafaela + '[' rafaela == rafaela ']' + codename=trusty + echo trusty trusty + echo step 1 step 1 + echo step 2 step 2 + exit 0
trap and error handler
- http://www.computerhope.com/unix/utrap.htm
- http://linuxcommand.org/wss0160.php
- http://www.tutorialspoint.com/unix/unix-signals-traps.htm
- http://www.ibm.com/developerworks/aix/library/au-usingtraps/
- http://bash.cyberciti.biz/guide/Trap_statement
- http://steve-parker.org/sh/trap.shtml (trap with a user-defined function)
- http://www.turnkeylinux.org/blog/shell-error-handling (set -e)
- http://unix.stackexchange.com/questions/17314/what-is-signal-0-in-a-trap-command (do something on EXIT)
- http://unix.stackexchange.com/questions/79648/how-to-trigger-error-using-trap-command
- Using Bash traps in your scripts
- How "Exit Traps" Can Make Your Bash Scripts Way More Robust And Reliable
The syntax to use trap command is
trap command signal
For example,
$ cat traptest.sh #!/bin/sh trap 'rm -f /tmp/tmp_file_$$' INT echo creating file /tmp/tmp_file_$$ date > /tmp/tmp_file_$$ echo 'press interrupt to interrupt ...' while [ -f /tmp/tmp_file_$$ ]; do echo file exists sleep 1 done echo the file no longer exists trap - INT echo creaing file /tmp/tmp_file_$$ date > /tmp/tmp_file_$$ echo 'press interrupt to interrupt ...' while [ -f /tmp/tmp_file_$$ ]; do echo file exists sleep 1 done echo we never get here exit 0
will get an output like
$ ./traptest.sh creating file /tmp/tmp_file_21389 press interrupt to interrupt ... file exists file exists ^Cthe file no longer exists creaing file /tmp/tmp_file_21389 press interrupt to interrupt ... file exists file exists ^C
The first when we use trap, it will delete the file when we hit Ctrl+C. The second time when we use trap, we do not specify any command to be exected when an INT signal occurs. So the default behavior occurs. That is, the final echo and exit statements are never executed.
Note that the following two are different.
trap - INT trap '' INT
The second command will IGNORE signals (Ctrl+C in this case) so if we apply this statement above, we will not be able to use Ctrl+C to kill the execution.
DEBUG trap to step through line by line
You can use the "DEBUG" trap to step through a bash script line by line
Bash shell find out if a command exists or not
http://www.cyberciti.biz/faq/unix-linux-shell-find-out-posixcommand-exists-or-not/
POSIX
POSIX built-in commands
- command is one of bash built-in commands (alias, bind, command, declare, echo, help, let, printf, read, source, type, typeset, ulimit and unalias).
- Bash Builtin Commands and Shell Builtin Commands
- Bash source code
- What is command on bash?
- What is the difference between a builtin command and one that is not?
- Use command command to tell if a command can be found.
- Use type command to tell if a command is built-in.
# command -v will return >0 when the command1 is not found command -v command1 >/dev/null && echo "command1 Found In \$PATH" || echo "command1 Not Found in \$PATH" $ help command command: command [-pVv] command [arg ...] Execute a simple command or display information about commands. Runs COMMAND with ARGS suppressing shell function lookup, or display information about the specified COMMANDs. Can be used to invoke commands on disk when a function with the same name exists. Options: -p use a default value for PATH that is guaranteed to find all of the standard utilities -v print a description of COMMAND similar to the `type' builtin -V print a more verbose description of each COMMAND Exit Status: Returns exit status of COMMAND, or failure if COMMAND is not found. $ type command command is a shell builtin $ type export export is a shell builtin $ type wget wget is /usr/bin/wget $ type tophat -bash: type: tophat: not found $ type sleep sleep is /bin/sleep $ command -v tophat $ command -v wget /usr/bin/wget
On macOS,
$ help command command: command [-pVv] command [arg ...] Runs COMMAND with ARGS ignoring shell functions. If you have a shell function called `ls', and you wish to call the command `ls', you can say "command ls". If the -p option is given, a default value is used for PATH that is guaranteed to find all of the standard utilities. If the -V or -v option is given, a string is printed describing COMMAND. The -V option produces a more verbose description.
type -P
type -P command1 &>/dev/null && echo "Found" || echo "Not Found" $ help type type: type [-afptP] name [name ...] Display information about command type. For each NAME, indicate how it would be interpreted if used as a command name. Options: -a display all locations containing an executable named NAME; includes aliases, builtins, and functions, if and only if the `-p' option is not also used -f suppress shell function lookup -P force a PATH search for each NAME, even if it is an alias, builtin, or function, and returns the name of the disk file that would be executed -p returns either the name of the disk file that would be executed, or nothing if `type -t NAME' would not return `file'. -t output a single word which is one of `alias', `keyword', `function', `builtin', `file' or `', if NAME is an alias, shell reserved word, shell function, shell builtin, disk file, or not found, respectively Arguments: NAME Command name to be interpreted. Exit Status: Returns success if all of the NAMEs are found; fails if any are not found. typeset: typeset [-aAfFgilrtux] [-p] name[=value] ... Set variable values and attributes. Obsolete. See `help declare'.
Find all bash builtin commands
https://www.cyberciti.biz/faq/linux-unix-bash-shell-list-all-builtin-commands/
$ help $ help | less $ help | grep read
Find if a command is internal or external
$ type -a COMMAND-NAME-HERE $ type -a cd $ type -a uname $ type -a : $ command -V ls $ command -V cd $ command -V food
pause by read -p command
http://www.cyberciti.biz/tips/linux-unix-pause-command.html
read -p "Press [Enter] key to start backup..."
If we want to ask users about a yes/no question, we can use this method
while true; do read -p "Do you wish to install this program? " yn case $yn in [Yy]* ) make install; break;; [Nn]* ) exit;; * ) echo "Please answer yes or no.";; esac done
OR
echo "Do you wish to install this program?" select yn in "Yes" "No"; do case $yn in Yes ) make install; break;; No ) exit;; esac done
Keyboard input and Arithmetic
http://linuxcommand.org/wss0110.php
read
#!/bin/bash echo -n "Enter some text > " read text echo "You entered: $text"
Arithmetic
#!/bin/bash # An applications of the simple command # echo $((2+2)) # That is, when you surround an arithmetic expression with the double parentheses, # the shell will perform arithmetic evaluation. first_num=0 second_num=0 echo -n "Enter the first number --> " read first_num echo -n "Enter the second number -> " read second_num echo "first number + second number = $((first_num + second_num))" echo "first number - second number = $((first_num - second_num))" echo "first number * second number = $((first_num * second_num))" echo "first number / second number = $((first_num / second_num))" echo "first number % second number = $((first_num % second_num))" echo "first number raised to the" echo "power of the second number = $((first_num ** second_num))"
and a program that formats an arbitrary number of seconds into hours and minutes:
#!/bin/bash seconds=0 echo -n "Enter number of seconds > " read seconds # use the division operator to get the quotient hours=$((seconds / 3600)) # use the modulo operator to get the remainder seconds=$((seconds % 3600)) minutes=$((seconds / 60)) seconds=$((seconds % 60)) echo "$hours hour(s) $minutes minute(s) $seconds second(s)"
xargs
xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (the default command is echo, located at /bin/echo) one or more times with any initial-arguments followed by items read from standard input.
- How to Use the xargs Command on Linux. Need to string some Linux commands together, but one of them doesn’t accept piped input.
$ touch a.txt b.txt $ ls -1 ./*.txt ./a.txt ./b.txt $ ls -1 ./*.txt | xargs ./a.txt ./b.txt
- Using xargs in Combination With bash -c to Create Complex Commands
Example1 - Find files named core in or below the directory /tmp and delete them
find /tmp -name core -type f -print0 | xargs -0 /bin/rm -f
where, -0 If there are blank spaces or characters (including single quote, newlines, et al) many commands will not work. This option take cares of file names with blank space.
Another case: suppose I have a file with filename -sT. It seems not possible to delete it directly with the rm command.
$ rm "-sT" rm: invalid option -- 's' Try 'rm ./-sT' to remove the file ‘-sT’. Try 'rm --help' for more information. $ $ ls *T ls: option requires an argument -- 'T' Try 'ls --help' for more information. $ ls "*T" ls: cannot access *T: No such file or directory $ ls "*s*" ls: cannot access *s*: No such file or directory $ find . -maxdepth 1 -iname '*-sT' ./-sT $ find . -maxdepth 1 -iname '*-sT' | xargs -0 /bin/rm -f $ find . -maxdepth 1 -iname '*-sT' | xargs /bin/rm -f # WORKS
Similarly, suppose I have a file of zero size. The file name is "-f3". I cannot delete it.
$ ls -lt total 448 -rw-r--r-- 1 mingc mingc 0 Jan 16 11:35 -f3 $ rm -f3 rm: invalid option -- '3' Try `rm ./-f3' to remove the file `-f3'. Try `rm --help' for more information. $ find . -size 0 -print0 |xargs -0 rm
Example2 - Find files from the grep coammand and sort them by date
grep -l "Polyphen" tmp/*.* | xargs ls -lt
Example3 - Gzip with multiple jobs
CORES=$(grep -c '^processor' /proc/cpuinfo) find /source -type f -print0 | xargs -0 -n 1 -P $CORES gzip -9
where
- find -print0 / xargs -0 protects you from whitespace in filenames
- xargs -n 1 means one gzip process per file
- xargs -P specifies the number of jobs
- gzip -9 means maximum compression
GNU Parallel
- http://www.gnu.org/software/parallel/
- https://www.gnu.org/software/parallel/parallel_tutorial.html
- https://www.biostars.org/p/63816/
- https://biowize.wordpress.com/2015/03/23/task-automation-with-bash-and-parallel/
- http://www.shakthimaan.com/posts/2014/11/27/gnu-parallel/news.html
- https://www.msi.umn.edu/support/faq/how-can-i-use-gnu-parallel-run-lot-commands-parallel
- http://deepdish.io/2014/09/15/gnu-parallel/
- http://davetang.org/muse/2013/11/18/using-gnu-parallel/
- https://vimeo.com/20838834, https://youtu.be/OpaiGYxkSuQ
A simple trick without using GNU Parallel is run the commands in background.
Example: same command, different command line argument
Input from the command line (Synopsis about the triple colon ":::"):
parallel echo ::: A B C parallel gzip --best ::: *.html # '--best' means best compression parallel gunzip ::: *.CEL.gz
Input from a file:
parallel -a abc-file echo
Input is a STDIN:
cat abc-file | parallel echo find . -iname "*after*" | parallel wc -l
Another similar example is to gzip each individual files
Example: each command containing an index
Instead of
for i in $(seq 1 100) do someCommand data$i.fastq > output$i.txt & done
, we can use
parallel --jobs 16 someCommand data{}.fastq '>' output{}.txt ::: {1..100}
Example: each command not containing an index
for i in *gz; do zcat $i > $(basename $i .gz).unpacked done
can be written as
parallel 'zcat {} > {.}.unpacked' ::: *.gz
Example: run several subscripts from a master script
Suppose I have a bunch of script files: script1.sh, script2.sh, ... And an optional master script (file ext does not end with .sh). My goal is to run them using GNU Parallel.
I can just run them using
parallel './{}' ::: *.sh
where "./" means the .sh files are located in the current directory and {} denotes each individual .sh file.
More detail:
$ mkdir test-par; cd test-par $ echo echo A > script1.sh $ echo echo B > script2.sh $ echo echo C > script3.sh $ echo echo D > script4.sh $ chmod +x *.sh $ cat > script # master script (not needed for GNU parallel method) ./script1.sh ./script2.sh ./script3.sh ./script4.sh $ time bash script A B C D real 0m0.025s user 0m0.004s sys 0m0.004s $ time parallel './{}' ::: *.sh # No need of a master script # may need to add --gnu option if asked. A B C D real 0m0.778s user 0m0.588s sys 0m0.144s # longer time because of the parallel overhead
Note
- When I run scripts (seqtools_vc) sequentially I can get the standard output on screen. However, I may not get these output when I use GNU parallel.
- There is a risk/problem if all scripts are trying to generate required/missing files when they detect the required files are absent.
rush - cross-platform tool for executing jobs in parallel
Debugging Scripts
- How To Enable Shell Script Debugging Mode in Linux (very good) Some options (note options can be used in 1. the set command 2. the first line of the shell file or 3. the terminal where the shell is invoked)
- -e: exit if a command yields a nonzero exit status
- -v: short for verbose
- -n: short for noexec or no ecxecution
- -x: short for xtrace or execution trace
- How to Trace Execution of Commands in Shell Script with Shell Tracing
- How to Perform Syntax Checking Debugging Mode in Shell Scripts
- http://www.cyberciti.biz/tips/debugging-shell-script.html
Run a shell script with -x option. Then each lines of the script will be shown on the stdout. We can see which line takes long time or which lines broke the code (it still runs through the script).
$ bash -x script-name
- Use of set builtin command
- Use of intelligent DEBUG function
To run a bash script line by line:
- Bash Debugger
- Use Geany. See the next session.
Geany
- (Ubuntu 12.04 only): By default, it does not have the terminal tab. Install virtual terminal emulator. Run
sudo apt-get install libvte-dev
- Step 1: Keyboard shortcut. Select a region of code. Edit -> >Commands->Send selection to Terminal. You can also assign a keybinding for this. To do so: go to Edit->Preferences and pick the Keybindings tab. See a screenshot here. I assign F12 (no any quote) for the shortcut. This is a complete list of the keybindings.
- Step 2: Newline character. Another issue is that the last line of sent code does not have a newline character. So I need to switch to the Terminal and press Enter. The solution is to modify the <geany.conf> (find its location using locate geany.conf. On my ubuntu 14 (geany 1.26), it is under ~/.config/geany/geany.conf) and set send_selection_unsafe=true. See here.
- Step 3: PATH variable.
$ tmpname=$(basename $inputVCF) Command 'basename' is available in '/usr/bin/basename' The command could not be located because '/usr/bin' is not included in the PATH environment variable.
The solution is to run PATH=$PATH:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin in the Terminal window before running our script.
- Step 4 (optional): Change background color.
Another handy change to geany is to change its background to black. To do that, go to Edit -> Preferences -> Editor. Once on the Editor options level, select the Display tab to the far right of the dialog, and you will notice a checkbox marked invert syntax highlighting colors.
See this post about changing the default terminal in the Terminal window. The default is xterm (see the output of echo $TERM).
Examples
- <upgrade8.sh> file from BioLinux installation page
- Install required R packages using a mixture of bash and R.
How to wrap a long linux command
Use backslash character. However, make sure the backslash character is the last character at a line. For example the first example below does not work since there is an extra space character after \.
Example 1 (not work)
sudo apt-get install libcap-dev libbz2-dev libgcrypt11-dev libpci-dev libnss3-dev libxcursor-dev \ libxcomposite-dev libxdamage-dev libxrandr-dev libdrm-dev libfontconfig1-dev libxtst-dev \ libcups2-dev libpulse-dev libudev-dev
vs example 2 (work)
sudo apt-get install libcap-dev libbz2-dev libgcrypt11-dev libpci-dev libnss3-dev libxcursor-dev \ libxcomposite-dev libxdamage-dev libxrandr-dev libdrm-dev libfontconfig1-dev libxtst-dev \ libcups2-dev libpulse-dev libudev-dev
pushd and popd are used to switch between multiple directories without the copying nad posting of directory paths. Thy operate on a stack; a last in first out data structure (LIFO).
pushd /var/www pushd /usr/src dirs pushd +2 popd
When we have only two locations, an alternative and easier way is cd -.
cd /usr/src # Do something cd /var/www cd - # /usr/src
bd – Quickly Go Back to a Parent Directory
- https://www.tecmint.com/bd-quickly-go-back-to-a-linux-parent-directory/
- https://raw.github.com/vigneshwaranr/bd/master/bd
Create log file
- Create a log file with date
logfile="output_$(date +"%Y%m%d%H%M").log"
- Redirect the error to a log file
logfile="output_$(date +"%Y%m%d%H%M").log" module load XXX || exit 1 echo "All output redirected to '$logfile'" set -ex exec 2>$logfile # Task 1 start_time=$(date +%s) # Do something with possible error output end_time=$(date +%s) echo "Task 1 Started: tarted: "$start_date"; Ended: "$end_date"; Elapsed time: "$(($end_time - $start_time))" sec">>$logfile # Task 2 start_time=$(date +%s) # Do something with possible error output end_time=$(date +%s) echo "Task 1 Started: tarted: "$start_date"; Ended: "$end_date"; Elapsed time: "$(($end_time - $start_time))" sec">>$logfile
Text processing
tr (similar to sed)
It seems tr does not take general regular expression.
The tr utility copies the given input to produced the output with substitution or deletion of selected characters. tr abbreviated as translate or transliterate.
- http://www.thegeekstuff.com/2012/12/linux-tr-command/
- http://www.cyberciti.biz/faq/how-to-use-linux-unix-tr-command/
- https://www.howtoforge.com/linux-tr-command/
It will read from STDIN and write to STDOUT. The syntax is
tr [OPTION] SET1 [SET2]
If both the SET1 and SET2 are specified and ‘-d’ OPTION is not specified, then tr command will replace each characters in SET1 with each character in same position in SET2. For example,
# translate to uppercase $ echo 'linux' | tr "[:lower:]" "[:upper:]" # Translate braces into parenthesis $ tr '{}' '()' < inputfile > outputfile # Replace comma with line break $ tr ',' '\n' < inputfile # Split a long line using the space $ echo $line | tr ' ' '\n' # Translate white-space to tabs $ echo "This is for testing" | tr [:space:] '\t' # Join/merge all the lines in a file into a single line $ tr -s '\n' ' ' < file.txt # note sed cannot match \n easily as tr command. # See # http://stackoverflow.com/questions/1251999/how-can-i-replace-a-newline-n-using-sed # https://unix.stackexchange.com/questions/26788/using-sed-to-convert-newlines-into-spaces
tr can also be used to remove particular characters using -d option. For example,
$ echo "the geek stuff" | tr -d 't' he geek suff $ tr -d "\15" < input > output # octal digit 15
A practical example
#!/bin/bash echo -n "Enter file name : " read myfile echo -n "Are you sure ( yes or no ) ? " read confirmation confirmation="$(echo ${confirmation} | tr 'A-Z' 'a-z')" if [ "$confirmation" == "yes" ]; then [ -f $myfile ] && /bin/rm $myfile || echo "Error - file $myfile not found" else : # do nothing fi
Second example
$ ifconfig | cut -c-10 | tr -d ' ' | tr -s '\n' eth0 eth1 ip6tnl0 lo sit0 # without tr -s '\n' eth0 eth1 ip6tnl0 lo sit0
where tr -d ' ' deletes every space character in each line. The \n newline character is squeezed using tr -s '\n' to produce a list of interface names. We use cut to extract the first 10 characters of each line.
Regular Expression and grep
- https://regexper.com/ You can type for example '[a-z]*.[0-9]' to see what it is doing.
- ( ?[a-zA-Z]+ ?) match all words in a given text
- [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} match an IP address
- 15 Practical Grep Command Examples In Linux
- Sed bracket expressions. sed remove last character from each line.
- Period means a single character. Using Grep & Regular Expressions to Search for Text Patterns in Linux
- Linux command line: grep PATTERN FILENAME or grep -E 'PATTERN1|PATTERN2' FILENAME (extended regular expression)
echo -e "today is Monday\nHow are you" | grep Monday grep -E "[a-z]+" filename # or egrep "[a-z]+" filename grep -i PATTERN FILENAME # ignore case grep -v PATTERN FILENAME # inverse match grep -c PATTERN FILENAME # count the number of lines in which a matching string appears grep -n PATTERN FILENAME # print the line number grep -R PATTERN DIR # recursively search many files and follow symbolic links grep -r PATTERN DIR # recursively search many files grep -e "pattern1" -e "pattern2" FILENAME # multiple patterns OR operation (older Linux) egrep 'pattern1|pattern2' FILENAME # multiple patterns (newer Linux) grep -f PATTERNFILE FILENAME # PATTERNFILE contains patterns line-by-line grep -F PATTERN FILENAME # Interpret PATTERN as a list of fixed strings, separated by # newlines, any of which is to be matched. grep -r --include \*.Rmd --include \*.R "file\.csv" ./ # search with only Rmd & R files grep -r --exclude "README" PATTERN DIR # excluding files in which to search grep -o \<dt\>.*<\/dt\> FILENAME # print only the matched string (<dt> .... </dt>) grep -w # checking for full words, not for sub-strings grep -E -w "SRR2923335.1|SRR2923335.1999" # match in words (either SRR2923335.1 or SRR2923335.1999)
- Extract the IP address from ifconfig command
$ ifconfig eth1 eth1 Link encap:Ethernet HWaddr 00:14:d1:b0:df:9f inet addr:192.168.1.172 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::214:d1ff:feb0:df9f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:29113 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:28561660 (28.5 MB) TX bytes:3516957 (3.5 MB) $ ifconfig eth1 | egrep -o "inet addr:[^ ]*" | grep -o "[0-9.]*" 192.168.1.172
where egrep -o "inet addr:[^ ]*" will match the pattern starting with inet addr: and ends with some non-space character sequence (specified by [^ ]*). Now in the next pipe, it prints the character combination of digits and '.'.
--include option
- how do I use the grep --include option for multiple file types? You can use multiple --include flags. grep -r --include=*.{html,php,htm} "pattern" /some/path/
grep -r --include *.{c,cpp} PATTERN DIR # including files in which to search
- grep --include command doesn't work in OSX Zsh. The trick is to use quotes.
grep -rl --include='*.Rmd' "pattern" ./ grep --include='*.rb' --include=='*.h*' -rnw . -e "pattern"
Bash Find Out IF a Variable Contains a Substring
- Bash Find Out IF a Variable Contains a Substring
- How to Tell If a Bash String Contains a Substring on Linux
grep returns TRUE or FALSE
Can grep return true/false or are there alternative methods
less -S: print long lines
Causes lines longer than the screen width to be chopped rather than folded. man less.
cut: extract columns or character positions from text files
http://www.thegeekstuff.com/2013/06/cut-command-examples/
cut -f 5-7 somefile # columns 5-7. cut -c 5-7 somefile # character positions 5-7
The default delimiter is TAB. If the field delimiter is different from TAB you need to specify it using -d:
cut -d' ' -f100-105 myfile > outfile # cut -d: -f6 somefile # colon-delimited file # grep "/bin/bash" /etc/passwd | cut -d':' -f1-4,6,7 # field 1 through 4, 6 and 7 cut -f3 --complement somefile # print all the columns except the third column
To specify the output delimiter, we shall use --output-delimiter. NOTE that to specify the Tab delimiter in cut, we shall use $'\t'. See http://www.computerhope.com/unix/ucut.htm. For example,
cut -f 1,3 -d ':' --output-delimiter=$'\t' somefile
If I am not sure about the number of the final field, I can leave the number off.
cut -f 1- -d ':' --output-delimiter=$'\t' somefile
A simple shell function to show the first 3 columns and 3 rows of the matrix
function show_matrix() { if [ -z "$1" ] || [ -z "$2" ]; then echo "Usage: show_matrix <filename> <delimiter>" return 1 fi if [ "$2" != "tab" ] && [ "$2" != "comma" ]; then echo "Delimiter must be 'tab' or 'comma'" return 1 fi if [ "$2" == "tab" ]; then cut -f1-3 "$1" | head -n 3 elif [ "$2" == "comma" ]; then cut -d',' -f1-3 "$1" | head -n 3 fi } # show_matrix data.txt tab # show_matrix data.txt comma
awk: operate on rows and/or columns
awk is a tool designed to work with data streams. It can operate on columns and rows. If supports many built-in functionalities, such as arrays and functions, in the C programming language. Its biggest advantage is its flexibility.
- https://en.wikipedia.org/wiki/AWK
- https://www.tutorialspoint.com/awk/awk_workflow.htm
- http://www.thegeekstuff.com/2010/01/awk-introduction-tutorial-7-awk-print-examples
- http://www.theunixschool.com/p/awk-sed.html
- http://www.grymoire.com/Unix/Awk.html
- https://www.howtogeek.com/562941/how-to-use-the-awk-command-on-linux/
- The many faces of awk
- Plucking out columns of data
- Printing simple text
- Doing math with awk
Structure of an awk script
awk pattern { action } awk ' BEGIN{ print "start" } pattern { AWK commands } END { print "end" } ' file
The three of components (BEGIN, END and a common statements block with the pattern match option) are optional and any of them can be absent in the script. The pattern can be also called a condition.
The default delimiter for fields is a space.
Some examples:
awk 'BEGIN { i=0 } { i++ } END { print i}' filename echo -e "line1\nline2" | awk 'BEGIN { print "start" } { print } END { print "End" }' seq 5 | awk 'BEGIN { sum=0; print "Summation:" } { print $1"+"; sum+=$1 } END { print "=="; print sum }' awk -F : '{print $6}' somefile # colon-delimited file, print the 6th field (cut can do it) # awk --field-searator="\\t" '{print $6}' filename # tab-delimited (cut can do it) awk -F":" '{ print $1 " " $3 }' /etc/passwd # (cut can do it) awk -F "\t" '{OFS="\t"} {$1="mouse"$1; print $0}' genes.gtf > genescb.gtf # or awk -F "\t" 'BEGIN {OFS="\t"} {$1="mouse"$1; print $0}' genes.gtf > genescb.gtf # replace ELEMENT with mouseELEMENT for data on the 1st column; tab separator was used for input (-F) and output (OFS) awk 'NR % 4 == 1 {print ">" $0 } NR % 4 == 2 {print $0}' input > output # extract rows 1,2,5,6,9,10,13,14,.... from input awk 'NR % 4 == 0 {print ">" $0 } NR % 4 == 3 {print $0}' input > output # extract rows 3,4,7,8,11,12,15,16,.... from input awk '(NR==2),(NR==4) {print $0}' input # print rows 2-4. awk '{ print ($1-32)*(5/9) }' # fahrenheit-to-celsius calculator, http://www.hcs.harvard.edu/~dholland/computers/awk.html # http://stackoverflow.com/questions/3700957/printing-lines-from-a-file-where-a-specific-field-does-not-start-with-something awk '$7 !~ /^mouse/ { print $0 }' input # column 7 not starting with 'mouse' awk '$7 ~ /^mouse/ { print $0 }' input # column 7 starting with 'mouse' awk '$7 ~ /mouse/ { print $0 }' input # column 7 containing 'mouse'
It seems AWK is useful for finding/counting a subset of rows or columns. It is not most used for string substitution.
Print the string between two parentheses
https://unix.stackexchange.com/questions/108250/print-the-string-between-two-parentheses
$ awk -F"[()]" '{print $2}' file $ echo ">gi|52546690|ref|NM_001005239.1| subfamily H, member 1 (OR11H1), mRNA" | awk -F"[()]" '{print $2}' OR11H1 $ echo ">gi|284172348|ref|NM_002668.2| proteolipid protein 2 (colonic epithelium-enriched) (PLP2), mRNA" | awk -F"[()]" '{print $2}' colonic epithelium-enriched # WRONG
Insert a line
https://stackoverflow.com/a/18276534
awk '/KEYWORDS/ { print; print "new line"; next }1' foo.input
Count number of columns in file
https://stackoverflow.com/a/8629351
awk -F'|' '{print NF; exit}' stores.dat # Change '|' as needed
sed (stream editor): substitution of text
By default, sed only prints the substituted text. To save the changes along the substitutions to the same file, use the -i option.
sed 's/text/replace/' file > newfile mv newfile file # OR better sed -i 's/text/replace/' file
The sed command will replace the first occurrence of the pattern in each line. If we want to replace every occurrence, we need to add the g parameter at the end, as follows:
sed -i 's/pattern/replace/g' file
To remove blank lines
sed '/^$/d' filename
# method 1. replace ] & [ by the empty string $ echo '00[123]44' | sed 's/[][]//g' 0012344 # method 2 - use tr $ echo '00[123]00' | tr -d '[]' 0012300
To replace all three-digit numbers with another specified word in a file
sed -i 's/\b[0-9]\{3\}\b/NUMBER/g' filename echo -e "I love 111 but not 1111." | sed 's/\b[0-9]\{3\}\b/NUMBER/g'
where {3} is used for matching the preceding character thrice. \ in \{3\} is used to give a special meaning for { and }. \b is the word boundary marker.
Variable string and quoting
text=hello echo hello world | sed "s/$text/HELLO/"
Double quoting expand the expression by evaluating it.
sed takes whatever follows the "s" as the separator
- Using different delimiters in sed
- http://www.grymoire.com/Unix/Sed.html#uh-2 ,
- https://en.wikipedia.org/wiki/Sed#Substitution_command
Suppose I like to replace "../jquery-ui.min.js" with "jquery-ui.js", I can use
echo '<script src="../jquery-ui.min.js"></script>' | sed 's|../jquery-ui.min.js|jquery-ui.js|g' # <script src="jquery-ui.js"></script>
$ cat tmp @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @RG ID:NEAT $ sed 's,^@RG.*,@RG\tID:None\tSM:None\tLB:None\tPL:Illumina,g' tmp @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @RG ID:None SM:None LB:None PL:Illumina $ sed 's/^@RG.*/@RG\tID:None\tSM:None\tLB:None\tPL:Illumina/g' tmp @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @RG ID:None SM:None LB:None PL:Illumina
Case insensitive
https://www.cyberciti.biz/faq/unixlinux-sed-case-insensitive-search-replace-matching/
# Newer version - add 'i' or 'I' after 'g' sed 's/find-word/replace-word/gI' input.txt > output.txt sed -i 's/find-word/replace-word/gI' input.txt # Older version/macOS sed 's/[wW][oO][rR][dD]/replace-word/g' input.txt > output.txt sed 's/[Ll]inux/Unix/g' input.txt > output.txt
macOS
"undefined label" error on Mac OS X
$ sed -i 's/mkyong/google/g' testing.txt sed: 1: "testing.txt": undefined label 'esting.txt' # Solution $ sed -i '.bak' 's/mkyong/google/g' testing.txt
Application: Get the top directory name of a tarball or zip file without extract it
dn=`unzip -vl filename.zip | sed -n '5p' | awk '{print $8}'` # 5 is the line number to print echo -e "$(basename $dn)" dn=`tar -tf filename.tar.bz2 | grep -o '^[^/]\+' | sort -u` # '-u' means unique echo -e $dn dn=`tar -tf filename.tar.gz | grep -o '^[^/]\+' | sort -u` echo -e $dn # Assume there is a sub-directory called htslibXXXX dn=$(basename `find -maxdepth 1 -name 'htslib*'`) echo -e $dn
Application: Grab the line number from the 'grep -n' command output
Follow here
grep -n 'regex' filename | sed 's/^\([0-9]\+\):.*$/\1/' # return line numbers for each matches # OR grep -n 'regex' filename | awk -F: '{print $1}' echo 123:ABCD | sed 's/^\([0-9]\+\):.*$/\1/' # 123
where \1 means to keep the substring of the pattern and \( & \) are used to mark the pattern. See http://www.grymoire.com/Unix/Sed.html for more examples, e.g. search repeating words or special patterns.
If we want to find the to directory for a zipped file (see wikipedia for the zip format), we can use
unzip -vl snpEff.zip | head | grep -n 'CRC-32' | awk -F: '{print $1}'
Application: Delete first few characters on each row
http://www.theunixschool.com/2014/08/sed-examples-remove-delete-chars-from-line-file.html
- To remove 1st n characters of every line:
# delete the first 4 characters from each line $ sed -r 's/.{4}//' file
Application: delete lines
- Delete a single line
- Delete a range of lines
- Delete multiple lines
- Delete all lines except specified range
- Delete empty lines
- Delete lines based on pattern
- Delete lines starting with a specific character
- Delete lines ending with specific character
- Deleting lines that match the pattern and the next line
- Deleting line from the pattern match to the end
Application: comment out certain lines
https://unix.stackexchange.com/a/128595. To comment lines 2 through 4 of bla.conf:
sed -i '2,4 s/^/#/' bla.conf
This is useful when I need to comment out line 240 & 242 on shell scripts (related to pdf file) generated from BRB-SeqTools.
Substitution of text: perl
- Add or remove 'chr' from vcf file https://www.biostars.org/p/18530/
How to delete the first few rows of a text file
Suppose we want to remove the first 3 rows of a text file
- sed
$ sed -e '1,3d' < t.txt # output to screen $ sed -i -e 1,3d yourfile # directly change the file
- tail
$ tail -n +4 t.txt # output to screen
- awk
$ awk 'NR > 3 { print }' < t.txt # output to screen
Delete the last row of a file
sed -i '$d' FILE
Show the first few characters from a text file
head -c 50 file # return the first 50 bytes
Remove/Delete The Empty Lines In A File
https://www.2daygeek.com/remove-delete-empty-lines-in-a-file-in-linux/
sed -i '/KEYWORD/d' File
cat: merge by rows
cat file1 file2 > output
paste: merge by columns
paste -d"\t" file1 file2 file3 > output paste file1 file2 file3 | column -s $'\t' > output
Web
Reference: Linux Shell Scripting Cookbook
Copy a complete webiste
wget --mirror --convert-links URL # OR wget -r -N -k -l DEPTH URL
HTTP or FTP authentication
wget --user username --password pass URL
Download a web page as plain text (instead of HTML text)
lynx URL -dump > TextWebPage.txt
cURL
curl http://google.com -o index.html --progress curl http://google.com --silent -o index.html # Cookies curl http://example.com --cookie "user=ABCD;pass=EFGH" curl URL --cookie-jar cookie_file # Setting a user agent string # http://www.useragentstring.com/pages/useragentstring.php curl URL --user-agent "Mozilla/5.0" # Authenticating curl -u user:pass http://test_auth.com curl -u user http://test_auth.com # Printing response headers excluding the data # For example, to check whether a page is reachable or not # by checking the 'Content-length' parameter. curl -I URL
Image crawler and downloader
#!/bin/bash #Desc: Images downloader #Filename: img_downloader.sh if [ $# -ne 3 ]; then echo "Usage: $0 URL -d DIRECTORY" exit -1 fi for i in {1..4} do case $1 in -d) shift; directory=$1; shift ;; *) url=${url:-$1}; shift;; esac done mkdir -p $directory; baseurl=$(echo $url | egrep -o "https?://[a-z.]+") echo Downloading $url curl -s $url | egrep -o "<img src=[^>]*>" | sed 's/<img src=\"\([^"]*\).*/\1/g' > /tmp/$$.list sed -i "s|^/|$baseurl/|" /tmp/$$.list cd $directory; while read filename; do echo Downloading $filename curl -s -O "$filename" --silent done < /tmp/$$.list
Find broken links in a website by lynx -traversal
#!/bin/bash #Desc: Find broken links in a website if [ $# -ne 1 ]; then echo -e "$Usage: $0 URL\n" exit 1; fi echo Broken links: mkdir /tmp/$$.lynx cd /tmp/$$.lynx lynx -traversal $1 > /dev/null count=0; sort -u reject.dat > links.txt while read link; do output=`curl -I $link -s | grep "HTTP/.*OK"`; if [[ -z $output ]]; then echo $link; let count++ fi done < links.txt [ $count -eq 0 ] && echo No broken links found.
Track changes to a website
#!/bin/bash #Desc: Script to track changes to webpage if [ $# -ne 1 ]; then echo -e "$Usage: $0 URL\n" exit 1; fi first_time=0 # Not first time if [ ! -e "last.html" ]; then first_time=1 # Set it is first time run fi curl --silent $1 -o recent.html if [ $first_time -ne 1 ]; then changes=$(diff -u last.html recent.html) if [ -n "$changes" ]; then echo -e "Changes:\n" echo "$changes" else echo -e "\nWebsite has no changes" fi else echo "[First run] Archiving.." fi cp recent.html last.html
POST/GET
Look at a web site source and look for the 'name' field in a <input> tag.
http://www.w3schools.com/html/html_forms.asp
# -d is used for posting in curl curl URL -d "postvar1=var1&postvar2=var2" # OR the 'get' command with the 'post-data' option get URL --post-data "postvar1=var1&postvar2=var2" -O out.html
Change detection of a website
- http://bhfsteve.blogspot.com/2013/03/monitoring-web-page-for-changes-using.html
- https://www.reddit.com/r/commandline/comments/2e2bkj/linux_software_to_monitor_website_changes/
- http://specto.sourceforge.net/ and https://www.linux.com/news/monitor-web-page-changes-specto
- http://www.mostlymaths.net/2010/01/cron-diff-wget-watch-changes-in-webpage.html
Working with Files
iconv command
- How To Use the iconv Command on Linux
- How to Convert Files to UTF-8 Encoding in Linux
- https://stackoverflow.com/questions/11316986/how-to-convert-iso8859-15-to-utf8
$ file test.R test.R: ISO-8859 text, with CRLF line terminators $ iconv -f ISO-8859 -t UTF-8 test.R # 'ISO-8859' is not supported $ iconv -t UTF-8 test.R # partial conversion?? $ iconv -f ISO-8859-1 -T UTF-8 test.R # Works
nl command
Add line numbers to a text file
$ cat demo_file THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE. this line is the 1st lower case line in this file. This Line Has All Its First Character Of The Word With Upper Case. Two lines above this line is empty. And this is the last line. $ nl demo_file 1 THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE. 2 this line is the 1st lower case line in this file. 3 This Line Has All Its First Character Of The Word With Upper Case. 4 Two lines above this line is empty. 5 And this is the last line.
file command
$ file thumbs/g7.jpg thumbs/g7.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI), density 72x72, segment length 16, Exif Standard: [TIFF image data, little-endian, direntries=10, orientation=upper-left, xresolution=134, yresolution=142, resolutionunit=2, software=Adobe Photoshop CS Windows, datetime=2004:03:31 22:28:58], baseline, precision 8, 100x75, frames 3 $ file index.html index.html: HTML document, ASCII text $ file 2742OS_5_01.sh 2742OS_5_01.sh: Bourne-Again shell script, ASCII text executable $ file R-3.2.3.tar.gz R-3.2.3.tar.gz: gzip compressed data, last modified: Thu Dec 10 03:12:50 2015, from Unix
date
Displaying dates and times your way in Linux
print by skipping rows
http://stackoverflow.com/questions/604864/print-a-file-skipping-x-lines-in-bash
$ tail -n +<N+1> <filename> # excluding first N lines # print by starting at line N+1. $ tail -n +11 /tmp/myfile # starting at line 11, or skipping the first 10 lines
tail -f (follow)
When we use the '-f' (follow) option, we can monitor a growing file. For example, we can create a new file called tmp.txt and run 'tail -f tmp.txt'. Now we open another terminal and run 'for i in {0..100}; do sleep 2; echo $i >> ~/output.txt ; done'. We will see in the 1st terminal that the content of tmp.txt is changed.
A practical example is
- Monitor system change
sudo tail -f /var/log/syslog
- Monitor a process and terminate itself when a give process dies
PID=$(pidof Foo) tail -f textfile --pid $PID
A process Foo (eg. gedit) is appending data to a file, the tail -f should be executed until the process Foo dies.
Low-level File Access
- file descriptors: 0 means standard input, 1 means standard output, 2 means standard error.
- size_t write(int fildes, const void *buf, size_t nbytes);
#include <unistd.h> #include <stdlib.h> int main() { if ((write(1, "Here is some data\n", 18)) != 17) write(2, "A write error has occurred on file descriptor\n", 46); exit(0); }
- size_t read(int fildes, void *buf, size_t nbytes); returns the number of data bytes actually read. If a read call returns 0, it had nothing to read; it reached the end of the file. An error on the call will cause it to return -1.
- To create a new file descriptor we use the open system call. int open(const char *path, int oflags, mode_t mode);
- The next program will do file copy.
#include <unistd.h> #include <sys/stat.h> #include <fcntl.h> #include <stdlib.h> int main() { char c; int in, out; in = open("file.in", O_RDONLY); out = open("file.out", O_WRONLY|O_CREAT, S_IRUSER|S_IWUSR); while(read(in,&c,1) == 1) write(out,&c,1) exit(0); }
The Standard I/O Library
- fopen, fclose
- fread, fwrite
- fflush
- fseek
- fgetc, getc, getchar
- fputc, putc, putchar
- fgets, gets
- printf, fprintf and sprintf
- scanf, fscanf and sscanf
Formatted Input and Output
- prinf, fprintf and sprintf
- scanf, fscanf and sscanf
Stream Errors
File and Directory Maintenance
Scanning Directories
- opendir, closedir
- readdir
- telldir
- seekdir
UNIX environment
Logging
Resources and Limits
Terminals
Fun command line utilities
Turn Your Terminal Into A Playground: 20+ Funny Linux Command Line Tools: cowsay, fortune, figlet, sl, ASCIIquarium, cmatrix, lolcat, ponysay, charasay, party parrot, ternimal, paclear, lavat, pond, cbonsai, dotacat, finger, pinky, no more secrets, hollywood, bucklespring, bb, toilet, sl-alt, fetch utilities, telehack, display star wars episode.
Reading from and Writing to the Terminal
The termios Structure
Terminal Output
Detecting Keystokes
Curses
A technique between command line and full GUI.
Example: vi.
Data Management
Development Tools
Books
Top Linux developers' recommended programming books
GNU Make and Makefiles
- minimal make A minimal tutorial on make from Karl Broman.
- http://makefiletutorial.com/index.html
- Notes for new Make users
Writing a Manual Page
Distributing Software
The patch Program
Debugging
debug a bash shell
How To Debug a Bash Shell Script Under Linux or UNIX
gdb
Processes and Signals
Search a process ID by its name
Use pgrep https://askubuntu.com/questions/612315/how-do-i-search-for-a-process-by-name-without-using-grep. For example (tested on Linux and macOS),
$ pgrep RStudio # assume RStudio is running 27043 $ pgrep geany # geany is not running. $