Python: Difference between revisions

From 太極
Jump to navigation Jump to search
 
(13 intermediate revisions by the same user not shown)
Line 28: Line 28:


== Install, setup ==
== Install, setup ==
=== Multiple python versions: pyenv ===
 
* https://github.com/pyenv/pyenv
=== Alias ===
* [https://opensource.com/article/20/4/pyenv How to use pyenv to run multiple versions of Python on a Mac]
[https://www.freecodecamp.org/news/how-to-fix-python-installation-errors-on-mac/ How to Fix Common Python Installation Errors on macOS]
* [https://ubuntushell.com/install-multiple-python-versions-on-ubuntu/ How to Install Multiple Python Versions on Ubuntu Using Pyenv]
<pre>
nano ~/.bash_profile
# or
nano ~/.zshrc
</pre>
Add the following line
<pre>
alias python=python3
</pre>


=== Ubuntu ===
=== Ubuntu ===
Line 54: Line 62:
* On my 2021 mac Ventura, the default python3 is at "/usr/bin". But when we try to run 'python3', it asked to install the command line developer tools. After the installation we can use python3. Still, there is no "~/Library/Python" directory.
* On my 2021 mac Ventura, the default python3 is at "/usr/bin". But when we try to run 'python3', it asked to install the command line developer tools. After the installation we can use python3. Still, there is no "~/Library/Python" directory.


== Online compiler ==
== Multiple python versions ==
* [https://www.python.org/shell/ Python.org]. It seems this has most modules.
[https://www.freecodecamp.org/news/manage-multiple-python-versions-and-virtual-environments-venv-pyenv-pyvenv-a29fb00c296f/ How to manage multiple Python versions and virtual environments] 2018.  
* [https://www.tutorialspoint.com/online_python_compiler.php Tutorialspoint]
* [https://onecompiler.com/python OneCompiler]


== IDE ==
=== conda/mamba ===
* [https://www.jetbrains.com/pycharm/ PyCharm]
<pre>
** Pycharm was used by [https://youtu.be/rfscVS0vtbw Learn Python - Full Course for Beginners ] (freeCodeCamp.org)
conda create --name myenv python=3.6.12
** [https://stackoverflow.com/questions/7681431/run-python-source-code-line-by-line Run a code line by line] by changing the keyboard shortcuts from Settings -> Keymap -> Other.
conda activate myenv
** To run the current file, right click the tab and select Run XXX. (Frustrated)
* [http://thonny.org/ Thonny]
* [https://www.spyder-ide.org/ Spyder]
* RStudio
** Create a file (xxx.py)
** Click the terminal tab. Type 'python' (or ipython3).
** Use Ctrl/CMD + Alt + Enter to run your python code line by line or a chunk.


=== Visual Studio Code ===
mamba create --name myenv python=3.6.12
* [https://youtu.be/h1sAzPojKMg Get started with Jupyter Notebooks in less than 4 minutes] (video)
mamba activate myenv
* [https://code.visualstudio.com/docs/datascience/jupyter-notebooks Jupyter Notebooks in VS Code], [https://code.visualstudio.com/docs/languages/python Python in Visual Studio Code]
</pre>
* [https://ithome.com.tw/news/156795 VS Code Python擴充套件開始不預裝Jupyter擴充套件]


The ipynb file can contain figures.  
=== pyenv ===
* https://github.com/pyenv/pyenv
* [https://github.com/posit-dev/positron/wiki Positron wiki]
* [https://opensource.com/article/20/4/pyenv How to use pyenv to run multiple versions of Python on a Mac]
* [https://ubuntushell.com/install-multiple-python-versions-on-ubuntu/ How to Install Multiple Python Versions on Ubuntu Using Pyenv]


[https://github.com/immunogenomics/harmony2019/tree/master/notebooks This (Harmony Manuscript)] has several notebook files where the code in ipynb files were written in R, not Python.
<pre>
pyenv install 3.6.12
pyenv virtualenv 3.6.12 myenv
pyenv activate myenv
</pre>


I can use [https://stackoverflow.com/a/61707097 vsc] to open a ipynb file.
=== virtualenv ===
<pre>
pip install virtualenv
virtualenv -p python3.6 myenv
source myenv/bin/activate
</pre>


Conversion
=== venv (python 3.3+) ===
* Rmd to ipynb
<ul>
** [https://github.com/mkearney/rmd2jupyter rmd2jupyter] package
<li>For Python 3, '''venv''' is generally more commonly used because it is included in the Python standard library starting from Python 3.3, making it more convenient and straightforward to use. However, '''virtualenv''' is still popular, especially among developers who need more advanced features or compatibility with older Python versions. It offers more flexibility and can be used with both Python 2 and Python 3.
** [https://codes.correlaid.org/first%20steps/r/jupyter/2021/03/02/Convert-Rmd-files-to-Jupyter-Notebook.html How to convert Rmd to ipynb notebook]: '''Jupytext''' and '''notedown'''.
<ul>
** [https://vatlab.github.io/sos-docs/index.html Script of Scripts (SoS) ]
<li>The primary purpose of '''venv''' is to create isolated environments for managing Python packages and dependencies, but not python itself. For instance, <syntaxhighlight lang='sh' inline>python3.10 -m venv myenv</syntaxhighlight>.
* ipynb to Rmd
<li>On the other hand, '''virtualenv''' can indeed be used to control the Python version for your virtual environments. For instance, <syntaxhighlight lang='sh' inline>virtualenv -p /usr/bin/python3.10 myenv</syntaxhighlight>.
** [https://rmarkdown.rstudio.com/docs/reference/convert_ipynb.html  Convert ipynb to Rmd]
</ul>


=== nbdev ===
<li>[https://www.howtogeek.com/start-project-python-virtual-environments/ Don’t Make This Mistake When You Start Your Python Project]
* [https://github.com/fastai/nbdev nbdev]
* [https://towardsdatascience.com/jupyter-is-now-a-full-fledged-ide-c99218d33095 Jupyter is now a full-fledged IDE] Literate programming is now a reality through nbdev and the new visual debugger for Jupyter.


=== Emacs ===
<li>[https://python-code.dev/articles/236456323 Understanding Python's Virtual Environment Landscape: venv vs. virtualenv, Wrapper Mania, and Dependency Control]
[https://stackoverflow.com/a/7053298 Emacs Shell mode: how to send region to shell?]


== JupyterLab ==
<li>Here is another example
* [https://stackoverflow.com/a/52392304 What is the difference between Jupyter Notebook and JupyterLab?]
<syntaxhighlight lang='bash'>
* Jupyter Notebook (classic) and JupyterLab are both web-based interactive computing environments for working with data and code, but they have some key differences in terms of their user interface, features, and capabilities.
~/github/PUREE$ python3 -m venv myenv
* JupyterLab is a more modern and powerful tool than Jupyter Notebook, and is recommended for users who want a more flexible and feature-rich interface for working with data and code. However, Jupyter Notebook remains a popular and widely used tool, particularly for working with Jupyter notebooks.
The virtual environment was not created successfully because ensurepip is not
* [https://towardsdatascience.com/7-reasons-why-you-should-use-jupyterlab-for-data-science-7c2a3db8755a 7 Reasons Why You Should Use Jupyterlab for Data Science]
available. On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.


=== Some resources ===
    apt install python3.10-venv
* [https://www.dataschool.io/cloud-services-for-jupyter-notebook/ Six easy ways to run your Jupyter Notebook in the cloud]
...
* [http://www.win-vector.com/blog/2020/03/cross-methods-are-a-leak-variance-trade-off/ Cross-Methods are a Leak/Variance Trade-Off]
~/github/PUREE$ sudo apt install python3.10-venv
* [https://opensource.com/article/20/11/daily-journal-jupyter Journal five minutes a day with Jupyter]
~/github/PUREE$ python3 -m venv myenv
* [https://www.dataquest.io/blog/jupyter-notebook-tutorial/ How to Use Jupyter Notebook in 2020: A Beginner’s Tutorial]
* [https://www.makeuseof.com/get-started-with-jupyter-notebook/ Get Started With Jupyter Notebook: A Tutorial]
* [https://datalya.com/blog/python-data-science/jupyter-notebook-command-mode-keyboard-shortcuts Jupyter Notebook Command Mode Keyboard Shortcuts]
** Enter: edit mode
** Esc: command mode
** Ctrl-Enter: run cell
** '''Shift-­Enter''': run current cell, and select cell below
** '''Alt-Enter''': run cell, insert a cell below
** Y: to code
** M: to markdown
** 1: to insert heading 1
** 2,3,4,5,6: to insert heading 2,3,4,5,6


=== Online tools rendering ipynb ===
~/github/PUREE$ source myenv/bin/activate
* Github
(myenv) ~/github/PUREE$ which python
* NBViewer (nbviewer.jupyter.org)
/home/brb/github/PUREE/myenv/bin/python
* Google Colaboratory (colab.research.google.com)
(myenv) ~/github/PUREE$ pip freeze > requirements.txt
* Binder (mybinder.org)
(myenv) ~/github/PUREE$ deactivate
* Kaggle Notebooks (kaggle.com)
~/github/PUREE$
* Azure Notebooks (notebooks.azure.com)
* Datalore (datalore.jetbrains.com)
* Deepnote (deepnote.com)
* CoCalc (cocalc.com)


=== Different installation methods ===
~/github/PUREE$ ls myenv/bin
<ul>
activate      activate.fish  f2py  f2py3.10    pip  pip3.10  python3    wheel
<li>https://pypi.org/project/jupyterlab/
activate.csh  Activate.ps1  f2py3  normalizer  pip3  python  python3.10
<li>https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html
</syntaxhighlight>
<li>[https://www.howtoforge.com/guide-to-install-jupyterlab-on-debian-12/ Guide to Install JupyterLab on Debian 12]. Hint: require node.js. '''Node.js''', is used for building and managing JupyterLab’s '''JavaScript''' dependencies. Many Jupyter extensions require having working '''npm''' (which comes with Node.js) and '''jlpm''' commands, which are required for downloading useful Jupyter extensions or other JavaScript dependencies. Node.js itself is built with '''GYP''', a cross-platform build tool written in Python, which is another reason why Python is needed.
</li>
<li>[https://www.howtoforge.com/how-to-install-jupyterlab-on-rocky-linux-9/ How to Install JupyterLab on Rocky Linux 9]
</ul>
<syntaxhighlight lang='sh'>
mkdir -p ~/project; cd ~/project
python3 -m venv myenv    # '-m venv' means to run venv module as a script
source myenv/bin/activate
pip3 install jupyter
which jupyter
jupyter --version


jupyter server --generate-config
== Online compiler ==
jupyter server password
* [https://www.python.org/shell/ Python.org]. It seems this has most modules.
* [https://www.tutorialspoint.com/online_python_compiler.php Tutorialspoint]
* [https://onecompiler.com/python OneCompiler]


jupyter lab --generate-config
== IDE ==
jupyter lab --show-config
* [https://www.jetbrains.com/pycharm/ PyCharm]
** Pycharm was used by [https://youtu.be/rfscVS0vtbw Learn Python - Full Course for Beginners ] (freeCodeCamp.org)
** [https://stackoverflow.com/questions/7681431/run-python-source-code-line-by-line Run a code line by line] by changing the keyboard shortcuts from Settings -> Keymap -> Other.
** To run the current file, right click the tab and select Run XXX. (Frustrated)
* [http://thonny.org/ Thonny]
* [https://www.spyder-ide.org/ Spyder]
* RStudio
** Create a file (xxx.py)
** Click the terminal tab. Type 'python' (or ipython3).
** Use Ctrl/CMD + Alt + Enter to run your python code line by line or a chunk.


sudo firewall-cmd --add-port=8888/tcp
=== Visual Studio Code ===
jupyter lab --ip 192.168.5.120
* [https://youtu.be/h1sAzPojKMg Get started with Jupyter Notebooks in less than 4 minutes] (video)
# http://192.168.5.120:8888/
* [https://code.visualstudio.com/docs/datascience/jupyter-notebooks Jupyter Notebooks in VS Code], [https://code.visualstudio.com/docs/languages/python Python in Visual Studio Code]
</syntaxhighlight>
* [https://ithome.com.tw/news/156795 VS Code Python擴充套件開始不預裝Jupyter擴充套件]
<Li>[https://www.linuxcapable.com/how-to-install-jupyter-notebook-on-ubuntu-linux/ How to Install Jupyter Notebook on Ubuntu 24.04, 22.04 or 20.04]
</ul>


=== JupyterLab Desktop ===
The ipynb file can contain figures.  
* https://github.com/jupyterlab/jupyterlab-desktop
* [https://blog.jupyter.org/introducing-the-new-jupyterlab-desktop-bca1982bdb23 Introducing the new JupyterLab Desktop!] 2/9/2023


=== pip/pip3 Jupyter ===
[https://github.com/immunogenomics/harmony2019/tree/master/notebooks This (Harmony Manuscript)] has several notebook files where the code in ipynb files were written in R, not Python.
<ul>
<li>https://jupyter.org/
<pre>
which python3
# /usr/bin/python3


pip3 install jupyterlab
I can use [https://stackoverflow.com/a/61707097 vsc] to open a ipynb file.
jupyter-lab
# http://localhost:8888/lab
# The current directory will be available on the file browser panel in JupyterLab.
</pre>
On Mac, it shows the following when I run 'pip3 install jupyterlab'
{{Pre}}
Installing collected packages: pip
  WARNING: The scripts pip, pip3 and pip3.9 are installed in '/Users/XXX/Library/Python/3.9/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.0
WARNING: You are using pip version 21.2.4; however, version 23.0 is available.
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.
</pre>
That is, I need to use '''/Users/XXX/Library/Python/3.9/bin/jupyter-lab''' to launch '''jupyter-lab'''  OR add the path to ".zshrc"; see [https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html jupyterlab doc].
</li>
</ul>


=== conda + Jupyter ===
Conversion
* [https://stackoverflow.com/a/58068850 How to use Jupyter notebooks in a conda environment?] Option 1 works for me.
* Rmd to ipynb
* https://github.com/dylkot/cNMF/#analyze_simulated_example_data.ipynb
** [https://github.com/mkearney/rmd2jupyter rmd2jupyter] package
<pre>
** [https://codes.correlaid.org/first%20steps/r/jupyter/2021/03/02/Convert-Rmd-files-to-Jupyter-Notebook.html How to convert Rmd to ipynb notebook]: '''Jupytext''' and '''notedown'''.
conda install --yes jupyterlab && conda clean --yes --all
** [https://vatlab.github.io/sos-docs/index.html Script of Scripts (SoS) ]
</pre>
* ipynb to Rmd
** [https://rmarkdown.rstudio.com/docs/reference/convert_ipynb.html  Convert ipynb to Rmd]


=== IPython shell ===
=== nbdev ===
<ul>
* [https://github.com/fastai/nbdev nbdev]
<li>[https://opensource.com/article/21/3/ipython-shell-jupyter-notebooks Why I love using the IPython shell and Jupyter notebooks]
* [https://towardsdatascience.com/jupyter-is-now-a-full-fledged-ide-c99218d33095 Jupyter is now a full-fledged IDE] Literate programming is now a reality through nbdev and the new visual debugger for Jupyter.
<pre>
ipython  # Shell


jupyter notebook # auto open the browser
=== Emacs ===
</pre>
[https://stackoverflow.com/a/7053298 Emacs Shell mode: how to send region to shell?]
</li>
<li>[https://medium.com/analytics-vidhya/why-switch-to-jupyterlab-from-jupyter-notebook-c6d98362945b Why switch to JupyterLab from jupyter-notebook?]
</li>
</ul>


=== Extract python code from Jupyter notebook ===
== JupyterLab ==
* [https://stackoverflow.com/a/56832942 Get only the code out of Jupyter Notebook]. '''nbconvert''' or '''jq'''. Or File -> Download as -> Python (.py) — this should export all code cells as single .py file.
* [https://stackoverflow.com/a/52392304 What is the difference between Jupyter Notebook and JupyterLab?]
* [https://win-vector.com/2022/04/30/separating-code-from-presentation-in-jupyter-notebooks/ Separating Code from Presentation in Jupyter Notebooks]
* Jupyter Notebook (classic) and JupyterLab are both web-based interactive computing environments for working with data and code, but they have some key differences in terms of their user interface, features, and capabilities.
* JupyterLab is a more modern and powerful tool than Jupyter Notebook, and is recommended for users who want a more flexible and feature-rich interface for working with data and code. However, Jupyter Notebook remains a popular and widely used tool, particularly for working with Jupyter notebooks.
* [https://towardsdatascience.com/7-reasons-why-you-should-use-jupyterlab-for-data-science-7c2a3db8755a 7 Reasons Why You Should Use Jupyterlab for Data Science]


=== Execute Javascript in a Jupyter Notebook ===
=== Some resources ===
[https://linuxtldr.com/run-javascript-jupyter-notebook/ How to Execute Javascript in a Jupyter Notebook on Linux]
* [https://www.dataschool.io/cloud-services-for-jupyter-notebook/ Six easy ways to run your Jupyter Notebook in the cloud]
* [http://www.win-vector.com/blog/2020/03/cross-methods-are-a-leak-variance-trade-off/ Cross-Methods are a Leak/Variance Trade-Off]
* [https://opensource.com/article/20/11/daily-journal-jupyter Journal five minutes a day with Jupyter]
* [https://www.dataquest.io/blog/jupyter-notebook-tutorial/ How to Use Jupyter Notebook in 2020: A Beginner’s Tutorial]
* [https://www.makeuseof.com/get-started-with-jupyter-notebook/ Get Started With Jupyter Notebook: A Tutorial]
* [https://datalya.com/blog/python-data-science/jupyter-notebook-command-mode-keyboard-shortcuts Jupyter Notebook Command Mode Keyboard Shortcuts]
** Enter: edit mode
** Esc: command mode
** Ctrl-Enter: run cell
** '''Shift-­Enter''': run current cell, and select cell below
** '''Alt-Enter''': run cell, insert a cell below
** Y: to code
** M: to markdown
** 1: to insert heading 1
** 2,3,4,5,6: to insert heading 2,3,4,5,6


=== Google colab ===
=== Online tools rendering ipynb ===
* https://colab.research.google.com/
* Github
* An example [https://github.com/MariaNattestad/pca-on-genotypes Mini bioinformatics project: PCA on genotypes]
* NBViewer (nbviewer.jupyter.org)
* Google Colaboratory (colab.research.google.com)
* Binder (mybinder.org)
* Kaggle Notebooks (kaggle.com)
* Azure Notebooks (notebooks.azure.com)
* Datalore (datalore.jetbrains.com)
* Deepnote (deepnote.com)
* CoCalc (cocalc.com)


=== R programming ===
=== Different installation methods ===
[https://developers.refinitiv.com/en/article-catalog/article/setup-jupyter-notebook-r Setup Jupyter Notebook for R] or [https://dzone.com/articles/using-r-on-jupyternbspnotebook Using R on Jupyter Notebook]
<ul>
<li>https://pypi.org/project/jupyterlab/
<li>https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html
<li>[https://www.howtoforge.com/guide-to-install-jupyterlab-on-debian-12/ Guide to Install JupyterLab on Debian 12]. Hint: require node.js. '''Node.js''', is used for building and managing JupyterLab’s '''JavaScript''' dependencies. Many Jupyter extensions require having working '''npm''' (which comes with Node.js) and '''jlpm''' commands, which are required for downloading useful Jupyter extensions or other JavaScript dependencies. Node.js itself is built with '''GYP''', a cross-platform build tool written in Python, which is another reason why Python is needed.
<li>[https://www.howtoforge.com/how-to-install-jupyterlab-on-rocky-linux-9/ How to Install JupyterLab on Rocky Linux 9]
<syntaxhighlight lang='sh'>
mkdir -p ~/project; cd ~/project
python3 -m venv myenv    # '-m venv' means to run venv module as a script
source myenv/bin/activate
pip3 install jupyter
which jupyter
jupyter --version


To use R in JupyterLab, you will first need to install the IRkernel package in your R environment using the following command: 
jupyter server --generate-config
<pre>
jupyter server password
install.packages('IRkernel')
</pre>


Once you have installed the IRkernel package, you can register it with JupyterLab using the following command in your R console:
jupyter lab --generate-config
<pre>
jupyter lab --show-config
IRkernel::installspec()
</pre>
After you have registered the kernel, you can start a new Jupyter notebook or JupyterLab session and select the "R" kernel from the kernel dropdown menu. This will allow you to run R code in JupyterLab, including data analysis, visualization, and other tasks.


=== Xeus-R: a future-proof Jupyter kernel for R ===
sudo firewall-cmd --add-port=8888/tcp
[https://blog.jupyter.org/meet-xeus-r-a-future-proof-jupyter-kernel-for-r-1adc5fdd09ab Meet Xeus-R: a future-proof Jupyter kernel for R]
jupyter lab --ip 192.168.5.120
# http://192.168.5.120:8888/
</syntaxhighlight>
<Li>[https://www.linuxcapable.com/how-to-install-jupyter-notebook-on-ubuntu-linux/ How to Install Jupyter Notebook on Ubuntu 24.04, 22.04 or 20.04]
</ul>


== Run Jupyter Notebooks on an Apple M1 Mac ==
=== JupyterLab Desktop ===
* [https://blog.roboflow.com/how-to-run-jupyter-notebooks-on-a-mac-m1/ How to Run Jupyter Notebooks on an Apple M1 Mac]
* https://github.com/jupyterlab/jupyterlab-desktop
* [https://towardsdatascience.com/how-to-easily-set-up-python-on-any-m1-mac-5ea885b73fab How to Easily Set Up Python on Any M1 Mac]
* [https://blog.jupyter.org/introducing-the-new-jupyterlab-desktop-bca1982bdb23 Introducing the new JupyterLab Desktop!] 2/9/2023
* [https://alexmanrique.com/blog/development/2021/03/05/installing-jupyter-in-macbook-air-m1.html Installing Jupyter in Macbook Air M1]


== Cheat sheet ==
=== pip/pip3 Jupyter ===
* http://datasciencefree.com/python.pdf
<ul>
* https://s3.amazonaws.com/assets.datacamp.com/blog_assets/PythonForDataScience.pdf
<li>https://jupyter.org/
<pre>
which python3
# /usr/bin/python3


== The Most Frequently Asked Questions About Python Programming ==
pip3 install jupyterlab
https://www.makeuseof.com/tag/python-programming-faq/
jupyter-lab
 
# http://localhost:8888/lab
== Running ==
# The current directory will be available on the file browser panel in JupyterLab.
=== Interactively ===
</pre>
Use Ctrl+d to quit.
On Mac, it shows the following when I run 'pip3 install jupyterlab'
{{Pre}}
Installing collected packages: pip
  WARNING: The scripts pip, pip3 and pip3.9 are installed in '/Users/XXX/Library/Python/3.9/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.0
WARNING: You are using pip version 21.2.4; however, version 23.0 is available.
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.
</pre>
That is, I need to use '''/Users/XXX/Library/Python/3.9/bin/jupyter-lab''' to launch '''jupyter-lab'''  OR add the path to ".zshrc"; see [https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html jupyterlab doc].
</li>
</ul>


=== How to run a python script file ===
=== conda + Jupyter ===
<syntaxhighlight lang='bash'>
* [https://stackoverflow.com/a/58068850 How to use Jupyter notebooks in a conda environment?] Option 1 works for me.
python mypython.py
* https://github.com/dylkot/cNMF/#analyze_simulated_example_data.ipynb
</syntaxhighlight>
<pre>
conda install --yes jupyterlab && conda clean --yes --all
</pre>


=== Run python statements from a command line ===
=== IPython shell ===
Use '''-c''' (command) option.
<ul>
<syntaxhighlight lang='bash'>
<li>[https://opensource.com/article/21/3/ipython-shell-jupyter-notebooks Why I love using the IPython shell and Jupyter notebooks]
python -c "import psutil"
<pre>
</syntaxhighlight>
ipython  # Shell


=== run python source code line by line ===
jupyter notebook # auto open the browser
[https://stackoverflow.com/a/7681464 run python source code line by line]
</pre>
<syntaxhighlight lang='bash'>
</li>
python -m pdb <script.py>
<li>[https://medium.com/analytics-vidhya/why-switch-to-jupyterlab-from-jupyter-notebook-c6d98362945b Why switch to JupyterLab from jupyter-notebook?]
</syntaxhighlight>
</li>
</ul>


== Install a new module ==
=== Extract python code from Jupyter notebook ===
* See an example of installing [[Anders2013#HTSeq|HTSeq]].
* [https://stackoverflow.com/a/56832942 Get only the code out of Jupyter Notebook]. '''nbconvert''' or '''jq'''. Or File -> Download as -> Python (.py) — this should export all code cells as single .py file.
* [https://win-vector.com/2022/04/30/separating-code-from-presentation-in-jupyter-notebooks/ Separating Code from Presentation in Jupyter Notebooks]


=== Module != Package ===
=== Execute Javascript in a Jupyter Notebook ===
* A Python '''module''' is a single file containing Python code, while a '''package''' is a collection of modules organized in a specific way.
[https://linuxtldr.com/run-javascript-jupyter-notebook/ How to Execute Javascript in a Jupyter Notebook on Linux]
* A module has a filename with the suffix '''.py''' added.


=== PyPI/Python Package Index ===
=== Google colab ===
* https://en.wikipedia.org/wiki/Python_Package_Index
* https://colab.research.google.com/
* [https://pypi.python.org/pypi The Python Package Index (PyPI)] is the definitive list of packages (or modules)
* An example [https://github.com/MariaNattestad/pca-on-genotypes Mini bioinformatics project: PCA on genotypes]
* [https://opensource.com/article/20/3/pip-linux-mac-windows How to install pip to manage PyPI packages easily]
* [https://opensource.com/article/19/9/pypi-guide Introducing the guide to 7 essential PyPI libraries and how to use them]


=== pip ===
=== R programming ===
'''pip''', use PyPI as the default source for packages and their dependencies.
[https://developers.refinitiv.com/en/article-catalog/article/setup-jupyter-notebook-r Setup Jupyter Notebook for R] or [https://dzone.com/articles/using-r-on-jupyternbspnotebook Using R on Jupyter Notebook]


As an example, [https://pypi.org/project/motioneye/ motionEye] can be installed by pip install or pip2 install; see its [https://github.com/ccrisan/motioneye/wiki/Installation wiki] and source code on Github.
To use R in JupyterLab, you will first need to install the IRkernel package in your R environment using the following command:
<pre>
install.packages('IRkernel')
</pre>


<syntaxhighlight lang='bash'>
Once you have installed the IRkernel package, you can register it with JupyterLab using the following command in your R console:
sudo apt-get install python-pip
pip --version
pip install SomePackage
pip show --files SomePackage
pip install --upgrade SomePackage
pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
pip install ‐‐upgrade pip  # Upgrade itself
pip uninstall SomePackage
 
sudo apt install python3-pip
pip3 --version
</syntaxhighlight>
 
==== Upgrade packages ====
[https://itsfoss.com/upgrade-pip-packages/ How to Upgrade Python Packages with Pip]
 
==== List installed packages and their versions, location/directory ====
<pre>
<pre>
pip3 list -v
IRkernel::installspec()
</pre>
</pre>
On my Ubuntu 20.04, the packages installed by '''pip3''' is located in ''~/.local/lib/python3.8/site-packages/''. It does not matter where I issued the ''pip3 install'' command.
After you have registered the kernel, you can start a new Jupyter notebook or JupyterLab session and select the "R" kernel from the kernel dropdown menu. This will allow you to run R code in JupyterLab, including data analysis, visualization, and other tasks.


==== The danger of upgrading pip ====
=== Xeus-R: a future-proof Jupyter kernel for R ===
* [https://stackoverflow.com/questions/49836676/error-after-upgrading-pip-cannot-import-name-main Error after upgrading pip]: cannot import name 'main'
[https://blog.jupyter.org/meet-xeus-r-a-future-proof-jupyter-kernel-for-r-1adc5fdd09ab Meet Xeus-R: a future-proof Jupyter kernel for R]
* [https://askubuntu.com/a/783442 You should consider upgrading via the 'pip install --upgrade pip' command]


==== Don't use sudo + pip ====
== Run Jupyter Notebooks on an Apple M1 Mac ==
https://askubuntu.com/questions/802544/is-sudo-pip-install-still-a-broken-practice
* [https://blog.roboflow.com/how-to-run-jupyter-notebooks-on-a-mac-m1/ How to Run Jupyter Notebooks on an Apple M1 Mac]
* [https://towardsdatascience.com/how-to-easily-set-up-python-on-any-m1-mac-5ea885b73fab How to Easily Set Up Python on Any M1 Mac]
* [https://alexmanrique.com/blog/development/2021/03/05/installing-jupyter-in-macbook-air-m1.html Installing Jupyter in Macbook Air M1]


==== "--user" option in pip ====
== Cheat sheet ==
* It is not recommended to use sudo before calling pip on Linux (actually can we?). This is because using sudo can cause permissions issues and can potentially damage your system12. Instead, you can install packages locally using the '''--user''' flag.
* http://datasciencefree.com/python.pdf
* [https://askubuntu.com/a/641194 Upgrade python packages with pip: use "sudo" or "--user"? ]
* https://s3.amazonaws.com/assets.datacamp.com/blog_assets/PythonForDataScience.pdf
* If you need to install packages system-wide, you can use virtual environments instead of sudo. Virtual environments allow you to create isolated Python environments that do not interfere with the system Python installation
* [https://github.com/pypa/pip/issues/4186 Permission denied: '/usr/local/lib/python2.7/dist-packages/pip']
<pre style="white-space: pre-wrap; /* CSS 3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* IE 5.5+ */ " >
$ pip install Pygments
...
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/Pygments-2.2.0.dist-info'
/usr/local/lib/python2.7/dist-packages/pip-9.0.1-py2.7.egg/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
$ pip install --user Pygments
Collecting Pygments
  Using cached Pygments-2.2.0-py2.py3-none-any.whl
Installing collected packages: Pygments
Successfully installed Pygments-2.2.0
</pre>


==== pip -t option ====
== The Most Frequently Asked Questions About Python Programming ==
We can force to install a package in user's directory (i.e. a package is already installed in the global directory /usr/lib/python3/dist-packages but some applications cannot find it). [https://stackoverflow.com/questions/17216689/pip-install-python-package-into-a-specific-directory-other-than-the-default-inst Pip install python package into a specific directory other than the default install location]
https://www.makeuseof.com/tag/python-programming-faq/
<pre>
pip3 install -t ~/.local/bin/python3.10/site-packages pytz
</pre>


==== virtualenv ====
== Running ==
Python “Virtual Environments” allows us to install a Python package in an isolated location, rather than installing it globally.
=== Interactively ===
<ul>
Use Ctrl+d to quit.
<li>[https://www.ostechnix.com/manage-python-packages-using-pip/ How To Manage Python Packages Using Pip].  


First Create a new project folder and cd to the project folder in your terminal.
=== How to run a python script file ===
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='bash'>
# Python 2
python mypython.py
$ sudo pip install virtualenv
</syntaxhighlight>
$ virtualenv <DIR_NAME>
$ source <DIR_NAME>/bin/activate
(<DIR_NAME>) ~$ which python
....
$ deactivate


# Python 3, https://docs.python.org/3/tutorial/venv.html
=== Run python statements from a command line ===
$ python3 -m venv <DIR_NAME>  # DIR_NAME is also called an environment
Use '''-c''' (command) option.
$ source <DIR_NAME>/bin/activate
<syntaxhighlight lang='bash'>
(<DIR_NAME>) ~$ which python
python -c "import psutil"
....
$ deactivate
</syntaxhighlight>
</syntaxhighlight>
Here is an example
 
=== run python source code line by line ===
[https://stackoverflow.com/a/7681464 run python source code line by line]
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='bash'>
~/github/PUREE$ python3 -m venv myenv
python -m pdb <script.py>
The virtual environment was not created successfully because ensurepip is not
</syntaxhighlight>
available.  On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.


    apt install python3.10-venv
== Install a new module ==
...
* See an example of installing [[Anders2013#HTSeq|HTSeq]].
~/github/PUREE$ sudo apt install python3.10-venv
~/github/PUREE$ python3 -m venv myenv


~/github/PUREE$ source myenv/bin/activate
=== Module != Package ===
(myenv) ~/github/PUREE$ which python
* A Python '''module''' is a single file containing Python code, while a '''package''' is a collection of modules organized in a specific way.
/home/brb/github/PUREE/myenv/bin/python
* A module has a filename with the suffix '''.py''' added.
(myenv) ~/github/PUREE$ pip freeze > requirements.txt
(myenv) ~/github/PUREE$ deactivate
~/github/PUREE$


~/github/PUREE$ ls myenv/bin
=== PyPI/Python Package Index ===
activate      activate.fish  f2py  f2py3.10    pip  pip3.10  python3    wheel
* https://en.wikipedia.org/wiki/Python_Package_Index
activate.csh  Activate.ps1  f2py3  normalizer  pip3  python  python3.10
* [https://pypi.python.org/pypi The Python Package Index (PyPI)] is the definitive list of packages (or modules)
</syntaxhighlight>
* [https://opensource.com/article/20/3/pip-linux-mac-windows How to install pip to manage PyPI packages easily]
</li>
* [https://opensource.com/article/19/9/pypi-guide Introducing the guide to 7 essential PyPI libraries and how to use them]
<li>
[https://youtu.be/N5vscPTWKOk Python Tutorial: virtualenv and why you should use virtual environments]. '''pip freeze'''.
<pre>
pip list
pip freeze --local > requirements.txt
...
pip install -r requirements.txt
pip list
</pre>
</li>
</ul>
* [https://opensource.com/article/20/10/learn-python-ebook Learn Python by creating a video game]
* [https://www.pythonforbeginners.com/basics/how-to-use-python-virtualenv How to use Python virtualenv]
* [https://www.dabapps.com/blog/introduction-to-pip-and-virtualenv-python/ A non-magical introduction to Pip and Virtualenv for Python beginners]
* Alternative to ''virtualenv'' we need to add "--user" to the '''pip''' command. See the installation guide of  [https://lasagne.readthedocs.io/en/latest/user/installation.html#python-pip lasagne] or [https://stackoverflow.com/a/15912917 easy_install or pip as a limited user?]


==== pipenv ====
=== pip ===
* [https://www.makeuseof.com/pipenv-python-environment/ Why Use Pipenv to Create a Python Environment?]
'''pip''', use PyPI as the default source for packages and their dependencies.
* Pipenv is a Python virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, pyenv and virtualenv. It automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the ever-important '''Pipfile.lock''', which is used to produce deterministic builds.
* https://pipenv.pypa.io/en/latest/pipfile/


==== Poetry ====
As an example, [https://pypi.org/project/motioneye/ motionEye] can be installed by pip install or pip2 install; see its [https://github.com/ccrisan/motioneye/wiki/Installation wiki] and source code on Github.
* https://python-poetry.org/  
* Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. Poetry offers a '''poetry.lock''' lockfile to ensure repeatable installs, and can build your project for distribution.


==== pipx (alternative to pip3) ====
<ul>
<li>Pipx is a tool that helps you install and run end-user applications written in Python. It is similar to macOS’s brew, JavaScript’s npx, and Linux’s apt.
* Pipx is focused on installing and managing Python packages that can be '''run from the command line directly''' as applications.
* pipx is made specifically for application installation, as it adds isolation yet still makes the apps available in your shell: pipx creates an '''isolated environment''' for each application and its associated packages.
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='bash'>
python3 -m pip install --user pipx
sudo apt-get install python-pip
python3 -m pipx ensurepath
pip --version
# OR sudo apt install pipx 
pip install SomePackage
# Or pip3 install pipx
pip show --files SomePackage
pip install --upgrade SomePackage
pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
pip install ‐‐upgrade pip  # Upgrade itself
pip uninstall SomePackage


pipx install <package_name> # no sudo needed
sudo apt install python3-pip
pipx list
pip3 --version
pipx uninstall <package_name>
</syntaxhighlight>
</syntaxhighlight>
<li>I need to find an alternative to '''pip''' utility because of a problem when I used pip command. '''Error: externally-managed-environment'''
 
* [https://www.omgubuntu.co.uk/2023/04/pip-install-error-externally-managed-environment-fix 3 Ways to Solve Pip Install Error on Ubuntu 23.04]
==== Upgrade packages ====
* See the example of installing [[Linux#Asciinema_.26_agg|Asciinema & agg]]
[https://itsfoss.com/upgrade-pip-packages/ How to Upgrade Python Packages with Pip]
* I can see "asciinema" was installed under ~/.local/bin directory that's because "pipx ensurepath" adds ~/.local/bin to ~/.bashrc file.
 
<li>Official website [https://pypa.github.io/pipx/pipx] - Install and Run Python Applications in Isolated Environments, [https://github.com/pypa/pipx github]
==== requirements.txt ====
<li>pipx vs pip
<ul>
* pipx is made specifically for application installation and adds isolation yet still makes the apps available in your shell. '''pipx creates an isolated environment for each application''' and its associated packages. On the other hand, '''pip is a general-purpose package installer for both libraries and apps with no environment isolation'''.
<li>[https://learnpython.com/blog/python-requirements-file/ The Python Requirements File and How to Create it]
* If you want to install an application that has dependencies that conflict with other applications or libraries on your system, you can use pipx to create an isolated environment for that application and its dependencies. This way, '''you can avoid conflicts between different versions of the same package'''.
<pre>
* On my Ubuntu, pip installs packages to /usr/local/lib/python3.8/dist-packages/.
pip freeze > requirements.txt
<li>[https://www.ostechnix.com/pipx-install-and-run-python-applications-in-isolated-environments/ Pipx – Install And Run Python Applications In Isolated Environments]
</pre>
<li>[https://opensource.com/article/21/7/python-pipx Run Python applications in virtual environments]
<li>An example
<pre>
tensorflow==2.3.1
uvicorn==0.12.2
fastapi==0.63.0
</pre>
We can use '''pip freeze''' or '''pip list''' to verify available packages in an environment.
</ul>
</ul>


=== python setup.py ===
==== List installed packages and their versions, location/directory ====
If a package has been bundled by its creator using the standard approach to
<pre>
bundling modules (with Python’s distutils tool), all you need to do is download
pip3 list -v
the package, uncompress it and type:
</pre>
<syntaxhighlight lang='bash'>
On my Ubuntu 20.04, the packages installed by '''pip3''' is located in ''~/.local/lib/python3.8/site-packages/''. It does not matter where I issued the ''pip3 install'' command.
python setup.py build
 
sudo python setup.py install
==== The danger of upgrading pip ====
</syntaxhighlight>
* [https://stackoverflow.com/questions/49836676/error-after-upgrading-pip-cannot-import-name-main Error after upgrading pip]: cannot import name 'main'
For Python 2, the packages are installed under '''/usr/local/lib/python2.7/dist-packages/'''.
* [https://askubuntu.com/a/783442 You should consider upgrading via the 'pip install --upgrade pip' command]
<syntaxhighlight lang='bash'>
 
$ ls -l /usr/local/lib/python2.7/dist-packages/
==== Don't use sudo + pip ====
total 12
https://askubuntu.com/questions/802544/is-sudo-pip-install-still-a-broken-practice
-rw-r--r-- 1 root staff  273 Jan 12 13:45 easy-install.pth
drwxr-sr-x 4 root staff 4096 Jan 12 13:45 HTSeq-0.6.1p1-py2.7-linux-x86_64.egg
drwxr-sr-x 4 root staff 4096 Jan 12 13:42 pysam-0.9.1.4-py2.7-linux-x86_64.egg
</syntaxhighlight>


=== python setup.py bdist_wheel ===
==== "--user" option in pip ====
* [https://stackoverflow.com/a/65480764 Why does python setup.py bdist_wheel creates a build folder?]
* It is not recommended to use sudo before calling pip on Linux (actually can we?). This is because using sudo can cause permissions issues and can potentially damage your system12. Instead, you can install packages locally using the '''--user''' flag.
* [https://realpython.com/python-wheels/ What Are Python Wheels and Why Should You Care?] The purpose of creating a wheel file in Python is to package and distribute your code in a way that makes it easier for others to install and use your code.
* [https://askubuntu.com/a/641194 Upgrade python packages with pip: use "sudo" or "--user"? ]
* [https://docs.python.org/3/distutils/builtdist.html Creating Built Distributions]
* If you need to install packages system-wide, you can use virtual environments instead of sudo. Virtual environments allow you to create isolated Python environments that do not interfere with the system Python installation
* [https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#id67 Wheels]
* [https://github.com/pypa/pip/issues/4186 Permission denied: '/usr/local/lib/python2.7/dist-packages/pip']
* In Python programming, “wheel” refers to a built-package format for Python. It is designed to contain all the files for a PEP 376 compatible install in a way that is very close to the on-disk format.
<pre style="white-space: pre-wrap; /* CSS 3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* IE 5.5+ */ " >
* [https://www.geeksforgeeks.org/what-is-a-python-wheel/# What is a Python wheel?]
$ pip install Pygments
* To create a wheel file in Python.
...
** First, make sure you have the wheel package installed: pip install wheel
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/Pygments-2.2.0.dist-info'
** Navigate to the directory containing your package and run the following command: python setup.py bdist_wheel
/usr/local/lib/python2.7/dist-packages/pip-9.0.1-py2.7.egg/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
** This will create a .whl file in the dist directory of your package. (people can use [https://stackoverflow.com/a/28300854 '''pip''' to install the project])
  InsecurePlatformWarning
$ pip install --user Pygments
Collecting Pygments
  Using cached Pygments-2.2.0-py2.py3-none-any.whl
Installing collected packages: Pygments
Successfully installed Pygments-2.2.0
</pre>
 
==== pip -t option ====
We can force to install a package in user's directory (i.e. a package is already installed in the global directory /usr/lib/python3/dist-packages but some applications cannot find it). [https://stackoverflow.com/questions/17216689/pip-install-python-package-into-a-specific-directory-other-than-the-default-inst Pip install python package into a specific directory other than the default install location]
<pre>
pip3 install -t ~/.local/bin/python3.10/site-packages pytz
</pre>


=== Get a list of installed modules ===
==== virtualenv ====
http://stackoverflow.com/questions/739993/how-can-i-get-a-list-of-locally-installed-python-modules
Python “Virtual Environments” allows us to install a Python package in an isolated location, rather than installing it globally.
<syntaxhighlight lang='bash'>
<ul>
pydoc modules
<li>[https://www.ostechnix.com/manage-python-packages-using-pip/ How To Manage Python Packages Using Pip].  
</syntaxhighlight>
Not helpful. See the '''pip list''' command.


=== Check installed packages' versions ===
First Create a new project folder and cd to the project folder in your terminal.
If you install packages through '''pip''', use
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='bash'>
$ pip list
# Python 2
...
$ sudo pip install virtualenv
pyOpenSSL (0.13.1)
$ virtualenv <DIR_NAME>
pyparsing (2.0.1)
$ source <DIR_NAME>/bin/activate
pysam (0.10.0)
(<DIR_NAME>) ~$ which python
python-dateutil (1.5)
....
pytz (2013.7)
$ deactivate
rudix (2016.12.13)
 
scipy (0.13.0b1)
# For Python 3, https://docs.python.org/3/tutorial/venv.html it is more common to use venv instead
setuptools (1.1.6)
$ python3 -m venv <DIR_NAME>  # DIR_NAME is also called an environment
singledispatch (3.4.0.3)
$ source <DIR_NAME>/bin/activate
six (1.4.1)
(<DIR_NAME>) ~$ which python
tornado (4.4.2)
....
vboxapi (1.0)
$ deactivate
xattr (0.6.4)
zope.interface (4.1.1)
</syntaxhighlight>
 
And more information about a package by using '''pip show PACKAGE'''.
<syntaxhighlight lang='bash'>
$ pip show pysam
Name: pysam
Version: 0.10.0
Summary: pysam
Home-page: https://github.com/pysam-developers/pysam
Author: Andreas Heger
Author-email: andreas.heger@gmail.com
License: MIT
Location: /Users/XXX/Library/Python/2.7/lib/python/site-packages
Requires:
</syntaxhighlight>
</syntaxhighlight>
<li>
[https://youtu.be/N5vscPTWKOk Python Tutorial: virtualenv and why you should use virtual environments]. '''pip freeze'''.
<pre>
pip list
pip freeze --local > requirements.txt
...
pip install -r requirements.txt
pip list
</pre>
</li>
</ul>
* [https://opensource.com/article/20/10/learn-python-ebook Learn Python by creating a video game]
* [https://www.pythonforbeginners.com/basics/how-to-use-python-virtualenv How to use Python virtualenv]
* [https://www.dabapps.com/blog/introduction-to-pip-and-virtualenv-python/ A non-magical introduction to Pip and Virtualenv for Python beginners]
* Alternative to ''virtualenv'' we need to add "--user" to the '''pip''' command. See the installation guide of  [https://lasagne.readthedocs.io/en/latest/user/installation.html#python-pip lasagne] or [https://stackoverflow.com/a/15912917 easy_install or pip as a limited user?]
==== pipenv ====
* [https://www.makeuseof.com/pipenv-python-environment/ Why Use Pipenv to Create a Python Environment?]
* Pipenv is a Python virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, pyenv and virtualenv. It automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the ever-important '''Pipfile.lock''', which is used to produce deterministic builds.
* https://pipenv.pypa.io/en/latest/pipfile/


The following method works whether the package is installed by source or binary package
==== Poetry ====
<syntaxhighlight lang='python'>
* https://python-poetry.org/
>>> import pysam
* Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. Poetry offers a '''poetry.lock''' lockfile to ensure repeatable installs, and can build your project for distribution.
>>> print(pysam.__version__)
0.10.0
>>> print pysam.__version__
0.10.0
</syntaxhighlight>


See http://hammelllab.labsites.cshl.edu/tetoolkit-faq/
==== pipx (alternative to pip3) ====
 
<ul>
=== Install a specific version of package through pip ===
<li>Pipx is a tool that helps you install and run end-user applications written in Python. It is similar to macOS’s brew, JavaScript’s npx, and Linux’s apt.  
https://stackoverflow.com/questions/5226311/installing-specific-package-versions-with-pip
* Pipx is focused on installing and managing Python packages that can be '''run from the command line directly''' as applications.
 
* pipx is made specifically for application installation, as it adds isolation yet still makes the apps available in your shell: pipx creates an '''isolated environment''' for each application and its associated packages.
For example, pysam package was actively released. But the new release (0.11.2.2) may introduce some bugs. So I have to install an older version (0.10.0 works for me on Mac El Capitan and Sierra).  
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='bash'>
$ sudo -H pip uninstall pysam
python3 -m pip install --user pipx
Uninstalling pysam-0.11.2.2:
python3 -m pipx ensurepath
......
# OR sudo apt install pipx    
$ sudo -H pip install pysam==0.10.0
# Or pip3 install pipx
Collecting pysam==0.10.0
   Downloading pysam-0.10.0.tar.gz (2.3MB)
    100% |████████████████████████████████| 2.3MB 418kB/s
Installing collected packages: pysam
  Running setup.py install for pysam ... done
Successfully installed pysam-0.10.0
</syntaxhighlight>


=== warning: Please check the permissions and owner of that directory ===
pipx install <package_name> # no sudo needed
I got this message when I use root to run the 'sudo pip install PACKAGE' command.  
pipx list
pipx uninstall <package_name>
</syntaxhighlight>
<li>I need to find an alternative to '''pip''' utility because of a problem when I used pip command. '''Error: externally-managed-environment'''
* [https://www.omgubuntu.co.uk/2023/04/pip-install-error-externally-managed-environment-fix 3 Ways to Solve Pip Install Error on Ubuntu 23.04]
* See the example of installing [[Linux#Asciinema_.26_agg|Asciinema & agg]]
* I can see "asciinema" was installed under ~/.local/bin directory that's because "pipx ensurepath" adds ~/.local/bin to ~/.bashrc file.
<li>Official website [https://pypa.github.io/pipx/pipx] - Install and Run Python Applications in Isolated Environments, [https://github.com/pypa/pipx github]
<li>pipx vs pip
* pipx is made specifically for application installation and adds isolation yet still makes the apps available in your shell. '''pipx creates an isolated environment for each application''' and its associated packages. On the other hand, '''pip is a general-purpose package installer for both libraries and apps with no environment isolation'''.
* If you want to install an application that has dependencies that conflict with other applications or libraries on your system, you can use pipx to create an isolated environment for that application and its dependencies. This way, '''you can avoid conflicts between different versions of the same package'''.
* On my Ubuntu, pip installs packages to /usr/local/lib/python3.8/dist-packages/.
<li>[https://www.ostechnix.com/pipx-install-and-run-python-applications-in-isolated-environments/ Pipx – Install And Run Python Applications In Isolated Environments]
<li>[https://opensource.com/article/21/7/python-pipx Run Python applications in virtual environments]
</ul>


See
=== python setup.py ===
* http://stackoverflow.com/questions/27870003/pip-install-please-check-the-permissions-and-owner-of-that-directory
If a package has been bundled by its creator using the standard approach to
* http://askubuntu.com/questions/578869/python-pip-permissions
bundling modules (with Python’s distutils tool), all you need to do is download
 
the package, uncompress it and type:
=== python3-pip installed but pip3 command not found? ===
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='bash'>
sudo apt-get remove python3-pip; sudo apt-get install python3-pip
python setup.py build
sudo python setup.py install
</syntaxhighlight>
</syntaxhighlight>
 
For Python 2, the packages are installed under '''/usr/local/lib/python2.7/dist-packages/'''.
=== DeepSurv example ===
https://github.com/jaredleekatzman/DeepSurv
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='bash'>
git clone https://github.com/jaredleekatzman/DeepSurv.git
$ ls -l /usr/local/lib/python2.7/dist-packages/
sudo cp /usr/bin/pip /usr/bin/pip.bak
total 12
sudo nano /usr/bin/pip # See https://stackoverflow.com/a/50187211 more detail
-rw-r--r-- 1 root staff  273 Jan 12 13:45 easy-install.pth
drwxr-sr-x 4 root staff 4096 Jan 12 13:45 HTSeq-0.6.1p1-py2.7-linux-x86_64.egg
drwxr-sr-x 4 root staff 4096 Jan 12 13:42 pysam-0.9.1.4-py2.7-linux-x86_64.egg
</syntaxhighlight>


# Method 1 for Theano
=== python setup.py bdist_wheel ===
sudo pip install theano
* [https://stackoverflow.com/a/65480764 Why does python setup.py bdist_wheel creates a build folder?]
# Method 2 for Theano
* [https://realpython.com/python-wheels/ What Are Python Wheels and Why Should You Care?] The purpose of creating a wheel file in Python is to package and distribute your code in a way that makes it easier for others to install and use your code.
pip install --user --upgrade https://github.com/Theano/Theano/archive/master.zip
* [https://docs.python.org/3/distutils/builtdist.html Creating Built Distributions]
* [https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#id67 Wheels]
* In Python programming, “wheel” refers to a built-package format for Python. It is designed to contain all the files for a PEP 376 compatible install in a way that is very close to the on-disk format.
* [https://www.geeksforgeeks.org/what-is-a-python-wheel/# What is a Python wheel?]
* To create a wheel file in Python.
** First, make sure you have the wheel package installed: pip install wheel
** Navigate to the directory containing your package and run the following command: python setup.py bdist_wheel
** This will create a .whl file in the dist directory of your package. (people can use [https://stackoverflow.com/a/28300854 '''pip''' to install the project])


pip install --user --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
=== Get a list of installed modules ===
cd DeepSurv/
http://stackoverflow.com/questions/739993/how-can-i-get-a-list-of-locally-installed-python-modules
pip install . --user
<syntaxhighlight lang='bash'>
sudo apt install python-pytest
pydoc modules
pip install h5py --user
sudo pip uninstall protobuf # https://stackoverflow.com/a/33623372
pip install protobuf --user
sudo apt install python-tk
py.test
============ test session starts ===========
platform linux2 -- Python 2.7.12, pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /home/brb/github/DeepSurv, inifile:
collected 7 items
 
tests/test_deepsurv.py .......
 
========== 7 passed in 5.77 seconds ========
</syntaxhighlight>
</syntaxhighlight>
Not helpful. See the '''pip list''' command.


== How to list all installed modules ==
=== Check installed packages' versions ===
 
If you install packages through '''pip''', use
<pre>
<syntaxhighlight lang='bash'>
help('modules') # the output is not pretty
$ pip list
</pre>
...
 
pyOpenSSL (0.13.1)
== Comment ==
pyparsing (2.0.1)
pysam (0.10.0)
python-dateutil (1.5)
pytz (2013.7)
rudix (2016.12.13)
scipy (0.13.0b1)
setuptools (1.1.6)
singledispatch (3.4.0.3)
six (1.4.1)
tornado (4.4.2)
vboxapi (1.0)
xattr (0.6.4)
zope.interface (4.1.1)
</syntaxhighlight>


# Use the comment symbol # for a single line
And more information about a package by using '''pip show PACKAGE'''.
# Use a delimiter “”” on each end of the comment. '''Attention''': [https://stackoverflow.com/questions/675442/how-to-comment-out-a-block-of-code-in-python Don't use triple-quotes]
<syntaxhighlight lang='bash'>
$ pip show pysam
Name: pysam
Version: 0.10.0
Summary: pysam
Home-page: https://github.com/pysam-developers/pysam
Author: Andreas Heger
Author-email: andreas.heger@gmail.com
License: MIT
Location: /Users/XXX/Library/Python/2.7/lib/python/site-packages
Requires:
</syntaxhighlight>


[http://www.zentut.com/python-tutorial/python-comments/ Python Comments] from zentut.com.
The following method works whether the package is installed by source or binary package
<syntaxhighlight lang='python'>
>>> import pysam
>>> print(pysam.__version__)
0.10.0
>>> print pysam.__version__
0.10.0
</syntaxhighlight>


=== Docstring ===
See http://hammelllab.labsites.cshl.edu/tetoolkit-faq/
* https://en.wikipedia.org/wiki/Docstring
* Python Developer's Guide [https://www.python.org/dev/peps/pep-0257/ Docstring Conventions]


== Try / Except ==
=== Install a specific version of package through pip ===
[https://pythonbasics.org/try-except/ Try and Except in Python]
https://stackoverflow.com/questions/5226311/installing-specific-package-versions-with-pip
<pre>
try:
    number = int(input("Enter a number: "))
    print(number)
except:
    print("Invalid Input")
</pre>


== if __name__ == "__main__": ==
For example, pysam package was actively released. But the new release (0.11.2.2) may introduce some bugs. So I have to install an older version (0.10.0 works for me on Mac El Capitan and Sierra).
* [https://stackoverflow.com/a/419185 What does if __name__ == “__main__”: do?]
<syntaxhighlight lang='bash'>
** [https://stackoverflow.com/a/419986 Simplest example]
$ sudo -H pip uninstall pysam
** [https://stackoverflow.com/a/419189 A more complicated example]
Uninstalling pysam-0.11.2.2:
* [https://www.freecodecamp.org/news/whats-in-a-python-s-name-506262fe61e8/ What’s in a (Python’s) __name__?]
......
$ sudo -H pip install pysam==0.10.0
Collecting pysam==0.10.0
  Downloading pysam-0.10.0.tar.gz (2.3MB)
    100% |████████████████████████████████| 2.3MB 418kB/s
Installing collected packages: pysam
  Running setup.py install for pysam ... done
Successfully installed pysam-0.10.0
</syntaxhighlight>


== How to Get the Current Directory in Python ==
=== warning: Please check the permissions and owner of that directory ===
[https://www.makeuseof.com/how-to-get-the-current-directory-in-python/ How to Get the Current Directory in Python]
I got this message when I use root to run the 'sudo pip install PACKAGE' command.


== Import a compiled C module ==
See
* An [http://www.swig.org/tutorial.html example] based on SWIG compiler.
* http://stackoverflow.com/questions/27870003/pip-install-please-check-the-permissions-and-owner-of-that-directory
* http://askubuntu.com/questions/578869/python-pip-permissions


== string and string operators ==
=== python3-pip installed but pip3 command not found? ===
Reference:
<syntaxhighlight lang='bash'>
# Python for Genomic Data Science from coursera.
sudo apt-get remove python3-pip; sudo apt-get install python3-pip
# [https://www.codementor.io/mgalarny/python-hello-world-and-string-manipulation-gdgwd8ymp Python Hello World and String Manipulation]
</syntaxhighlight>


* Use double quote instead of single quote to define a string
=== DeepSurv example ===
* Use triple double quotes """ to write a [https://docs.python.org/3/tutorial/introduction.html long string spanning multiple lines] or [http://stackoverflow.com/questions/7696924/multiline-comments-in-python comments in a python script]
https://github.com/jaredleekatzman/DeepSurv
* if dna="gatagc", then
<syntaxhighlight lang='bash'>
<syntaxhighlight lang='python'>
git clone https://github.com/jaredleekatzman/DeepSurv.git
dna[0]='g'
sudo cp /usr/bin/pip /usr/bin/pip.bak
dna[-1]='c' (start counting from the right)
sudo nano /usr/bin/pip # See https://stackoverflow.com/a/50187211 more detail
dna[-2]='g'
dna[0:3]='gat' (the end always excluded)
dna[:3]='gat'
dna[2:]='tgc'
len(dna)=6
type(dna)
print(dna)
dna.count('c')
dna.upper()
dna.find('ag')=3  (only the first occurrence of 'ag' is reported)
dna.find('17', 2) (start looking from pos 17)
dna.rfind('ag')  ( search backwards in string)
dna.islower()    (True)
dna.isupper()    (False)
dna.replace('a', 'A')
print(dna.upper().isupper())
</syntaxhighlight>


=== Format ===
# Method 1 for Theano
[https://docs.python.org/3.8/library/string.html#format-specification-mini-language Format Specification Mini-Language]
sudo pip install theano
# Method 2 for Theano
pip install --user --upgrade https://github.com/Theano/Theano/archive/master.zip


== Regular expression ==
pip install --user --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
[https://www.makeuseof.com/regular-expressions-python/ The Beginner’s Guide to Regular Expressions With Python]
cd DeepSurv/
pip install . --user
sudo apt install python-pytest
pip install h5py --user
sudo pip uninstall protobuf # https://stackoverflow.com/a/33623372
pip install protobuf --user
sudo apt install python-tk
py.test
============ test session starts ===========
platform linux2 -- Python 2.7.12, pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /home/brb/github/DeepSurv, inifile:
collected 7 items


== User's input ==
tests/test_deepsurv.py .......
<syntaxhighlight lang='python'>
dna=raw_input("Enter a DNA sequence: ")  # python 2
dna=input("Enter a DNA sequence: ")      # python 3
</syntaxhighlight>
To convert a user's input (a string) to others
<syntaxhighlight lang='python'>
int(x, [, base])
flaot(x)
str(x) #converts x to a string
str(65) # '65'


chr(x)  # converts an integer to a character
========== 7 passed in 5.77 seconds ========
chr(65) # 'A'
</syntaxhighlight>
</syntaxhighlight>


== Print ==
== How to list all installed modules ==
[https://stackoverflow.com/a/6183002 Why is parenthesis in print voluntary in Python 2.7?]
 
== Fancy Output ==
<syntaxhighlight lang='python'>
print("THE DNA's GC content is ", gc, "%") # gives too many digits following the dot
print("THE DNA's GC content is %5.3f %%" % " % gc)
# the percent operator separating the formatting string and the value to
# replace the format placeholder
print("%d" % 10.6)  # 10
print("%e" % 10.6)  # 10.060000e+01
print("%s" % dna)  # gatagc
</syntaxhighlight>


== Type ==
[https://docs.python.org/3/library/functions.html Built-in Functions], [https://www.golinuxcloud.com/python-type-of-variable/#Conclusion How to check type of variable (object) in Python]
<pre>
<pre>
type(object)
help('modules') # the output is not pretty
</pre>
</pre>


== List ==
== Comment ==
A list is an ordered set of values
<syntaxhighlight lang='python'>
gene_expr=['gene', 5.16e-08, 0.001385, 7.33e-08]
print(gene_expr[2]
gene_expr[0]='Lif'
</syntaxhighlight>


Slice a list (it will create a new list)
# Use the comment symbol # for a single line
<syntaxhighlight lang='python'>
# Use a delimiter “”” on each end of the comment. '''Attention''': [https://stackoverflow.com/questions/675442/how-to-comment-out-a-block-of-code-in-python Don't use triple-quotes]
gene_expr[-3:]  # [5.16e-08, 0.001385, 7.33e-08]
gene_expr[1:3] = [6.09e-07]
</syntaxhighlight>


Clear the list
[http://www.zentut.com/python-tutorial/python-comments/ Python Comments] from zentut.com.
<syntaxhighlight lang='python'>
gene_expr[]=[]
</syntaxhighlight>


=== List functions ===
=== Docstring ===
Size of the list
* https://en.wikipedia.org/wiki/Docstring
<syntaxhighlight lang='python'>
* Python Developer's Guide [https://www.python.org/dev/peps/pep-0257/ Docstring Conventions]
len(gene_expr)
</syntaxhighlight>


Delete an element
== Try / Except ==
<syntaxhighlight lang='python'>
[https://pythonbasics.org/try-except/ Try and Except in Python]
del gene_expr[1]
<pre>
</syntaxhighlight>
try:
    number = int(input("Enter a number: "))
    print(number)
except:
    print("Invalid Input")
</pre>


Extend/append to a list
== if __name__ == "__main__": ==
<syntaxhighlight lang='python'>
* [https://stackoverflow.com/a/419185 What does if __name__ == “__main__”: do?]
gene_expr).extend([5.16e-08, 0.00123])
** [https://stackoverflow.com/a/419986 Simplest example]
</syntaxhighlight>
** [https://stackoverflow.com/a/419189 A more complicated example]
* [https://www.freecodecamp.org/news/whats-in-a-python-s-name-506262fe61e8/ What’s in a (Python’s) __name__?]


Count the number of times an element appears in a list
== How to Get the Current Directory in Python ==
<syntaxhighlight lang='python'>
[https://www.makeuseof.com/how-to-get-the-current-directory-in-python/ How to Get the Current Directory in Python]
print(gene_expr.count('Lif'), gene_expr.count('gene'))
</syntaxhighlight>


Reverse all elements in a list
== Import a compiled C module ==
<syntaxhighlight lang='python'>
* An [http://www.swig.org/tutorial.html example] based on SWIG compiler.
gene_expr.reverse()
print(gene_expr)
help(list)
</syntaxhighlight>


Lists as Stacks
== string and string operators ==
<syntaxhighlight lang='python'>
Reference:
stack=['a', 'b', 'c', 'd']
# Python for Genomic Data Science from coursera.  
stack.append('e')
# [https://www.codementor.io/mgalarny/python-hello-world-and-string-manipulation-gdgwd8ymp Python Hello World and String Manipulation]
</syntaxhighlight>


Sorting lists
* Use double quote instead of single quote to define a string
* Use triple double quotes """ to write a [https://docs.python.org/3/tutorial/introduction.html long string spanning multiple lines] or [http://stackoverflow.com/questions/7696924/multiline-comments-in-python comments in a python script]
* if dna="gatagc", then
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
mylist=[3, 31, 123, 1, 5]
dna[0]='g'
sorted(mylist)
dna[-1]='c' (start counting from the right)
mylist  # not changed
dna[-2]='g'
mylist.sort()
dna[0:3]='gat' (the end always excluded)
 
dna[:3]='gat'
mylist=['c', 'g', 'T', 'a', 'A']
dna[2:]='tgc'
mylist.sort()
len(dna)=6
type(dna)
print(dna)
dna.count('c')
dna.upper()
dna.find('ag')=3  (only the first occurrence of 'ag' is reported)
dna.find('17', 2) (start looking from pos 17)
dna.rfind('ag')  ( search backwards in string)
dna.islower()    (True)
dna.isupper()    (False)
dna.replace('a', 'A')
print(dna.upper().isupper())
</syntaxhighlight>
</syntaxhighlight>


Don't change an element in a string!
=== Format ===
<syntaxhighlight lang='python'>
[https://docs.python.org/3.8/library/string.html#format-specification-mini-language Format Specification Mini-Language]


motif = 'nacggggtc'
== Regular expression ==
motif[0] = 'a'    # ERROR
[https://www.makeuseof.com/regular-expressions-python/ The Beginner’s Guide to Regular Expressions With Python]
</syntaxhighlight>


== Tuples ==
== User's input ==
A tuple consists of a number of values separated by commas, and is another standard sequence data type, like strings and lists.
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
t=1,2,3
dna=raw_input("Enter a DNA sequence: ")  # python 2
t
dna=input("Enter a DNA sequence: ")     # python 3
t=(1,2,3) # we may input tuples with or without surrounding parentheses
</syntaxhighlight>
</syntaxhighlight>
To convert a user's input (a string) to others
<syntaxhighlight lang='python'>
int(x, [, base])
flaot(x)
str(x) #converts x to a string
str(65) # '65'


== Sets ==
chr(x)  # converts an integer to a character
A set is an unordered collection with no duplicate elements.
chr(65) # 'A'
<syntaxhighlight lang='python'>
brca1={'DNA repair', 'zine ion binding'}
brca2={protein binding', 'H4 histone'}
brca1 | brca2
brca1 & brca2
brca1 - brca2
</syntaxhighlight>
</syntaxhighlight>


== Dictionaries ==
== Print ==
A '''dictionary''' is an unordered set of ''key'' and ''value'' pairs, with the requirement that the keys are unique (within on dictionary).
[https://stackoverflow.com/a/6183002 Why is parenthesis in print voluntary in Python 2.7?]


== Fancy Output ==
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
TF_motif = {'SP1' : 'gggcgg',
print("THE DNA's GC content is ", gc, "%") # gives too many digits following the dot
            'C/EBP' : 'attgcgcaat',
print("THE DNA's GC content is %5.3f %%" % " % gc)  
            'ATF' : 'tgacgtca',
# the percent operator separating the formatting string and the value to
            'c-Myc' : 'cacgtg',
# replace the format placeholder
            'Oct-1' : 'atgcaaat'}
print("%d" % 10.6) # 10
# Access
print("%e" % 10.6) # 10.060000e+01
print("The recognition sequence for the ATF transcription is %s." % TF_motif['ATF'])  
print("%s" % dna)   # gatagc
# Update
TF_motif['AP-1'] = 'tgagtca'
# Delete
del TF_motif['SP1']
# Size of a list
len(TF_motif)
# Get a list of all the 'keys' in a dictionary
list(TF_motif.keys())
# Get a list of all the 'values'
list(TF_motif.values())
# sort
sorted(TF_motif.keys())
sorted(TF_motif.values())
</syntaxhighlight>
</syntaxhighlight>


We can retrieve data from dictionaries using the '''items()''' method.
== Type ==
[https://docs.python.org/3/library/functions.html Built-in Functions], [https://www.golinuxcloud.com/python-type-of-variable/#Conclusion How to check type of variable (object) in Python]
<pre>
type(object)
</pre>
 
== List ==
A list is an ordered set of values
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
for name,seq in seqs.item():
gene_expr=['gene', 5.16e-08, 0.001385, 7.33e-08]
    print(name, seq)
print(gene_expr[2]
gene_expr[0]='Lif'
</syntaxhighlight>
</syntaxhighlight>


In summary, '''strings''', '''lists''' and '''dictionaries''' are most useful data types for bioinformatics.
Slice a list (it will create a new list)
<syntaxhighlight lang='python'>
gene_expr[-3:]  # [5.16e-08, 0.001385, 7.33e-08]
gene_expr[1:3] = [6.09e-07]
</syntaxhighlight>


== '''if''' statement ==
Clear the list
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
dna=input('Enter DNA sequence: ')
gene_expr[]=[]
if 'n' in dna :
  nbases=dna.count('n')
  print("dna sequence has %d undefined bases " % nbases)
 
if condtion 1:
  do action 1
elif condition 2:
  do action 2
else:
  do action 3
</syntaxhighlight>
</syntaxhighlight>


== Logical operators ==
=== List functions ===
Use and, or, not.
Size of the list
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
dna=input('Enter DNA sequence: ')
len(gene_expr)  
if 'n' in dna or 'N' in dna:
    nbases=dna.count('n')+dna.count('N')
    print("dna sequence has %d undefined bases " % nbases)
else:
    print("dna sequence has no undefined bases)
</syntaxhighlight>
</syntaxhighlight>


== Loops ==
Delete an element
while
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
dna=input('Enter DNA sequence:')
del gene_expr[1]
pos=dna.find('gt', 0)
 
while pos>-1 :
    print("Donar splice site candidate at position %d" %pos)
    pos=dna.find('gt', pos+1)
</syntaxhighlight>
</syntaxhighlight>


for
Extend/append to a list
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
motifs=["attccgt", "aggggggttttttcg", "gtagc"]
gene_expr).extend([5.16e-08, 0.00123])
for m in motifs:
    print(m, len(m))
</syntaxhighlight>
</syntaxhighlight>


range
Count the number of times an element appears in a list
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
for i in range(4):
print(gene_expr.count('Lif'), gene_expr.count('gene'))
    print(i)
</syntaxhighlight>
for i in range(1,10,2):
    print(i)
</syntaxhighlight>


Problem: find all characters in a given protein sequence are valid amino acids.
Reverse all elements in a list
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
protein='SDVIHRYKUUPAKSHGWYVCJRSRFTWMVWWRFRSCRA'
gene_expr.reverse()
for i in range(len(protein)):
print(gene_expr)
    if protein[i] not in 'ABCDEFGHIKLMNPQRSTVWXYZ':
help(list)
        print("this is not a valid protein sequence!")
        break
</syntaxhighlight>
</syntaxhighlight>


continue
Lists as Stacks
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
protein='SDVIHRYKUUPAKSHGWYVCJRSRFTWMVWWRFRSCRA'
stack=['a', 'b', 'c', 'd']
corrected_protein=''
stack.append('e')
for i in range(len(protein)):
    if protein[i] not in 'ABCDEFGHIKLMNPQRSTVWXYZ':
        continue
    corrected_protein=corrected_protein+protein[i]
print("COrrected protein seq is %s" % corrected_protein)
</syntaxhighlight>
</syntaxhighlight>


else Statement used with loops
Sorting lists
* If used with a for loop, the else statement is executed when the loop has exhausted iterating the list
* If used with a while loop, the else statement is executed when the condition becomes false
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
# Find all prime numbers smaller than a given integer
mylist=[3, 31, 123, 1, 5]
N=10
sorted(mylist)
for y in range(2, N):
mylist  # not changed
    for x in range(2, y):
mylist.sort()
        if y %x == 0:
 
            print(y, 'equals', x, '*', y//x)
mylist=['c', 'g', 'T', 'a', 'A']
            break
mylist.sort()
        else:
            // loop fell through without finding a factor
            print(y, 'is a prime number')
</syntaxhighlight>
</syntaxhighlight>


The '''pass''' statement is a placeholder
Don't change an element in a string!
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
if motif not in dna:
 
    pass
motif = 'nacggggtc'
else:
motif[0] = 'a'    # ERROR
    some_function_here()
</syntaxhighlight>
</syntaxhighlight>


== Functions ==
== Tuples ==
[https://opensource.com/article/19/7/get-modular-python-functions Get modular with Python function]
A tuple consists of a number of values separated by commas, and is another standard sequence data type, like strings and lists.
 
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
def function_name(arguments) :
t=1,2,3
    function_code_block
t
    return output
t=(1,2,3) # we may input tuples with or without surrounding parentheses
</syntaxhighlight>
</syntaxhighlight>


For example,
== Sets ==
A set is an unordered collection with no duplicate elements.
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
def gc(dna) :
brca1={'DNA repair', 'zine ion binding'}
    "this function computes the gc perc of a dna seq"
brca2={protein binding', 'H4 histone'}
    nbases=dna.count('n')+dna.count('n')
brca1 | brca2
    gcpercent=float(dna.count('c')+dna.count('C')+dna.count('g)
brca1 & brca2
+dna.count('G'))*100.0/(len(dna)-nbases)
brca1 - brca2
    return gcpercent
gc('AAAAGTNNAGTCC')
help(gc)
</syntaxhighlight>
</syntaxhighlight>


=== SyntaxError: invalid syntax ===
== Dictionaries ==
https://stackoverflow.com/a/11890194
A '''dictionary''' is an unordered set of ''key'' and ''value'' pairs, with the requirement that the keys are unique (within on dictionary).


On the Python shell add an empty line at the end of function definition. Eg
<syntaxhighlight lang='python'>
<pre>
TF_motif = {'SP1' : 'gggcgg',
>>> def fun(a):
            'C/EBP' : 'attgcgcaat',
...    return a+1
            'ATF' : 'tgacgtca',
...
            'c-Myc' : 'cacgtg',
>>> fun(9)
            'Oct-1' : 'atgcaaat'}
10
# Access
>>> exit()
print("The recognition sequence for the ATF transcription is %s." % TF_motif['ATF'])
</pre>
# Update
TF_motif['AP-1'] = 'tgagtca'
# Delete
del TF_motif['SP1']
# Size of a list
len(TF_motif)
# Get a list of all the 'keys' in a dictionary
list(TF_motif.keys())
# Get a list of all the 'values'
list(TF_motif.values())
# sort
sorted(TF_motif.keys())
sorted(TF_motif.values())
</syntaxhighlight>


On a python script
We can retrieve data from dictionaries using the '''items()''' method.
<pre>
<syntaxhighlight lang='python'>
def fun(a):
for name,seq in seqs.item():
     return a+1
     print(name, seq)
print fun(9)
</syntaxhighlight>
</pre>


=== Debug functions ===
In summary, '''strings''', '''lists''' and '''dictionaries''' are most useful data types for bioinformatics.
https://stackoverflow.com/a/4929267


You can launch a Python program through [https://docs.python.org/3/library/pdb.html pdb] by using '''pdb myscript.py''' or '''python -m pdb myscript.py'''
== '''if''' statement ==
<syntaxhighlight lang='python'>
dna=input('Enter DNA sequence: ')
if 'n' in dna :
  nbases=dna.count('n')
  print("dna sequence has %d undefined bases " % nbases)


<pre>
if condtion 1:
$ cat debug.py
  do action 1
def fun(a):
elif condition 2:
    a= a*2
  do action 2
    a= a*3
else:
    return a+1
  do action 3
print fun(5)
</syntaxhighlight>


$ python -m pdb debug.py
== Logical operators ==
> /home/pi/Downloads/debug.py(1)<module>()
Use and, or, not.
-> def fun(a):
(Pdb) b fun
Breakpoint 1 at /home/pi/Downloads/debug.py:1
(Pdb) c
> /home/pi/Downloads/debug.py(2)fun()
-> a= a*2
(Pdb) n
> /home/pi/Downloads/debug.py(3)fun()
-> a= a*3
(Pdb)
> /home/pi/Downloads/debug.py(4)fun()
-> return a+1
(Pdb) p a
30
(Pdb) n
--Return--
> /home/pi/Downloads/debug.py(4)fun()->31
-> return a+1
(Pdb) exit
</pre>
 
=== Boolean functions ===
Problem: checks if a given dna seq contains an in-frame stop condon
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
dna=input("Enter a dna seq: ")
dna=input('Enter DNA sequence: ')
if (has_stop_codon(dna)) :
if 'n' in dna or 'N' in dna:
     print("input seq has an in frame stop codon.")
    nbases=dna.count('n')+dna.count('N')
else :
     print("dna sequence has %d undefined bases " % nbases)
     print("input seq has no in frame stop codon.")
else:
     print("dna sequence has no undefined bases)
</syntaxhighlight>
 
== Loops ==
while
<syntaxhighlight lang='python'>
dna=input('Enter DNA sequence:')
pos=dna.find('gt', 0)


def has_stop_codon(dna) :
while pos>-1 :
     "This function checks if given dna seq has in frame stop codons."
     print("Donar splice site candidate at position %d" %pos)
     stop_codon_found=False
     pos=dna.find('gt', pos+1)
    stop_codons=['tga', 'tag', 'taa']
    for i in range(0, len(dna), 3) :
        codon=dna[i:i+3].lower()
        if codon in stop_codons :
            stop_codon_found=True
            break
    return stop_codon_found
</syntaxhighlight>
</syntaxhighlight>


=== Function default parameter values ===
for
Suppose the has_stop_codon function also accepts a frame argument (equal to 0, 1, or 2) which specifies in what frame we want to look for stop codons.
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
def has_stop_codon(dna, frame=0) :
motifs=["attccgt", "aggggggttttttcg", "gtagc"]
    "This function checks if given dna seq has in frame stop codons."
for m in motifs:
    stop_codon_found=False
     print(m, len(m))
    stop_codons=['tga', 'tag', 'taa']
    for i in range(frame, len(dna), 3) :
        codon=dna[i:i+3].lower()
        if codon in stop_codons :  
            stop_codon_found=True
            break
     return stop_codon_found
 
dna="atgagcggccggct"
has_stop_codon(dna)    # False
has_stop_codon(dna, 0) # False
has_stop_codon(dna, 1) # True
has_stop_codon(frame=0, dna=dna)
</syntaxhighlight>
</syntaxhighlight>


=== More examples ===
range
Reverse complement of a dna sequence
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
def reversecomplement(seq):
for i in range(4):
     """Return the reverse complement of the dna string."""
     print(i)
    seq = reverse_string(seq)
for i in range(1,10,2):
    seq = complement(seq)
     print(i)
     return seq
 
reversecomplement('CCGGAAGAGCTTACTTAG')
</syntaxhighlight>
</syntaxhighlight>


To reverse a string
Problem: find all characters in a given protein sequence are valid amino acids.
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
def reverse_string(seq):
protein='SDVIHRYKUUPAKSHGWYVCJRSRFTWMVWWRFRSCRA'
     return seq[::-1]
for i in range(len(protein)):
 
     if protein[i] not in 'ABCDEFGHIKLMNPQRSTVWXYZ':  
reverse_string(dna)
        print("this is not a valid protein sequence!")
        break
</syntaxhighlight>
</syntaxhighlight>


Complement a DNA Sequence
continue
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
def complement(dna):
protein='SDVIHRYKUUPAKSHGWYVCJRSRFTWMVWWRFRSCRA'
    """Return the complementary sequence string."""
corrected_protein=''
    basecomplement = {'A':'T', 'C':'G', 'G':'C', 'T':'A',
for i in range(len(protein)):
                      'N':'N', 'a':t', 'c':'g', 'g':'c', 't':'a', 'n':'n'} # dictionary
    if protein[i] not in 'ABCDEFGHIKLMNPQRSTVWXYZ':  
    letters = list(dna) # list comprehensions
        continue
     letters = [basecomplement[base] for base in letters]
     corrected_protein=corrected_protein+protein[i]
    return ''.join(letters)
print("COrrected protein seq is %s" % corrected_protein)
</syntaxhighlight>
</syntaxhighlight>


=== Split and Join functions ===
else Statement used with loops
* If used with a for loop, the else statement is executed when the loop has exhausted iterating the list
* If used with a while loop, the else statement is executed when the condition becomes false
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
sentence="enzymes and other proteins come in many shapes"
# Find all prime numbers smaller than a given integer
sentence.split() # split on all whitespaces
N=10
sentence.split('and') # use 'and' as the separator
for y in range(2, N):
 
    for x in range(2, y):
'-'.join(['enzymes', 'and', 'other', 'proteins', 'come', 'in', 'many', 'shapes'])
        if y %x == 0:
            print(y, 'equals', x, '*', y//x)
            break
        else:
            // loop fell through without finding a factor
            print(y, 'is a prime number')
</syntaxhighlight>
</syntaxhighlight>


=== Variable number of function arguments ===
The '''pass''' statement is a placeholder
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
def newfunction(fi, se, th, *rest):
if motif not in dna:
  print("First: %s" % fi)
    pass
  print("Second: %s" % se)
else:
  print("Third: %s" % th)
    some_function_here()
  print("Rest... %s" % rest)
  return
</syntaxhighlight>
</syntaxhighlight>


== Modules and packages ==
== Functions ==
* [https://www.programiz.com/python-programming/modules Python Modules]
[https://opensource.com/article/19/7/get-modular-python-functions Get modular with Python function]
* [https://www.w3schools.com/python/python_modules.asp Python Modules] from w3schools
 
* [https://realpython.com/python-import/ Python import: Advanced Techniques and Tips]
<syntaxhighlight lang='python'>
* [https://docs.python.org/3/py-modindex.html Python Module Index]
def function_name(arguments) :
    function_code_block
    return output
</syntaxhighlight>


'''Packages''' group multiple '''modules''' under on name, by using "dotted module names". For example, the module name A.B designates a submodule named B in a package named A. See [https://stackoverflow.com/questions/7948494/whats-the-difference-between-a-python-module-and-a-python-package What's the difference between a Python module and a Python package?]
For example,
 
<dnautil.py>
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
#!/usr/bin/python
"""
dnautil module contains a few useful functions for dna seq
"""
def gc(dna) :
def gc(dna) :
     blah
     "this function computes the gc perc of a dna seq"
     blah
     nbases=dna.count('n')+dna.count('n')
    gcpercent=float(dna.count('c')+dna.count('C')+dna.count('g)
+dna.count('G'))*100.0/(len(dna)-nbases)
     return gcpercent
     return gcpercent
gc('AAAAGTNNAGTCC')
help(gc)
</syntaxhighlight>
</syntaxhighlight>


When a module is imported, Python first searches for a built-in module with that name.
=== SyntaxError: invalid syntax ===
https://stackoverflow.com/a/11890194


If built-in module is not found, Python then searches for a file obtained by
On the Python shell add an empty line at the end of function definition. Eg
adding the extension .py to the name of the module that it's imported:
<pre>
* in your current directory,
>>> def fun(a):
* the directory where Python has been installed
...    return a+1
* in a path, i.e., a colon(':') separated list of file paths, stored in the environment variable PYTHONPATH.
...
>>> fun(9)
10
>>> exit()
</pre>
 
On a python script
<pre>
def fun(a):
    return a+1
print fun(9)
</pre>


You can use the '''sys.path''' variable from the '''sys''' built-in module to check the list of all directories where Python look for files
=== Debug functions ===
<syntaxhighlight lang='python'>
https://stackoverflow.com/a/4929267
import sys
sys.path
</syntaxhighlight>


If the sys.path variable does not contains the directory where you put your module you can extend it:
You can launch a Python program through [https://docs.python.org/3/library/pdb.html pdb] by using '''pdb myscript.py''' or '''python -m pdb myscript.py'''
<syntaxhighlight lang='python'>
sys.path.append("/home/$USER/python")
sys.path
</syntaxhighlight>


=== Using modules (from PACKAGE/DIRNAME/FILENAME import CLASS) ===
<pre>
<syntaxhighlight lang='python'>
$ cat debug.py
from math import *
def fun(a):
print(floor(3.7))
    a= a*2
    a= a*3
    return a+1
print fun(5)


import dnautil
$ python -m pdb debug.py
dna="atgagggctaggt"
> /home/pi/Downloads/debug.py(1)<module>()
gc(dna)         # gc is not defined
-> def fun(a):
dnautil.gc(dna) # Good
(Pdb) b fun
</syntaxhighlight>
Breakpoint 1 at /home/pi/Downloads/debug.py:1
 
(Pdb) c
Import Names from a Module
> /home/pi/Downloads/debug.py(2)fun()
<syntaxhighlight lang='python'>
-> a= a*2
from dnautil import *
(Pdb) n
gc(dna)         # OK
> /home/pi/Downloads/debug.py(3)fun()
-> a= a*3
(Pdb)
> /home/pi/Downloads/debug.py(4)fun()
-> return a+1
(Pdb) p a
30
(Pdb) n
--Return--
> /home/pi/Downloads/debug.py(4)fun()->31
-> return a+1
(Pdb) exit
</pre>


from dnautil import gc, has_stop_codon
=== Boolean functions ===
Problem: checks if a given dna seq contains an in-frame stop condon
<syntaxhighlight lang='python'>
dna=input("Enter a dna seq: ")
if (has_stop_codon(dna)) :
    print("input seq has an in frame stop codon.")
else :
    print("input seq has no in frame stop codon.")
 
def has_stop_codon(dna) :
    "This function checks if given dna seq has in frame stop codons."
    stop_codon_found=False
    stop_codons=['tga', 'tag', 'taa']
    for i in range(0, len(dna), 3) :
        codon=dna[i:i+3].lower()
        if codon in stop_codons :
            stop_codon_found=True
            break
    return stop_codon_found
</syntaxhighlight>
</syntaxhighlight>


[https://opensource.com/article/19/7/get-modular-python-functions Get modular with Python functions] & [https://opensource.com/article/19/7/get-modular-python-classes Learn object-oriented programming with Python] from opensource.com.
=== Function default parameter values ===
Suppose the has_stop_codon function also accepts a frame argument (equal to 0, 1, or 2) which specifies in what frame we want to look for stop codons.
<syntaxhighlight lang='python'>
def has_stop_codon(dna, frame=0) :
    "This function checks if given dna seq has in frame stop codons."
    stop_codon_found=False
    stop_codons=['tga', 'tag', 'taa']
    for i in range(frame, len(dna), 3) :
        codon=dna[i:i+3].lower()
        if codon in stop_codons :
            stop_codon_found=True
            break
    return stop_codon_found


=== from...import vs import vs import...as ===
dna="atgagcggccggct"
<ul>
has_stop_codon(dna)    # False
<li>[https://vinesmsuic.github.io/2020/06/05/python-import-vs-fromimport/ Difference between 'import' and 'from...import' in Python] </li>
has_stop_codon(dna, 0) # False
<li>[https://core-electronics.com.au/tutorials/import-from-as-python.html Import, From and As Keywords in Python] </li>
has_stop_codon(dna, 1) # True
<li>[https://stackoverflow.com/a/21547572 `from … import` vs `import .`]  </li>
has_stop_codon(frame=0, dna=dna)
<li>[http://www.wellho.net/mouth/418_Difference-between-import-and-from-in-Python.html Difference between import and from in Python].
Python's '''import''' loads a Python module into its own namespace, so that you have to add the module name followed by a dot in front of references to any names from the imported module that you refer to:
<syntaxhighlight lang='python'>
import feathers
duster = feathers.ostrich("South Africa")
</syntaxhighlight>
</syntaxhighlight>
'''from''' loads a Python module into the current namespace, so that you can refer to it without the need to mention the module name again:
 
=== More examples ===
Reverse complement of a dna sequence
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
from feathers import *
def reversecomplement(seq):
duster = ostrich("South Africa")
    """Return the reverse complement of the dna string."""
    seq = reverse_string(seq)
    seq = complement(seq)
    return seq
 
reversecomplement('CCGGAAGAGCTTACTTAG')
</syntaxhighlight>
</syntaxhighlight>
<ul>
<li>Question: Why are both import and from provided? Can't I always use from?


Answer: If you were to load a lot of modules using from, you would find sooner or later that there was a conflict of names; from is fine for a small program but if it was used throughout a big program, you would hit problems from time to time
To reverse a string
</li>
<li>
Question: Should I always use import then?
 
Answer: No ... '''use import most of the time, but use from is you want to refer to the members of a module many, many times in the calling code'''; that way, you save yourself having to write "feather." (in our example) time after time, but yet you don't end up with a cluttered namespace. You could describe this approach as being the best of both worlds.
</li>
</ul>
<li>[https://stackoverflow.com/a/22245722 from … import OR import … as for modules] </li>
<li>Some examples:
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
from numpy import array  # Run file; load specific 'attribute'
def reverse_string(seq):
arr = array([1,2,3])   # Use name directly: no need to qualify
    return seq[::-1]
print(arr) # print [1 2 3]


from math import pi
reverse_string(dna)
pi # 3.141592653589793
</syntaxhighlight>
math.pi  # NameError: name 'math' is not defined
 
Complement a DNA Sequence
<syntaxhighlight lang='python'>
def complement(dna):
    """Return the complementary sequence string."""
    basecomplement = {'A':'T', 'C':'G', 'G':'C', 'T':'A',
                      'N':'N', 'a':t', 'c':'g', 'g':'c', 't':'a', 'n':'n'} # dictionary
    letters = list(dna) # list comprehensions
    letters = [basecomplement[base] for base in letters]
    return ''.join(letters)
</syntaxhighlight>
</syntaxhighlight>


VS
=== Split and Join functions ===
 
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
import numpy  # Run file; load module as a whole
sentence="enzymes and other proteins come in many shapes"
arr = numpy.array([1,2,3])  # Use its attribute names: '.' to qualify
sentence.split()  # split on all whitespaces
print(arr) # print [1 2 3]
sentence.split('and') # use 'and' as the separator


import math
'-'.join(['enzymes', 'and', 'other', 'proteins', 'come', 'in', 'many', 'shapes'])
math.pi # 3.141592653589793
dir(math)
</syntaxhighlight>
</syntaxhighlight>


VS
=== Variable number of function arguments ===
 
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
import numpy as np
def newfunction(fi, se, th, *rest):
dir(np)
  print("First: %s" % fi)
 
  print("Second: %s" % se)
import math as m
  print("Third: %s" % th)
m.pi # 3.141592653589793
  print("Rest... %s" % rest)
  return
</syntaxhighlight>
</syntaxhighlight>
</li>
<li>[https://github.com/qianhuiSenn/scRNA_cell_deconv_benchmark/blob/master/Scripts/Result_preprocessing/metrics_calculation_example.ipynb scRNA_cell_deconv_benchmark] example.
</li>
</ul>


=== help ===
== Modules and packages ==
<pre>
* [https://www.programiz.com/python-programming/modules Python Modules]
from AAA import BBB
* [https://www.w3schools.com/python/python_modules.asp Python Modules] from w3schools
help(BBB)
* [https://realpython.com/python-import/ Python import: Advanced Techniques and Tips]
help(BBB.FunctionName)
* [https://docs.python.org/3/py-modindex.html Python Module Index]


import BBB as CCC
'''Packages''' group multiple '''modules''' under on name, by using "dotted module names". For example, the module name A.B designates a submodule named B in a package named A. See [https://stackoverflow.com/questions/7948494/whats-the-difference-between-a-python-module-and-a-python-package What's the difference between a Python module and a Python package?]
help(CCC)
</pre>


=== Packages & __init__.py ===
<dnautil.py>
Each package in Python is a directory which MUST contain a special file __init__.py. This file can be empty and it indicates that the directory it contains is a Python package, so it can be imported the same way a module can be imported. https://docs.python.org/2/tutorial/modules.html
<syntaxhighlight lang='python'>
#!/usr/bin/python
"""
dnautil module contains a few useful functions for dna seq
"""
def gc(dna) :
    blah
    blah
    return gcpercent
</syntaxhighlight>


Example: suppose you have several modules dnautil.py, rnautil.py , and proteinutil.py. You want to group them in a package called "bioseq" which processes all types of biological sequences. The structure of the package:
When a module is imported, Python first searches for a built-in module with that name.
<pre>
bioseq/
  __init__.py
  dnautil.py
  rnautil.py
  proteinutil.py
  fasta/
    __init__.py
    fastautil.py
  fastq/
    __init__.py
    fastqutil.py
</pre>


Loading from packages:
If built-in module is not found, Python then searches for a file obtained by
<syntaxhighlight lang='python'>
adding the extension .py to the name of the module that it's imported:
import bioseq.dnautil
* in your current directory,
bioseq.dnautil.gc(dna)
* the directory where Python has been installed
* in a path, i.e., a colon(':') separated list of file paths, stored in the environment variable PYTHONPATH.


from bioseq import dnautil
You can use the '''sys.path''' variable from the '''sys''' built-in module to check the list of all directories where Python look for files
dnautil.gc(dna)
<syntaxhighlight lang='python'>
 
import sys
from bioseq.fasta.fastautil import fastqseqread
sys.path
</syntaxhighlight>
</syntaxhighlight>


=== Example ===
If the sys.path variable does not contains the directory where you put your module you can extend it:
[https://www.youtube.com/watch?v=rfscVS0vtbw&t=14257s Building a Multiple Choice Quiz] by freeCodeCamp.org
 
'''QuestionFile.py'''
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
class Question:
sys.path.append("/home/$USER/python")
    def __init__(self, prompt, answer):
sys.path
        self.prompt = prompt
        self.answer = answer
</syntaxhighlight>
</syntaxhighlight>


'''app.py'''
=== Using modules (from PACKAGE/DIRNAME/FILENAME import CLASS) ===
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
from QuestionFile import Question
from math import *
print(floor(3.7))


question_prompts = [
import dnautil
    "What color are apples?\n(a) Red/Green\n(b) Purple\n(c) Orange\n\n",
dna="atgagggctaggt"
    "What color are Bananas?\n(a) Teal\n(b) Magenta\n(c) Yellow\n\n",
gc(dna)         # gc is not defined
    "What color are strawberries?\n(a) Yellow\n(b) Red\n(c) Blue\n\n"
dnautil.gc(dna) # Good
]
</syntaxhighlight>


questions = [
Import Names from a Module
    Question(question_prompts[0], "a"),
<syntaxhighlight lang='python'>
    Question(question_prompts[1], "c"),
from dnautil import *
    Question(question_prompts[2], "b")
gc(dna)         # OK
]


def run_test(question):
from dnautil import gc, has_stop_codon
    score = 0
    for question in questions:
        answer = input(question.prompt)
        if answer == question.answer:
            score += 1
    print("You got " + str(score) + " /" + str(len(questions))+ " correct")
 
run_test(questions)
</syntaxhighlight>
</syntaxhighlight>
Run the program by '''python3 app.py'''


== Files - Communicate with the outside  ==
[https://opensource.com/article/19/7/get-modular-python-functions Get modular with Python functions] & [https://opensource.com/article/19/7/get-modular-python-classes Learn object-oriented programming with Python] from opensource.com.
* [https://www.pythonforbeginners.com/files/reading-and-writing-files-in-python Reading and Writing Files in Python]
* [https://realpython.com/read-write-files-python/#iterating-over-each-line-in-the-file Reading and Writing Files in Python (Guide)]


=== from...import vs import vs import...as ===
<ul>
<li>[https://vinesmsuic.github.io/2020/06/05/python-import-vs-fromimport/ Difference between 'import' and 'from...import' in Python] </li>
<li>[https://core-electronics.com.au/tutorials/import-from-as-python.html Import, From and As Keywords in Python] </li>
<li>[https://stackoverflow.com/a/21547572 `from … import` vs `import .`]  </li>
<li>[http://www.wellho.net/mouth/418_Difference-between-import-and-from-in-Python.html Difference between import and from in Python].
Python's '''import''' loads a Python module into its own namespace, so that you have to add the module name followed by a dot in front of references to any names from the imported module that you refer to:
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
f=open('myfile', 'r') # read
import feathers
f=open('myfile')
duster = feathers.ostrich("South Africa")
f=open('myfile', 'w') # write
f=open('myfile', 'a') # append
</syntaxhighlight>
 
=== Take care if a file does not exists ===
<syntaxhighlight lang='python'>
try:
    f = open('myfile')
except IOError:
    print("the file myfile does not exist!!")
</syntaxhighlight>
</syntaxhighlight>
 
'''from''' loads a Python module into the current namespace, so that you can refer to it without the need to mention the module name again:
=== Reading ===
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
for line in f:
from feathers import *
    print(line)
duster = ostrich("South Africa")
</syntaxhighlight>
</syntaxhighlight>
<ul>
<li>Question: Why are both import and from provided? Can't I always use from?


Change positions within a file object
Answer: If you were to load a lot of modules using from, you would find sooner or later that there was a conflict of names; from is fine for a small program but if it was used throughout a big program, you would hit problems from time to time
</li>
<li>
Question: Should I always use import then?
 
Answer: No ... '''use import most of the time, but use from is you want to refer to the members of a module many, many times in the calling code'''; that way, you save yourself having to write "feather." (in our example) time after time, but yet you don't end up with a cluttered namespace. You could describe this approach as being the best of both worlds.
</li>
</ul>
<li>[https://stackoverflow.com/a/22245722 from … import OR import … as for modules] </li>
<li>Some examples:
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
f.seek(0) # go to the beginning of the file
from numpy import array  # Run file; load specific 'attribute'
f.read()
arr = array([1,2,3])   # Use name directly: no need to qualify
print(arr) # print [1 2 3]
 
from math import pi
pi # 3.141592653589793
math.pi  # NameError: name 'math' is not defined
</syntaxhighlight>
</syntaxhighlight>


Read a single line
VS
 
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
f.seek(0)
import numpy  # Run file; load module as a whole
f.readline()
arr = numpy.array([1,2,3]) # Use its attribute names: '.' to qualify
print(arr) # print [1 2 3]
 
import math
math.pi # 3.141592653589793
dir(math)
</syntaxhighlight>
</syntaxhighlight>


Write into a file
VS
<syntaxhighlight lang='python'>
 
f=open("/home/$USER/myfile, 'a)
<syntaxhighlight lang='python'>
f.write("this is a new line")
import numpy as np
f.close()
dir(np)


>>> with open("file.txt", "w") as f:
import math as m
...  f.write(str(object))
m.pi # 3.141592653589793
...
</syntaxhighlight>
</syntaxhighlight>
</li>
<li>[https://github.com/qianhuiSenn/scRNA_cell_deconv_benchmark/blob/master/Scripts/Result_preprocessing/metrics_calculation_example.ipynb scRNA_cell_deconv_benchmark] example.
</li>
</ul>


[https://stackoverflow.com/a/16999000 Importing large tab-delimited .txt file into Python]
=== help ===
<pre>
<pre>
# R
from AAA import BBB
write.table(iris[1:10,], file="iris.txt", sep="\t", quote=F, row.names=F)
help(BBB)
help(BBB.FunctionName)
 
import BBB as CCC
help(CCC)
</pre>


# Python
=== Packages & __init__.py ===
import csv
Each package in Python is a directory which MUST contain a special file __init__.py. This file can be empty and it indicates that the directory it contains is a Python package, so it can be imported the same way a module can be imported. https://docs.python.org/2/tutorial/modules.html
with open('iris.txt') as f:
    reader = csv.reader(f, delimiter="\t")
    d = list(reader)
print(d[0][2])
print(d[1][2])


# Shell
Example: suppose you have several modules dnautil.py, rnautil.py , and proteinutil.py. You want to group them in a package called "bioseq" which processes all types of biological sequences. The structure of the package:
$ python test_csv.py
<pre>
Petal.Length
bioseq/
1.4
  __init__.py
  dnautil.py
  rnautil.py
  proteinutil.py
  fasta/
    __init__.py
    fastautil.py
  fastq/
    __init__.py
    fastqutil.py
</pre>
</pre>
If the data are all numerical, we can use the numpy package.
<pre>
# R
write.table(iris[1:10, 1:4],
            file="~/Downloads/iris2.txt",
            sep="\t", quote=F, row.names=F, col.names=F)


# Python
Loading from packages:
import numpy as np
d = np.loadtxt('iris2.txt', delimiter="\t")
print(d[0][2])
print(d[1][2])
 
# Shell
$ python test_csv2.py
1.4
1.4
</pre>
 
=== Read text file from a URL ===
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
import urllib.request
import bioseq.dnautil
bioseq.dnautil.gc(dna)


url = "http://textfiles.com/adventure/aencounter.txt"
from bioseq import dnautil
file = urllib.request.urlopen(url)
dnautil.gc(dna)


for line in file:
from bioseq.fasta.fastautil import fastqseqread
  print(line.decode('utf-8'))
</syntaxhighlight>
</syntaxhighlight>
* [https://docs.python.org/3.0/library/urllib.request.html urllib.request — extensible library for opening URLs]
* [https://www.guru99.com/accessing-internet-data-with-python.html Python Internet Access using Urllib.Request and urlopen()]


=== Command line arguments ===
=== Example ===
Suppose we run 'python processfasta.py myfile.fa' in the command line, then
[https://www.youtube.com/watch?v=rfscVS0vtbw&t=14257s Building a Multiple Choice Quiz] by freeCodeCamp.org
 
'''QuestionFile.py'''
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
import sys
class Question:
print(sys.argv) #  ['processfasta.py', 'myfile.fa']
    def __init__(self, prompt, answer):
        self.prompt = prompt
        self.answer = answer
</syntaxhighlight>
</syntaxhighlight>


More completely
'''app.py'''
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
#!/usr/bin/python
from QuestionFile import Question
"""
processfasta.py builds a dictionary with all sequences from a FASTA file.
"""


import sys
question_prompts = [
filename=sys.argv[1]
    "What color are apples?\n(a) Red/Green\n(b) Purple\n(c) Orange\n\n",
    "What color are Bananas?\n(a) Teal\n(b) Magenta\n(c) Yellow\n\n",
    "What color are strawberries?\n(a) Yellow\n(b) Red\n(c) Blue\n\n"
]


try:
questions = [
  f = open(filename)
    Question(question_prompts[0], "a"),
except IOError:
    Question(question_prompts[1], "c"),
     print("File %s does not exist!" % filename)
     Question(question_prompts[2], "b")
</syntaxhighlight>
]


Parsing command line arguments with '''getopt'''. Suppose we want to store in the dictionary the sequences bigger than a given length provided in the command line: 'processfasta.py -l 250 myfile.fa'
def run_test(question):
<syntaxhighlight lang='python'>
    score = 0
#!/usr/bin/python
    for question in questions:
import sys
        answer = input(question.prompt)
import getopt
        if answer == question.answer:
            score += 1
    print("You got " + str(score) + " /" + str(len(questions))+ " correct")


def usage():
run_test(questions)
    print """
</syntaxhighlight>
processfasta.py: reads a FASTA file and builds a
Run the program by '''python3 app.py'''
dictionary with all sequence bigger than a given length


processfasta.py [-h] [-l <length>] <filename>
== Files - Communicate with the outside  ==
* [https://www.pythonforbeginners.com/files/reading-and-writing-files-in-python Reading and Writing Files in Python]
* [https://realpython.com/read-write-files-python/#iterating-over-each-line-in-the-file Reading and Writing Files in Python (Guide)]


-h          print this message
<syntaxhighlight lang='python'>
-l <length>  filter all sequences with a length
f=open('myfile', 'r') # read
              smaller than <length>
f=open('myfile')
              (default <length>=0)
f=open('myfile', 'w') # write
<filename>   the file has to be in FASTA format
f=open('myfile', 'a') # append
 
o, a = getopt.getopt(sys.argv[1:], '1:h')
opts = {} # empty dictionary
seqlen=0;
 
for k,v in o:
    opts[k] = v
if 'h' in opts.keys():  # he means the user wants help
    usage(); sys.exit()
if len(a) < 1:
    usage(); sys.exit("input fasta file is missing")
if 'l' in opts.keys():
    if opts['l'] <0 :
        print("length of seq should be positive!"); sys.exit(0);
    seqlen=opts['l']
</syntaxhighlight>
</syntaxhighlight>


=== stdin and stdout ===
=== Take care if a file does not exists ===
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
sys.stdin.read()
try:
 
    f = open('myfile')
sys.stdout.write("Some useful ouput.\n")
except IOError:
 
    print("the file myfile does not exist!!")
sys.stderr.write("Warning: input file was not found\n")
</syntaxhighlight>
</syntaxhighlight>


=== Call external programs ===
=== Reading ===
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
import subprocess
for line in f:
subprocess.call('["ls", "-l"]) # return code indicates the success or failure of the execution
    print(line)
</syntaxhighlight>


subprocess.call('["tophat", "genome_mouse_idx", "PE_reads_1.fq.gz", "PE_reads_2.fq.gz"])
Change positions within a file object
<syntaxhighlight lang='python'>
f.seek(0)  # go to the beginning of the file
f.read()
</syntaxhighlight>
</syntaxhighlight>


== Exceptions ==
Read a single line
[https://www.thegeekstuff.com/2019/05/python-try-except-examples/ 5 Python Examples to Handle Exceptions using try, except and finally]
 
== Debugging ==
[https://www.makeuseof.com/debug-python-code/ How to Debug Your Python Code]
 
== [http://biopython.org/wiki/Main_Page Biopython] & Pubmed ==
* Parsers for various bioinformatics file formats (FASTA, Genbank)
* Access to online services like NCBI Entrez or Pubmed databases
* Interfaces to common bioinformatics programs such as BLAST, Clustalw and others.
 
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
import Bio
f.seek(0)
print(Bio.__version__)
f.readline()
</syntaxhighlight>
</syntaxhighlight>


Running BLAST over the internet
Write into a file
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
from Bio.Blast import NCBIWWW
f=open("/home/$USER/myfile, 'a)
fasta_string = open("myseq.fa").read()
f.write("this is a new line")
result_handle = NCBIWWW.qblast("blastn":, "nt", fasta_string)
f.close()
# blastn is the program to use
# nt is the database to search against
# default output is xml
help(NCBIWWW.qblast)
</syntaxhighlight>


The BLAST record
>>> with open("file.txt", "w") as f:
<syntaxhighlight lang='python'>
...  f.write(str(object))
from Bio.Blast import NCBIXML
...
blast_record = NCBIXML.read(result_handle)
</syntaxhighlight>
</syntaxhighlight>


Parse BLAST output
[https://stackoverflow.com/a/16999000 Importing large tab-delimited .txt file into Python]
<syntaxhighlight lang='python'>
<pre>
len(blast_record.alignments)
# R
write.table(iris[1:10,], file="iris.txt", sep="\t", quote=F, row.names=F)


E_VALUE_THRESH = 0.01
# Python
for alignment in blas_record.alignments:
import csv
  for hsp in alignment.hsps:
with open('iris.txt') as f:
    if hsp.expect < E_VALUE_THRESH:
    reader = csv.reader(f, delimiter="\t")
      print('***Alignment***')
    d = list(reader)
      print('sequence:', alignment.title)
print(d[0][2])
      print('length:', alignment.length)
print(d[1][2])
      print('e value:', hsp.expect)
      print(hsp.query)
      print(hsp.match)
      print(hsp.sbjct)
</syntaxhighlight>


More help with Biopython
# Shell
* Biopython tutorial and cookbook: http://biopython.org/DIST/docs/tutorial/Tutorial.html
$ python test_csv.py
* Biopython FAQ: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc5
Petal.Length
1.4
</pre>
If the data are all numerical, we can use the numpy package.
<pre>
# R
write.table(iris[1:10, 1:4],
            file="~/Downloads/iris2.txt",
            sep="\t", quote=F, row.names=F, col.names=F)


== pubmed_parser ==
# Python
[https://github.com/titipata/pubmed_parser Parser for Pubmed Open-Access XML Subset and MEDLINE XML Dataset]
import numpy as np
d = np.loadtxt('iris2.txt', delimiter="\t")
print(d[0][2])
print(d[1][2])


== pyTest ==
# Shell
* https://wiki.python.org/moin/PyTest
$ python test_csv2.py
* https://docs.python-guide.org/writing/tests/
1.4
1.4
</pre>
 
=== Read text file from a URL ===
<syntaxhighlight lang='python'>
import urllib.request
 
url = "http://textfiles.com/adventure/aencounter.txt"
file = urllib.request.urlopen(url)


== pyc file ==
for line in file:
[https://stackoverflow.com/a/3918716 What is the difference between .py and .pyc files? [duplicate]]. I observe it can cause a problem when I want to modify a python file but it keeps using the old pyc file so my change is not used (Raspbery Pi e-ink example).
  print(line.decode('utf-8'))
</syntaxhighlight>
* [https://docs.python.org/3.0/library/urllib.request.html urllib.request — extensible library for opening URLs]
* [https://www.guru99.com/accessing-internet-data-with-python.html Python Internet Access using Urllib.Request and urlopen()]


== Shutdown or restart OS ==
=== Command line arguments ===
Below is tested on Raspbian
Suppose we run 'python processfasta.py myfile.fa' in the command line, then
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
import os
import sys
os.system('sudo shutdown -h now')
print(sys.argv)  #  ['processfasta.py', 'myfile.fa']
</syntaxhighlight>
</syntaxhighlight>


= Popular python libraries =
More completely
[https://pythontips.com/2013/07/30/20-python-libraries-you-cant-live-without/ 20 Python libraries you can’t live without]
== psutil ==
* [https://programtalk.com/python-examples/psutil.cpu_percent/ psutil.cpu_percent() examples]. Inspired by the e-ink example from Raspberry Pi.
* https://github.com/arvydas/blinkstick-python/wiki/Example%3A-Display-CPU-usage
* https://www.liaoxuefeng.com/wiki/1016959663602400/1183565811281984
<syntaxhighlight lang='python'>
<syntaxhighlight lang='python'>
# pip install psutil --user
#!/usr/bin/python
for x in range(10):
"""
    psutil.cpu_percent(interval=1)
processfasta.py builds a dictionary with all sequences from a FASTA file.
</syntaxhighlight>
"""


== [http://www.numpy.org/ numpy] ==
import sys
* [https://engineering.ucsb.edu/~shell/che210d/numpy.pdf An introduction to Numpy and Scipy]
filename=sys.argv[1]
* https://docs.scipy.org/doc/numpy-dev/user/quickstart.html
* [https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf Cheat sheets]
* [https://www.geeksforgeeks.org/program-to-find-the-sum-of-each-row-and-each-column-of-a-matrix/ Program to find the Sum of each Row and each Column of a Matrix]


== pandas ==
try:
[https://www.makeuseof.com/pandas-manipulate-dataframes/ 30 pandas Commands for Manipulating DataFrames]
  f = open(filename)
except IOError:
    print("File %s does not exist!" % filename)
</syntaxhighlight>


Write a pandas dataframe to a text file using [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html to.csv()]. https://stackoverflow.com/a/41514539
Parsing command line arguments with '''getopt'''. Suppose we want to store in the dictionary the sequences bigger than a given length provided in the command line: 'processfasta.py -l 250 myfile.fa'
<pre>
<syntaxhighlight lang='python'>
a.to_csv('xgboost.txt', header=True, index=True, sep='\t')
#!/usr/bin/python
</pre>
import sys
import getopt


== [https://www.scipy.org/ scipy] ==
def usage():
* [http://blog.nextgenetics.net/?e=94 Hypergeometric test]
    print """
* [http://www2.stat.duke.edu/~ar182/rr/examples-gallery/PermutationTest.html Permutation Test]
processfasta.py: reads a FASTA file and builds a
dictionary with all sequence bigger than a given length


== seaborn ==
processfasta.py [-h] [-l <length>] <filename>
* https://seaborn.pydata.org/
* [https://github.com/georgezakinih/exploratory-data-analysis Examples of performaing Explorator Data Analysis for few public clinical data sets]
* ChatGPT [https://twitter.com/arjunrajlab/status/1650134944641871878 GPT-4]


== matplotlib ==
-h          print this message
* https://matplotlib.org/users/installing.html
-l <length>  filter all sequences with a length
* [https://www.makeuseof.com/draw-graphs-jupyter-notebook/ How to Draw Graphs in Jupyter Notebook]
              smaller than <length>
              (default <length>=0)
<filename>  the file has to be in FASTA format


Installation. <syntaxhighlight lang='bash'>
o, a = getopt.getopt(sys.argv[1:], '1:h')
python -m pip install -U pip
opts = {} # empty dictionary
python -m pip install -U matplotlib
seqlen=0;


# https://stackoverflow.com/a/50328517
for k,v in o:
sudo apt-get install python3.5-tk
    opts[k] = v
if 'h' in opts.keys():  # he means the user wants help
    usage(); sys.exit()
if len(a) < 1:
    usage(); sys.exit("input fasta file is missing")
if 'l' in opts.keys():
    if opts['l'] <0 :
        print("length of seq should be positive!"); sys.exit(0);
    seqlen=opts['l']
</syntaxhighlight>
</syntaxhighlight>


Example. <syntaxhighlight lang='python'>
=== stdin and stdout ===
from sklearn import datasets
<syntaxhighlight lang='python'>
iris = datasets.load_iris()
sys.stdin.read()
import matplotlib.pyplot as plt
iris = iris.data


# Scatterplot
sys.stdout.write("Some useful ouput.\n")
plt.scatter(iris[:,1], iris[:,2])
plt.show()


# Boxplot
sys.stderr.write("Warning: input file was not found\n")
plot.boxplot(iris[:,1])
plt.show()
 
# Histogram
plt.hist(iris[:,1])
plt.show()
</syntaxhighlight>
</syntaxhighlight>


== scikit-learn ==
=== Call external programs ===
[https://scikit-learn.org/stable/index.html scikit-learn]: Machine Learning in Python
<syntaxhighlight lang='python'>
import subprocess
subprocess.call('["ls", "-l"]) # return code indicates the success or failure of the execution


Installation. <syntaxhighlight lang='bash'>
subprocess.call('["tophat", "genome_mouse_idx", "PE_reads_1.fq.gz", "PE_reads_2.fq.gz"])
pip install -U scikit-learn
</syntaxhighlight>
 
Example.
<syntaxhighlight lang='python'>
$ python
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()
</syntaxhighlight>
</syntaxhighlight>


== PyTorch ==
== Exceptions ==
* https://pytorch.org/
[https://www.thegeekstuff.com/2019/05/python-try-except-examples/ 5 Python Examples to Handle Exceptions using try, except and finally]
* [https://github.com/PyTorchLightning/pytorch-lightning PyTorch Lightning] - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.


== feedparser ==
== Debugging ==
[https://fedoramagazine.org/never-miss-magazines-article-build-rss-notification-system/ Never miss a Magazine article — build your own RSS notification system]
[https://www.makeuseof.com/debug-python-code/ How to Debug Your Python Code]


== Boto ==
== [http://biopython.org/wiki/Main_Page Biopython] & Pubmed ==
A Python interface to Amazon Web Services
* Parsers for various bioinformatics file formats (FASTA, Genbank)
* Access to online services like NCBI Entrez or Pubmed databases
* Interfaces to common bioinformatics programs such as BLAST, Clustalw and others.


* http://docs.pythonboto.org/en/latest/
<syntaxhighlight lang='python'>
* https://hpc.nih.gov/training/handouts/object_storage_class_2018_oct.pdf
import Bio
print(Bio.__version__)
</syntaxhighlight>


== PIL, Pillow ==
Running BLAST over the internet
* Installation <syntaxhighlight lang='bash'>sudo apt install python-imaging </syntaxhighlight>
<syntaxhighlight lang='python'>
* [https://stackoverflow.com/a/24103766 How I can load a font file with PIL.ImageFont.truetype without specifying the absolute path?]
from Bio.Blast import NCBIWWW
fasta_string = open("myseq.fa").read()
result_handle = NCBIWWW.qblast("blastn":, "nt", fasta_string)
# blastn is the program to use
# nt is the database to search against
# default output is xml
help(NCBIWWW.qblast)
</syntaxhighlight>


== plotnine ==
The BLAST record
[https://www.r-bloggers.com/2020/11/python-and-r-part-2-visualizing-data-with-plotnine/ Python and R – Part 2: Visualizing Data with Plotnine]
<syntaxhighlight lang='python'>
from Bio.Blast import NCBIXML
blast_record = NCBIXML.read(result_handle)
</syntaxhighlight>


== nltk: Natural Language Toolkit ==
Parse BLAST output
https://www.nltk.org/
<syntaxhighlight lang='python'>
len(blast_record.alignments)


== pygame ==
E_VALUE_THRESH = 0.01
[https://opensource.com/article/20/10/learn-python-ebook Learn Python by creating a video game]
for alignment in blas_record.alignments:
  for hsp in alignment.hsps:
    if hsp.expect < E_VALUE_THRESH:
      print('***Alignment***')
      print('sequence:', alignment.title)
      print('length:', alignment.length)
      print('e value:', hsp.expect)
      print(hsp.query)
      print(hsp.match)
      print(hsp.sbjct)
</syntaxhighlight>


== scanpy ==
More help with Biopython
* [https://github.com/theislab/scanpy scanpy] and the [https://scanpy.readthedocs.io/en/stable/installation.html installation instruction]
* Biopython tutorial and cookbook: http://biopython.org/DIST/docs/tutorial/Tutorial.html
* [https://github.com/chriscainx/mnnpy mnnpy]
* Biopython FAQ: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc5


= Trouble shooting =
== pubmed_parser ==
== ImportError: cannot import name main when running pip ==
[https://github.com/titipata/pubmed_parser Parser for Pubmed Open-Access XML Subset and MEDLINE XML Dataset]
https://stackoverflow.com/a/50187211


== Error: externally-managed-environment ==
== pyTest ==
See [[Python#pipx_.28alternative_to_pip3.29|pipx]]
* https://wiki.python.org/moin/PyTest
* https://docs.python-guide.org/writing/tests/


== TypeError: ‘module’ object is not callable ==
== pyc file ==
I was trying to run "bbknn.py" from [https://github.com/oxwang/fda_scRNA-seq/tree/master/4_Batch_correction/Code here].
[https://stackoverflow.com/a/3918716 What is the difference between .py and .pyc files? [duplicate]]. I observe it can cause a problem when I want to modify a python file but it keeps using the old pyc file so my change is not used (Raspbery Pi e-ink example).


[https://www.thecrazyprogrammer.com/2020/11/typeerror-module-object-is-not-callable.html Solve “TypeError: ‘module’ object is not callable” in Python], [https://stackoverflow.com/a/4534443 TypeError: 'module' object is not callable]
== Shutdown or restart OS ==
Below is tested on Raspbian
<syntaxhighlight lang='python'>
import os
os.system('sudo shutdown -h now')
</syntaxhighlight>


The problem is I have a file called "bbknn.py" and I have "import bbknn" in the code. It will confuse python. The solution is to rename my script file "bbknn.py" ('''avoid MODULE.py''') to other name like "bbknnDemo.py".  
= Popular python libraries =
[https://pythontips.com/2013/07/30/20-python-libraries-you-cant-live-without/ 20 Python libraries you can’t live without]
== psutil ==
* [https://programtalk.com/python-examples/psutil.cpu_percent/ psutil.cpu_percent() examples]. Inspired by the e-ink example from Raspberry Pi.
* https://github.com/arvydas/blinkstick-python/wiki/Example%3A-Display-CPU-usage
* https://www.liaoxuefeng.com/wiki/1016959663602400/1183565811281984
<syntaxhighlight lang='python'>
# pip install psutil --user
for x in range(10):
    psutil.cpu_percent(interval=1)
</syntaxhighlight>


[https://www.quora.com/When-I-import-my-module-in-python-it-automatically-runs-all-of-the-defined-functions-inside-of-it-How-do-I-prevent-it-from-auto-executing-my-functions-but-still-allow-me-to-call-them-in-my-main-script When I import my module in python, it automatically runs all of the defined functions inside of it. How do I prevent it from auto executing my functions, but still allow me to call them in my main script?]
== [http://www.numpy.org/ numpy] ==
* [https://engineering.ucsb.edu/~shell/che210d/numpy.pdf An introduction to Numpy and Scipy]
* https://docs.scipy.org/doc/numpy-dev/user/quickstart.html
* [https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf Cheat sheets]
* [https://www.geeksforgeeks.org/program-to-find-the-sum-of-each-row-and-each-column-of-a-matrix/ Program to find the Sum of each Row and each Column of a Matrix]


== Illegal instruction ==
== pandas ==
I got this error after I called python3 -c 'import scanpy'. [https://hpc.nih.gov/apps/python.html Python on Biowulf].
[https://www.makeuseof.com/pandas-manipulate-dataframes/ 30 pandas Commands for Manipulating DataFrames]


Write a pandas dataframe to a text file using [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html to.csv()]. https://stackoverflow.com/a/41514539
<pre>
<pre>
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
a.to_csv('xgboost.txt', header=True, index=True, sep='\t')
TMPDIR=/tmp bash Miniconda3-latest-Linux-x86_64.sh -p ~/conda -b
</pre>


source ~/conda/etc/profile.d/conda.sh # ~/conda/condabin is added to PATH
== [https://www.scipy.org/ scipy] ==
* [http://blog.nextgenetics.net/?e=94 Hypergeometric test]
* [http://www2.stat.duke.edu/~ar182/rr/examples-gallery/PermutationTest.html Permutation Test]


conda activate base
== seaborn ==
python -V # Python 3.9.4
* https://seaborn.pydata.org/
* [https://github.com/georgezakinih/exploratory-data-analysis Examples of performaing Explorator Data Analysis for few public clinical data sets]
* ChatGPT [https://twitter.com/arjunrajlab/status/1650134944641871878 GPT-4]


conda create -n project1 pandas numpy scipy -y
== matplotlib ==
conda activate project1
* https://matplotlib.org/users/installing.html
pip3 install scanpy bbknn
* [https://www.makeuseof.com/draw-graphs-jupyter-notebook/ How to Draw Graphs in Jupyter Notebook]
ls ~/conda/envs/project1/lib/python3.9/site-packages
# bbknn and scanpy are there 
python3 -c 'import scanpy'
# Illegal instruction


conda info --env
Installation. <syntaxhighlight lang='bash'>
conda deactivate
python -m pip install -U pip
conda remove --all -n project1 -y
python -m pip install -U matplotlib
conda deactivate
</pre>


== No matching distribution found for XXX ==
# https://stackoverflow.com/a/50328517
Got an error ''No matching distribution found for lasagne==0.2.dev1'' when I ran '' 'pip install .' '' on [https://github.com/jaredleekatzman/DeepSurv DeepSurv].
sudo apt-get install python3.5-tk
</syntaxhighlight>


https://github.com/imatge-upc/saliency-salgan-2017/issues/29
Example. <syntaxhighlight lang='python'>
from sklearn import datasets
iris = datasets.load_iris()
import matplotlib.pyplot as plt
iris = iris.data


== Python AttributeError: 'module' object has no attribute 'SSL_ST_INIT' ==
# Scatterplot
See https://stackoverflow.com/a/52398193. I got this message after I ran ''sudo pip install --upgrade cryptography'' and ''pip show cryptography''. The reason I try to upgrade cryptography is the following message
plt.scatter(iris[:,1], iris[:,2])
<pre>
plt.show()
$ pip show protobuf
/home/brb/.local/lib/python2.7/site-packages/pip/_vendor/requests/__init__.py:83:
  RequestsDependencyWarning: Old version of cryptography ([1, 2, 3]) may cause slowdown.
  warnings.warn(warning, RequestsDependencyWarning)
Name: protobuf
...
</pre>


And OpenSSL & pyOpenSSL-0.15.1.egg-inf are under /usr/lib/python2.7/dist-packages directory on my Ubuntu 16.04.
# Boxplot
plot.boxplot(iris[:,1])
plt.show()


Note the following solutions do not work
# Histogram
<pre>
plt.hist(iris[:,1])
$ sudo pip uninstall pyopenssl
plt.show()
$ sudo pip install pyOpenSSL==16.2.0
</syntaxhighlight>
</pre>


I always get an error message
== scikit-learn ==
<pre>
[https://scikit-learn.org/stable/index.html scikit-learn]: Machine Learning in Python
...
  File "/usr/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 118, in <module>
    SSL_ST_INIT = _lib.SSL_ST_INIT
AttributeError: 'module' object has no attribute 'SSL_ST_INIT'
</pre>


And a quick solution is to do '''sudo rm -r /usr/local/lib/python2.7/dist-packages/OpenSSL'''. I also did '''sudo pip install pyopenssl''' but I did not follow [https://stackoverflow.com/a/53195423 this answer] ('''sudo apt install --reinstall python-openssl''').
Installation. <syntaxhighlight lang='bash'>
pip install -U scikit-learn
</syntaxhighlight>


== /usr/bin/env: ‘python’: No such file or directory ==
Example.
On [https://askubuntu.com/a/1234598 Ubuntu 20.04],
<syntaxhighlight lang='python'>
<pre>
$ python
sudo apt-get install python-is-python3
>>> from sklearn import datasets
</pre>
>>> iris = datasets.load_iris()
This solved an error when I used [https://yt-dl.org/ youtube-dl].
>>> digits = datasets.load_digits()
</syntaxhighlight>


= Projects based on python =
== PyTorch ==
* https://pytorch.org/
* [https://github.com/PyTorchLightning/pytorch-lightning PyTorch Lightning] - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.


* [http://kevinmehall.net/p/pithos/ pithos] Pandora on linux
== feedparser ==
* Many Raspberry Pi GPIO projects
[https://fedoramagazine.org/never-miss-magazines-article-build-rss-notification-system/ Never miss a Magazine article — build your own RSS notification system]
* [http://csbio.unc.edu/genescissors/instruction.html GeneScissors] It also requires pip and scikit-learn packages.
* [http://keepnote.org KeepNote] It depends on Python 2.X, [http://www.sqlite.org sqlite] and [http://www.pygtk.org PyGTK].
* [http://www.zim-wiki.org Zim] It depends on Python, Gtk and the python-gtk bindings.
* [http://www.giuspen.com/cherrytree Cherrytree] It depends on Python2, Python-gtk2, Python-gtksourceview2, p7zip-full, python-enchant and python-dbus.


= Send emails =
== Boto ==
* [https://support.google.com/accounts/answer/6010255?hl=en&utm_source=google-account&utm_medium=profile-less-secure-apps-card Less secure apps & your Google Account]. To help keep your account secure, from May 30, 2022, ​​Google no longer supports the use of third-party apps or devices which ask you to sign in to your Google Account using only your username and password.
A Python interface to Amazon Web Services
** [https://pythonassets.com/posts/send-email-via-gmail-and-smtp/ Send Email via Gmail and SMTP] Use an '''App Password''' 2022/9. Click on Security -> 2-Step Verification (You may need to enter your PW first). Scroll to the bottom of the page, and you'll see the "App passwords" section. You can delete/create app passwords but you can't view any existing passwords.
* [https://www.makeuseof.com/python-send-email/ How to Send Automated Email Messages in Python 3] 2021/3


= GUI programming =
* http://docs.pythonboto.org/en/latest/
[https://www.raspberrypi.org/blog/create-graphical-user-interfaces-with-python/ New book: Create Graphical User Interfaces with Python]
* https://hpc.nih.gov/training/handouts/object_storage_class_2018_oct.pdf


== Qt for GUI development ==
== PIL, Pillow ==
* http://zetcode.com/gui/pyqt4/
* Installation <syntaxhighlight lang='bash'>sudo apt install python-imaging </syntaxhighlight>
* http://wiki.wildsong.biz/index.php/PyQt Create GUI in Qt Designer and convert/use it in PyQt.
* [https://stackoverflow.com/a/24103766 How I can load a font file with PIL.ImageFont.truetype without specifying the absolute path?]


= Python 3 =
== plotnine ==
* Python 2.7 will not be maintained past 2020. See https://pythonclock.org/.
[https://www.r-bloggers.com/2020/11/python-and-r-part-2-visualizing-data-with-plotnine/ Python and R – Part 2: Visualizing Data with Plotnine]
* [https://github.com/arogozhnikov/python3_with_pleasure Migrating to Python 3 with pleasure]
* [https://github.com/wesm/pydata-book Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython] by Wes McKinney.


== pip3 ==
== nltk: Natural Language Toolkit ==
Use '''pip3''' instead of '''pip''' for Python 3. For example,
https://www.nltk.org/
<syntaxhighlight lang='bash'>
pip3 install --upgrade pip


pip3 install -U scikit-learn
== pygame ==
[https://opensource.com/article/20/10/learn-python-ebook Learn Python by creating a video game]


pip3 install -U matplotlib
== scanpy ==
</syntaxhighlight>
* [https://github.com/theislab/scanpy scanpy] and the [https://scanpy.readthedocs.io/en/stable/installation.html installation instruction]
* [https://github.com/chriscainx/mnnpy mnnpy]


== http.server ==
= Trouble shooting =
[https://developers.google.com/web/tools/chrome-devtools/workspaces/ Edit Files With Workspaces]. The 'http.server' module is contained in python3.
== ImportError: cannot import name main when running pip ==
<pre>
https://stackoverflow.com/a/50187211
cd ~/website
python3 -m http.server
</pre>


= C vs Python =
== Error: externally-managed-environment ==
[https://www.makeuseof.com/c-python-core-differences/ C vs. Python: The Key Differences]
See [[Python#pipx_.28alternative_to_pip3.29|pipx]]
 
== TypeError: ‘module’ object is not callable ==
I was trying to run "bbknn.py" from [https://github.com/oxwang/fda_scRNA-seq/tree/master/4_Batch_correction/Code here].
 
[https://www.thecrazyprogrammer.com/2020/11/typeerror-module-object-is-not-callable.html Solve “TypeError: ‘module’ object is not callable” in Python], [https://stackoverflow.com/a/4534443 TypeError: 'module' object is not callable]
 
The problem is I have a file called "bbknn.py" and I have "import bbknn" in the code. It will confuse python. The solution is to rename my script file "bbknn.py" ('''avoid MODULE.py''') to other name like "bbknnDemo.py".
 
[https://www.quora.com/When-I-import-my-module-in-python-it-automatically-runs-all-of-the-defined-functions-inside-of-it-How-do-I-prevent-it-from-auto-executing-my-functions-but-still-allow-me-to-call-them-in-my-main-script When I import my module in python, it automatically runs all of the defined functions inside of it. How do I prevent it from auto executing my functions, but still allow me to call them in my main script?]
 
== Illegal instruction ==
I got this error after I called python3 -c 'import scanpy'. [https://hpc.nih.gov/apps/python.html Python on Biowulf].


= R and Python: reticulate package =
* [https://blog.rstudio.com/2018/03/26/reticulate-r-interface-to-python/ reticulate: R interface to Python] (2018). [https://rstudio.github.io/reticulate Latest version].
<ul>
<li>[https://support.rstudio.com/hc/en-us/articles/360023654474-Installing-and-Configuring-Python-with-RStudio Installing and Configuring Python with RStudio]
<ul>
<li>The instruction is based on ''virtualenv''. But I'm following Biowulf's Python [https://hpc.nih.gov/apps/python.html#envs miniconda instruction] to create a new project/environment. One caveat is I need to run ''source ~/$USER/conda/etc/profile.d/conda.sh'' each time before I start R in order to make conda available OR I need to set the [https://rstudio.github.io/reticulate/reference/miniconda_path.html RETICULATE_MINICONDA_PATH] variable (see below). </li>
<li>The conda-related reticulate functions include conda_create(), [https://www.rdocumentation.org/packages/reticulate/versions/1.20/topics/use_python use_condaenv()], conda_install(), conda_list(), conda_remove() </li>
<li>Use '''py_config()''' to check the current python path and other python versions found. </li>
<li>My example
<pre>
<pre>
library(reticulate)
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
# Assume I followed Biowulf's instruction to create 'project1'
TMPDIR=/tmp bash Miniconda3-latest-Linux-x86_64.sh -p ~/conda -b
Sys.setenv(RETICULATE_MINICONDA_PATH = "~/conda")
 
conda_list()
source ~/conda/etc/profile.d/conda.sh # ~/conda/condabin is added to PATH
use_condaenv("project1", required=T)
 
py_config()
conda activate base
python -V # Python 3.9.4
 
conda create -n project1 pandas numpy scipy -y
conda activate project1
pip3 install scanpy bbknn
ls ~/conda/envs/project1/lib/python3.9/site-packages
# bbknn and scanpy are there 
python3 -c 'import scanpy'
# Illegal instruction
 
conda info --env
conda deactivate
conda remove --all -n project1 -y
conda deactivate
</pre>
</pre>
</li>
 
</ul>
== No matching distribution found for XXX ==
</li>
Got an error ''No matching distribution found for lasagne==0.2.dev1'' when I ran '' 'pip install .' '' on [https://github.com/jaredleekatzman/DeepSurv DeepSurv].
</ul>
 
* [[Rstudio#Python|RStudio -> Python]]
https://github.com/imatge-upc/saliency-salgan-2017/issues/29
* https://cran.r-project.org/web/packages/reticulate/index.html, [https://github.com/rstudio/reticulate Github]
 
** Using Python in R markdown
== Python AttributeError: 'module' object has no attribute 'SSL_ST_INIT' ==
** Importing Python modules and call its functions directly from R — '''import()''' function
See https://stackoverflow.com/a/52398193. I got this message after I ran ''sudo pip install --upgrade cryptography'' and ''pip show cryptography''. The reason I try to upgrade cryptography is the following message
** Sourcing Python scripts — '''source_python()''' function
<pre>
** Python REPL — The '''repl_python()''' function creates an interactive Python console within R.  
$ pip show protobuf
<ul>
/home/brb/.local/lib/python2.7/site-packages/pip/_vendor/requests/__init__.py:83:
<li>[https://rstudio.github.io/reticulate/articles/versions.html Python Version Configuration]. Suppose I have installed miniconda and create a new environment called 'project1'. Then after calling '''source ~/conda/etc/profile.d/conda.sh''' I can start in R
  RequestsDependencyWarning: Old version of cryptography ([1, 2, 3]) may cause slowdown.
  warnings.warn(warning, RequestsDependencyWarning)
Name: protobuf
...
</pre>
 
And OpenSSL & pyOpenSSL-0.15.1.egg-inf are under /usr/lib/python2.7/dist-packages directory on my Ubuntu 16.04.
 
Note the following solutions do not work
<pre>
<pre>
library(reticulate)
$ sudo pip uninstall pyopenssl
use_condaenv("project1", required = TRUE)
$ sudo pip install pyOpenSSL==16.2.0
</pre>
</li>
</ul>
* On my macOS, even I have python3 installed, it still asks to install miniconda (/Users/$USER/Library/r-miniconda). So I get another version of Python3 in '''/Users/$USER/Library/r-miniconda/envs/r-reticulate/bin/python'''.
* I found RStudio IDE is better than PyCharm and Thonny editors.
* Install Python packages https://rstudio.github.io/reticulate/articles/python_packages.html
** Better to have [https://www.anaconda.com/distribution/ anaconda3] installed. 2.26G space is required on macOS.
** Direct running py_install("pandas") would ask me to upgrade virtualenv
** Running virtualenv_create("r-reticulate") and then py_install("pandas") works
* Cheat sheet https://ugoproto.github.io/ugo_r_doc/pdf/reticulate.pdf
* [https://www.brodrigues.co/blog/2018-12-30-reticulate/ R or Python? Why not both? Using Anaconda Python within R with {reticulate}]
* [https://www.listendata.com/2018/03/run-python-from-r.html?m=1 Run Python from R]
* [https://www.statworx.com/de/blog/r-and-python-using-reticulate-to-get-the-best-of-both-worlds/ R and Python: Using reticulate to get the best of both worlds]. Note
** [https://rstudio.github.io/reticulate/articles/r_markdown.html RStudio v1.2 preview release includes support for using reticulate to execute Python chunks within R Notebooks]
** Error from my execution: ''ValueError: 'RBF' is not in list''
* [https://rviews.rstudio.com/2019/03/18/the-reticulate-package-solves-the-hardest-problem-in-data-science-people/ The reticulate package solves the hardest problem in data science: people]
* [https://rviews.rstudio.com/2019/06/10/reticulate-virtualenv-and-python-in-linux/ reticulate, virtualenv, and Python in Linux]
* Bugs
** [https://stackoverflow.com/a/49556037 Pass Python objects to R]: Works. Or use py_run_string()
** [https://stackoverflow.com/a/52542230 Cannot pass R variables to Python]: use source_python()
* [https://github.com/matloff/R-vs.-Python-for-Data-Science R vs Python for data science] by Norm Matloff.
* [https://bensstats.wordpress.com/2020/11/05/rvspython-5-1-making-the-game-even-with-pythons-best-practices/ RvsPython #5.1: Making the Game even with Python’s Best Practices]
* [https://bensstats.wordpress.com/2020/11/04/rvspython-5-using-monte-carlo-to-simulate-%CF%80/ RvsPython #5: Using Monte Carlo To Simulate π]
* [https://www.business-science.io/learn-r/2020/04/20/setup-python-in-r-with-rmarkdown.html How to Run Python's Scikit-Learn in R in 5 minutes]
<ul>
<li>Test python and markdown files
{{Pre}}
def add_three(x):
    z = x + 3
    return z
</pre>
</pre>


I always get an error message
<pre>
<pre>
---
...
title: "R Notebook"
  File "/usr/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 118, in <module>
output: html_notebook
    SSL_ST_INIT = _lib.SSL_ST_INIT
---
AttributeError: 'module' object has no attribute 'SSL_ST_INIT'
</pre>
 
And a quick solution is to do '''sudo rm -r /usr/local/lib/python2.7/dist-packages/OpenSSL'''. I also did '''sudo pip install pyopenssl''' but I did not follow [https://stackoverflow.com/a/53195423 this answer] ('''sudo apt install --reinstall python-openssl''').


```{r}
== /usr/bin/env: ‘python’: No such file or directory ==
library(reticulate)
On [https://askubuntu.com/a/1234598 Ubuntu 20.04],
py_discover_config()
<pre>
x <- 5
sudo apt-get install python-is-python3
source_python("test.py")
</pre>
y <- add_three(x)
This solved an error when I used [https://yt-dl.org/ youtube-dl].
print(y)
```


Pass R variables to Python. Works
= Projects based on python =
```{python}
a = 7
print(r.x)
```


Pass python variables to R. Works.
* [http://kevinmehall.net/p/pithos/ pithos] Pandora on linux
```{r}
* Many Raspberry Pi GPIO projects
py$a
* [http://csbio.unc.edu/genescissors/instruction.html GeneScissors] It also requires pip and scikit-learn packages.
py_run_string("y = 10"); py$y
* [http://keepnote.org KeepNote] It depends on Python 2.X, [http://www.sqlite.org sqlite] and [http://www.pygtk.org PyGTK].
```
* [http://www.zim-wiki.org Zim] It depends on Python, Gtk and the python-gtk bindings.
</pre>
* [http://www.giuspen.com/cherrytree Cherrytree] It depends on Python2, Python-gtk2, Python-gtksourceview2, p7zip-full, python-enchant and python-dbus.
</li>
</ul>
* [https://hutsons-hacks.info/reticulate-webinar-r-and-python-a-happy-union Reticulate webinar – R and Python – a happy union]
* [https://datascienceplus.com/linking-r-and-python-to-retrieve-financial-data-and-plot-a-candlestick/ Linking R and Python to retrieve financial data and plot a candlestick]
* [https://www.r-bloggers.com/2022/04/getting-started-with-python-using-r-and-reticulate/ Getting started with Python using R and reticulate]


== How to quit python ==
= Send emails =
Type '''exit''' and hit Enter. See https://rstudio.github.io/reticulate/.
* [https://support.google.com/accounts/answer/6010255?hl=en&utm_source=google-account&utm_medium=profile-less-secure-apps-card Less secure apps & your Google Account]. To help keep your account secure, from May 30, 2022, ​​Google no longer supports the use of third-party apps or devices which ask you to sign in to your Google Account using only your username and password.
** [https://pythonassets.com/posts/send-email-via-gmail-and-smtp/ Send Email via Gmail and SMTP] Use an '''App Password''' 2022/9. Click on Security -> 2-Step Verification (You may need to enter your PW first). Scroll to the bottom of the page, and you'll see the "App passwords" section. You can delete/create app passwords but you can't view any existing passwords.
* [https://www.makeuseof.com/python-send-email/ How to Send Automated Email Messages in Python 3] 2021/3


== R vs Python ==
= GUI programming =
* [https://www.business-science.io/business/2021/07/12/R-is-for-research-Python-is-for-production.html R is for Research, Python is for Production]
[https://www.raspberrypi.org/blog/create-graphical-user-interfaces-with-python/ New book: Create Graphical User Interfaces with Python]
* [https://datasciencetut.com/difference-between-r-and-python/ Difference between R and Python]


== Call R from Python ==
== Qt for GUI development ==
* [https://github.com/kpj/rwrap rwrap] Seamlessly integrate R packages into Python.
* http://zetcode.com/gui/pyqt4/
* [https://pypi.org/project/rpy2/ rpy2]
* http://wiki.wildsong.biz/index.php/PyQt Create GUI in Qt Designer and convert/use it in PyQt.


= Conda, Anaconda, miniconda =
= Python 3 =
* [[Docker#Conda|Docker]]
* Python 2.7 will not be maintained past 2020. See https://pythonclock.org/.
* [https://hpc.nih.gov/apps/python.html Python on Biowulf]. Users who need stable, reproducible environments are encouraged to install miniconda in their data directory and create their own private environments.
* [https://github.com/arogozhnikov/python3_with_pleasure Migrating to Python 3 with pleasure]
* [https://towardsdatascience.com/a-guide-to-conda-environments-bc6180fc533 The Definitive Guide to Conda Environments]
* [https://github.com/wesm/pydata-book Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython] by Wes McKinney.
 
== pip3 ==
Use '''pip3''' instead of '''pip''' for Python 3. For example,
<syntaxhighlight lang='bash'>
pip3 install --upgrade pip
 
pip3 install -U scikit-learn


== Private environment ==
pip3 install -U matplotlib
[https://hpc.nih.gov/docs/diy_installation/conda.html Conda on Biowulf] & [https://github.com/conda-forge/miniforge#mambaforge mambaforge]
</syntaxhighlight>


== Transfer a conda environment to another computer: YAML files ==
== http.server ==
[https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html Managing environments]
[https://developers.google.com/web/tools/chrome-devtools/workspaces/ Edit Files With Workspaces]. The 'http.server' module is contained in python3.
<pre>
<pre>
# computer 1
cd ~/website
conda env export > environment.yml
python3 -m http.server
</pre>


# computer 2
= C vs Python =
conda env create -f environment.yml
[https://www.makeuseof.com/c-python-core-differences/ C vs. Python: The Key Differences]
</pre>


== Conda environment create, activate, deactivate, info (see a list) ==
= R and Python: reticulate package =
[https://conda.io/projects/conda/en/latest/user-guide/getting-started.html#managing-conda Getting started with conda]. More details are in [https://conda.io/projects/conda/en/latest/user-guide/tasks/index.html Tasks].
* [https://blog.rstudio.com/2018/03/26/reticulate-r-interface-to-python/ reticulate: R interface to Python] (2018). [https://rstudio.github.io/reticulate Latest version].
<syntaxhighlight lang='bash'>
<ul>
conda --version
<li>[https://support.rstudio.com/hc/en-us/articles/360023654474-Installing-and-Configuring-Python-with-RStudio Installing and Configuring Python with RStudio]
 
<ul>
# Manage environment
<li>The instruction is based on ''virtualenv''. But I'm following Biowulf's Python [https://hpc.nih.gov/apps/python.html#envs miniconda instruction] to create a new project/environment. One caveat is I need to run ''source ~/$USER/conda/etc/profile.d/conda.sh'' each time before I start R in order to make conda available OR I need to set the [https://rstudio.github.io/reticulate/reference/miniconda_path.html RETICULATE_MINICONDA_PATH] variable (see below). </li>
conda info --envs  # see a list of environments.  
<li>The conda-related reticulate functions include conda_create(), [https://www.rdocumentation.org/packages/reticulate/versions/1.20/topics/use_python use_condaenv()], conda_install(), conda_list(), conda_remove() </li>
                  # The active environment is the one with an asterisk (*)
<li>Use '''py_config()''' to check the current python path and other python versions found. </li>
# create a new environment
<li>My example
conda create --name myenv
<pre>
# remove an environment
library(reticulate)
conda remove --name myenv --all
# Assume I followed Biowulf's instruction to create 'project1'
 
Sys.setenv(RETICULATE_MINICONDA_PATH = "~/conda")
# Manage Python
conda_list()
conda create --name snakes python=3.5
use_condaenv("project1", required=T)
conda activate snowflakes # activate
py_config()
conda info --envs
</pre>
python --version
</li>
conda activate  # Change your current environment back to the default (base)
</ul>
conda deactivate # exit any python virtualenv
</li>
 
</ul>
# Managing packages
* [[Rstudio#Python|RStudio -> Python]]
conda search beautifulsoup4
* https://cran.r-project.org/web/packages/reticulate/index.html, [https://github.com/rstudio/reticulate Github]
conda install beautifulsoup4
** Using Python in R markdown
conda list
** Importing Python modules and call its functions directly from R — '''import()''' function
 
** Sourcing Python scripts — '''source_python()''' function
# Updating Anaconda or Miniconda
** Python REPL — The '''repl_python()''' function creates an interactive Python console within R.
conda update conda
<ul>
</syntaxhighlight>
<li>[https://rstudio.github.io/reticulate/articles/versions.html Python Version Configuration]. Suppose I have installed miniconda and create a new environment called 'project1'. Then after calling '''source ~/conda/etc/profile.d/conda.sh''' I can start in R
 
<pre>
== Anaconda ==
library(reticulate)
* [https://research.computing.yale.edu/sites/default/files/files/anaconda.pdf Introduction to Anaconda]. Simplifies installation of Python packages
use_condaenv("project1", required = TRUE)
** Platform-independent package manager
</pre>
** Doesn’t require administrative privileges
</li>
** Installs non-Python library dependencies (MKL, HDF5, Boost)
</ul>
** Provides ”virtual environment” capabilities
* On my macOS, even I have python3 installed, it still asks to install miniconda (/Users/$USER/Library/r-miniconda). So I get another version of Python3 in '''/Users/$USER/Library/r-miniconda/envs/r-reticulate/bin/python'''.
** Many channels exist that support additional packages
* I found RStudio IDE is better than PyCharm and Thonny editors.
* [https://docs.anaconda.com/anaconda/install/mac-os/ Install Anaconda on macOS]. Better to use the command line method in order to install it to the user's directory. The new python can be manually loaded into the shell by using '''source ~/.bash_profile'''. Like Ubuntu, ananconda3 is installed under ~/ directory. In addition, '''Anaconda-Navigator''' is available under Finder -> Applications.
* Install Python packages https://rstudio.github.io/reticulate/articles/python_packages.html
** Better to have [https://www.anaconda.com/distribution/ anaconda3] installed. 2.26G space is required on macOS.
** Direct running py_install("pandas") would ask me to upgrade virtualenv
** Running virtualenv_create("r-reticulate") and then py_install("pandas") works
* Cheat sheet https://ugoproto.github.io/ugo_r_doc/pdf/reticulate.pdf
* [https://www.brodrigues.co/blog/2018-12-30-reticulate/ R or Python? Why not both? Using Anaconda Python within R with {reticulate}]
* [https://www.listendata.com/2018/03/run-python-from-r.html?m=1 Run Python from R]
* [https://www.statworx.com/de/blog/r-and-python-using-reticulate-to-get-the-best-of-both-worlds/ R and Python: Using reticulate to get the best of both worlds]. Note
** [https://rstudio.github.io/reticulate/articles/r_markdown.html RStudio v1.2 preview release includes support for using reticulate to execute Python chunks within R Notebooks]
** Error from my execution: ''ValueError: 'RBF' is not in list''
* [https://rviews.rstudio.com/2019/03/18/the-reticulate-package-solves-the-hardest-problem-in-data-science-people/ The reticulate package solves the hardest problem in data science: people]
* [https://rviews.rstudio.com/2019/06/10/reticulate-virtualenv-and-python-in-linux/ reticulate, virtualenv, and Python in Linux]
* Bugs
** [https://stackoverflow.com/a/49556037 Pass Python objects to R]: Works. Or use py_run_string()
** [https://stackoverflow.com/a/52542230 Cannot pass R variables to Python]: use source_python()
* [https://github.com/matloff/R-vs.-Python-for-Data-Science R vs Python for data science] by Norm Matloff.
* [https://bensstats.wordpress.com/2020/11/05/rvspython-5-1-making-the-game-even-with-pythons-best-practices/ RvsPython #5.1: Making the Game even with Python’s Best Practices]
* [https://bensstats.wordpress.com/2020/11/04/rvspython-5-using-monte-carlo-to-simulate-%CF%80/ RvsPython #5: Using Monte Carlo To Simulate π]
* [https://www.business-science.io/learn-r/2020/04/20/setup-python-in-r-with-rmarkdown.html How to Run Python's Scikit-Learn in R in 5 minutes]
<ul>
<ul>
<li>[https://www.digitalocean.com/community/tutorials/how-to-install-the-anaconda-python-distribution-on-ubuntu-16-04 How To Install the Anaconda Python Distribution on Ubuntu 16.04]. As we can see Anaconda3 will be installed under '''/home/$USER/anaconda3'''.
<li>Test python and markdown files
<ul>
{{Pre}}
<li>Download '''Anaconda3-2018.12-Linux-x86_64.sh''' from https://www.anaconda.com/distribution/#download-section </li>
def add_three(x):
<li>bash Anaconda3-2018.12-Linux-x86_64.sh </li>
    z = x + 3
<li>There is a question: <span style="color: red">Do you wish the installer to initialize Anaconda3</span>. If you answer Yes, it will run '''conda init''' & modify ~/.bashrc file. # <span style="color: red"> This will overwrite system's Python</span>. So the default python/python3 will now be in /home/$USER/anaconda3/bin/.
    return z
</pre>
 
<pre>
<pre>
Do you wish the installer to initialize Anaconda3
---
by running conda init? [yes|no]
title: "R Notebook"
[no] >>> yes
output: html_notebook
no change    /home/brb/anaconda3/condabin/conda
---
no change    /home/brb/anaconda3/bin/conda
no change    /home/brb/anaconda3/bin/conda-env
no change    /home/brb/anaconda3/bin/activate
no change    /home/brb/anaconda3/bin/deactivate
no change    /home/brb/anaconda3/etc/profile.d/conda.sh
no change    /home/brb/anaconda3/etc/fish/conf.d/conda.fish
no change    /home/brb/anaconda3/shell/condabin/Conda.psm1
no change    /home/brb/anaconda3/shell/condabin/conda-hook.ps1
no change    /home/brb/anaconda3/lib/python3.8/site-packages/xontrib/conda.xsh
no change    /home/brb/anaconda3/etc/profile.d/conda.csh
modified      /home/brb/.bashrc


==> For changes to take effect, close and re-open your current shell. <==
```{r}
library(reticulate)
py_discover_config()
x <- 5
source_python("test.py")
y <- add_three(x)
print(y)
```


If you'd prefer that conda's base environment not be activated on startup,
Pass R variables to Python. Works
  set the auto_activate_base parameter to false:
```{python}
a = 7
print(r.x)
```


conda config --set auto_activate_base false
Pass python variables to R. Works.
```{r}
py$a
py_run_string("y = 10"); py$y
```
</pre>
</pre>
If I choose not to modify .bashrc file,
</li>
<pre>
</ul>
Do you wish the installer to initialize Anaconda3
* [https://hutsons-hacks.info/reticulate-webinar-r-and-python-a-happy-union Reticulate webinar – R and Python – a happy union]
by running conda init? [yes|no]
* [https://datascienceplus.com/linking-r-and-python-to-retrieve-financial-data-and-plot-a-candlestick/ Linking R and Python to retrieve financial data and plot a candlestick]
[no] >>> no
* [https://www.r-bloggers.com/2022/04/getting-started-with-python-using-r-and-reticulate/ Getting started with Python using R and reticulate]


You have chosen to not have conda modify your shell scripts at all.
== How to quit python ==
To activate conda's base environment in your current shell session:
Type '''exit''' and hit Enter. See https://rstudio.github.io/reticulate/.


eval "$(/home/brb/anaconda3/bin/conda shell.YOUR_SHELL_NAME hook)"
== R vs Python ==
* [https://www.business-science.io/business/2021/07/12/R-is-for-research-Python-is-for-production.html R is for Research, Python is for Production]
* [https://datasciencetut.com/difference-between-r-and-python/ Difference between R and Python]


To install conda's shell functions for easier access, first activate, then:
== Call R from Python ==
* [https://github.com/kpj/rwrap rwrap] Seamlessly integrate R packages into Python.
* [https://pypi.org/project/rpy2/ rpy2]


conda init
= Conda, Anaconda, miniconda =
* [[Docker#Conda|Docker]]
* [https://hpc.nih.gov/apps/python.html Python on Biowulf]. Users who need stable, reproducible environments are encouraged to install miniconda in their data directory and create their own private environments.
* [https://towardsdatascience.com/a-guide-to-conda-environments-bc6180fc533 The Definitive Guide to Conda Environments]


If you'd prefer that conda's base environment not be activated on startup,
== Private environment ==
  set the auto_activate_base parameter to false:  
[https://hpc.nih.gov/docs/diy_installation/conda.html Conda on Biowulf] & [https://github.com/conda-forge/miniforge#mambaforge mambaforge]


conda config --set auto_activate_base false
== Transfer a conda environment to another computer: YAML files ==
[https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html Managing environments]
<pre>
# computer 1
conda env export > environment.yml


Thank you for installing Anaconda3!
# computer 2
conda env create -f environment.yml
</pre>


</pre></li>
== Conda environment create, activate, deactivate, info (see a list) ==
<li>'''Anaconda-Navigator''' (including jupyter notebook, Spyder IDE, ...) can be launched by typing '''anaconda-navigator''' in a terminal </li>
[https://conda.io/projects/conda/en/latest/user-guide/getting-started.html#managing-conda Getting started with conda]. More details are in [https://conda.io/projects/conda/en/latest/user-guide/tasks/index.html Tasks].
</ul>
<syntaxhighlight lang='bash'>
</li>
conda --version
</ul>
* [https://opensource.com/article/18/4/getting-started-anaconda-python Getting started with Anaconda Python for data science]  
* Differences:
** [https://stackoverflow.com/questions/30034840/what-are-the-differences-between-conda-and-anaconda What are the differences between Conda and Anaconda]
** [https://www.quora.com/What-is-the-comparison-among-conda-vs-pip-vs-anaconda What is the comparison among conda vs pip vs anaconda?]
** [https://bioconda.github.io/faqs.html#conda-anaconda-minconda What’s the difference between Anaconda, conda, and Miniconda?]
* Comparions:
** [https://conda.io/docs/ Conda]: an open source package management system and environment management system
** [https://conda.io/miniconda.html Miniconda], which is a smaller alternative to Anaconda that is just conda and its dependencies. Once you have Miniconda, you can easily install Anaconda into it with '''conda install anaconda'''.
** [https://anaconda.org/ Anaconda]: Anaconda is a set of about a hundred packages including conda, numpy, scipy, ipython notebook, and so on.
*** [https://anaconda.org/r/rstudio RStudio]
*** [https://bioconda.github.io/ Bioconda] is a '''channel''' for the conda package manager specializing in bioinformatics software.
* [https://docs.anaconda.com/anaconda/install/uninstall/ Uninstall]
* Used in [https://github.com/MaxSalm/pdxBlacklist pdxBlacklist]
* [https://stringfestanalytics.com/what-is-open-source-distribution/ What is an open source software distribution?]


== Miniconda ==
# Manage environment
* https://docs.conda.io/en/latest/miniconda.html As you can see miniconda installers were separated by the Python version.
conda info --envs  # see a list of environments.  
* [https://ostechnix.com/how-to-install-miniconda-in-linux/ How To Install Miniconda In Linux] 2021. It includes Install Miniconda interactively, '''unattended installation''', '''Update Miniconda''', and '''Uninstall Miniconda'''. If you've chosen the default location, the installer will display “PREFIX=/var/home/<user>/miniconda3”. To manually activate conda's base environment, do '''/home/<user>/miniconda3/etc/profile.d/conda.sh''' where we assume miniconda is installed under /home/<user>/miniconda3 directory.
                  # The active environment is the one with an asterisk (*)
* [https://youtu.be/bbIG5d3bOmk Miniconda Installation for macOS users] 2019. At the end of installation, we see if we don't want conda's base environment to be activated on start up, we can do '''conda config --set auto_activate_base false'''
# create a new environment
* See also [https://hpc.nih.gov/apps/python.html Python on Biowulf] about how to specify '''prefix'''.
conda create --name myenv
* We can add/install a module to an existing environment. See [https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/background_install/miniconda.html#sidenote Miniconda: Python(s) in a convenient setup].
# remove an environment
:<syntaxhighlight lang='bash'>
conda remove --name myenv --all
conda install -n <env_name> <package>
conda create -n myenv python=3 # create a new environment named “myenv” with Python 3 installed
  # after that, use "conda activate myenv" and use "conda install numpy" to install the numpy
</syntaxhighlight>


=== Install and "conda init" ===
# Manage Python
* Windows: screenshots are included [https://katiekodes.com/setup-python-windows-miniconda/ Setting up Python on Windows with Miniconda by Anaconda] & [https://docs.anaconda.com/free/anaconda/install/windows/ Anaconda documentation]. The default is not to add Anaconda to my PATH environment variable.
conda create --name snakes python=3.5
* Ubuntu: [https://varhowto.com/install-miniconda-ubuntu-20-04/ How to Install Miniconda on Ubuntu 20.04]. After installation, PATH variable will prepend '''~/miniconda3/condabin''' which contains only 1 file: '''conda'''.
conda activate snowflakes # activate
* '''conda init'''
conda info --envs
** Running conda init initializes conda for shell interaction by writing some shell code in the relevant startup scripts of your shell (e.g~/.bashrc) 1. This allows the conda command to interact more closely with the shell context and provides a cleaner PATH manipulation and snappier responses in some conda commands. The main advantage of running conda init is that it enables the use of the [https://docs.conda.io/projects/conda/en/latest/dev-guide/deep-dives/activation.html conda activate and conda deactivate] commands, which are used to activate and deactivate conda environments.
python --version
** We only need to call "conda init" once no matter after we install conda how many conda environments we will work.
conda activate  # Change your current environment back to the default (base)
** One disadvantage of running conda init is that it can sometimes cause issues if the initialization is not done correctly or if there are conflicts with other configurations in your [https://stackoverflow.com/questions/58388190/conda-init-doesnt-work-in-bash-on-windows shell startup scripts]. However, these issues can usually be resolved by troubleshooting and making the necessary changes to your configurat
conda deactivate # exit any python virtualenv


=== conda environment ===
# Managing packages
A conda environment is a directory that contains a specific collection of conda packages that you have installed.
conda search beautifulsoup4
conda install beautifulsoup4
conda list


You need to create an environment first before you can activate it. The conda activate command does not create an environment for you, it only activates an existing one.
# Updating Anaconda or Miniconda
conda update conda
</syntaxhighlight>


== Anaconda ==
* [https://research.computing.yale.edu/sites/default/files/files/anaconda.pdf Introduction to Anaconda]. Simplifies installation of Python packages
** Platform-independent package manager
** Doesn’t require administrative privileges
** Installs non-Python library dependencies (MKL, HDF5, Boost)
** Provides ”virtual environment” capabilities
** Many channels exist that support additional packages
* [https://docs.anaconda.com/anaconda/install/mac-os/ Install Anaconda on macOS]. Better to use the command line method in order to install it to the user's directory. The new python can be manually loaded into the shell by using '''source ~/.bash_profile'''. Like Ubuntu, ananconda3 is installed under ~/ directory. In addition, '''Anaconda-Navigator''' is available under Finder -> Applications.
<ul>
<li>[https://www.digitalocean.com/community/tutorials/how-to-install-the-anaconda-python-distribution-on-ubuntu-16-04 How To Install the Anaconda Python Distribution on Ubuntu 16.04]. As we can see Anaconda3 will be installed under '''/home/$USER/anaconda3'''.
<ul>
<li>Download '''Anaconda3-2018.12-Linux-x86_64.sh''' from https://www.anaconda.com/distribution/#download-section </li>
<li>bash Anaconda3-2018.12-Linux-x86_64.sh </li>
<li>There is a question: <span style="color: red">Do you wish the installer to initialize Anaconda3</span>. If you answer Yes, it will run '''conda init''' & modify ~/.bashrc file. # <span style="color: red"> This will overwrite system's Python</span>. So the default python/python3 will now be in /home/$USER/anaconda3/bin/.
<pre>
<pre>
conda create --name myenv
Do you wish the installer to initialize Anaconda3
 
by running conda init? [yes|no]
conda activate myenv
[no] >>> yes
</pre>
no change    /home/brb/anaconda3/condabin/conda
 
no change    /home/brb/anaconda3/bin/conda
Q: Where are the environments located? A: Conda environments are typically stored in the envs subdirectory of your Anaconda installation directory. For example, if you have an environment named myenv, it would be located in a directory like ~/anaconda3/envs/myenv. The exact path can be found by using '''conda env list''' command.
no change    /home/brb/anaconda3/bin/conda-env
 
no change    /home/brb/anaconda3/bin/activate
Q: How do I list all existing environments? A: To list all existing conda environments, you can use the conda env list or conda info --envs command. Here’s how you do it:
no change    /home/brb/anaconda3/bin/deactivate
<pre>
no change    /home/brb/anaconda3/etc/profile.d/conda.sh
conda env list
no change    /home/brb/anaconda3/etc/fish/conf.d/conda.fish
#
no change    /home/brb/anaconda3/shell/condabin/Conda.psm1
base      * /opt/conda
no change    /home/brb/anaconda3/shell/condabin/conda-hook.ps1
DrivR-Base  /opt/conda/envs/DrivR-Base
no change    /home/brb/anaconda3/lib/python3.8/site-packages/xontrib/conda.xsh
</pre>
no change    /home/brb/anaconda3/etc/profile.d/conda.csh
modified      /home/brb/.bashrc
 
==> For changes to take effect, close and re-open your current shell. <==


Q: How to quit a conda environment?
If you'd prefer that conda's base environment not be activated on startup,
<pre>
  set the auto_activate_base parameter to false:
conda deactivate  # Return to base
conda deactivate  # Exit base
</pre>


Q: Check the disk space used by a specific conda environment.
conda config --set auto_activate_base false
<pre>
du -sh /path/to/conda/envs/your_enviornment_name
</pre>
</pre>
 
If I choose not to modify .bashrc file,
Q: How to delete a conda environemnt,
<pre>
<pre>
conda deactivate
Do you wish the installer to initialize Anaconda3
conda env remove --name your_environment_name
by running conda init? [yes|no]
# OR
[no] >>> no
conda remove --name your_environment_name --all
</pre>


=== Install all anaconda packages ===
You have chosen to not have conda modify your shell scripts at all.
* https://stackoverflow.com/a/52316549 '''conda install anaconda'''
To activate conda's base environment in your current shell session:
* https://docs.conda.io/en/latest/miniconda.html '''conda create -n py3k anaconda python=3'''
* how much space is needed for installing anaconda? The minimum disk space required for installing Anaconda is 3 GB, but it is recommended to have at least 6 GB of free disk space available.


=== Uninstall miniconda ===
eval "$(/home/brb/anaconda3/bin/conda shell.YOUR_SHELL_NAME hook)"
# rm -rf ~/miniconda3
# nano ~/.bash_profile and delete conda initialize block


=== What's the purpose of the “base” (for best practices) in Anaconda? ===
To install conda's shell functions for easier access, first activate, then:
https://stackoverflow.com/a/56504279


=== Does Conda replace the need for virtualenv? ===
conda init
[https://stackoverflow.com/a/34398794 Yes]. Conda is not limited to Python but can be used for other languages too.


== GCC/gFortran ==
If you'd prefer that conda's base environment not be activated on startup,
* '''conda install gcc'''
  set the auto_activate_base parameter to false:  
* Using [https://anaconda.org/conda-forge/gfortran/  conda-forge] channel - '''conda install -c conda-forge gfortran'''


== Using R language with Anaconda ==
conda config --set auto_activate_base false
<ul>
 
<li>[https://docs.anaconda.com/free/anaconda/packages/using-r-language/ Using R language with Anaconda]
Thank you for installing Anaconda3!
<pre>
 
conda create -n r_env r-essentials r-base
</pre></li>
conda activate r_env
<li>'''Anaconda-Navigator''' (including jupyter notebook, Spyder IDE, ...) can be launched by typing '''anaconda-navigator''' in a terminal </li>
</pre>
</ul>
<li>Difference of install a package using '''install.packages()''' function in R and using the '''conda install''' command?  
</li>
* The install.packages() function in R and the conda install command are two different ways to install R packages. The install.packages() function is used to install packages from the Comprehensive R Archive Network (CRAN), while the conda install command is used to install packages from the Anaconda repository.
</ul>
* One key difference between the two methods is that conda can manage dependencies across multiple programming languages, while install.packages() only manages dependencies within R.
* [https://opensource.com/article/18/4/getting-started-anaconda-python Getting started with Anaconda Python for data science]
* Another difference is that conda allows you to create and manage multiple isolated environments, each with its own set of packages. This can be useful if you want to have different versions of packages available for different projects. With install.packages(), all packages are installed in the same global library, which can make it more difficult to manage dependencies and avoid conflicts.
* Differences:
<li>
** [https://stackoverflow.com/questions/30034840/what-are-the-differences-between-conda-and-anaconda What are the differences between Conda and Anaconda]
[https://towardsdatascience.com/a-guide-to-conda-environments-bc6180fc533 The Definitive Guide to Conda Environments], [https://docs.anaconda.com/anaconda/user-guide/tasks/using-r-language/ Using R language with Anaconda]. '''Environments''' created with ''conda create'' live by default in the '''envs/''' folder of your Conda directory, whose path will look something like ''/Users/user-name/miniconda3/envs'' or ''/Users/user-name/anaconda3/envs''.
** [https://www.quora.com/What-is-the-comparison-among-conda-vs-pip-vs-anaconda What is the comparison among conda vs pip vs anaconda?]
<pre>
** [https://bioconda.github.io/faqs.html#conda-anaconda-minconda What’s the difference between Anaconda, conda, and Miniconda?]
Activate conda base                  Create a new env    Activate a new env          Deactivate an env
* Comparions:
----------------------------> (base) ----------------->  -------------------> (r-env) -----------------> (base)
** [https://conda.io/docs/ Conda]: an open source package management system and environment management system
eval $(conda shell.bash hook)"      conda create r-env  conda activate r-env        conda deactivate
** [https://conda.io/miniconda.html Miniconda], which is a smaller alternative to Anaconda that is just conda and its dependencies. Once you have Miniconda, you can easily install Anaconda into it with '''conda install anaconda'''.
</pre>
** [https://anaconda.org/ Anaconda]: Anaconda is a set of about a hundred packages including conda, numpy, scipy, ipython notebook, and so on.
<pre>
*** [https://anaconda.org/r/rstudio RStudio]
$ eval "$(/home/brb/anaconda3/bin/conda shell.bash hook)"
*** [https://bioconda.github.io/ Bioconda] is a '''channel''' for the conda package manager specializing in bioinformatics software.
(base) $ mkdir mypythonproj; cd mypythonproj  # This step seems not necessary
* [https://docs.anaconda.com/anaconda/install/uninstall/ Uninstall]
(base) $ conda create -n r-env r-base
* Used in [https://github.com/MaxSalm/pdxBlacklist pdxBlacklist]
...
* [https://stringfestanalytics.com/what-is-open-source-distribution/ What is an open source software distribution?]
#
 
# To activate this environment, use
== Miniconda ==
#
* https://docs.conda.io/en/latest/miniconda.html As you can see miniconda installers were separated by the Python version.
#    $ conda activate r-env
* [https://ostechnix.com/how-to-install-miniconda-in-linux/ How To Install Miniconda In Linux] 2021. It includes Install Miniconda interactively, '''unattended installation''', '''Update Miniconda''', and '''Uninstall Miniconda'''. If you've chosen the default location, the installer will display “PREFIX=/var/home/<user>/miniconda3”. To manually activate conda's base environment, do '''/home/<user>/miniconda3/etc/profile.d/conda.sh''' where we assume miniconda is installed under /home/<user>/miniconda3 directory.
#
* [https://youtu.be/bbIG5d3bOmk Miniconda Installation for macOS users] 2019. At the end of installation, we see if we don't want conda's base environment to be activated on start up, we can do '''conda config --set auto_activate_base false'''
# To deactivate an active environment, use
* See also [https://hpc.nih.gov/apps/python.html Python on Biowulf] about how to specify '''prefix'''.
#
* We can add/install a module to an existing environment. See [https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/background_install/miniconda.html#sidenote Miniconda: Python(s) in a convenient setup].
#    $ conda deactivate
:<syntaxhighlight lang='bash'>
(base) $ conda activate r-env
conda install -n <env_name> <package>
(r-env) $ ls anaconda3/envs
conda create -n myenv python=3 # create a new environment named “myenv” with Python 3 installed
r-env
  # after that, use "conda activate myenv" and use "conda install numpy" to install the numpy
(r-env) $ conda install r-essentials
</syntaxhighlight>
(r-env) $ which R
 
/home/brb/anaconda3/envs/r-env/bin/R
=== Install and "conda init" ===
(r-env) $ ls -la  # Still Empty
* Windows: screenshots are included [https://katiekodes.com/setup-python-windows-miniconda/ Setting up Python on Windows with Miniconda by Anaconda] & [https://docs.anaconda.com/free/anaconda/install/windows/ Anaconda documentation]. The default is not to add Anaconda to my PATH environment variable.
(r-env) $ R --version
* Ubuntu: [https://varhowto.com/install-miniconda-ubuntu-20-04/ How to Install Miniconda on Ubuntu 20.04]. After installation, PATH variable will prepend '''~/miniconda3/condabin''' which contains only 1 file: '''conda'''.
R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
* '''conda init'''
# Note that the current R version should be 4.0.3
** Running conda init initializes conda for shell interaction by writing some shell code in the relevant startup scripts of your shell (e.g~/.bashrc) 1. This allows the conda command to interact more closely with the shell context and provides a cleaner PATH manipulation and snappier responses in some conda commands. The main advantage of running conda init is that it enables the use of the [https://docs.conda.io/projects/conda/en/latest/dev-guide/deep-dives/activation.html conda activate and conda deactivate] commands, which are used to activate and deactivate conda environments.
(r-env) $ conda env list
** We only need to call "conda init" once no matter after we install conda how many conda environments we will work.
base                    /home/brb/anaconda3
** One disadvantage of running conda init is that it can sometimes cause issues if the initialization is not done correctly or if there are conflicts with other configurations in your [https://stackoverflow.com/questions/58388190/conda-init-doesnt-work-in-bash-on-windows shell startup scripts]. However, these issues can usually be resolved by troubleshooting and making the necessary changes to your configurat
r-env                *  /home/brb/anaconda3/envs/r-env
 
(r-env) $ conda deactivate
=== conda environment ===
(base)  $  
A conda environment is a directory that contains a specific collection of conda packages that you have installed.
</pre>
 
It seems to be better to save the environment inside a project directory. So using '''python -m venv /path/to/new/environment''' method is preferred. You can also use '''conda create --prefix /path/to/new/environment'''. Placing environments outside of the default env/ folder comes with some drawbacks. Read the document of 'The Definitive Guide to Conda Environments'.
You need to create an environment first before you can activate it. The conda activate command does not create an environment for you, it only activates an existing one.
</li>
 
<li> [https://conda-forge.org/ conda-forge channel], [https://conda-forge.org/docs/user/introduction.html A brief introduction], https://anaconda.org/conda-forge/r-base. Following the instruction seems to mess things up though the conda-forge says the latest version is 4.0.3 (3 years late).
<pre>
{{Pre}}
conda create --name myenv
$ eval "$(/home/brb/anaconda3/bin/conda shell.bash hook)"
 
(base) $ conda install -c conda-forge r-base
conda activate myenv
...
</pre>
## Package Plan ##
 
Q: Where are the environments located? A: Conda environments are typically stored in the envs subdirectory of your Anaconda installation directory. For example, if you have an environment named myenv, it would be located in a directory like ~/anaconda3/envs/myenv. The exact path can be found by using '''conda env list''' command.
 
Q: How do I list all existing environments? A: To list all existing conda environments, you can use the conda env list or conda info --envs command. Here’s how you do it:
<pre>
conda env list
#
base      * /opt/conda
DrivR-Base  /opt/conda/envs/DrivR-Base
</pre>
 
Q: How to quit a conda environment?
<pre>
conda deactivate  # Return to base
conda deactivate  # Exit base
</pre>
 
Q: Check the disk space used by a specific conda environment.
<pre>
du -sh /path/to/conda/envs/your_enviornment_name
</pre>
 
Q: How to delete a conda environemnt,
<pre>
conda deactivate
conda env remove --name your_environment_name
# OR
conda remove --name your_environment_name --all
</pre>
 
=== Install all anaconda packages ===
* https://stackoverflow.com/a/52316549 '''conda install anaconda'''
* https://docs.conda.io/en/latest/miniconda.html '''conda create -n py3k anaconda python=3'''
* how much space is needed for installing anaconda? The minimum disk space required for installing Anaconda is 3 GB, but it is recommended to have at least 6 GB of free disk space available.
 
=== Uninstall miniconda ===
# rm -rf ~/miniconda3
# nano ~/.bash_profile and delete conda initialize block
 
=== What's the purpose of the “base” (for best practices) in Anaconda? ===
https://stackoverflow.com/a/56504279
 
=== Does Conda replace the need for virtualenv? ===
[https://stackoverflow.com/a/34398794 Yes]. Conda is not limited to Python but can be used for other languages too.
 
== GCC/gFortran ==
* '''conda install gcc'''
* Using [https://anaconda.org/conda-forge/gfortran/  conda-forge] channel - '''conda install -c conda-forge gfortran'''
 
== Using R language with Anaconda ==
<ul>
<li>[https://docs.anaconda.com/free/anaconda/packages/using-r-language/ Using R language with Anaconda]
<pre>
conda create -n r_env r-essentials r-base
conda activate r_env
</pre>
<li>Difference of install a package using '''install.packages()''' function in R and using the '''conda install''' command?  
* The install.packages() function in R and the conda install command are two different ways to install R packages. The install.packages() function is used to install packages from the Comprehensive R Archive Network (CRAN), while the conda install command is used to install packages from the Anaconda repository.
* One key difference between the two methods is that conda can manage dependencies across multiple programming languages, while install.packages() only manages dependencies within R.
* Another difference is that conda allows you to create and manage multiple isolated environments, each with its own set of packages. This can be useful if you want to have different versions of packages available for different projects. With install.packages(), all packages are installed in the same global library, which can make it more difficult to manage dependencies and avoid conflicts.
<li>
[https://towardsdatascience.com/a-guide-to-conda-environments-bc6180fc533 The Definitive Guide to Conda Environments], [https://docs.anaconda.com/anaconda/user-guide/tasks/using-r-language/ Using R language with Anaconda]. '''Environments''' created with ''conda create'' live by default in the '''envs/''' folder of your Conda directory, whose path will look something like ''/Users/user-name/miniconda3/envs'' or ''/Users/user-name/anaconda3/envs''.
<pre>
Activate conda base                  Create a new env    Activate a new env          Deactivate an env
----------------------------> (base) ----------------->  -------------------> (r-env) -----------------> (base)
eval $(conda shell.bash hook)"      conda create r-env  conda activate r-env        conda deactivate
</pre>
<pre>
$ eval "$(/home/brb/anaconda3/bin/conda shell.bash hook)"
(base) $ mkdir mypythonproj; cd mypythonproj  # This step seems not necessary
(base) $ conda create -n r-env r-base
...
#
# To activate this environment, use
#
#    $ conda activate r-env
#
# To deactivate an active environment, use
#
#    $ conda deactivate
(base) $ conda activate r-env
(r-env) $ ls anaconda3/envs
r-env
(r-env) $ conda install r-essentials
(r-env) $ which R
/home/brb/anaconda3/envs/r-env/bin/R
(r-env) $ ls -la  # Still Empty
(r-env) $ R --version
R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
# Note that the current R version should be 4.0.3
(r-env) $ conda env list
base                    /home/brb/anaconda3
r-env                *  /home/brb/anaconda3/envs/r-env
(r-env) $ conda deactivate
(base)  $  
</pre>
It seems to be better to save the environment inside a project directory. So using '''python -m venv /path/to/new/environment''' method is preferred. You can also use '''conda create --prefix /path/to/new/environment'''. Placing environments outside of the default env/ folder comes with some drawbacks. Read the document of 'The Definitive Guide to Conda Environments'.
</li>
<li> [https://conda-forge.org/ conda-forge channel], [https://conda-forge.org/docs/user/introduction.html A brief introduction], https://anaconda.org/conda-forge/r-base. Following the instruction seems to mess things up though the conda-forge says the latest version is 4.0.3 (3 years late).
{{Pre}}
$ eval "$(/home/brb/anaconda3/bin/conda shell.bash hook)"
(base) $ conda install -c conda-forge r-base
...
## Package Plan ##


   environment location: /home/brb/anaconda3
   environment location: /home/brb/anaconda3
 
 
   added / updated specs:
   added / updated specs:
     - r-base
     - r-base
...
...
Downloading and Extracting Packages
Downloading and Extracting Packages
r-base-3.2.2    ...
r-base-3.2.2    ...
(base) $ R --version
(base) $ R --version
/home/brb/anaconda3/lib/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
/home/brb/anaconda3/lib/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
(base) $ which R
(base) $ which R
/home/brb/anaconda3/bin/R
/home/brb/anaconda3/bin/R
</pre>
</pre>
</li>
</li>
</ul>
 
== Run R with Jupyter notebook ==
<ul>
<li>[https://marketsplash.com/tutorials/r/how-to-use-r-in-jupyter-notebooks/ How To Use R In Jupyter Notebooks: A Step-By-Step Approach]
<pre>
mkdir project; cd project
python3 -m venv myenv
source myenv/bin/activate
pip3 install jupyter
# Same terminal
R  # this is from the global environment
    # so the local environment for R does not work
</pre>
<pre>
install.packages('IRkernel')  # Install IRkernel from within R
        # Make sure build-essential has been installed before
        # running install.packages().
IRkernel::installspec()      # Make IRkernel available to JupyterLab
q()
</pre>
After running these commands in R, you should be able to select R as a kernel when creating a new notebook in JupyterLab.
<pre>
jupyter lab # automatically launch jupyter in a browser
 
Ctrl+c      # stop
deactivate
</pre>
 
<li>[https://docs.anaconda.com/anaconda/navigator/tutorials/r-lang/ Using the R programming language in Jupyter Notebook] (Anaconda)
 
<li>[https://developers.refinitiv.com/en/article-catalog/article/setup-jupyter-notebook-r Setup Jupyter Notebook for R] (Windows OS, no conda)
<li>[https://datatofish.com/r-jupyter-notebook/ How to Add R to Jupyter Notebook (full steps)] using Anaconda
<li>[https://www.storybench.org/install-r-jupyter-notebook/ How to install R on a Jupyter notebook] using homebrew
<li>[http://www.rebeccabarter.com/blog/2017-11-17-ggplot2_tutorial/ ggplot2: Mastering the basics] & [https://github.com/rlbarter/ggplot2-thw Jupyter Notebook]. To set up the Jupyter environment, see the [[Docker#Python_Jupyter_including_R|Docker method]].
<pre>
docker run --rm -p 8888:8888 \
      -e JUPYTER_ENABLE_LAB=yes \
      -v "$PWD":/home/jovyan \
      jupyter/datascience-notebook:r-4.0.3
</pre>
We first have to use "git clone https://github.com/rlbarter/ggplot2-thw.git" to download the repo and "cd ggplot2-thw".  Then after opening http://IP:8888/?token=XXXXXXX we will see "ggplot2.ipynb" on the left panel. Double click the file will open it on the Notebook.
</li>
</ul>
 
== Example 1: GEO2RNAseq ==
[[Genome#GEO2RNAseq|GEO2RNAseq]]
 
== Example 2: p-NET ==
[https://github.com/marakeby/pnet_prostate_paper Biologically informed deep neural network for prostate cancer classification and discovery] and the [https://www.nature.com/articles/s41586-021-03922-4 paper] 2021.
 
= Mamba =
<ul>
<li>'''Mamba''' is a high-performance package manager that is fully compatible with '''Conda''', the package management system widely used in the Python ecosystem. It was developed to provide a faster and more efficient alternative to Conda, addressing some of the performance issues, especially in terms of dependency resolution and package installation speed.
<li> https://github.com/mamba-org/mamba The Fast Cross-Platform Package Manager
<li> [https://hpc.nih.gov/apps/python.html#envs Biowulf]. Mambaforge: a derivative of miniconda that includes mamba and uses the conda-forge channel in place of the defaults channel
<li> [https://youtu.be/yeXDyF6_VwQ How to install Mamba on Ubuntu 21.10], [https://youtu.be/q0qn8EOOP6c How to install R using Mamba], [https://youtu.be/rlTXjOXjJGE How to install RStudio on Ubuntu 21.10 with R installed using Mamba]
<li> Debian 12
<ul>
<li>(This step is unnecessary, miniforge includes Conda already) Install [https://docs.anaconda.com/anaconda/install/linux/ anaconda] on debian. I choose 'no' at the final question.
<syntaxhighlight lang='sh'>
sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 \
          libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
 
curl -O https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
 
bash Anaconda3-2024.06-1-Linux-x86_64.sh
 
# Anaconda3 will now be installed into this location: /home/$USER/anaconda3
# Do you wish to update your shell profile to automatically initialize conda? no
# Log out and log in again
 
# verify conda
conda list
 
conda activate base
 
conda deactivate
</syntaxhighlight>
<li>Install [https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html mamba]. Note that I choose 'yes' at the final question.
<syntaxhighlight lang='sh'>
curl -O https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-$(uname)-$(uname -m).sh
# press q to exit the agreement
# Miniforge3 will now be installed into this location:
# /home/brb/miniforge3
 
# To activate this environment, use:
#
#    micromamba activate /home/brb/miniforge3
#
# Or to execute a single command in this environment, use:
#
#    micromamba run -p /home/brb/miniforge3 mycommand
#
# ...
# You can undo this by running `conda init --reverse $SHELL`? [yes|no]
# [no] >>> yes    <----- IMPORTANT; o.w. mamba will not be available
# You have chosen to not have conda modify your shell scripts at all.
# To activate conda's base environment in your current shell session:
#
# eval "$(/home/brb/miniforge3/bin/conda shell.YOUR_SHELL_NAME hook)"
#
# To install conda's shell functions for easier access, first activate, then:
#
# conda init
#
# Thank you for installing Miniforge3!
</syntaxhighlight>
<li>Using mamba. Mamba commands are the same as Conda commands, so you can seamlessly switch between using the two.
<syntaxhighlight lang='sh'>
mamba create -n myenv python=3.6.12
 
mamba activate myenv
 
mamba install numpy=1.19.2  pandas=1.1.3
pip list
 
mamba deactivate
# deactivating an environment does not delete it; it simply changes your working context.
 
mamba env list
 
mamba env remove -n myenv
 
$ which mamba
/home/brb/miniforge3/condabin/mamba
$ which conda
/home/brb/miniforge3/condabin/conda
</syntaxhighlight>
</ul>
</ul>
== Run R with Jupyter notebook ==
<ul>
<li>[https://marketsplash.com/tutorials/r/how-to-use-r-in-jupyter-notebooks/ How To Use R In Jupyter Notebooks: A Step-By-Step Approach]
<pre>
mkdir project; cd project
python3 -m venv myenv
source myenv/bin/activate
pip3 install jupyter
# Same terminal
R  # this is from the global environment
    # so the local environment for R does not work
</pre>
<pre>
install.packages('IRkernel')  # Install IRkernel from within R
        # Make sure build-essential has been installed before
        # running install.packages().
IRkernel::installspec()      # Make IRkernel available to JupyterLab
q()
</pre>
After running these commands in R, you should be able to select R as a kernel when creating a new notebook in JupyterLab.
<pre>
jupyter lab # automatically launch jupyter in a browser
Ctrl+c      # stop
deactivate
</pre>
<li>[https://docs.anaconda.com/anaconda/navigator/tutorials/r-lang/ Using the R programming language in Jupyter Notebook] (Anaconda)
<li>[https://developers.refinitiv.com/en/article-catalog/article/setup-jupyter-notebook-r Setup Jupyter Notebook for R] (Windows OS, no conda)
<li>[https://datatofish.com/r-jupyter-notebook/ How to Add R to Jupyter Notebook (full steps)] using Anaconda
<li>[https://www.storybench.org/install-r-jupyter-notebook/ How to install R on a Jupyter notebook] using homebrew
<li>[http://www.rebeccabarter.com/blog/2017-11-17-ggplot2_tutorial/ ggplot2: Mastering the basics] & [https://github.com/rlbarter/ggplot2-thw Jupyter Notebook]. To set up the Jupyter environment, see the [[Docker#Python_Jupyter_including_R|Docker method]].
<pre>
docker run --rm -p 8888:8888 \
      -e JUPYTER_ENABLE_LAB=yes \
      -v "$PWD":/home/jovyan \
      jupyter/datascience-notebook:r-4.0.3
</pre>
We first have to use "git clone https://github.com/rlbarter/ggplot2-thw.git" to download the repo and "cd ggplot2-thw".  Then after opening http://IP:8888/?token=XXXXXXX we will see "ggplot2.ipynb" on the left panel. Double click the file will open it on the Notebook.
</li>
</ul>
</ul>
== Example 1: GEO2RNAseq ==
[[Genome#GEO2RNAseq|GEO2RNAseq]]
== Example 2: p-NET ==
[https://github.com/marakeby/pnet_prostate_paper Biologically informed deep neural network for prostate cancer classification and discovery] and the [https://www.nature.com/articles/s41586-021-03922-4 paper] 2021.
= Mamba =
* https://github.com/mamba-org/mamba The Fast Cross-Platform Package Manager
* [https://hpc.nih.gov/apps/python.html#envs Biowulf]. Mambaforge: a derivative of miniconda that includes mamba and uses the conda-forge channel in place of the defaults channel
* [https://youtu.be/yeXDyF6_VwQ How to install Mamba on Ubuntu 21.10], [https://youtu.be/q0qn8EOOP6c How to install R using Mamba], [https://youtu.be/rlTXjOXjJGE How to install RStudio on Ubuntu 21.10 with R installed using Mamba]


= Web framework =
= Web framework =

Latest revision as of 12:07, 12 September 2024

Basic

Resources

Python end of life

https://endoflife.date/python or https://devguide.python.org/. By default, the end-of-life is scheduled 5 years after the first release, but can be adjusted by the release manager of each branch.

Install, setup

Alias

How to Fix Common Python Installation Errors on macOS

nano ~/.bash_profile
# or
nano ~/.zshrc

Add the following line

alias python=python3

Ubuntu

How to Install the Latest Python Version on Ubuntu Linux

Mac

# check pip version
python -m pip --version

# install
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py

# Upgrading pip
python -m pip install -U pip
  • On my 2018 mac, the default python3 is at "/usr/local/bin/pyhton3". In "~/Library/Python" directory, it has "2.7", "3.8" and "3.9".
  • On my 2021 mac Ventura, the default python3 is at "/usr/bin". But when we try to run 'python3', it asked to install the command line developer tools. After the installation we can use python3. Still, there is no "~/Library/Python" directory.

Multiple python versions

How to manage multiple Python versions and virtual environments 2018.

conda/mamba

conda create --name myenv python=3.6.12
conda activate myenv

mamba create --name myenv python=3.6.12
mamba activate myenv

pyenv

pyenv install 3.6.12
pyenv virtualenv 3.6.12 myenv
pyenv activate myenv

virtualenv

pip install virtualenv
virtualenv -p python3.6 myenv
source myenv/bin/activate

venv (python 3.3+)

  • For Python 3, venv is generally more commonly used because it is included in the Python standard library starting from Python 3.3, making it more convenient and straightforward to use. However, virtualenv is still popular, especially among developers who need more advanced features or compatibility with older Python versions. It offers more flexibility and can be used with both Python 2 and Python 3.
    • The primary purpose of venv is to create isolated environments for managing Python packages and dependencies, but not python itself. For instance, python3.10 -m venv myenv.
    • On the other hand, virtualenv can indeed be used to control the Python version for your virtual environments. For instance, virtualenv -p /usr/bin/python3.10 myenv.
  • Don’t Make This Mistake When You Start Your Python Project
  • Understanding Python's Virtual Environment Landscape: venv vs. virtualenv, Wrapper Mania, and Dependency Control
  • Here is another example
    ~/github/PUREE$ python3 -m venv myenv
    The virtual environment was not created successfully because ensurepip is not
    available.  On Debian/Ubuntu systems, you need to install the python3-venv
    package using the following command.
    
        apt install python3.10-venv
    ...
    ~/github/PUREE$ sudo apt install python3.10-venv
    ~/github/PUREE$ python3 -m venv myenv
    
    ~/github/PUREE$ source myenv/bin/activate
    (myenv) ~/github/PUREE$ which python
    /home/brb/github/PUREE/myenv/bin/python
    (myenv) ~/github/PUREE$ pip freeze > requirements.txt
    (myenv) ~/github/PUREE$ deactivate
    ~/github/PUREE$
    
    ~/github/PUREE$ ls myenv/bin
    activate      activate.fish  f2py   f2py3.10    pip   pip3.10  python3     wheel
    activate.csh  Activate.ps1   f2py3  normalizer  pip3  python   python3.10

Online compiler

IDE

  • PyCharm
  • Thonny
  • Spyder
  • RStudio
    • Create a file (xxx.py)
    • Click the terminal tab. Type 'python' (or ipython3).
    • Use Ctrl/CMD + Alt + Enter to run your python code line by line or a chunk.

Visual Studio Code

The ipynb file can contain figures.

This (Harmony Manuscript) has several notebook files where the code in ipynb files were written in R, not Python.

I can use vsc to open a ipynb file.

Conversion

nbdev

Emacs

Emacs Shell mode: how to send region to shell?

JupyterLab

  • What is the difference between Jupyter Notebook and JupyterLab?
  • Jupyter Notebook (classic) and JupyterLab are both web-based interactive computing environments for working with data and code, but they have some key differences in terms of their user interface, features, and capabilities.
  • JupyterLab is a more modern and powerful tool than Jupyter Notebook, and is recommended for users who want a more flexible and feature-rich interface for working with data and code. However, Jupyter Notebook remains a popular and widely used tool, particularly for working with Jupyter notebooks.
  • 7 Reasons Why You Should Use Jupyterlab for Data Science

Some resources

Online tools rendering ipynb

  • Github
  • NBViewer (nbviewer.jupyter.org)
  • Google Colaboratory (colab.research.google.com)
  • Binder (mybinder.org)
  • Kaggle Notebooks (kaggle.com)
  • Azure Notebooks (notebooks.azure.com)
  • Datalore (datalore.jetbrains.com)
  • Deepnote (deepnote.com)
  • CoCalc (cocalc.com)

Different installation methods

JupyterLab Desktop

pip/pip3 Jupyter

  • https://jupyter.org/
    which python3
    # /usr/bin/python3
    
    pip3 install jupyterlab
    jupyter-lab
    # http://localhost:8888/lab
    # The current directory will be available on the file browser panel in JupyterLab.
    

    On Mac, it shows the following when I run 'pip3 install jupyterlab'

    Installing collected packages: pip
      WARNING: The scripts pip, pip3 and pip3.9 are installed in '/Users/XXX/Library/Python/3.9/bin' which is not on PATH.
      Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
    Successfully installed pip-23.0
    WARNING: You are using pip version 21.2.4; however, version 23.0 is available.
    You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.
    

    That is, I need to use /Users/XXX/Library/Python/3.9/bin/jupyter-lab to launch jupyter-lab OR add the path to ".zshrc"; see jupyterlab doc.

conda + Jupyter

conda install --yes jupyterlab && conda clean --yes --all

IPython shell

Extract python code from Jupyter notebook

Execute Javascript in a Jupyter Notebook

How to Execute Javascript in a Jupyter Notebook on Linux

Google colab

R programming

Setup Jupyter Notebook for R or Using R on Jupyter Notebook

To use R in JupyterLab, you will first need to install the IRkernel package in your R environment using the following command:

install.packages('IRkernel')

Once you have installed the IRkernel package, you can register it with JupyterLab using the following command in your R console:

IRkernel::installspec()

After you have registered the kernel, you can start a new Jupyter notebook or JupyterLab session and select the "R" kernel from the kernel dropdown menu. This will allow you to run R code in JupyterLab, including data analysis, visualization, and other tasks.

Xeus-R: a future-proof Jupyter kernel for R

Meet Xeus-R: a future-proof Jupyter kernel for R

Run Jupyter Notebooks on an Apple M1 Mac

Cheat sheet

The Most Frequently Asked Questions About Python Programming

https://www.makeuseof.com/tag/python-programming-faq/

Running

Interactively

Use Ctrl+d to quit.

How to run a python script file

python mypython.py

Run python statements from a command line

Use -c (command) option.

python -c "import psutil"

run python source code line by line

run python source code line by line

python -m pdb <script.py>

Install a new module

  • See an example of installing HTSeq.

Module != Package

  • A Python module is a single file containing Python code, while a package is a collection of modules organized in a specific way.
  • A module has a filename with the suffix .py added.

PyPI/Python Package Index

pip

pip, use PyPI as the default source for packages and their dependencies.

As an example, motionEye can be installed by pip install or pip2 install; see its wiki and source code on Github.

sudo apt-get install python-pip
pip --version
pip install SomePackage
pip show --files SomePackage
pip install --upgrade SomePackage
pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip 
pip install ‐‐upgrade pip  # Upgrade itself
pip uninstall SomePackage

sudo apt install python3-pip
pip3 --version

Upgrade packages

How to Upgrade Python Packages with Pip

requirements.txt

List installed packages and their versions, location/directory

pip3 list -v

On my Ubuntu 20.04, the packages installed by pip3 is located in ~/.local/lib/python3.8/site-packages/. It does not matter where I issued the pip3 install command.

The danger of upgrading pip

Don't use sudo + pip

https://askubuntu.com/questions/802544/is-sudo-pip-install-still-a-broken-practice

"--user" option in pip

$ pip install Pygments
...
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/Pygments-2.2.0.dist-info'
/usr/local/lib/python2.7/dist-packages/pip-9.0.1-py2.7.egg/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
$ pip install --user Pygments
Collecting Pygments
  Using cached Pygments-2.2.0-py2.py3-none-any.whl
Installing collected packages: Pygments
Successfully installed Pygments-2.2.0

pip -t option

We can force to install a package in user's directory (i.e. a package is already installed in the global directory /usr/lib/python3/dist-packages but some applications cannot find it). Pip install python package into a specific directory other than the default install location

pip3 install -t ~/.local/bin/python3.10/site-packages pytz

virtualenv

Python “Virtual Environments” allows us to install a Python package in an isolated location, rather than installing it globally.

  • How To Manage Python Packages Using Pip. First Create a new project folder and cd to the project folder in your terminal.
    # Python 2
    $ sudo pip install virtualenv
    $ virtualenv <DIR_NAME>
    $ source <DIR_NAME>/bin/activate
    (<DIR_NAME>) ~$ which python
    ....
    $ deactivate
    
    # For Python 3, https://docs.python.org/3/tutorial/venv.html it is more common to use venv instead
    $ python3 -m venv <DIR_NAME>  # DIR_NAME is also called an environment
    $ source <DIR_NAME>/bin/activate
    (<DIR_NAME>) ~$ which python
    ....
    $ deactivate
  • Python Tutorial: virtualenv and why you should use virtual environments. pip freeze.
    pip list
    pip freeze --local > requirements.txt
    ...
    pip install -r requirements.txt
    pip list
    

pipenv

  • Why Use Pipenv to Create a Python Environment?
  • Pipenv is a Python virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, pyenv and virtualenv. It automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the ever-important Pipfile.lock, which is used to produce deterministic builds.
  • https://pipenv.pypa.io/en/latest/pipfile/

Poetry

  • https://python-poetry.org/
  • Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. Poetry offers a poetry.lock lockfile to ensure repeatable installs, and can build your project for distribution.

pipx (alternative to pip3)

  • Pipx is a tool that helps you install and run end-user applications written in Python. It is similar to macOS’s brew, JavaScript’s npx, and Linux’s apt.
    • Pipx is focused on installing and managing Python packages that can be run from the command line directly as applications.
    • pipx is made specifically for application installation, as it adds isolation yet still makes the apps available in your shell: pipx creates an isolated environment for each application and its associated packages.
    python3 -m pip install --user pipx
    python3 -m pipx ensurepath
    # OR sudo apt install pipx   
    # Or pip3 install pipx
    
    pipx install <package_name> # no sudo needed
    pipx list
    pipx uninstall <package_name>
  • I need to find an alternative to pip utility because of a problem when I used pip command. Error: externally-managed-environment
  • Official website [1] - Install and Run Python Applications in Isolated Environments, github
  • pipx vs pip
    • pipx is made specifically for application installation and adds isolation yet still makes the apps available in your shell. pipx creates an isolated environment for each application and its associated packages. On the other hand, pip is a general-purpose package installer for both libraries and apps with no environment isolation.
    • If you want to install an application that has dependencies that conflict with other applications or libraries on your system, you can use pipx to create an isolated environment for that application and its dependencies. This way, you can avoid conflicts between different versions of the same package.
    • On my Ubuntu, pip installs packages to /usr/local/lib/python3.8/dist-packages/.
  • Pipx – Install And Run Python Applications In Isolated Environments
  • Run Python applications in virtual environments

python setup.py

If a package has been bundled by its creator using the standard approach to bundling modules (with Python’s distutils tool), all you need to do is download the package, uncompress it and type:

python setup.py build
sudo python setup.py install

For Python 2, the packages are installed under /usr/local/lib/python2.7/dist-packages/.

$ ls -l /usr/local/lib/python2.7/dist-packages/
total 12
-rw-r--r-- 1 root staff  273 Jan 12 13:45 easy-install.pth
drwxr-sr-x 4 root staff 4096 Jan 12 13:45 HTSeq-0.6.1p1-py2.7-linux-x86_64.egg
drwxr-sr-x 4 root staff 4096 Jan 12 13:42 pysam-0.9.1.4-py2.7-linux-x86_64.egg

python setup.py bdist_wheel

Get a list of installed modules

http://stackoverflow.com/questions/739993/how-can-i-get-a-list-of-locally-installed-python-modules

pydoc modules

Not helpful. See the pip list command.

Check installed packages' versions

If you install packages through pip, use

$ pip list
...
pyOpenSSL (0.13.1)
pyparsing (2.0.1)
pysam (0.10.0)
python-dateutil (1.5)
pytz (2013.7)
rudix (2016.12.13)
scipy (0.13.0b1)
setuptools (1.1.6)
singledispatch (3.4.0.3)
six (1.4.1)
tornado (4.4.2)
vboxapi (1.0)
xattr (0.6.4)
zope.interface (4.1.1)

And more information about a package by using pip show PACKAGE.

$ pip show pysam
Name: pysam
Version: 0.10.0
Summary: pysam
Home-page: https://github.com/pysam-developers/pysam
Author: Andreas Heger
Author-email: [email protected]
License: MIT
Location: /Users/XXX/Library/Python/2.7/lib/python/site-packages
Requires:

The following method works whether the package is installed by source or binary package

>>> import pysam
>>> print(pysam.__version__)
0.10.0
>>> print pysam.__version__
0.10.0

See http://hammelllab.labsites.cshl.edu/tetoolkit-faq/

Install a specific version of package through pip

https://stackoverflow.com/questions/5226311/installing-specific-package-versions-with-pip

For example, pysam package was actively released. But the new release (0.11.2.2) may introduce some bugs. So I have to install an older version (0.10.0 works for me on Mac El Capitan and Sierra).

$ sudo -H pip uninstall pysam
Uninstalling pysam-0.11.2.2:
......
$ sudo -H pip install pysam==0.10.0
Collecting pysam==0.10.0
  Downloading pysam-0.10.0.tar.gz (2.3MB)
    100% |████████████████████████████████| 2.3MB 418kB/s 
Installing collected packages: pysam
  Running setup.py install for pysam ... done
Successfully installed pysam-0.10.0

warning: Please check the permissions and owner of that directory

I got this message when I use root to run the 'sudo pip install PACKAGE' command.

See

python3-pip installed but pip3 command not found?

sudo apt-get remove python3-pip; sudo apt-get install python3-pip

DeepSurv example

https://github.com/jaredleekatzman/DeepSurv

git clone https://github.com/jaredleekatzman/DeepSurv.git
sudo cp /usr/bin/pip /usr/bin/pip.bak
sudo nano /usr/bin/pip # See https://stackoverflow.com/a/50187211 more detail

# Method 1 for Theano
sudo pip install theano
# Method 2 for Theano
pip install --user --upgrade https://github.com/Theano/Theano/archive/master.zip 

pip install --user --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
cd DeepSurv/
pip install . --user
sudo apt install python-pytest
pip install h5py --user
sudo pip uninstall protobuf # https://stackoverflow.com/a/33623372
pip install protobuf --user
sudo apt install python-tk
py.test
============ test session starts ===========
platform linux2 -- Python 2.7.12, pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /home/brb/github/DeepSurv, inifile:
collected 7 items

tests/test_deepsurv.py .......

========== 7 passed in 5.77 seconds ========

How to list all installed modules

help('modules')  # the output is not pretty

Comment

  1. Use the comment symbol # for a single line
  2. Use a delimiter “”” on each end of the comment. Attention: Don't use triple-quotes

Python Comments from zentut.com.

Docstring

Try / Except

Try and Except in Python

try:
    number = int(input("Enter a number: "))
    print(number)
except:
    print("Invalid Input")

if __name__ == "__main__":

How to Get the Current Directory in Python

How to Get the Current Directory in Python

Import a compiled C module

string and string operators

Reference:

  1. Python for Genomic Data Science from coursera.
  2. Python Hello World and String Manipulation
dna[0]='g' 
dna[-1]='c' (start counting from the right)
dna[-2]='g'
dna[0:3]='gat' (the end always excluded)
dna[:3]='gat'
dna[2:]='tgc'
len(dna)=6
type(dna)
print(dna)
dna.count('c')
dna.upper()
dna.find('ag')=3  (only the first occurrence of 'ag' is reported)
dna.find('17', 2) (start looking from pos 17)
dna.rfind('ag')   ( search backwards in string)
dna.islower()    (True)
dna.isupper()    (False)
dna.replace('a', 'A')
print(dna.upper().isupper())

Format

Format Specification Mini-Language

Regular expression

The Beginner’s Guide to Regular Expressions With Python

User's input

dna=raw_input("Enter a DNA sequence: ")  # python 2
dna=input("Enter a DNA sequence: ")      # python 3

To convert a user's input (a string) to others

int(x, [, base])
flaot(x)
str(x) #converts x to a string
str(65) # '65'

chr(x)  # converts an integer to a character
chr(65) # 'A'

Print

Why is parenthesis in print voluntary in Python 2.7?

Fancy Output

print("THE DNA's GC content is ", gc, "%") # gives too many digits following the dot
print("THE DNA's GC content is %5.3f %%" % " % gc) 
# the percent operator separating the formatting string and the value to
# replace the format placeholder
print("%d" % 10.6)  # 10
print("%e" % 10.6)  # 10.060000e+01
print("%s" % dna)   # gatagc

Type

Built-in Functions, How to check type of variable (object) in Python

type(object)

List

A list is an ordered set of values

gene_expr=['gene', 5.16e-08, 0.001385, 7.33e-08]
print(gene_expr[2]
gene_expr[0]='Lif'

Slice a list (it will create a new list)

gene_expr[-3:]  # [5.16e-08, 0.001385, 7.33e-08]
gene_expr[1:3] = [6.09e-07]

Clear the list

gene_expr[]=[]

List functions

Size of the list

len(gene_expr)

Delete an element

del gene_expr[1]

Extend/append to a list

gene_expr).extend([5.16e-08, 0.00123])

Count the number of times an element appears in a list

print(gene_expr.count('Lif'), gene_expr.count('gene'))

Reverse all elements in a list

gene_expr.reverse()
print(gene_expr)
help(list)

Lists as Stacks

stack=['a', 'b', 'c', 'd']
stack.append('e')

Sorting lists

mylist=[3, 31, 123, 1, 5]
sorted(mylist)
mylist  # not changed
mylist.sort()

mylist=['c', 'g', 'T', 'a', 'A']
mylist.sort()

Don't change an element in a string!


motif = 'nacggggtc'
motif[0] = 'a'    # ERROR

Tuples

A tuple consists of a number of values separated by commas, and is another standard sequence data type, like strings and lists.

t=1,2,3
t
t=(1,2,3)  # we may input tuples with or without surrounding parentheses

Sets

A set is an unordered collection with no duplicate elements.

brca1={'DNA repair', 'zine ion binding'}
brca2={protein binding', 'H4 histone'}
brca1 | brca2
brca1 & brca2
brca1 - brca2

Dictionaries

A dictionary is an unordered set of key and value pairs, with the requirement that the keys are unique (within on dictionary).

TF_motif = {'SP1' : 'gggcgg', 
            'C/EBP' : 'attgcgcaat',
            'ATF' : 'tgacgtca',
            'c-Myc' : 'cacgtg',
            'Oct-1' : 'atgcaaat'}
# Access
print("The recognition sequence for the ATF transcription is %s." % TF_motif['ATF']) 
# Update
TF_motif['AP-1'] = 'tgagtca'
# Delete
del TF_motif['SP1']
# Size of a list
len(TF_motif)
# Get a list of all the 'keys' in a dictionary
list(TF_motif.keys())
# Get a list of all the 'values'
list(TF_motif.values())
# sort
sorted(TF_motif.keys())
sorted(TF_motif.values())

We can retrieve data from dictionaries using the items() method.

for name,seq in seqs.item():
    print(name, seq)

In summary, strings, lists and dictionaries are most useful data types for bioinformatics.

if statement

dna=input('Enter DNA sequence: ')
if 'n' in dna :
  nbases=dna.count('n')
  print("dna sequence has %d undefined bases " % nbases)

if condtion 1:
  do action 1
elif condition 2:
  do action 2
else:
  do action 3

Logical operators

Use and, or, not.

dna=input('Enter DNA sequence: ')
if 'n' in dna or 'N' in dna:
    nbases=dna.count('n')+dna.count('N')
    print("dna sequence has %d undefined bases " % nbases)
else:
    print("dna sequence has no undefined bases)

Loops

while

dna=input('Enter DNA sequence:')
pos=dna.find('gt', 0)

while pos>-1 :
    print("Donar splice site candidate at position %d" %pos)
    pos=dna.find('gt', pos+1)

for

motifs=["attccgt", "aggggggttttttcg", "gtagc"]
for m in motifs:
    print(m, len(m))

range

for i in range(4):
    print(i)
for i in range(1,10,2):
    print(i)

Problem: find all characters in a given protein sequence are valid amino acids.

protein='SDVIHRYKUUPAKSHGWYVCJRSRFTWMVWWRFRSCRA'
for i in range(len(protein)):
    if protein[i] not in 'ABCDEFGHIKLMNPQRSTVWXYZ': 
        print("this is not a valid protein sequence!")
        break

continue

protein='SDVIHRYKUUPAKSHGWYVCJRSRFTWMVWWRFRSCRA'
corrected_protein=''
for i in range(len(protein)):
    if protein[i] not in 'ABCDEFGHIKLMNPQRSTVWXYZ': 
        continue
    corrected_protein=corrected_protein+protein[i]
print("COrrected protein seq is %s" % corrected_protein)

else Statement used with loops

  • If used with a for loop, the else statement is executed when the loop has exhausted iterating the list
  • If used with a while loop, the else statement is executed when the condition becomes false
# Find all prime numbers smaller than a given integer
N=10
for y in range(2, N):
    for x in range(2, y):
        if y %x == 0:
            print(y, 'equals', x, '*', y//x)
            break
        else:
            // loop fell through without finding a factor
            print(y, 'is a prime number')

The pass statement is a placeholder

if motif not in dna:
    pass
else:
    some_function_here()

Functions

Get modular with Python function

def function_name(arguments) :
    function_code_block
    return output

For example,

def gc(dna) :
    "this function computes the gc perc of a dna seq"
    nbases=dna.count('n')+dna.count('n')
    gcpercent=float(dna.count('c')+dna.count('C')+dna.count('g)
+dna.count('G'))*100.0/(len(dna)-nbases)
    return gcpercent
gc('AAAAGTNNAGTCC')
help(gc)

SyntaxError: invalid syntax

https://stackoverflow.com/a/11890194

On the Python shell add an empty line at the end of function definition. Eg

>>> def fun(a):
...     return a+1
... 
>>> fun(9)
10
>>> exit()

On a python script

def fun(a):
    return a+1
print fun(9)

Debug functions

https://stackoverflow.com/a/4929267

You can launch a Python program through pdb by using pdb myscript.py or python -m pdb myscript.py

$ cat debug.py
def fun(a):
    a= a*2
    a= a*3
    return a+1
print fun(5)

$ python -m pdb debug.py
> /home/pi/Downloads/debug.py(1)<module>()
-> def fun(a):
(Pdb) b fun
Breakpoint 1 at /home/pi/Downloads/debug.py:1
(Pdb) c
> /home/pi/Downloads/debug.py(2)fun()
-> a= a*2
(Pdb) n
> /home/pi/Downloads/debug.py(3)fun()
-> a= a*3
(Pdb) 
> /home/pi/Downloads/debug.py(4)fun()
-> return a+1
(Pdb) p a
30
(Pdb) n
--Return--
> /home/pi/Downloads/debug.py(4)fun()->31
-> return a+1
(Pdb) exit

Boolean functions

Problem: checks if a given dna seq contains an in-frame stop condon

dna=input("Enter a dna seq: ")
if (has_stop_codon(dna)) :
    print("input seq has an in frame stop codon.")
else :
    print("input seq has no in frame stop codon.")

def has_stop_codon(dna) :
    "This function checks if given dna seq has in frame stop codons."
    stop_codon_found=False
    stop_codons=['tga', 'tag', 'taa']
    for i in range(0, len(dna), 3) :
        codon=dna[i:i+3].lower()
        if codon in stop_codons : 
            stop_codon_found=True
            break
    return stop_codon_found

Function default parameter values

Suppose the has_stop_codon function also accepts a frame argument (equal to 0, 1, or 2) which specifies in what frame we want to look for stop codons.

def has_stop_codon(dna, frame=0) :
    "This function checks if given dna seq has in frame stop codons."
    stop_codon_found=False
    stop_codons=['tga', 'tag', 'taa']
    for i in range(frame, len(dna), 3) :
        codon=dna[i:i+3].lower()
        if codon in stop_codons : 
            stop_codon_found=True
            break
    return stop_codon_found

dna="atgagcggccggct"
has_stop_codon(dna)    # False
has_stop_codon(dna, 0) # False
has_stop_codon(dna, 1) # True
has_stop_codon(frame=0, dna=dna)

More examples

Reverse complement of a dna sequence

def reversecomplement(seq):
    """Return the reverse complement of the dna string."""
    seq = reverse_string(seq)
    seq = complement(seq)
    return seq

reversecomplement('CCGGAAGAGCTTACTTAG')

To reverse a string

def reverse_string(seq):
    return seq[::-1]

reverse_string(dna)

Complement a DNA Sequence

def complement(dna):
    """Return the complementary sequence string."""
    basecomplement = {'A':'T', 'C':'G', 'G':'C', 'T':'A', 
                      'N':'N', 'a':t', 'c':'g', 'g':'c', 't':'a', 'n':'n'} # dictionary
    letters = list(dna) # list comprehensions
    letters = [basecomplement[base] for base in letters]
    return ''.join(letters)

Split and Join functions

sentence="enzymes and other proteins come in many shapes"
sentence.split()  # split on all whitespaces
sentence.split('and') # use 'and' as the separator

'-'.join(['enzymes', 'and', 'other', 'proteins', 'come', 'in', 'many', 'shapes'])

Variable number of function arguments

def newfunction(fi, se, th, *rest):
  print("First: %s" % fi)
  print("Second: %s" % se)
  print("Third: %s" % th)
  print("Rest... %s" % rest)
  return

Modules and packages

Packages group multiple modules under on name, by using "dotted module names". For example, the module name A.B designates a submodule named B in a package named A. See What's the difference between a Python module and a Python package?

<dnautil.py>

#!/usr/bin/python
"""
dnautil module contains a few useful functions for dna seq
"""
def gc(dna) :
    blah
    blah
    return gcpercent

When a module is imported, Python first searches for a built-in module with that name.

If built-in module is not found, Python then searches for a file obtained by adding the extension .py to the name of the module that it's imported:

  • in your current directory,
  • the directory where Python has been installed
  • in a path, i.e., a colon(':') separated list of file paths, stored in the environment variable PYTHONPATH.

You can use the sys.path variable from the sys built-in module to check the list of all directories where Python look for files

import sys
sys.path

If the sys.path variable does not contains the directory where you put your module you can extend it:

sys.path.append("/home/$USER/python")
sys.path

Using modules (from PACKAGE/DIRNAME/FILENAME import CLASS)

from math import *
print(floor(3.7))

import dnautil
dna="atgagggctaggt"
gc(dna)          # gc is not defined
dnautil.gc(dna)  # Good

Import Names from a Module

from dnautil import *
gc(dna)          # OK

from dnautil import gc, has_stop_codon

Get modular with Python functions & Learn object-oriented programming with Python from opensource.com.

from...import vs import vs import...as

  • Difference between 'import' and 'from...import' in Python
  • Import, From and As Keywords in Python
  • `from … import` vs `import .`
  • Difference between import and from in Python. Python's import loads a Python module into its own namespace, so that you have to add the module name followed by a dot in front of references to any names from the imported module that you refer to:
    import feathers
    duster = feathers.ostrich("South Africa")

    from loads a Python module into the current namespace, so that you can refer to it without the need to mention the module name again:

    from feathers import *
    duster = ostrich("South Africa")
    • Question: Why are both import and from provided? Can't I always use from? Answer: If you were to load a lot of modules using from, you would find sooner or later that there was a conflict of names; from is fine for a small program but if it was used throughout a big program, you would hit problems from time to time
    • Question: Should I always use import then? Answer: No ... use import most of the time, but use from is you want to refer to the members of a module many, many times in the calling code; that way, you save yourself having to write "feather." (in our example) time after time, but yet you don't end up with a cluttered namespace. You could describe this approach as being the best of both worlds.
  • from … import OR import … as for modules
  • Some examples:
    from numpy import array   # Run file; load specific 'attribute'
    arr = array([1,2,3])    # Use name directly: no need to qualify
    print(arr) # print [1 2 3]
    
    from math import pi
    pi # 3.141592653589793
    math.pi  # NameError: name 'math' is not defined

    VS

    import numpy   # Run file; load module as a whole
    arr = numpy.array([1,2,3])  # Use its attribute names: '.' to qualify
    print(arr) # print [1 2 3]
    
    import math
    math.pi # 3.141592653589793
    dir(math)

    VS

    import numpy as np
    dir(np)
    
    import math as m
    m.pi # 3.141592653589793
  • scRNA_cell_deconv_benchmark example.

help

from AAA import BBB
help(BBB)
help(BBB.FunctionName)

import BBB as CCC
help(CCC)

Packages & __init__.py

Each package in Python is a directory which MUST contain a special file __init__.py. This file can be empty and it indicates that the directory it contains is a Python package, so it can be imported the same way a module can be imported. https://docs.python.org/2/tutorial/modules.html

Example: suppose you have several modules dnautil.py, rnautil.py , and proteinutil.py. You want to group them in a package called "bioseq" which processes all types of biological sequences. The structure of the package:

bioseq/
  __init__.py
  dnautil.py
  rnautil.py
  proteinutil.py
  fasta/
    __init__.py
    fastautil.py
  fastq/
    __init__.py
    fastqutil.py

Loading from packages:

import bioseq.dnautil
bioseq.dnautil.gc(dna)

from bioseq import dnautil
dnautil.gc(dna)

from bioseq.fasta.fastautil import fastqseqread

Example

Building a Multiple Choice Quiz by freeCodeCamp.org

QuestionFile.py

class Question:
    def __init__(self, prompt, answer):
        self.prompt = prompt
        self.answer = answer

app.py

from QuestionFile import Question

question_prompts = [
    "What color are apples?\n(a) Red/Green\n(b) Purple\n(c) Orange\n\n",
    "What color are Bananas?\n(a) Teal\n(b) Magenta\n(c) Yellow\n\n",
    "What color are strawberries?\n(a) Yellow\n(b) Red\n(c) Blue\n\n"
]

questions = [
    Question(question_prompts[0], "a"),
    Question(question_prompts[1], "c"),
    Question(question_prompts[2], "b")
]

def run_test(question):
    score = 0
    for question in questions:
        answer = input(question.prompt)
        if answer == question.answer:
            score += 1
    print("You got " + str(score) + " /" + str(len(questions))+ " correct")

run_test(questions)

Run the program by python3 app.py

Files - Communicate with the outside

f=open('myfile', 'r') # read
f=open('myfile')
f=open('myfile', 'w') # write
f=open('myfile', 'a') # append

Take care if a file does not exists

try:
    f = open('myfile')
except IOError:
    print("the file myfile does not exist!!")

Reading

for line in f:
    print(line)

Change positions within a file object

f.seek(0)  # go to the beginning of the file
f.read()

Read a single line

f.seek(0)
f.readline()

Write into a file

f=open("/home/$USER/myfile, 'a)
f.write("this is a new line")
f.close()

>>> with open("file.txt", "w") as f:
...   f.write(str(object))
...

Importing large tab-delimited .txt file into Python

# R
write.table(iris[1:10,], file="iris.txt", sep="\t", quote=F, row.names=F)

# Python
import csv
with open('iris.txt') as f:
    reader = csv.reader(f, delimiter="\t")
    d = list(reader)
print(d[0][2])
print(d[1][2])

# Shell
$ python test_csv.py
Petal.Length
1.4

If the data are all numerical, we can use the numpy package.

# R
write.table(iris[1:10, 1:4], 
            file="~/Downloads/iris2.txt", 
            sep="\t", quote=F, row.names=F, col.names=F)

# Python
import numpy as np
d = np.loadtxt('iris2.txt', delimiter="\t")
print(d[0][2])
print(d[1][2])

# Shell
$ python test_csv2.py
1.4
1.4

Read text file from a URL

import urllib.request

url = "http://textfiles.com/adventure/aencounter.txt"
file = urllib.request.urlopen(url)

for line in file:
   print(line.decode('utf-8'))

Command line arguments

Suppose we run 'python processfasta.py myfile.fa' in the command line, then

import sys
print(sys.argv)  #  ['processfasta.py', 'myfile.fa']

More completely

#!/usr/bin/python
"""
processfasta.py builds a dictionary with all sequences from a FASTA file.
"""

import sys
filename=sys.argv[1]

try:
  f = open(filename)
except IOError:
    print("File %s does not exist!" % filename)

Parsing command line arguments with getopt. Suppose we want to store in the dictionary the sequences bigger than a given length provided in the command line: 'processfasta.py -l 250 myfile.fa'

#!/usr/bin/python
import sys
import getopt

def usage():
    print """
processfasta.py: reads a FASTA file and builds a 
dictionary with all sequence bigger than a given length

processfasta.py [-h] [-l <length>] <filename>

 -h           print this message
 -l <length>  filter all sequences with a length
              smaller than <length>
              (default <length>=0)
 <filename>   the file has to be in FASTA format

o, a = getopt.getopt(sys.argv[1:], '1:h')
opts = {} # empty dictionary
seqlen=0;

for k,v in o:
    opts[k] = v
if 'h' in opts.keys():  # he means the user wants help
    usage(); sys.exit()
if len(a) < 1:
    usage(); sys.exit("input fasta file is missing")
if 'l' in opts.keys():
    if opts['l'] <0 :
        print("length of seq should be positive!"); sys.exit(0);
    seqlen=opts['l']

stdin and stdout

sys.stdin.read()

sys.stdout.write("Some useful ouput.\n")

sys.stderr.write("Warning: input file was not found\n")

Call external programs

import subprocess
subprocess.call('["ls", "-l"]) # return code indicates the success or failure of the execution

subprocess.call('["tophat", "genome_mouse_idx", "PE_reads_1.fq.gz", "PE_reads_2.fq.gz"])

Exceptions

5 Python Examples to Handle Exceptions using try, except and finally

Debugging

How to Debug Your Python Code

Biopython & Pubmed

  • Parsers for various bioinformatics file formats (FASTA, Genbank)
  • Access to online services like NCBI Entrez or Pubmed databases
  • Interfaces to common bioinformatics programs such as BLAST, Clustalw and others.
import Bio
print(Bio.__version__)

Running BLAST over the internet

from Bio.Blast import NCBIWWW
fasta_string = open("myseq.fa").read()
result_handle = NCBIWWW.qblast("blastn":, "nt", fasta_string)
# blastn is the program to use
# nt is the database to search against
# default output is xml
help(NCBIWWW.qblast)

The BLAST record

from Bio.Blast import NCBIXML
blast_record = NCBIXML.read(result_handle)

Parse BLAST output

len(blast_record.alignments)

E_VALUE_THRESH = 0.01
for alignment in blas_record.alignments:
  for hsp in alignment.hsps:
    if hsp.expect < E_VALUE_THRESH:
      print('***Alignment***')
      print('sequence:', alignment.title)
      print('length:', alignment.length)
      print('e value:', hsp.expect)
      print(hsp.query)
      print(hsp.match)
      print(hsp.sbjct)

More help with Biopython

pubmed_parser

Parser for Pubmed Open-Access XML Subset and MEDLINE XML Dataset

pyTest

pyc file

What is the difference between .py and .pyc files? [duplicate]. I observe it can cause a problem when I want to modify a python file but it keeps using the old pyc file so my change is not used (Raspbery Pi e-ink example).

Shutdown or restart OS

Below is tested on Raspbian

import os
os.system('sudo shutdown -h now')

Popular python libraries

20 Python libraries you can’t live without

psutil

# pip install psutil --user
for x in range(10):
    psutil.cpu_percent(interval=1)

numpy

pandas

30 pandas Commands for Manipulating DataFrames

Write a pandas dataframe to a text file using to.csv(). https://stackoverflow.com/a/41514539

a.to_csv('xgboost.txt', header=True, index=True, sep='\t')

scipy

seaborn

matplotlib

Installation.

python -m pip install -U pip
python -m pip install -U matplotlib

# https://stackoverflow.com/a/50328517
sudo apt-get install python3.5-tk

Example.

from sklearn import datasets
iris = datasets.load_iris()
import matplotlib.pyplot as plt
iris = iris.data

# Scatterplot
plt.scatter(iris[:,1], iris[:,2])
plt.show()

# Boxplot
plot.boxplot(iris[:,1])
plt.show()

# Histogram
plt.hist(iris[:,1])
plt.show()

scikit-learn

scikit-learn: Machine Learning in Python

Installation.

pip install -U scikit-learn

Example.

$ python
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()

PyTorch

feedparser

Never miss a Magazine article — build your own RSS notification system

Boto

A Python interface to Amazon Web Services

PIL, Pillow

plotnine

Python and R – Part 2: Visualizing Data with Plotnine

nltk: Natural Language Toolkit

https://www.nltk.org/

pygame

Learn Python by creating a video game

scanpy

Trouble shooting

ImportError: cannot import name main when running pip

https://stackoverflow.com/a/50187211

Error: externally-managed-environment

See pipx

TypeError: ‘module’ object is not callable

I was trying to run "bbknn.py" from here.

Solve “TypeError: ‘module’ object is not callable” in Python, TypeError: 'module' object is not callable

The problem is I have a file called "bbknn.py" and I have "import bbknn" in the code. It will confuse python. The solution is to rename my script file "bbknn.py" (avoid MODULE.py) to other name like "bbknnDemo.py".

When I import my module in python, it automatically runs all of the defined functions inside of it. How do I prevent it from auto executing my functions, but still allow me to call them in my main script?

Illegal instruction

I got this error after I called python3 -c 'import scanpy'. Python on Biowulf.

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
TMPDIR=/tmp bash Miniconda3-latest-Linux-x86_64.sh -p ~/conda -b

source ~/conda/etc/profile.d/conda.sh # ~/conda/condabin is added to PATH

conda activate base
python -V # Python 3.9.4

conda create -n project1 pandas numpy scipy -y
conda activate project1
pip3 install scanpy bbknn
ls ~/conda/envs/project1/lib/python3.9/site-packages
# bbknn and scanpy are there  
python3 -c 'import scanpy'
# Illegal instruction

conda info --env
conda deactivate
conda remove --all -n project1 -y
conda deactivate

No matching distribution found for XXX

Got an error No matching distribution found for lasagne==0.2.dev1 when I ran 'pip install .' on DeepSurv.

https://github.com/imatge-upc/saliency-salgan-2017/issues/29

Python AttributeError: 'module' object has no attribute 'SSL_ST_INIT'

See https://stackoverflow.com/a/52398193. I got this message after I ran sudo pip install --upgrade cryptography and pip show cryptography. The reason I try to upgrade cryptography is the following message

$ pip show protobuf
/home/brb/.local/lib/python2.7/site-packages/pip/_vendor/requests/__init__.py:83: 
  RequestsDependencyWarning: Old version of cryptography ([1, 2, 3]) may cause slowdown.
  warnings.warn(warning, RequestsDependencyWarning)
Name: protobuf
...

And OpenSSL & pyOpenSSL-0.15.1.egg-inf are under /usr/lib/python2.7/dist-packages directory on my Ubuntu 16.04.

Note the following solutions do not work

$ sudo pip uninstall pyopenssl
$ sudo pip install pyOpenSSL==16.2.0

I always get an error message

...
  File "/usr/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 118, in <module>
    SSL_ST_INIT = _lib.SSL_ST_INIT
AttributeError: 'module' object has no attribute 'SSL_ST_INIT'

And a quick solution is to do sudo rm -r /usr/local/lib/python2.7/dist-packages/OpenSSL. I also did sudo pip install pyopenssl but I did not follow this answer (sudo apt install --reinstall python-openssl).

/usr/bin/env: ‘python’: No such file or directory

On Ubuntu 20.04,

sudo apt-get install python-is-python3

This solved an error when I used youtube-dl.

Projects based on python

  • pithos Pandora on linux
  • Many Raspberry Pi GPIO projects
  • GeneScissors It also requires pip and scikit-learn packages.
  • KeepNote It depends on Python 2.X, sqlite and PyGTK.
  • Zim It depends on Python, Gtk and the python-gtk bindings.
  • Cherrytree It depends on Python2, Python-gtk2, Python-gtksourceview2, p7zip-full, python-enchant and python-dbus.

Send emails

  • Less secure apps & your Google Account. To help keep your account secure, from May 30, 2022, ​​Google no longer supports the use of third-party apps or devices which ask you to sign in to your Google Account using only your username and password.
    • Send Email via Gmail and SMTP Use an App Password 2022/9. Click on Security -> 2-Step Verification (You may need to enter your PW first). Scroll to the bottom of the page, and you'll see the "App passwords" section. You can delete/create app passwords but you can't view any existing passwords.
  • How to Send Automated Email Messages in Python 3 2021/3

GUI programming

New book: Create Graphical User Interfaces with Python

Qt for GUI development

Python 3

pip3

Use pip3 instead of pip for Python 3. For example,

pip3 install --upgrade pip

pip3 install -U scikit-learn

pip3 install -U matplotlib

http.server

Edit Files With Workspaces. The 'http.server' module is contained in python3.

cd ~/website
python3 -m http.server

C vs Python

C vs. Python: The Key Differences

R and Python: reticulate package

  • Installing and Configuring Python with RStudio
    • The instruction is based on virtualenv. But I'm following Biowulf's Python miniconda instruction to create a new project/environment. One caveat is I need to run source ~/$USER/conda/etc/profile.d/conda.sh each time before I start R in order to make conda available OR I need to set the RETICULATE_MINICONDA_PATH variable (see below).
    • The conda-related reticulate functions include conda_create(), use_condaenv(), conda_install(), conda_list(), conda_remove()
    • Use py_config() to check the current python path and other python versions found.
    • My example
      library(reticulate)
      # Assume I followed Biowulf's instruction to create 'project1'
      Sys.setenv(RETICULATE_MINICONDA_PATH = "~/conda") 
      conda_list()
      use_condaenv("project1", required=T)
      py_config()
      
  • Python Version Configuration. Suppose I have installed miniconda and create a new environment called 'project1'. Then after calling source ~/conda/etc/profile.d/conda.sh I can start in R
    library(reticulate)
    use_condaenv("project1", required = TRUE)
    
  • Test python and markdown files
    def add_three(x):
        z = x + 3
        return z
    
    ---
    title: "R Notebook"
    output: html_notebook
    ---
    
    ```{r}
    library(reticulate)
    py_discover_config()
    x <- 5
    source_python("test.py")
    y <- add_three(x)
    print(y)
    ```
    
    Pass R variables to Python. Works
    ```{python}
    a = 7
    print(r.x)
    ```
    
    Pass python variables to R. Works.
    ```{r}
    py$a
    py_run_string("y = 10"); py$y
    ```
    

How to quit python

Type exit and hit Enter. See https://rstudio.github.io/reticulate/.

R vs Python

Call R from Python

  • rwrap Seamlessly integrate R packages into Python.
  • rpy2

Conda, Anaconda, miniconda

Private environment

Conda on Biowulf & mambaforge

Transfer a conda environment to another computer: YAML files

Managing environments

# computer 1
conda env export > environment.yml

# computer 2
conda env create -f environment.yml

Conda environment create, activate, deactivate, info (see a list)

Getting started with conda. More details are in Tasks.

conda --version

# Manage environment
conda info --envs  # see a list of environments. 
                   # The active environment is the one with an asterisk (*)
# create a new environment
conda create --name myenv
# remove an environment
conda remove --name myenv --all

# Manage Python
conda create --name snakes python=3.5
conda activate snowflakes # activate
conda info --envs
python --version
conda activate  # Change your current environment back to the default (base)
conda deactivate # exit any python virtualenv

# Managing packages
conda search beautifulsoup4
conda install beautifulsoup4
conda list

# Updating Anaconda or Miniconda
conda update conda

Anaconda

  • Introduction to Anaconda. Simplifies installation of Python packages
    • Platform-independent package manager
    • Doesn’t require administrative privileges
    • Installs non-Python library dependencies (MKL, HDF5, Boost)
    • Provides ”virtual environment” capabilities
    • Many channels exist that support additional packages
  • Install Anaconda on macOS. Better to use the command line method in order to install it to the user's directory. The new python can be manually loaded into the shell by using source ~/.bash_profile. Like Ubuntu, ananconda3 is installed under ~/ directory. In addition, Anaconda-Navigator is available under Finder -> Applications.
  • How To Install the Anaconda Python Distribution on Ubuntu 16.04. As we can see Anaconda3 will be installed under /home/$USER/anaconda3.
    • Download Anaconda3-2018.12-Linux-x86_64.sh from https://www.anaconda.com/distribution/#download-section
    • bash Anaconda3-2018.12-Linux-x86_64.sh
    • There is a question: Do you wish the installer to initialize Anaconda3. If you answer Yes, it will run conda init & modify ~/.bashrc file. # This will overwrite system's Python. So the default python/python3 will now be in /home/$USER/anaconda3/bin/.
      Do you wish the installer to initialize Anaconda3
      by running conda init? [yes|no]
      [no] >>> yes
      no change     /home/brb/anaconda3/condabin/conda
      no change     /home/brb/anaconda3/bin/conda
      no change     /home/brb/anaconda3/bin/conda-env
      no change     /home/brb/anaconda3/bin/activate
      no change     /home/brb/anaconda3/bin/deactivate
      no change     /home/brb/anaconda3/etc/profile.d/conda.sh
      no change     /home/brb/anaconda3/etc/fish/conf.d/conda.fish
      no change     /home/brb/anaconda3/shell/condabin/Conda.psm1
      no change     /home/brb/anaconda3/shell/condabin/conda-hook.ps1
      no change     /home/brb/anaconda3/lib/python3.8/site-packages/xontrib/conda.xsh
      no change     /home/brb/anaconda3/etc/profile.d/conda.csh
      modified      /home/brb/.bashrc
      
      ==> For changes to take effect, close and re-open your current shell. <==
      
      If you'd prefer that conda's base environment not be activated on startup, 
         set the auto_activate_base parameter to false: 
      
      conda config --set auto_activate_base false
      

      If I choose not to modify .bashrc file,

      Do you wish the installer to initialize Anaconda3
      by running conda init? [yes|no]
      [no] >>> no
      
      You have chosen to not have conda modify your shell scripts at all.
      To activate conda's base environment in your current shell session:
      
      eval "$(/home/brb/anaconda3/bin/conda shell.YOUR_SHELL_NAME hook)" 
      
      To install conda's shell functions for easier access, first activate, then:
      
      conda init
      
      If you'd prefer that conda's base environment not be activated on startup, 
         set the auto_activate_base parameter to false: 
      
      conda config --set auto_activate_base false
      
      Thank you for installing Anaconda3!
      
      
    • Anaconda-Navigator (including jupyter notebook, Spyder IDE, ...) can be launched by typing anaconda-navigator in a terminal

Miniconda

  • https://docs.conda.io/en/latest/miniconda.html As you can see miniconda installers were separated by the Python version.
  • How To Install Miniconda In Linux 2021. It includes Install Miniconda interactively, unattended installation, Update Miniconda, and Uninstall Miniconda. If you've chosen the default location, the installer will display “PREFIX=/var/home/<user>/miniconda3”. To manually activate conda's base environment, do /home/<user>/miniconda3/etc/profile.d/conda.sh where we assume miniconda is installed under /home/<user>/miniconda3 directory.
  • Miniconda Installation for macOS users 2019. At the end of installation, we see if we don't want conda's base environment to be activated on start up, we can do conda config --set auto_activate_base false
  • See also Python on Biowulf about how to specify prefix.
  • We can add/install a module to an existing environment. See Miniconda: Python(s) in a convenient setup.
conda install -n <env_name> <package>
conda create -n myenv python=3 # create a new environment named “myenv” with Python 3 installed
   # after that, use "conda activate myenv" and use "conda install numpy" to install the numpy

Install and "conda init"

  • Windows: screenshots are included Setting up Python on Windows with Miniconda by Anaconda & Anaconda documentation. The default is not to add Anaconda to my PATH environment variable.
  • Ubuntu: How to Install Miniconda on Ubuntu 20.04. After installation, PATH variable will prepend ~/miniconda3/condabin which contains only 1 file: conda.
  • conda init
    • Running conda init initializes conda for shell interaction by writing some shell code in the relevant startup scripts of your shell (e.g~/.bashrc) 1. This allows the conda command to interact more closely with the shell context and provides a cleaner PATH manipulation and snappier responses in some conda commands. The main advantage of running conda init is that it enables the use of the conda activate and conda deactivate commands, which are used to activate and deactivate conda environments.
    • We only need to call "conda init" once no matter after we install conda how many conda environments we will work.
    • One disadvantage of running conda init is that it can sometimes cause issues if the initialization is not done correctly or if there are conflicts with other configurations in your shell startup scripts. However, these issues can usually be resolved by troubleshooting and making the necessary changes to your configurat

conda environment

A conda environment is a directory that contains a specific collection of conda packages that you have installed.

You need to create an environment first before you can activate it. The conda activate command does not create an environment for you, it only activates an existing one.

conda create --name myenv

conda activate myenv

Q: Where are the environments located? A: Conda environments are typically stored in the envs subdirectory of your Anaconda installation directory. For example, if you have an environment named myenv, it would be located in a directory like ~/anaconda3/envs/myenv. The exact path can be found by using conda env list command.

Q: How do I list all existing environments? A: To list all existing conda environments, you can use the conda env list or conda info --envs command. Here’s how you do it:

conda env list
#
base       * /opt/conda
DrivR-Base   /opt/conda/envs/DrivR-Base

Q: How to quit a conda environment?

conda deactivate  # Return to base
conda deactivate  # Exit base 

Q: Check the disk space used by a specific conda environment.

du -sh /path/to/conda/envs/your_enviornment_name

Q: How to delete a conda environemnt,

conda deactivate
conda env remove --name your_environment_name
# OR
conda remove --name your_environment_name --all

Install all anaconda packages

Uninstall miniconda

  1. rm -rf ~/miniconda3
  2. nano ~/.bash_profile and delete conda initialize block

What's the purpose of the “base” (for best practices) in Anaconda?

https://stackoverflow.com/a/56504279

Does Conda replace the need for virtualenv?

Yes. Conda is not limited to Python but can be used for other languages too.

GCC/gFortran

  • conda install gcc
  • Using conda-forge channel - conda install -c conda-forge gfortran

Using R language with Anaconda

  • Using R language with Anaconda
    conda create -n r_env r-essentials r-base
    conda activate r_env
    
  • Difference of install a package using install.packages() function in R and using the conda install command?
    • The install.packages() function in R and the conda install command are two different ways to install R packages. The install.packages() function is used to install packages from the Comprehensive R Archive Network (CRAN), while the conda install command is used to install packages from the Anaconda repository.
    • One key difference between the two methods is that conda can manage dependencies across multiple programming languages, while install.packages() only manages dependencies within R.
    • Another difference is that conda allows you to create and manage multiple isolated environments, each with its own set of packages. This can be useful if you want to have different versions of packages available for different projects. With install.packages(), all packages are installed in the same global library, which can make it more difficult to manage dependencies and avoid conflicts.
  • The Definitive Guide to Conda Environments, Using R language with Anaconda. Environments created with conda create live by default in the envs/ folder of your Conda directory, whose path will look something like /Users/user-name/miniconda3/envs or /Users/user-name/anaconda3/envs.
    Activate conda base                  Create a new env    Activate a new env           Deactivate an env
    ----------------------------> (base) ----------------->  -------------------> (r-env) -----------------> (base)
    eval $(conda shell.bash hook)"       conda create r-env  conda activate r-env         conda deactivate
    
    $ eval "$(/home/brb/anaconda3/bin/conda shell.bash hook)"
    (base) $ mkdir mypythonproj; cd mypythonproj  # This step seems not necessary
    (base) $ conda create -n r-env r-base
    ...
    #
    # To activate this environment, use
    #
    #     $ conda activate r-env
    #
    # To deactivate an active environment, use
    #
    #     $ conda deactivate
    (base) $ conda activate r-env
    (r-env) $ ls anaconda3/envs
    r-env
    (r-env) $ conda install r-essentials
    (r-env) $ which R
    /home/brb/anaconda3/envs/r-env/bin/R
    (r-env) $ ls -la   # Still Empty
    (r-env) $ R --version
    R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
    # Note that the current R version should be 4.0.3
    (r-env) $ conda env list
    base                     /home/brb/anaconda3
    r-env                 *  /home/brb/anaconda3/envs/r-env
    (r-env) $ conda deactivate
    (base)  $ 
    

    It seems to be better to save the environment inside a project directory. So using python -m venv /path/to/new/environment method is preferred. You can also use conda create --prefix /path/to/new/environment. Placing environments outside of the default env/ folder comes with some drawbacks. Read the document of 'The Definitive Guide to Conda Environments'.

  • conda-forge channel, A brief introduction, https://anaconda.org/conda-forge/r-base. Following the instruction seems to mess things up though the conda-forge says the latest version is 4.0.3 (3 years late).
    $ eval "$(/home/brb/anaconda3/bin/conda shell.bash hook)"
    (base) $ conda install -c conda-forge r-base
    ...
    ## Package Plan ##
    
      environment location: /home/brb/anaconda3
    
      added / updated specs:
        - r-base
    ...
    Downloading and Extracting Packages
    r-base-3.2.2    ...
    (base) $ R --version
    /home/brb/anaconda3/lib/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
    (base) $ which R
    /home/brb/anaconda3/bin/R
    

Run R with Jupyter notebook

Example 1: GEO2RNAseq

GEO2RNAseq

Example 2: p-NET

Biologically informed deep neural network for prostate cancer classification and discovery and the paper 2021.

Mamba

  • Mamba is a high-performance package manager that is fully compatible with Conda, the package management system widely used in the Python ecosystem. It was developed to provide a faster and more efficient alternative to Conda, addressing some of the performance issues, especially in terms of dependency resolution and package installation speed.
  • https://github.com/mamba-org/mamba The Fast Cross-Platform Package Manager
  • Biowulf. Mambaforge: a derivative of miniconda that includes mamba and uses the conda-forge channel in place of the defaults channel
  • How to install Mamba on Ubuntu 21.10, How to install R using Mamba, How to install RStudio on Ubuntu 21.10 with R installed using Mamba
  • Debian 12
    • (This step is unnecessary, miniforge includes Conda already) Install anaconda on debian. I choose 'no' at the final question.
      sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 \
                libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
      
      curl -O https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
      
      bash Anaconda3-2024.06-1-Linux-x86_64.sh
      
      # Anaconda3 will now be installed into this location: /home/$USER/anaconda3
      # Do you wish to update your shell profile to automatically initialize conda? no
      # Log out and log in again
      
      # verify conda
      conda list
      
      conda activate base
      
      conda deactivate
    • Install mamba. Note that I choose 'yes' at the final question.
      curl -O https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
      bash Miniforge3-$(uname)-$(uname -m).sh
      # press q to exit the agreement
      # Miniforge3 will now be installed into this location:
      # /home/brb/miniforge3
      
      # To activate this environment, use:
      #
      #    micromamba activate /home/brb/miniforge3
      #
      # Or to execute a single command in this environment, use:
      #
      #    micromamba run -p /home/brb/miniforge3 mycommand
      # 
      # ...
      # You can undo this by running `conda init --reverse $SHELL`? [yes|no]
      # [no] >>> yes    <----- IMPORTANT; o.w. mamba will not be available
      # You have chosen to not have conda modify your shell scripts at all.
      # To activate conda's base environment in your current shell session:
      # 
      # eval "$(/home/brb/miniforge3/bin/conda shell.YOUR_SHELL_NAME hook)"
      #
      # To install conda's shell functions for easier access, first activate, then:
      #
      # conda init
      # 
      # Thank you for installing Miniforge3!
    • Using mamba. Mamba commands are the same as Conda commands, so you can seamlessly switch between using the two.
      mamba create -n myenv python=3.6.12
      
      mamba activate myenv
      
      mamba install numpy=1.19.2  pandas=1.1.3
      pip list
      
      mamba deactivate
      # deactivating an environment does not delete it; it simply changes your working context. 
      
      mamba env list
      
      mamba env remove -n myenv
      
      $ which mamba
      /home/brb/miniforge3/condabin/mamba
      $ which conda
      /home/brb/miniforge3/condabin/conda

Web framework

Flask

Django

Games

Simulate gravity in your Python game