PDF: Difference between revisions
Line 254: | Line 254: | ||
</pre> | </pre> | ||
= How to turn web pages into PDFs = | = How to turn web pages into PDFs using 'google-chrome' = | ||
<ul> | <ul> | ||
<li>[https://tecadmin.net/create-pdf-google-chrome-headless/ How to Create PDF of Webpage Using Google Chrome Headless]. [https://rdrr.io/cran/pagedown/man/chrome_print.html pagedown::chrome_print()] | <li>[https://tecadmin.net/create-pdf-google-chrome-headless/ How to Create PDF of Webpage Using Google Chrome Headless]. [https://rdrr.io/cran/pagedown/man/chrome_print.html pagedown::chrome_print()] | ||
Line 264: | Line 264: | ||
</li> | </li> | ||
</ul> | </ul> | ||
= Print website to PDF online = | |||
* The problem of using 'google-chrome' is I don't have a lot of controls. | |||
* https://webtopdf.com/ (it works great when I test [https://bioconductor.org/packages/release/bioc/vignettes/COCOA/inst/doc/IntroToCOCOA.html this] page. It turns the website into a 44 pages pdf though it lost the table of contents). It has lots of options like zoom, margins, ... | |||
** To preserve the table of contents, I checked '''Auto Bookmark'''. | |||
= Rmd to PDF = | = Rmd to PDF = | ||
[https://datawookie.dev/blog/2021/05/using-pagedown-in-docker/ Using {pagedown} in Docker] | [https://datawookie.dev/blog/2021/05/using-pagedown-in-docker/ Using {pagedown} in Docker] |
Revision as of 19:06, 13 June 2021
Ubuntu PDF viewer
Linux PDF Viewer: Best 15 PDF Readers Reviewed for Linux Users
- Okular (install through app store, annotation function, trim margins/selection) Best
- How to use annotations, How to annotate documents using Okular. Click Tools -> Review to display the annotation tools.
- Click 'Reviews' icon to see a list of annotations that were made.
- Adobe Reader
- Qoppa PDF Studio
- Foxit Reader (By default it will be installed to ~/opt/foxitsoftware/foxitreader). It freezes my Pop_OS 20.04.
- MuPDF (lightweight, seems no thumbnail option, no GUI interface)
- XPDF
- Qpdfview
- GNU GV
- Zathura
- Atril Document Reader
- ePDF Viewer
- Calibre
- Google Drive
- Master PDF Editor
Change the default viewer
Right Click(pdf)-> Properties-> Open With-> Okular (or anything) -> Set as default.
PDF reader
The default one Evince seems slow when I try to view odroid magazine.
MuPDF is good at speed. Okular is good at annotation.
I installed and tried MuPDF (github source code). It seems faster and I don't see blank pages when I view one odroid magazine. In terms of speed, mupdf >> xpdf >> okular >> Evince.
To change it to be the default program for opening PDF files, right click the file and select Property. Go to the Open With tab. Choose your file viewer.
sudo apt-get install mupdf
Keyboard shortcuts for mupdf (man mupdf) or http://mupdf.com/docs/manual. Note these are case-sensitive.
W - fit to width H - fit to height L - rotate page left (clockwise) R - rotate page right (counter-clockwise) 12g - go to page 12 >,< - go to the next or previous page +,- - zoom in or out / - search for text n,N - Find the next or previous search result. h,j,k,l - Scroll page left, down, up, or right.
Tip: to copy a text, use the right mouse button to select a text. Then use Ctrl+c to copy it. It seems it does not work all the time:(
Other pdf viewer choices are
- acroread
- Allow to have custom colors for page background and document text.
- The custom colors works well on Macbook Pro (2880 x 1440). Background color #494949 and text color #494949.
- xpdf. old-fashioned. slow.
- evince. slow.
- okular (KDE/Qt application)
- Annotation tool such as highlighter is under Tools > Review (F6).
- Allow to change its background color. Though it works, the result using 'invert colors' option is not good on Dell U2312HM. We can try other option like 'dark & light colors' where we can change the individual colors for the background (say #494949) and text.
- Not as fast as mupdf. It can open a variety of ebook formats.
- MacOS should work but it needs to install KDE.
- Able to show file properties eg Page Size (eg 50x36 in), Creator (eg PowerPoint), Producer (eg Mac OS X Quartz PDFContext), PDF version (eg 1.3)
- kpdf
- gv
- qpdfview. slow. Used by Raspbian june 2018.
- Foxit or PDF-XChange Viewer(needs wine)
Browsers
Why You Don't Need Adobe Reader (And What to Use Instead)
PDF crop
6 Best PDF Page Cropping Tools For Linux
krop
It is easy to use and works fine on Ubuntu 20.04 (I am using 0.5.1 though the current version is 0.6.0).
http://arminstraub.com/software/krop
Install manually
$ sudo apt install python3-poppler-qt5 python3-pypdf2 python3-pip $ pip3 install https://github.com/arminstraub/krop/archive/v0.6.0.tar.gz --user Successfully built krop Installing collected packages: krop WARNING: The script krop is installed in '/home/brb/.local/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
I can add ~/.local/bin to the global PATH. But how do I make it available through Activities?
pdfcrop
pdfcrop (briss is better)
https://askubuntu.com/questions/124692/command-line-tool-to-crop-pdf-files
sudo apt-get install texlive-extra-utils pdfcrop input.pdf output.pdf # no margins, works but seems too tight pdfcrop --margins 5 input.pdf output.pdf # crop pdf but keep 5 bp from each side of page pdfcrop --margins '5 10 20 30' input.pdf output.pdf # left, top, right and bottom margins of 5, 10, 20, and 30 pt # To actually crop something away, use negative values in the argument for crop. # For example, to crops 50 pts from the left, top, right, bottom (in this order). pdfcrop --margins '-50 -50 -50 -50' input.pdf output.pdf
One problem I found is (for newer PDFs with meta data) --margins initially removes the entire margin before implementing the adjustment. This will cause some pages being chopped out.
briss
This java program gives me a better control on cropping
- Download the file briss-0.9.tar.gz (8.7 MB) and extract it
- Run java -jar briss-0.9.jar
- Load the pdf file. It will ask what pages to be excluded from merging (This function does not work). Click 'Cancel' to continue.
- It will automatically create two rectangle areas; one for odd (left) pages and the other for even (right)pages
- Now we work on the left page first. Enlarge the selection to suit our need. Then right click & choose 'Select/Deselect rectangle' (a dash line will be added to the edges of the rectangle) and then 'Copy rectangles'.
- Work on the right page. Right click and choose 'Delete rectangle'. Then 'Paste rectangles'.
- Now we can click 'Action -> Preview' to preview the result. If we are satisfied with the result, we can click 'Action -> Crop PDF'. Done.
pdftk
Extract pages
pdftk oldfile.pdf cat 3-8 output newfile.pdf pdftk oldfile.pdf cat 5 9 11 output newfile.pdf
Remove certain pages
https://www.linux.com/learn/manipulating-pdfs-pdf-toolkit
sudo apt install pdftk # remove pages 10 to 25 from a PDF file pdftk myDocument.pdf cat 1-9 26-end output removedPages.pdf # remove the last page pdftk infile.pdf cat 1-r2 output outfile.pdf # remove the last 2 pages pdftk infile.pdf cat 1-r3 output outfile.pdf
Rotate using pdftk
First I convert jpg files to pdf using imagemagic.
convert *.jpg INPUT.pdf
Then I install pdftk and follow this to do a rotation.
$ sudo apt install snapd $ sudo snap install pdftk # Suppose I want to rotate page 1 to page 2. $ /snap/bin/pdftk INPUT.pdf rotate 1-2west output OUTPUT.pdf
PDF highlight and annotation
Install Okular by
sudo apt-get install okular
To highlight a line, click F6 (Tools -> Review) to turn on the annotation tool bar (it will be shown on the left hand side of the documentation). You can then click
- the 4th icon to highlight a line (it may not be able to select the right texts we want. But when it works the result is nice)
- the last icon to draw an ellipse or a rectangle (to change from an ellipse to a rectange you can click Settings -> configure Okular... -> annotation)
Another method is to use a windows program and run it using Wine. See the discussion here.
Android & iOS
Xodo Free. Cross platform.
How to convert pdf to image on Linux command line
How to convert pdf to image on Linux command line. I got an error when I used the convert command; see ImageMagick security policy 'PDF' blocking conversion for a solution by editing the file </etc/ImageMagick-6/policy.xml> on my Ubuntu 20.04.
Merge multiple pdf files into one pdf file
https://stackoverflow.com/questions/2507766/merge-convert-multiple-pdf-files-into-one-pdf
pdfunite in-1.pdf in-2.pdf in-n.pdf out.pdf
Arrange, merge, split, rotate, crop
PDFArranger: Merge, Split, Rotate, Crop Or Rearrange PDF Documents (PDF-Shuffler Fork)
https://github.com/jeromerobert/pdfarranger
Editing
- Download Master PDF Editor 4 For Linux (Free To Use Version)
- Xournal, Handwritten Notes And PDF Annotation Tool Xournal++ Update Brings New Floating Toolbox
- PDF Arranger 1.7.0 Released With New Features And Enhancements Jan 2021
TOC/table of contents
- How to create clickable table of contents in a PDF?
- Preview Tip: Making a linked Table of Contents
- PDFOutliner
Print scale
Print > Scale > slce to page
Print multiple pages per sheet: pdfnup
The program is similar to psnup.
sudo apt install texlive-extra-utils
Search
sudo apt install pdfgrep # or brew install on macOS pdfgrep 'pattern' *.pdf
Extract tables from pdf
Split view
It is useful if we want to compare two pages side by side.
- Use split-window view from Adobe reader.
- How to compare two PDF documents side by side from foxit (Windows, mac, Linux).
- Using browsers
Optimize for mobile device
Adobe reader
Close the Tools pane in Acrobat Reader DC
Sign/signature
- How to Sign a PDF: 6 Ways to Secure Electronic Signatures
- service.cancer.gov -> Search pdf signature -> Digitally Sign a Document in Adobe Acrobat or Reader
Print a text file with line numbers
How To Add Line Numbers To Text Files On Linux.
For print out R source code, we should only keep the code starting with the function definition because that's the way RStudio will display.
# 1. Using 'nl' command $ nl -b a file.txt # 2. Using 'cat' command $ cat -n file.txt # 3. Using 'awk' command $ awk 'BEGIN{i=1} /.*/{printf "%d.% s\n",i,$0; i++}' file.txt # 4. Using 'sed' command $ sed '/./=' file.txt | sed '/./N; s/\n/ /' # 5. Using 'less' command $ less -N file.txt
How to turn web pages into PDFs using 'google-chrome'
- How to Create PDF of Webpage Using Google Chrome Headless. pagedown::chrome_print()
google-chrome --headless --disable-gpu --print-to-pdf=file1.pdf http://www.example.com/
- How to turn web pages into PDFs with Puppeteer and NodeJS
Print website to PDF online
- The problem of using 'google-chrome' is I don't have a lot of controls.
- https://webtopdf.com/ (it works great when I test this page. It turns the website into a 44 pages pdf though it lost the table of contents). It has lots of options like zoom, margins, ...
- To preserve the table of contents, I checked Auto Bookmark.