AI: Difference between revisions
Line 26: | Line 26: | ||
* [https://twitter.com/sama/status/1601731295792414720 ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness] | * [https://twitter.com/sama/status/1601731295792414720 ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness] | ||
* [https://twitter.com/neelroop/status/1603821482723135488 Plagiarism is an outdated concept for AI] | * [https://twitter.com/neelroop/status/1603821482723135488 Plagiarism is an outdated concept for AI] | ||
* [https://www.theatlantic.com/technology/archive/2022/12/openai-chatgpt-writing-high-school-english-essay/672412/ The End of High-School English] | |||
* Some examples: | * Some examples: | ||
** [https://twitter.com/AndrewYNg/status/1600284752258686976?s=20&t=48tAUjSNvG8VNdPQNrKQBQ Andrew Ng] | ** [https://twitter.com/AndrewYNg/status/1600284752258686976?s=20&t=48tAUjSNvG8VNdPQNrKQBQ Andrew Ng] |
Revision as of 13:54, 26 December 2022
人類如何勝過AI
Applications
- 人何時走完全未知?美研發AI預測臨終準確度達90%
- 美國FDA首次批准AI醫療儀器上市,能自動即時偵測糖尿病視網膜病變
- 在家養老-科技幫大忙
- 病理研究有新幫手,Google以AR顯微鏡結合深度學習即時發現癌細胞
- This New App Is Like Shazam for Your Nature Photos. Seek App.
- Draw This camera prints crappy drawings of the things you photograph (DIY) with Google's quickdraw.
- What Are Machine Learning Algorithms? Here’s How They Work
- How to Read Articles That Use Machine Learning Users’ Guides to the Medical Literature
- Google的人工智慧開源神器三歲了,它被用在很多你想不到的地方 Nov 2018
- What is Natural Language Processing and How Does It Work? NLP works via preprocessing the text and then running it through the machine learning-trained algorithm.
- Why Machine Learning Cannot Ignore Maximum Likelihood Estimation van der Laan & Rose 2021
ChatGPT
- https://openai.com/api/pricing/. 0.04 - 2¢ per 1k tokens/words (language models). You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. How much does the OpenAI’s cost per session? About $0.05 to $0.1, if you send 25 to 50 requests. $18.00 free trial. Cost depends on how many words in the input and output at $0.000002 per word.
- https://openai.com/blog/chatgpt/
- Sign Up to OpenAI without Your Phone Number | OpenAI SMS Verification, https://smsverification.xyz/
- https://beta.openai.com/overview, examples
- Tip: use SHIFT + ENTER to add a line break for entering some code.
- This AI chatbot is dominating social media with its frighteningly good essays 2022/12/5
- how it actually works?
- ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness
- Plagiarism is an outdated concept for AI
- The End of High-School English
- Some examples:
- Andrew Ng
- Bioinformatics
- ChatGPT can Create Datasets, Program in R… and when it makes an Error it can Fix that too!
- It can also update code to more modern syntax as long as the syntax predates April 2021
- http://rtutor.ai/ which is built based on R's openai package.
- 15 Creative Ways to Use ChatGPT by OpenAI
- Setting up macOS as an R data science rig in 2023
AI, ML and DL
AI, ML and DL: What’s the Difference?
AI and statistics
Artificial Intelligence and Statistics: Just the Old Wine in New Wineskins? Faes 2022
What are the most important statistical ideas of the past 50 years
Four Deep Learning Papers to Read in June 2021
Neural network
- My presentations on ‘Elements of Neural Networks & Deep Learning’ -Part1,2,3, Part4,5, Part 6,7,8 by Ganesh
- Understanding the Magic of Neural Networks
- Biological interpretation of deep neural network for phenotype prediction based on gene expression 2020
Types of artificial neural networks
https://en.wikipedia.org/wiki/Types_of_artificial_neural_networks
neuralnet package
- https://cran.r-project.org/web/packages/neuralnet/
- Fitting a Neural Network in R; neuralnet package
- neuralnet: Train and Test Neural Networks Using R
- Creating & Visualizing Neural Network in R
- A Beginner’s Guide to Neural Networks with R!
nnet package
sauron package
Explaining predictions of Convolutional Neural Networks with 'sauron' package
OneR package
h2o package
https://cran.r-project.org/web/packages/h2o/index.html
shinyML package
shinyML - Compare Supervised Machine Learning Models Using Shiny App
LSBoost
Explainable 'AI' using Gradient Boosted randomized networks Pt2 (the Lasso)
LightGBM/Light Gradient Boosting Machine
Survival data
- Building a survival-neuralnet from scratch in base R
- Gradient descent for the elastic net Cox-PH model
Simulated neural network
Simulated Neural Network with Bootstrapping Time Series Data
Languages
GitHub: The top 10 programming languages for machine learning
Keras (high level library)
Keras is a model-level library, providing high-level building blocks for developing deep-learning models. It doesn’t handle low-level operations such as tensor manipulation and differentiation. Instead, it relies on a specialized, well-optimized tensor library to do so, serving as the backend engine of Keras.
Currently, the three existing backend implementations are the TensorFlow backend, the Theano backend, and the Microsoft Cognitive Toolkit (CNTK) backend.
On Ubuntu, we can install required packages by
$ sudo apt-get install build-essential cmake git unzip \ pkg-config libopenblas-dev liblapack-dev $ sudo apt-get install python-numpy python-scipy python- matplotlib python-yaml $ sudo apt-get install libhdf5-serial-dev python-h5py $ sudo apt-get install graphviz $ sudo pip install pydot-ng $ sudo apt-get install python-opencv $ sudo pip install tensorflow # CPU only $ sudo pip install tensorflow-gpu # GPU support $ sudo pip install theano $ sudo pip install keras $ python -c "import keras; print keras.__version__" $ sudo pip install --upgrade keras $ Upgrade Keras
To configure the backend of Keras, see Introduction to Python Deep Learning with Keras.
TensorFlow (backend library)
Basic
- https://www.tensorflow.org/
- https://tensorflow.rstudio.com/
- R interface to Keras. I followed the instruction for the installation but got an error of illegal operand. The solution is to use an older version of tensorflow; see here. library(keras); install_keras(tensorflow = "1.5") (Ubuntu 16.04, Phenom(tm) II X6 1055T)
- https://rviews.rstudio.com/2018/04/03/r-and-tensorflow-presentations/, Slides
- https://hub.docker.com/r/andrie/tensorflowr/, https://hub.docker.com/r/rocker/ml/dockerfile (outdated)
- Deep Learning on Biowulf
- Raspberry Pi
- Books
- Deep Learning with R by François Chollet with J. J. Allaire, 2018. ISBN-10: 161729554X (available on safaribooksonline)
- Deep Learning with Python by François Chollet, 2017 (available on safaribooksonline)
- Deep Learning by Ian Goodfellow and Yoshua Bengio and Aaron Courville
- Enterprise Web Services with Neural Networks Using R and TensorFlow A docker image was created based on R 3.5.0 using R libraries from MRAN’s July 2nd, 2018 snapshot, as well as Miniconda 3 version 4.4.10 for python.
- Deep Learning Glossary
- http://www.wildml.com/deep-learning-glossary/
- What is an epoch (related to batch) in deep learning?, Epoch vs Iteration when training neural networks. Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to complete 1 epoch. Since "batch" depends on the partition of the entire samples, we need different partitions (epoches) in order to get an unbiased result.
- Best Books to learn Tensorflow
Some terms
Machine Learning Glossary from developers.google.com
Tensor
Tensors for Neural Networks, Clearly Explained!!!
Dense layer and dropout layer
In Keras, what is a "dense" and a "dropout" layer?
Fully-connected layer (= dense layer). You can choose "relu" or "sigmoid" or "softmax" activation function.
Activation function
- Artificial neural network -> Neural networks as functions [math]\displaystyle{ \textstyle f (x) = K \left(\sum_i w_i g_i(x)\right) }[/math] where K (commonly referred to as the activation function) is some predefined function, such as the hyperbolic tangent or sigmoid function or softmax function or rectifier function.
- Rectifier/ReLU f(x) = max(0, x).
- Sigmoid. Binary problem. Logistic function and hyperbolic tangent tanh(x) are two examples of sigmoid functions.
- Softmax. Multiclass classification.
Backpropagation
https://en.wikipedia.org/wiki/Backpropagation
Convolutional network
https://en.wikipedia.org/wiki/Convolutional_neural_network
Deep Learning with Python
Jupyter notebooks for the code samples of the book "Deep Learning with Python"
sudo apt install python3-pip python3-dev sudo apt install build-essential cmake git unzip \ pkg-config libopenblas-dev liblapack-dev sudo apt-get install python3-numpy python3-scipy python3-matplotlib \ python3-yaml sudo apt install libhdf5-serial-dev python3-h5py sudo apt install graphviz sudo pip3 install pydot-ng # sudo apt-get install python-opencv # https://stackoverflow.com/questions/37188623/ubuntu-how-to-install-opencv-for-python3 # https://askubuntu.com/questions/783956/how-to-install-opencv-3-1-for-python-3-5-on-ubuntu-16-04-lts sudo pip3 install keras
Colorize black-and-white photos
Colorize black-and-white photos
Keras using R
- R Markdown Notebooks for "Deep Learning with R"
- R interface to Keras
- Deep Neural Network in R
- Python vs R
- Derivative of a tensor operation: the gradient
- Define loss_value = f(W) = dot(W, x)
- W1 = W0 - step * gradient(f)(W0)
- Stochastic gradient descent
- Tensor operations:
- relu(x) = max(0, x)
- Each neural layer from our first network example transforms its input data:output = relu(dot(W, input) + b) where W and b are the weights or trainable parameters of the layer.
Training process:
- Draw a batch of X and Y
- Run the network on x (a step called the forward pass) to obtain predictions y_pred.
- How many layers to use.
- How many “hidden units” to chose for each layer.
- Compute the loss of the network on the batch
- loss
- optimizer: determines how learning proceeds (how the network will be updated based on the loss function). It implements a specific variant of stochastic gradient descent (SGD).
- metrics
- Update all weights of the network in a way that slightly reduces the loss on this batch.
- batch_size
- epochs (=iteration over all samples in a batch_size of samples)
Keras (in order to use Keras, you need to install TensorFlow or CNTK or Theano):
- Define your training data: input tensors and target tensors.
- Define a network of layers (or model). Two ways to define a model:
- using the keras_model_sequential() function (only for linear stacks of layers, which is the most common network architecture by far) or
model <- keras_model_sequential() %>% layer_dense(units = 32, input_shape = c(784)) %>% layer_dense(units = 10, activation = "softmax")
- the functional API (for directed acyclic graphs of layers, which let you build completely arbitrary architectures)
input_tensor <- layer_input(shape = c(784)) output_tensor <- input_tensor %>% layer_dense(units = 32, activation = "relu") %>% layer_dense(units = 10, activation = "softmax") model <- keras_model(inputs = input_tensor, outputs = output_tensor)
- using the keras_model_sequential() function (only for linear stacks of layers, which is the most common network architecture by far) or
- Compile the learning process by choosing a loss function, an optimizer, and some metrics to monitor.
model %>% compile( optimizer = optimizer_rmsprop(lr = 0.0001), loss = "mse", metrics = c("accuracy") )
- Iterate on your training data by calling the fit() method of your model.
model %>% fit(input_tensor, target_tensor, batch_size = 128, epochs = 10)
Custom loss function
Custom Loss functions for Deep Learning: Predicting Home Values with Keras for R
Metrics
https://machinelearningmastery.com/custom-metrics-deep-learning-keras-python/
Docker RStudio IDE
Assume we are using rocker/rstudio IDE, we need to install some packages first in the OS.
$ docker run -d -p 8787:8787 -e USER=XXX -e PASSWORD=XXX --name rstudio rocker/rstudio $ docker exec -it rstudio bash # apt update # apt install python-pip python-dev # pip install virtualenv
And then in R,
install.packages("keras") library(keras) install_keras(tensorflow = "1.5")
Use your own Dockerfile
Data Science for Startups: Containers Building reproducible setups for machine learning
Some examples
See Tensorflow for R from RStudio for several examples.
Binary data (Chapter 3.4)
- The final layer will use a sigmoid activation so as to output a probability (a score between 0 and 1, indicating how likely the sample is to have the target “1”.
- A relu (rectified linear unit) is a function meant to zero-out negative values, while a sigmoid “squashes” arbitrary values into the [0, 1] interval, thus outputting something that can be interpreted as a probability.
library(keras) imdb <- dataset_imdb(num_words = 10000) c(c(train_data, train_labels), c(test_data, test_labels)) %<-% imdb # Preparing the data vectorize_sequences <- function(sequences, dimension = 10000) {...} x_train <- vectorize_sequences(train_data) x_test <- vectorize_sequences(test_data) y_train <- as.numeric(train_labels) y_test <- as.numeric(test_labels) # Build the network ## Two intermediate layers with 16 hidden units each ## The final layer will output the scalar prediction model <- keras_model_sequential() %>% layer_dense(units = 16, activation = "relu", input_shape = c(10000)) %>% layer_dense(units = 16, activation = "relu") %>% layer_dense(units = 1, activation = "sigmoid") model %>% compile( optimizer = "rmsprop", loss = "binary_crossentropy", metrics = c("accuracy") ) model %>% fit(x_train, y_train, epochs = 4, batch_size = 512) ## Error in py_call_impl(callable, dots$args, dots$keywords) : MemoryError: ## 10.3GB memory is necessary on my 16GB machine # Validation results <- model %>% evaluate(x_test, y_test) # Prediction on new data model %>% predict(x_test[1:10,])
Multi class data (Chapter 3.5)
- Goal: build a network to classify Reuters newswires into 46 different mutually-exclusive topics.
- You end the network with a dense layer of size 46. This means for each input sample, the network will output a 46-dimensional vector. Each entry in this vector (each dimension) will encode a different output class.
- The last layer uses a softmax activation. You saw this pattern in the MNIST example. It means the network will output a probability distribution over the 46 different output classes: that is, for every input sample, the network will produce a 46-dimensional output vector, where outputi is the probability that the sample belongs to class i. The 46 scores will sum to 1.
library(keras) reuters <- dataset_reuters(num_words = 10000) c(c(train_data, train_labels), c(test_data, test_labels)) %<-% reuters model <- keras_model_sequential() %>% layer_dense(units = 64, activation = "relu", input_shape = c(10000)) %>% layer_dense(units = 64, activation = "relu") %>% layer_dense(units = 46, activation = "softmax") model %>% compile( optimizer = "rmsprop", loss = "categorical_crossentropy", metrics = c("accuracy") ) history <- model %>% fit( partial_x_train, partial_y_train, epochs = 9, batch_size = 512, validation_data = list(x_val, y_val) ) results <- model %>% evaluate(x_test, one_hot_test_labels) # Prediction on new data predictions <- model %>% predict(x_test)
- MNIST dataset.
Regression data (Chapter 3.6)
- Because so few samples are available, we will be using a very small network with two hidden layers. In general, the less training data you have, the worse overfitting will be, and using a small network is one way to mitigate overfitting.
- Our network ends with a single unit, and no activation (i.e. it will be linear layer). This is a typical setup for scalar regression (i.e. regression where we are trying to predict a single continuous value). Applying an activation function would constrain the range that the output can take. Here, because the last layer is purely linear, the network is free to learn to predict values in any range.
- We are also monitoring a new metric during training: mae. This stands for Mean Absolute Error.
library(keras) dataset <- dataset_boston_housing() c(c(train_data, train_targets), c(test_data, test_targets)) %<-% dataset build_model <- function() { model <- keras_model_sequential() %>% layer_dense(units = 64, activation = "relu", input_shape = dim(train_data)[[2]]) %>% layer_dense(units = 64, activation = "relu") %>% layer_dense(units = 1) model %>% compile( optimizer = "rmsprop", loss = "mse", metrics = c("mae") ) } # K-fold CV k <- 4 indices <- sample(1:nrow(train_data)) folds <- cut(1:length(indices), breaks = k, labels = FALSE) num_epochs <- 100 all_scores <- c() for (i in 1:k) { cat("processing fold #", i, "\n") # Prepare the validation data: data from partition # k val_indices <- which(folds == i, arr.ind = TRUE) val_data <- train_data[val_indices,] val_targets <- train_targets[val_indices] # Prepare the training data: data from all other partitions partial_train_data <- train_data[-val_indices,] partial_train_targets <- train_targets[-val_indices] # Build the Keras model (already compiled) model <- build_model() # Train the model (in silent mode, verbose=0) model %>% fit(partial_train_data, partial_train_targets, epochs = num_epochs, batch_size = 1, verbose = 0) # Evaluate the model on the validation data results <- model %>% evaluate(val_data, val_targets, verbose = 0) all_scores <- c(all_scores, results$mean_absolute_error) }
PyTorch
An R Shiny app to recognize flower species
Google Cloud Platform
- Choosing between TensorFlow/Keras, BigQuery ML, and AutoML Natural Language for text classification Comparing text classification done three ways on Google Cloud Platform
Amazon
Amazon's Machine Learning University is making its online courses available to the public
Workshops
Notebooks from the Practical AI Workshop 2019
OpenML.org
Biology
- Predicting Splicing from Primary Sequence with Deep Learning Jaganathan et al 2018
- Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks Li et al BMC Bioinformatics 2019
- DL 101: Basic introduction to deep learning with its application in biomedical related fields 2022