C

From 太極
Jump to navigation Jump to search

Socket Programming

Terms

There are two types of address domains.

  • AF_UNIX: the unix domain for two processes which share a common file system, and
  • AF_INET: the Internet domain for any two hosts on the Internet.

There are two types of sockets.

  • a stream socket in which characters are read in a continuous stream as if from a file or pipe, and
  • a datagram socket, in which messages are read in chunks.

The two symbolic constants are SOCK_STREAM and SOCK_DGRAM.

There are two types of ports:

  • well known Ports | those that rarely change overtime. For instance, servers that provide mail, file transfer, remote login,

etc.

  • dynamic ports | typically used only for the life of a process. For instance, pipes can be implemented using message passing

Examples of servers:

  • mail server
  • login server
  • file server
  • web server
  • streaming server

TCP vs UDP

  • TCP is used for services with a large data capacity, and a persistent connection
  • „ UDP is more commonly used for quick lookups, and single use query-reply actions.
  • Some common examples of TCP and UDP with their default ports:
    • DNS lookup UDP 53
    • FTP TCP 21
    • HTTP TCP 80
    • POP3 TCP 110
    • Telnet TCP 23

General Idea

The steps involved in establishing a socket on the client side are as follows:

  1. Create a socket with the socket() system call
  2. Connect the socket to the address of the server using the connect() system call
  3. Send and receive data. There are a number of ways to do this, but the simplest is to use the read() and write() system calls.

The steps involved in establishing a socket on the server side are as follows:

  1. Create a socket with the socket() system call
  2. Bind the socket to an address using the bind() system call. For a server socket on the Internet, an address consists of a port number on the host machine.
  3. Listen for connections with the listen() system call
  4. Accept a connection with the accept() system call. This call typically blocks until a client connects with the server.
  5. Receive and send data by using read() and write().

Internet Hearsay

View all TCP sockets currently active
$ netstat --tcp

View all UDP sockets
$ netstat --udp

View all TCP sockets in the listening state
$ netstat --listening

View the multicast group membership information
$ netstat --groups

Display the list of masqueraded connections
$ netstat --masquerade

View statistics for each protocol
$ netstat --statistics

Display all traffic on the eth0 interface for the local host
$ tcpdump -l -i eth0

Show all traffic on the network coming from or going to host plato
$ tcpdump host plato

Show all HTTP traffic for host camus
$ tcpdump host camus and (port http)

View traffic coming from or going to TCP port 45000 on the local host
$ tcpdump tcp port 45000

Make http request via telnet

Below, we only input two lines. One is telnet linus.nci.nih.gov 80 and the other is HEAD / HTTP/1.0\n\n. Remember the one carriage character and one line feed at the end of request line. We can change the HTTP method in the 2nd input to GET /HTTP/1.0\n\n to fetch the full page. See the book HTTP: The Definitive Guide and wikipedia.

$ telnet linus.nci.nih.gov 80
Trying 137.187.182.124...
Connected to ncias-p942-v-1.nci.nih.gov.
Escape character is '^]'.
HEAD / HTTP/1.0

HTTP/1.1 200 OK
Date: Thu, 21 Mar 2013 14:47:26 GMT
Server: Apache
Last-Modified: Tue, 12 Mar 2013 13:52:32 GMT
ETag: "302a-4d7ba99db0800"
Accept-Ranges: bytes
Content-Length: 12330
Connection: close
Content-Type: text/html

Connection closed by foreign host.
$

Socket Programming Examples using C/C++/Qt

Example 1 - Linux/Unix

http://www.linuxhowtos.org/C_C++/socket.htm. The codes are saved under here.

$ gcc server.c -o server
$ gcc client.c -o client
$ ./server 51717
$ ./client 192.168.0.21 51717
Please enter the message: Who are you?
I got your message

where 192.168.0.21 is the ip address on the server. The server side will show

Here is the message: Who are you?

If everything works correctly, the server will display your message on stdout, send an acknowledgement message to the client and terminate. The client will print the acknowledgement message from the server and then terminate.

  • We can actually run the client on a different machine (eg server on my Ubuntu and client on my Windows) although we can also run both client and server on the same machine.
  • Once we use the port (51717) one time, we can not use the same port to run it again??? The screen shows an error "ERROR on binding: Address already in use". The problem is we may need to wait until 4 minutes for avoiding this message. See the solution in here or SO_REUSEADDR option in setsockopt(). That is, we just need to change server.c to add the following
 ...
 int iOption = 1; // Turn on keep-alive, 0 = disables, 1 = enables
 ...
 // Immediately after the declaration of sockfd, we do
 if (setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, (const char *) &iOption,  sizeof(int)) == -1) {
    error("setsockopt");
    exit(1);
 }
  • On Windows, we can use TCPView to see which process is listening on which port, socket status. On Linux, we can use netstat -a command (it gives a long output)
mli@PhenomIIx6:~/Downloads$ sudo netstat -a | grep 51717
tcp        0      0 *:51717                 *:*                     LISTEN
  • We can choose any port number between 2000 and 65535.
  • If we use 51717 port for example, the server will open that port. But once the program is finished, the port will be closed immediately. Use linux command netstat -lp --inet to check which ports are opened.

Similar examples:

Example 2 (in C)

http://www.prasannatech.net/2008/07/socket-programming-tutorial.html

Server side (The code is here):

$ ./tcpserver

TCPServer Waiting for client on port 5000
 I got a connection from (127.0.0.1 , 36123)
 SEND (q or Q to quit) : yes

 RECIEVED DATA = Got it.
 SEND (q or Q to quit) : how are you

 RECIEVED DATA = I am fine.
 SEND (q or Q to quit) : q

q
^C
$

Client side (The code is here):

$ ./tcpclient

Recieved data = yes
SEND (q or Q to quit) : Got it.

Recieved data = how are you
SEND (q or Q to quit) : I am fine.

Example 3 - Simple HTTP server

The example is modified (hear files only) from http://rosettacode.org/wiki/Hello_world/Web_server (This is a wonderful website which include creating hello world web server using different programming languages). PS. the instruction in http://mwaidyanatha.blogspot.com/2011/05/writing-simple-web-server-in-c.html is worth a look but 4 lines of creating HTML standard headlines is not successful and too complicated.

We can test the server by

#include<netinet/in.h>
#include<stdio.h>
#include<stdlib.h>
#include<sys/socket.h>
#include<sys/stat.h>
#include<sys/types.h>
#include<unistd.h>

char response[] = "HTTP/1.1 200 OK\r\n"
"Content-Type: text/html; charset=UTF-8\r\n\r\n"
"<html>\r\n"
"<head><title>Bye-bye baby bye-bye</title>\r\n"
"<style>\r\n"
"   body { background-color: #111 }\r\n"
"   h1 { font-size:4cm; text-align: center; color: black;"
"   text-shadow: 0 0 2mm red} \r\n"
"</style></head>\r\n"
"<body><h1>Goodbye, world!</h1></body></html>\r\n";
 
int main()
{
	int one = 1, client_fd;
	struct sockaddr_in svr_addr, cli_addr;
	socklen_t sin_len = sizeof(cli_addr);
 
	int sock = socket(AF_INET, SOCK_STREAM, 0);
	if (sock < 0)
		err(1, "can't open socket");
 
	setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(int));
 
	int port = 8080;
	svr_addr.sin_family = AF_INET;
	svr_addr.sin_addr.s_addr = INADDR_ANY;
	svr_addr.sin_port = htons(port);
 
	if (bind(sock, (struct sockaddr *) &svr_addr, sizeof(svr_addr)) == -1) {
		close(sock);
		err(1, "Can't bind");
	}
 
	listen(sock, 5);
	while (1) {
		client_fd = accept(sock, (struct sockaddr *) &cli_addr, &sin_len);
		printf("got connection\n");
 
		if (client_fd == -1) {
			perror("Can't accept");
			continue;
		}
 
		write(client_fd, response, sizeof(response) - 1); /*-1:'\0'*/
		close(client_fd);
	}
}

Compile and run it by gcc testServer.c; ./a.out.

Example 4 - Mimic browser request

The code is based on the post. http://codebase.eu/tutorial/linux-socket-programming-c/. My local copy of [1].

This is another similar post. http://www.binarytides.com/receive-full-data-with-recv-socket-function-in-c/ which teaches how to receive full data with recv socket function in C.

Testing tcpclient.c

The result is a program that connects to google and downloads (the first 1000 bytes of) the google homepage.

$ g++ tcpclient.cpp
$ ./a.out
Setting up the structs...
Creating a socket...
Connect()ing...
send()ing message...
Waiting to recieve data...
1000 bytes recieved :
HTTP/1.1 200 OK
Date: Wed, 20 Mar 2013 14:41:54 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=78ef359985426090:FF=0:TM=1363790513:LM=1363790514:S=UO5PtdM9ETqX6Mm_; 
Set-Cookie: 
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked

8000
<!doctype html><html itemscope="itemscope" itemtype="http://schema.org/WebPage">
<head><meta content="Search the world's information,
Receiving complete. Closing socket...
$

Testing tcpserver.c and tcpclient2.c

Server side:

$ ./tcpserver
Setting up the structs...
Creating a socket...
Binding socket...
Listen()ing for connections...
Connection accepted. Using new socketfd : 4
Waiting to recieve data...
37 bytes recieved :
GET / HTTP/1.1
host: www.google.com


send()ing back a message...
Stopping server...
$

Client side (modify tcpclient.c to use IP 127.0.0.1 and port 5556):

$ ./tcpclient2
Setting up the structs...
Creating a socket...
Connect()ing...
send()ing message...
Waiting to recieve data...
10 bytes recieved :
thank you.#▒
Receiving complete. Closing socket...
$

Example 5 - Windows socket (almost implies C++)

Example 6 Get image using Qt

See Chapter 14. Foundation of Qt Development. The code is on https://github.com/arraytools/Qt/tree/master/FQD/Chapter14.

Example 7 Trip planner using Qt

See Chapter 15. Networking on C++ GUI Programming Qt 4. The code is on http://taichi.selfip.net:81/lang/c/qt-book/chap15/.

Basic(s)

C++ standard

Cheat sheet, Tutorial, Crash course

Differences between C and C++

Books for C++

The Definitive C++ Book Guide and List from the stackoverflow post.

Books for C

C++11 for Ubuntu 12.04

http://askubuntu.com/questions/113291/how-do-i-install-gcc-4-7

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-4.7 g++-4.7

# Also, don't forget to update-alternatives, as suggested here
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.6 60 --slave /usr/bin/g++ g++ /usr/bin/g++-4.6 
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.7 40 --slave /usr/bin/g++ g++ /usr/bin/g++-4.7 
sudo update-alternatives --config gcc

When issuing the last command, it will ask what version of gcc to use.

First code

The simplest c/c++ code

Mentioned by the above book Exploring Beaglebone: Tools and Techniques for Building with Embedded Linux.

main() { }

Hello World

C version (compiled by gcc helloworld.c)

#include <stdio.h>
int main(int argc, char *argv[]) 
{
    printf("Hello World!\n");
    return 0;
}  

C++ version (compiled by g++ <helloworld.cpp>). Note that though gcc is installed by default but g++ is not installed by default in Ubuntu 14.04 and Linux Mint 17.3. We need to install g++ manually.

#include <iostream>
#include <string>

using namespace std;

int main()
{
	string name; 
	cout << "Enter your name: "; 
	cin >> name; 
	cout << "\nHello, " << name << "! It's nice to meet you!";
	return 0;
}

Note that following the keyword cout is the insertion operator (or output stream operator) '<<' and following the keyword cin is the extraction operator (or input stream operator) '>>'. Both are public member functions in iostream class.

The angular brackets (<>) around the include filename means that it is a standard, rather than a user-defined include.

The header stdio.h is located in /usr/include/ directory and iostream header is in /usr/include/c++/4.X/iostream.

Running g++ helloworld.cpp actually involved several steps (process cpp to another cpp, created assembly code, compile assembly code, link and create an executable). See the book Exploring Beaglebone: Tools and Techniques for Building with Embedded Linux Chapter 2.3.

#include <iostream>

When you do #include <iostream> it causes a set of classes and other things to be included in your source file. For iostream, and most of the standard library headers, they place these things in a namespace named std.

So the code for #include <iostream> looks something like this:

namespace std { 
    class cin  { ... };
    class cout { ... };
    class cerr { ... };
    class clog { ... };
    ...
}

namespace

#include <iostream>
using namespace std;

namespace foo
{
  int value() { return 5; }
}

namespace bar
{
  const double pi = 3.1416;
  double value() { return 2*pi; }
}

int main () {
  cout << foo::value() << '\n';
  cout << bar::value() << '\n';
  cout << bar::pi << '\n';
  return 0;
}

The std namespace --- C++ Standard Library

https://en.wikipedia.org/wiki/C%2B%2B_Standard_Library

  • The C++ Standard Library is a collection of classes and functions.
  • Features of the C++ Standard Library are declared within the std namespace.
  • The C++ Standard Library also incorporates 18 headers of the ISO C90 C standard library ending with ".h", but their use is deprecated.[2] No other headers in the C++ Standard Library end in ".h".

In the above <helloworld.cpp> C++ program, <iostream> is one of standard headers included in the C++ Standard Library. The wikipedia page categorizes the standard headers by

  1. Contains: <array>, <list>, <map>, <vector>, ...
  2. General: <algorithm>, <utility>, ...
  3. Localization: <locale>, <codecvt>
  4. Strings: <string>, <regex>
  5. Streams and Input/Output: <fstream>, <iostream>, <istream>, <ostream>, <sstream>
  6. Language support
  7. Thread support library
  8. Numerics library
  9. C standard library

int main() return value

The return value for main should indicate how the program exited. Normal exit is generally represented by a 0 return value from main. Abnormal termination is usually signalled by a non-zero return but there is no standard for how non-zero codes are interpreted.

Naming convension

Some Words about Standalone Application vs Web Application

  • Users can make use of the hardware power on my own machine
  • Users don't worry his/her data will be used by 3rd party
  • Users don't need to worry about potential network problem
  • Users don't need to worry his/her jobs need to be waited in queue
  • Maintainer don't need to worry the server can be hacked (purposely or incidentally). The server has to pass some security exam before it can be opened to public.

Header/Include guard, Preprocessor

#ifndef ... #define ... #endif. See wikipedia and cplusplus.com.

Basic Arithmetic

   double a = 10, b=3;
   cout << int (a/b) << endl;       // 3
   cout << int (a/b + .5) << endl;   // 3
   a= 11;
   cout << int (a/b) << endl;    // 3
   cout << int (a/b + .5) << endl;  // 4

bool type

Data Type Ranges

type bytes range
int 4 –2,147,483,648 to 2,147,483,647
unsigned int 4 0 to 4,294,967,295
long 4 –2,147,483,648 to 2,147,483,647
unsigned long 4 0 to 4,294,967,295
long long 8 –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
unsigned long long 8 0 to 18,446,744,073,709,551,615
float 4 3.4E +/- 38 (7 digits)
double 8 1.7E +/- 308 (15 digits)

Conditional operation ? :

int main(int argc, char** argv)
{
  // logical-OR expression  ?  expression  :  conditional-expression
  char* filename = argc >= 2 ? argv[1] : (char*)"input.txt";
}

Increment ++ operator

Note that ++variable is slightly faster than its alternative variable++ since the alternative needs to make a copy of itself before returning the result. (see SeqAn tutorial here)

namespace

Create Utilities Function

Don't put them in a class; just make them non-member functions at namespace scope.

// header file <utility.hpp>
namespace utility
{
   void function1();
   void function2();
   void function3();

   template<typename T>
   void function4()
   {
      //function template definition should be here!
   }
}

// cpp <utility.cpp>
#include "utility.hpp"
namespace utility
{
   void function1() 
   {
        //code
   }
   void function2() 
   {
        //code
   }
   void function3() 
   {
        //code
   }
}

Memory Management

On page 25 of the memory management lecture note, it mentions

  • If using int x; the allocation occurs on a region of memory called the stack.
  • If using new int; the allocation occurs on a region of memory called the heap.

Heap is outside the scope of a given function and must be explicitly cleaned up with free. When free is called, the memory is released back to system but the pointer is not set to be null. Therefore if malloc and free are used within a loop the pointer should be set to null afterwards, to allow a test on the pointer after the next malloc call. See issue 17 of The MagPi.

Access the memory allocation of variable out of scope of function will cause compilation warnings and may result in unexpected crashes. The following example appeared in the MagPi magazine. A similar example also appeared in the 10.6 (Three kinds of memory management) of Accelerated C++.

include <stdio.h>
int* fun() {
  int i=0;   /* A solution is to use 'static int i=0;'          */
  return &i; /* Return the memory address of i. Do not do this! */
}

int main() {
  printf("%p\n",fun()); /* Print the memory address of i within fun() */
  return 0;
}

Solutions:

1. Use the 'static' keyword as in the comment of the code

2. If a pointer is assigned the address of memory on the heap inside a function, then it can be accessed afterwards. After the function call, it is still necessary to call free to release the dynamically allocated memory.

#include <stdio.h>
#include <stdlib.h>
int* fun() {
  int *i=0;
  i=(int*)malloc(sizeof(int));
  return i;
}
int main() {
  int *p = 0;
  p = fun();
  printf("%p\n",p);
  free(p);
  return 0;
}

3. Use standard containers such as vector. See the split function in Accelerated C++ Chapter 5.

#include<vector>

vector<string> split(const string& s)
{
   vector<string> ret;
   ...
   return ret;
}

As mentioned in the Stroustrup C++ Style and Technique FAQ. A vector keeps track of the memory it uses to store its elements. When a vector needs more memory for elements, it allocates more; when a vector goes out of scope, it frees that memory. Therefore, the user need not be concerned with the allocation and deallocation of memory for vector elements.

Heap vs Stack

When an object is needed within several different functions calls, it might be helpful to create it on the heap instead.

The difference between creating an object on the stack and the heap is

  • Objects (eg automatic variables including ordinary local variables, local variables declared using auto, and local variables declared using register) on the stack are automatically cleaned up when they go out of scope. The advantage of the stack is the speed and book-keeping. The drawback of the stack is the data is lost once the function returns.
  • Objects (eg pointers) on the heap stay in memory. It has to be explicitly deleted.
// Using stack
#include <iostream>
#include "Square.h"
using namespace std;
int main() {
  Square s; // Using the default constructor
  Square s2(3. 0, ' b' ); // Using the second constructor
  cout << " s.area()=" << s.area() << " , s.colour()=" << s.colour() << endl ;
  cout << " s2.area()=" << s2.area() << " , s2.colour()=" << s2.colour() << endl ;
  return 0;
}

// Using heap
#include <iostream>
#include "Square.h"
using namespace std;
int main() {
  Square *s = new Square(); // Using the default constructor
  Square *s2 = new Square(3. 0, ' b' ); // Using the second constructor
  cout << "s->area()=" << s->area() << " , s->colour()=" << s->colour() << endl ;
  cout << "s2->area()=" << s2->area() << " , s2->colour()=" << s2->colour() << endl ;
  delete s;
  delete s2;
  return 0;
}

Some references:

  1. What could be the advantages of stack over heap dynamic memory allocation in C?
  2. The MagPi issue 23
  3. C++ Primer Plus Chapter 9.

The auto_ptr Class (deprecated as of C++11)

  • See Chapter 16 of C++ Primer Plus.
  • http://www.cplusplus.com/reference/memory/auto_ptr/. This class template is deprecated as of C++11. unique_ptr is a new facility with a similar functionality, but with improved security (no fake copy assignments), added features (deleters) and support for arrays.

new, delete operators, pointers and dynamic memory

Use new/delete instead of malloc()/free() which depends on <cstdlib> in C++. For example, the following code was a modification from Listing 4.22 (p169) <delete.cpp> from the book 'C++ Primer Plus'.

The memory created by new operator is called heap or free store. Forgetting to use delete operator will cause a memory leak. This kind of storage is called dynamic storage which differs from automatic storage and static storage/stack.

#include <iostream>
#include <cstring> // strlen() 

using namespace std;

int main()
{
   char * pn = new char(strlen("/home") + strlen("/Downloads") + 1);
   strcpy(pn, "/home");
   cout << pn << endl;
   cout << strlen(pn) << endl;

   strcat(pn, "/Downloads");
   cout << pn << endl;
   cout << strlen(pn) << endl;
   delete [] pn;
   return 0;
}
  • new allows to allocate a variable length arrays while allocating arrays on the stack stack size must be constant. See p52 of MIT Open Course or p33 of TheMagPi issue 23.
  • new and delete should be used together. See p62 to p67 of the above MIT note.
  • Pay attention to the scope of variables. If a variable is declared within a parenthesis, it will evaporate once it exists the parenthesis. See p69 of the above MIT note for the problem and p75 for the solution.
  • See Chapter 4, 9, 12 of C++ Primer Plus. Or Chapter 14 of C++ Without Fear
  • When a program interacts with other programs in a GUI or network environment, it typically passes or receives pointers to objects.

scalar

int *pn = new int;  // dynamic
delete pn;

VS

int higgens;
int *pt = &higgens;  // not dynamic

Array

  • Every array contains a sequence of one or more objects of the same type. The number of elements in the array must be known at compile time, which requirement implies that arrays cannot grow or shrink dynamically the way library containers do. See 10.1.3 of Accelerated C++.
  • Because arrays are not class types, they have no members. In particular, they do not have the size_type member to name an appropriate type to deal with the size of an array.
  • The <cstddef> header defines size_t, which is a more general type.
const size_t NDim = 3;
double coords[NDim];

static const double numbers[] = {97, 94, 90, 0 };
static const size_t ngrades = sizeof(numbers) / sizeof(*numbers);
  • There is a fundamental relationship between arrays and pointers.
*coords = 1.5;  // the array's initial element
int *pn = new int[10]; // OR   int *pn; pn = new int[10];
pn[1] =3;
delete[] pn;

int *parr[40]; // array of 40 pointers to int

function

# Listing 4.22.1 <delete.cpp> from C++ Prime Plus (5th ed)
char *name;
char *getname(); // prototype
char *getname() {
  char* out = new char[5];
  ...
  return out;
}
name = getname();
delete [] name;
# 1. It is possible to use 'new' in a function and use 'delete' in the main function.
# 2. This memory is not controlled by scope. It means new & delete gives you more
#     power to control over how a program uses memory.
# 3. See also PPP Chapter 20.1.
#    double* jack_data = get_from_jack(&jack_count);
#    vector<double>* jill_data = get_from_jill();
#    //...
#    delete[] jack_data;
#    delete jill_data;

use new(delete) in a class constructor(destructor)

# Listing 12.4, 12.5 and p591-593 of C++ Primer Plus (5th ed)
String::String (const String & st)
{
  len = st.len;
  str = new char [len + 1];
  std::strcpy(str, st.str);
}
String::~String()
{
  delete [] str;
}

dereference

Fraction *pFrac = new Fraction(1, 2);
(*pFrac).get_num(); // OR pFrac->get_num();
(*pFrac).member;   //  pFrac->member;

combining dereference and increment in a single expression

auto pbeg = v.begin();
// print elements up to the first negative value
while (pbeg != v.end() && *pbeg >= 0)
    cout << *pbeg++ << endl; // print the current value and advance pbeg

The precedence of postfix increment is higher than that of the dereference operator, so *pbeg++ is equivalent to *(pbeg++) . The operand of * is the unincremented value of pbeg. Thus, the statement prints the element to which pbeg originally pointed and increments pbeg.

variable array size in new

int n;
cout << "How many elements?";
cint >> n;
int *p = new int[n];
....
delete [] p;

null pointer

See Null Pointers in comp.lang.c Frequently Asked Questions.

There are several ways to create a null pointer.

int *p1 = nullptr; // C++11
int *p2 = 0;
int *p3 = NULL; // must #include cstdlib header which defines the preprocessor variable NULL as 0.

Modern C++ programs generally should avoid using NULL and use nullptr instead.

To test whether a pointer is valid, we can use either

if (p0 != nullptr)  // consider p0 valid

OR

if (p0)  // consider p0 valid; p0 is not zero.

dealing with problems with memory allocation

If the memory requested is not available, the new operator returns a null pointer.

You can test for this possibility and take the appropriate action.

int *p = new int[1000];
if (!p) {
cout << "Insufficient memory.";
exit(-1);
}

Memory leak

Does calling new operator twice on the same pointer without calling delete operator in between cause a memory leak?

The answer is Yes.

Test memory leak

The following C++ code was used to test a memory leak and also the capacity of memory. Note that when 'n' is declared as unsigned int, the maximum value can be 4,294,967,295=2^32-1 which corresponds to 32,767 MB (about 32 GB) in size for a double vector (assume 8 bytes). If I change the type of n to "unsigned long long", its range can go up to 18,446,744,073,709,551,615=2^64-1 or 137438953472 GB in size for a double vector.

Note that the program should be built in x64 instead of win32 version in Visual Studio if we like to test it on non-Windows-XP OS on Windows. Linux OS does not need to worry...

#include <iostream>

using namespace std;

void foo(unsigned long long n)
{
    double *ptr = new (std::nothrow) double[n];
    if (!ptr) {
        cout << "Failed to allocate double[n]" << endl;
    } else {
        ptr[1] = 1.0;
        ptr[n-1] = 2.0;
        cout << "double[n] is allocated successfully!" << endl;
    }
}

int main()
{
    cout << "The program is used to test memory leak. \n";
    cout << "Do not worry. It won't crash your computer.\n";
    cout << "The source code is available on https://gist.github.com/arraytools/6689581 \n" << endl;
    unsigned long long n;
    int testChoice;
    cout << "Enter your choice 1=main, 2=sub: ";
    cin >> testChoice;
    cout << "\nNext enter the array size (<= 18,446,744,073,709,551,615) : ";
    cout << "\nSome common scenarios";
    cout << "\n268,435,456-1 = 2GB";
    cout << "\n536,870,912-1 = 4GB";
    cout << "\n1,073,741,824-1 = 8GB";
    cout << "\n2,147,483,648-1 = 16GB";
    cout << "\n4,294,967,296-1 = 32GB";
    cout << "\n8,589,934,592-1 = 64GB";
    cout << "\n";
    cin >> n;
    if (testChoice == 1)
    {
        double *ptr = new (std::nothrow) double[n];
        if (!ptr)
        {
            cout << "Failed to allocate double[n]" << endl;
        } else {
            ptr[1] = 1.0;
            ptr[n-1] = 2.0;
            cout << "double[n] is allocated successfully!" << endl;
        }
    } else {
        foo(n);
    }
    return 0;
}

and my testing result on Windows XP with 2GB physical memory & 2GB virtual memory. The program is compiled into 32-bit console application. It is a different story when tested on 64-bit Windows 7.

Array Size Main Function
20,000,000 =150 MB OK OK
200,000,000=1.5 GB OK OK
2000,000,000=15 GB Not OK Not OK
500,000,000 =4 GB Not OK Not OK
250,000,000=1.91 GB OK OK
270,000,000=2.06 GB Not OK Not OK

Enumerations

See 2.3.3 of The C++ Programming language or http://en.cppreference.com/w/cpp/language/enum

Unscoped enumeration

enum Color { RED, GREEN, BLUE };
Color r = RED;
switch(r)
{
    case RED  : std::cout << "red\n";   break;
    case GREEN: std::cout << "green\n"; break;
    case BLUE : std::cout << "blue\n";  break;
}

Scoped enumerations

enum class Color { RED, GREEN = 20, BLUE };
Color r = Color::BLUE;
switch(r)
{
    case Color::RED  : std::cout << "red\n";   break;
    case Color::GREEN: std::cout << "green\n"; break;
    case Color::BLUE : std::cout << "blue\n";  break;
}
// int n = r; // error: no scoped enum to int conversion
int n = static_cast<int>(r); // OK, n = 21

this keyword/pointer

The this keyword is valid only inside a member function, where it denotes a pointer to the object on which the member function is operating. For example, inside Vec::operator=, the type of this is Vec*, because this is a pointer to the Vec object of which operator= is a member. For a binary operator, such as assignment, this is bound to the left-hand operand. Ordinarily, this is used when we need to refer to the object itself, as we do here both in the initial if test and in the return.

Communication between objects

See Pi magazine issue 24. The secret is to use this pointer to pass a pointer to the data member object. In header files,

// Child.h
#ifndef CHILD_H
#define CHILD_H
class Parent; // forward declaration to reduce precompile time
class Child {
....
}
#endif
// Parent.h
#ifndef PARENT_H
#define PARENT_H
class Child; // forward declaration to reduce precompile time
class Parent {
...
}
#endif

In C++ files,

// Child.cpp
Child::Child(Parent *parent):
  m_parent(parent) {
}

void Child::run() {
  cout << m_parent->x() << ", " << m_parent->y() << endl;
}

// Parent.cpp
Parent::Parent(unsigned int x, unsigned int y):
  m_child(0),
  m_x(x),
  m_y(y) {
  if (!m_child) m_child = new Child(this);
  m_child->run();
}

bad_alloc error

http://stackoverflow.com/questions/6833143/how-to-check-memory-allocation-failures-with-new-operator

In C++ there are 2 primary ways in which new allocates memory and each requires different error checking.

The standard new operator will throw a std::bad_alloc exception on failure and this can be handled like a normal exception

try {
  char* c = new char[100];
} catch (std::bad_alloc&) {
  // Handle error
}

Or alternative the nothrow version of new will simply return NULL on failure

char* c = new (std::nothrow) char[100];
if (!c) {
  // Handle error
}

C++ videos

Type casting

by declaiming explicitly like (float)5 or using suffix like 5f.

Global variables, header

http://stackoverflow.com/questions/9702053/how-to-declare-a-global-variable-in-c

1. Put extern int myvar in the header file 2. Put int myvar in the cpp file

// header file <myheader.h>
extern int x

// cpp file
#include "myheader.h"
void foo()
{
  x = 5;
}

C vs C++ with functions

In non-object programming, we use

function(x, parameter)

In C++ programming, we use

x->function(parameter) // if x is a pointer
x.function(parameter)   // if x is not a pointer

<cstdlib> vs <stdlib.h>

The first one is for C++ and the other is for C. See here.

Function Prototypes

For example,

void cheers(int);    // prototype: no return value
double cube(double x);   // prototype: returns a double
int main()
{
...
}

Functon prototyping is often hidden in the include files.

Pass function name in a function

See

int main() {
    ....
    vector<double> homework;
    read_hw(cin, homework);

    // note that returning the stream allows our caller to write
    // if (read_hw(cin, homework)) { /* ... */ }
    // as an abbreviation for 
    // read_hw(cin, homework); 
    // if (cin) { /* ... */ }
    ....
}

istream& read_hw(istream& in, vector<double>& hw)
{
    // 'in' is not copied. We will modify it and return it.
    ....
    return in;
}
int main() {
   ...
   write_analysis(cout, "median", median_analysis, did, didnt);
   write_analysis(cout, "average", average_analysis, did, didnt);
   ...
}

void write_analysis(ostream& out, const string& name,
                    double analysis(const vector<Student_info>&),
                    const vector<Student_info>& did,
                    const vector<Student_info>& didnt)
{
	out << name << ": median(did) = " << analysis(did) <<
	               ", median(didnt) = " << analysis(didnt) << endl;
}

For the ostream & parameter, see Write a Function to Contain an Argument of Output Device.

Argument Passing in Functions

References as a function argument seems more popular for numerical values and Pointers as a function argument seems popular for char arrays. It is introduced in C++ Primer Chapter 6.2.2.

The argument can be a scalar or an array. Also it be numerical values or character strings.

Pointers as a function argument

The following example is coming from http://www.nongnu.org/c-prog-book/online/x641.html. See also cplusplus.com tutorial about pointers.

#include <stdio.h>

int swap_ints(int *first_number, int *second_number);

int main()
{
  int a = 4, b = 7;
  printf("pre-swap values are: a == %d, b == %d\n", a, b);
  swap_ints(&a, &b);
  printf("post-swap values are: a == %d, b == %d\n", a, b);
  return 0;
}

int swap_ints(int *first_number, int *second_number)
{
  int temp;
  temp = *first_number;
  *first_number = *second_number;
  *second_number = temp;
  return 0;
}

References as a function argument

The new variable is called Reference Variables which is a name that acts as an alias. This is described in C++ Primer Plus Chapter 8. See also

Here we create a reference that looks and acts like a standard C++ variable except that it operates on the same data as the variable that it references.

The following example is modified from C++ pitalls.

int foo = 3;     // foo == 3
int &bar = foo;  // foo == 3. bar is of type int &. The ampersand & is NOT the address operator.
bar = 5;         // foo == 5
int * prats = &foo;  // prats is a pointer. *prats and bar can be used interchangeably with foo
                               // and use the expression &bar and prats interchangeably with &foo.

and the same concept of references is used when passing variables.

#include <iostream>
using namespace std;

void foo( int &i )
{
    i++;
}

int main()
{
    int bar = 5;   // bar == 5
    cout << bar << endl;
    foo( bar );    // bar == 6
    cout << bar << endl;
    foo( bar );    // bar == 7
    cout << bar << endl;
    return 0;
}

However, reference variable 1. it is necessary to initialize the reference when you declare it 2. a reference is rather like a const pointer; you have to initialize it when you create it and once a reference pledges its allegiance to a particular variable, it sticks to its pledge.

More examples from this post.

// OK case.
main: int i=6;
         chgInt(&i);
function:  void chgInt(int *p);
                   *p = 10 + *p;

// Not OK case.
main: char *name = "old";
          chgStr(name);
function: void chgStr(char *n);
              n = "new"; // This will make a new string, not changing the original one

// Correction
main: SAME
function: void chgStr(char* &n);
              n = "new";

In fact, the way of using char* &n in function argument is also used by Foundation of Qt Development List 1.1 and 1.2 where string is used instead of char*. Below is the code of List 1.1:

class MyClass
{
public:
    MyClass( const string& text );
    const string& text() const;
    void setText( const string& text );
    int getLengthOfText() const;
private:
    string m_text;
};

In PPP book, Stroustrup also uses call by references everywhere, such as:

// 21.9 Container algorithms
void test(vector<int>& v)
{
  sort(v.begin(), v.end());
} 
int main()
{
  vector<int> vs;
  test(vs);
}

A real power of using references is in the example of PPP book Chapter 20.1 where we can use C' way in C++ to access a value in C++'s vector.

# C array
double* jack_data = get_from_jack(&jack_count);    
#   jack_data[i]  ---- value
#   &jack_data[i] ---- address

# C++ vector
vector<double>* jill_data = get_from_jill();    
# Then instead of using the basic way to access the data
#   (*jill_data)[i]  ---- value
#   &(*jill_data)[i] ---- address
# we can use the reference method
vector<double>& v = *jill_data;
#   v[i]  ---- value
#   &v[i] ---- address

Comparison of Reference and Pointer

Reference (pass by reference) Pointer (pass by address)
Main
int a;
int & bar = a;
foo(a); 
// pass variable (not pass address of the variable)
// it 'can' be passing 'values' of variables; 
// determined by prototype of the function def.
note: no * is used. & is used in the declaration.
int a;
int * bar = &a;
foo(&a); 
// pass address of variables
note: * is used in the declaration.
Function
foo(int &b) 
// b is an alias. It can be used directly.
// For example, b = 10; 
note: no * is used. & is used in the declaration.
foo(int *b)
// b is a pointer to an int
// For example, *b = 10;
note * is used in the declaration.

lvalue and rvalue

  • Chapter 8.5.6 Pass-vy-value vs pass-by-reference of PPP (an easy to follow book with essential concepts there).
  • Chapter 6.4 Objects and Values of The C++ Programming Language.
  • Chapter 7.7.1 Lvalue References and 7.7.2 Rvalue References of The C++ Programming Language.
void g(int a, int& r, const int& cr)
{
    ++a;
    ++r;
    int x = cr;
}
int main()
{
    int x = 0;
    int y = 0;
    int z = 0;
    g(x, y, z); // x==0; y==1; z==0
    g(1, 2, 3); // error: reference argument r needs a variable to refer to 
    g(1, y, 3); // OK since cr is const we can pass a literal
}

rvalue reference, && and Move semantics

This is used to avoid 'copy' data. The situation happened in copy constructor or assignment. It is important especially for large data.

  • See Programming: Principal and Practice (2nd ed) 18.3.4. It is a new feature in C++11.
  • Google it.

Pass the (address) of the pointer (char array) as function argument

http://stackoverflow.com/questions/1217173/how-to-have-a-char-pointer-as-an-out-parameter-for-c-function.

#include <string.h>
#include <iostream>
using namespace std;

void SetName( char **pszStr )
{
    char* pTemp = new char[10];
    strcpy(pTemp,"Mark");
    *pszStr = pTemp; // assign the address of the pointer to this char pointer
}

void SetNameBetter(char *& pszStr )
{
    char* pTemp = new char[10];
    strcpy(pTemp,"MarkBetter");
    pszStr = pTemp; // this works because pxzStr *is* the pointer in main
}

void SetNameNotOK( char *pszStr )
{
    char* pTemp = new char[10];
    strcpy(pTemp,"Mark");
    pszStr = pTemp;
}

int main(void){
    
    char* pszName = NULL;
    // SetName( pszName );
    SetName( &pszName ); // pass the address of this pointer so it can change
    cout<<"Name - "<< pszName<<endl; // not *pszName

    SetNameBetter( pszName ); // pass the pointer into the function, using a reference
    cout<<"Name - "<< pszName<<endl; // not *pszName

    delete [] pszName;
    return 0;
}

Function and Arrays

(C++ Primer Ch 6.2.4 Array Parameters) Because arrays are passed as pointers, functions ordinarily don't know the size of the array they are given. There are three techniques used to manage pointer parameters

  1. using a marker to specify the extent of a character array (null character for C-style strings)
  2. using the standard library conventions (begin and end pointers) and
  3. explicitly passing a size parameter.

See Chapter 7 of C++ Primer Plus (5th ed).

# Version 1.
int sumarray(int arr[], int n)
{
  int total = 0;
  for(int i=0; i<n; i++) total += arr[i];
  return total;
}

# Version 2.
int sumarray(int * arr, int n)
{
  int total = 0;
  for(int i=0; i<n; i++) total += arr[i];
  // arr[i] is equivalent to *(arr + i)
  return total;
}

# Version 3a.
int sumarray(const int arr[], int n); //protect input array

# Version 3b.
int sumarray(const int *begin, const int *end)
{
  const int *pt;
  int total = 0;
  for(pt=begin; pt != end; pt++) total += *pt;
  return total;
}

Pointer and const

See Chapter 7 of C++ Primer Plus (5th ed).

int sloth = 3;
const int * ps = &sloth;  
// a pointer to const int, prevent using the pointer to change the pointed-to value

int * const finger = &sloth;  
// a const pointer to int, prevent from changing where the pointer points.

Functions and Two-Dimensional Arrays

int data[3][4] = {{1,2,3,4}, {5,6,7,8}, {9,10,11,12}}; 
// data is an array with 3 elements. 
int total = sum(ar, 3); 
// Since the first element of data is an array of 4 int values. so ar should be a pointer to array-of-four-int.
// What is the prototype?

int sum(int (*ar)[4], int size); 
// the parentheses are needed because '''int *ar[4]''' should declare 
// an array of 4 pointers to int.
OR
int sum(int ar[][4], int size);

int sum(int ar[][4], int size) 
{
  int total = 0;
  for(int r=0; r< size; r++)
    for(int c=0; c<4; c++)
      total += ar[r][c];
  return total;
}

Note that the parentheses around *ar are necessary:

int *matrix[10]:    // array of 10 pointers
int (*matrix)[10]:  // pointer to an array of ten ints

command line arguments, arguments to main

int main(int argc, char *argv[]) { ... } // argv is an array of pointers to C-style character strings.
int main(int argc, char **argv) { ... }  // argv points to a char *.

argv[0] = "prog"; // actually it is 'p', 'r', 'o', 'g', '\0'.
argv[1] = "-i";   // or '-', 'i', '\0'.
argv[2] = "inputfile";
argv[3] = "-o";
argv[4] = "outputfile";

Varying parameters

(C++11) initializer_list parameters.

void error_msg(initializer_list<string> i1)
{
  for (auto beg = i1.begin(); beg != i1.end(); ++beg)  cout << *beg << " ";
  // OR for (const auto &elem : i1)  cout << elem << " ";
  cout << endl;
}
...
if (expected != actual) {
  error_msg({"foo", expected, actual});
else
  error_msg({"foo", "Okay"});
}

Ellipsis parameters

Ellipsis parameters are in C++ to allow programs to interface to C code that uses a C library facility named varargs.

void foo(param_list, ...);
void foo(...);

Return a pointer to an array

See C++ Primer Ch 6.3.3.

A function cannot return an array. It can return a pointer or a reference to an array. However, the syntax used to define functions that return pointers or references to arrays can be intimidating.

(C++11) Trailing return type. Trailing returns can be defined for any function with complicated return types, such as pointers (or references) to arrays. A trailing return type follows the parameter list and is preceded by ->. To signal that the return follows the parameter list, we use auto where the return type ordinarily appears.

// fcn takes an int argument and returns a pointer to an array of 10 ints
auto func(int i) -> int(*)[10];


Alternativly, if we know the array(s) to which our function can return a pointer, we can use decltype to declare the return type. For example, the following function returns a pointer to one of two arrays, depending on the value of its parameter.

int odd[] = {1, 3, 5, 7, 9};
int even[] = {2, 4, 6, 8, 10};
// returns a pointer to an array of 5 int elements
decltype(odd) *arrptr(int i)
{
  return(i % 2) ? &odd : &even; // return a pointer to the array
}

Pointer, vector and element

  • See PPP 20.1 (Storing and processing data) and 20.1.1 (Working with data).
// If jill_data is a pointer from get_from_jill() which returns a pointer to a vector

vector<double>* get_from_jill(); 

vector<double>* jill_data = get_from_jill();

int i = 1;
cout << *jill_data[i];   // means *(jill_data[i]) which is not we want.
cout << (*jill_data[i]); // good because [ ] binds tigher than *. 

delete jill_data;

pass a vector of strings using reference

See how to pass a vector of strings using reference.

#include <iostream>
#include <vector>
#include <string>

using namespace std;

void foo(char* &s) {
    /* Goal: print an array of character s */
    /* approach 1: not working. Only the first character is shown.
       cout << *s << " ";     */
    /* approach 2: simplest solution */
    cout << s << " ";
    /* approach 3: loop. Works fine
       while(*s) printf("%c",*s++);    */
    /* approach 4: non-loop. Works fine.
    /* std::string str(s, s + strlen((const char *)s)); convert char* to string
    cout << str << " ";                                 and then print out   */
}

void foo2(vector<string> vs) {
    // for(auto it : vs) cout << it << endl; for C++11
    for (std::vector<string>::iterator it = vs.begin(); it!=vs.end(); ++it)
        cout << ' ' << *it;
}

void foo3(vector<string> &vs) {
    for (std::vector<string>::iterator it = vs.begin(); it!=vs.end(); ++it)
        cout << ' ' << *it;
    cout << endl;
    for (unsigned int i=0; i < vs.size(); ++i)
        cout << ' ' << vs[i];
}

int main()
{
    const int argc=4;
    // C style of an array of pointers to characters
    char** argv = new char *[argc];;
    argv[0] = "This";
    argv[1] = "Is";
    argv[2] = "A";
    argv[3] = "Book";
    cout << "char** type" << endl;
    cout << "Call directly" << endl;
    for(int i=0; i<argc; i++)   cout << argv[i] << " ";
    cout << endl;

    cout << "Call foo()" << endl;
    for(int i=0; i<argc; i++) foo(argv[i]); // only 1 string at a time
    cout << endl << endl;

    // std::vector<std::string> vs = {"This", "Is", "A", "Book"}; C++11
    std::vector<std::string> vs;
    vs.push_back("This");
    vs.push_back("Is");
    vs.push_back("A");
    vs.push_back("Book");
    cout << "string vector type" << endl;
    cout << "Print from main()" << endl;
    for (std::vector<string>::iterator it = vs.begin(); it!=vs.end(); ++it)
        cout << ' ' << *it;
    cout << endl;

    cout << "Call foo2() pass by value" << endl;
    foo2(vs);  // pass whole vector
    cout << endl;

    cout << "Call foo3() pass by reference" << endl;
    foo3(vs);  // pass whole vector
    cout << endl;

    delete[] argv;
    return 0;
}

And the output

char** type
Call directly
This Is A Book
Call foo()
This Is A Book

string vector type
Print from main()
 This Is A Book
Call foo2() pass by value
 This Is A Book
Call foo3() pass by reference
 This Is A Book
 This Is A Book
Press <RETURN> to close this window...

A similar example is to create the vector elements in subroutine and again the vector is passed by reference.

#include <iostream>
#include <vector>
#include <string>

using namespace std;

void foo(vector<string> &vs) {
    vs.push_back("This");
    vs.push_back("Is");
    vs.push_back("A");
    vs.push_back("Book");
}

void foo2(vector<string> &vs) {
    for (std::vector<string>::iterator it = vs.begin(); it!=vs.end(); ++it)
        cout << ' ' << *it;
    cout << endl;
}

int main()
{
    std::vector<std::string> vs;
    cout << "string vector type" << endl;
    cout << "Print from main()" << endl;
    for (std::vector<string>::iterator it = vs.begin(); it!=vs.end(); ++it)
        cout << ' ' << *it;
    cout << endl;

    cout << "Pass by reference" << endl;
    foo(vs);  // pass whole vector
    foo2(vs);
    return 0;
}

For 2 dimensional matrix of string, see the example below.

Container class in C++ vs simple array in C

https://isocpp.org/wiki/faq/containers

try { ... } catch { ... } for exception handling

try
{
    throw 20;
}
catch (int e)
{
    cout << "An exception occurred. Exception Nr. " << e << endl;
}

Second example.

   char *buf;
   try {
      buf = new char[512];
      if( buf == 0 )
         throw "Memory allocation failure!";
   }
   catch( char * str ) {
      cout << "Exception raised: " << str << '\n';
   }

Third example.

try {
    return grade(s);
} catch (domain_error) {
    return grade(s.midterm, s.final, 0);
}

Fourth example,

#include <iostream>
using namespace std;

int divide_numbers(int a, int b)
{
    if(b==0)
        throw 1;
    return a/b;
}

int main()
{
    int a, b;

    cout << "One: ";
    cin >> a;
    cout << "Two: ";
    cin >> b;

    try
    {
        cout << divide_numbers(a, b);
    }
    catch(int& code)
    {
        cout << "ERROR CODE: " << code;
    }
    catch(...)
    {
        cout << "An unknown error has occurred.";
    }
    //Continue doing whatever afterwards like normal

    return 0;
}

Fifth example,

#include <iostream>
using namespace std;

int main()
{
    try
    {
        int* myarray = new int[100000000000000];
        delete [] myarray;
    }
    catch(exception& e) //Takes a reference to an 'exception' object
    {
        cout << "Error allocating memory: " << e.what() << endl;
    }
}

Sixth example,

// http://www.tenouk.com/cpluscodesnippet/domainerrortypeid.html
// http://www.cplusplus.com/reference/exception/exception/?kw=exception
#include <iostream>
#include <exception>  // operator exception
#include <typeinfo>   // operator typeid
using namespace std;
 
int main(void)
{
	try
	{
		throw domain_error("Some error with your domain!");
	}
	catch (std::exception &err)
	{
		cerr<<"Caught: "<<err.what()<<endl;
		cerr<<"Type: "<<typeid(err).name()<<endl;
	};
}

Class

The fundamental ideas behind classes are data abstraction and encapsulation.

Data abstraction separate interface and implementation. The interface of a class consists of the operations that users of the class can execute. The implementation includes the class's data members, the bodies of the functions.

Encapsulation enforces the separation of a class' interface and implementation.

Class vs struct

The only difference between struct and class is the default access level.

Members in a class are private by default while members defined in struct are public.

Single colon and double colons

Single colon
  • Single colon ":" was used in inheritance. For instance,
class fourwheeler {}
class car: public fourwheeler {}
class Vector {
public:
  Vector(int s) :elem{new double(s)}, sz{s} {} 
  double& operator[](int i) { return elem[i]; }
  int size() { return sz; }
private:
  double* elem;
  int sz;
}
Double colons
  • Double colons was used to define/refer a class's function/method. Sometimes it can be used to resolve namespace problem. See here.
void MyClass::setText() {}

:: operator

The :: in the function name is the same scope operator. For example, <Student_info.cc> in Accelerated C++,

double Student_info::grade() const
{
	return ::grade(midterm, final, homework);
}

Note that the :: in front of a name insists on using a version of that name that is not a member of anything. In this case, we call the version of grade that takes two doubles and a vector<double>.

The const in is a promise that calling the grade function will not change any of the data member of the Student_info object (Accelerated C++ 9.2.1). We can understand this usage by comparing the new function declaration with the original:

double Student_info::grade() const { ... }   // member-function version

double grade(const Student_info&) { ... }    // original

The grade() declared in Stduent_info::grade() is a const member function.

Another more explicit example is on Why does C++ need the scope resolution operator.

#include <iostream>

int a = 10;
namespace M
{
    int a = 20;
    namespace N
    {
        int a = 30;
        void f()
        {
            int x = a; //a refers to the name inside N, same as M::N::a
            int y = M::a; //M::a refers to the name inside M
            int z = ::a; //::a refers to the name in the global namespace

            std::cout << x << ", "<< y << ", " << z << std::endl; //30,20,10
        }
    }
}
int main() 
{
    M::N::f();
}

Access Member Functions

Use "." dot for regular objects. See the next session.

For pointers to a class, use arrows "->".

Pointers to classes
// http://www.cplusplus.com/doc/tutorial/classes/
#include <iostream>
using namespace std;

class Rectangle {
  int width, height;
public:
  Rectangle(int x, int y) : width(x), height(y) {}
  int area(void) { return width * height; }
};

int main() {
  Rectangle obj (3, 4);
  Rectangle * foo, * bar, * baz;
  foo = &obj;
  bar = new Rectangle (5, 6);
  baz = new Rectangle[2] { {2,5}, {3,6} };
  cout << "obj's area: " << obj.area() << '\n';
  cout << "*foo's area: " << foo->area() << '\n';
  cout << "*bar's area: " << bar->area() << '\n';
  cout << "baz[0]'s area:" << baz[0].area() << '\n';
  cout << "baz[1]'s area:" << baz[1].area() << '\n';       
  delete bar;
  delete[] baz;
  return 0;
}

Access parent class data member

http://stackoverflow.com/questions/11525418/how-to-access-parent-classs-data-member-from-child-class-when-both-parent-and

Default constructor

The default constructor is the constructor that takes no parameters. It is called when an object is declared but is not initialized with any arguments. In fact, empty parentheses cannot be used to call the default constructor.

// http://www.cplusplus.com/doc/tutorial/classes/
// overloading class constructors
#include <iostream>
using namespace std;

class Rectangle {
    int width, height;
  public:
    Rectangle ();
    Rectangle (int,int);
    int area (void) {return (width*height);}
};

Rectangle::Rectangle () {
  width = 5;
  height = 5;
}

Rectangle::Rectangle (int a, int b) {
  width = a;
  height = b;
}

int main () {
  Rectangle rect (3,4);
  Rectangle rectb;    // ok, default constructor called
  Rectangle rectc();  // function declaration, default constructor NOT called 
  cout << "rect area: " << rect.area() << endl;
  cout << "rectb area: " << rectb.area() << endl;
  cout << "rectc area: " << rectc.area() << endl; // error
  return 0;
}

When we compile it, we will get an error

In function 'int main()': 29:35: error: request for member 'area' in 'rectc', which is of non-class type 'Rectangle()'

Defining class member functions and member initializer list

Use the single colon and a parenthesis. The parenthesis method can be used in the general variables intialization.

This method is particular necessary when we want to initialize a constant member value. See how to initialize const member variable in a class C++ in stackoverflow.com. The constant assignment happens in initializer list, much before the class initilization occurs.

See Single colon or 9.4.4 of Programming: principles and practice, 9.5.1 of Accelerated C++.

class Date {
public:
    Date(int yy, int mm, int dd) 
        :y(yy), m(mm), d(dd)        
    {   ...  }
    void add_day(int n)
    {   ...  }
    int month() { return m; }
    ...
private:
    int y, m, d;            // year, month, day
};

The :y(yy), m(mm), d(dd) notation is how we initialize members. It is called a member initializer list (See also this article in cplusplus.com). We could also use assignment to do the same job (not as good as the above way)

Date(int yy, int mm, int dd) 
{
    y = yy;
    m = mm;
    d = dd;
    // ...
}

If we want to initialize private variables with value 0, we can do it too now.

class Student_info {
public:
  Student_info();
  Student_info(std::istream&);
private:
  std::string n;
  double midterm, final;
  std::vector<double> homework;
};

Student_info::Student_info() : midterm(0), final(0) { }

explicit keyword and implicit conversion to resolve the parameters to a function

http://stackoverflow.com/questions/121162/what-does-the-explicit-keyword-in-c-mean

Class constructor can take parameter via assignment

For example (see p6 & p76 of the [MIT lecture note)

class Integer {
  public:
  int val;
  Integer(int v) {
    val = v; cout << "constructor with arg " << v << endl;
  }
};
int main() {
  Integer i(3);
  Integer j = 5;
}

The output will be

constructor with arg 3 
constructor with arg 5

copy constructor and copy assignment (operator=)

  • See Chapter 11.3 of Accelerate C++
  • Cplusplus.com
  • See Chapter 18.3.1 and 18.3.2 of Programming: Principles and Practice (2nd ed).

copy constructor

MyClass(const MyClass& arg);
</pre>
To use it,
<pre>
MyClass v1(3);
MyClass v2 = v1; // OR MyClass v2 {v1};

copy assignment

MyClass& MyClass::operator=(const MyClass& a)
</pre>
To use it,
<pre>
MyClass v1(3);
v1.set(2, 2.2);
MyClass v2(4);
v2 = v1;

Operator Overloading

An operator function has the form

operatorOP(argument-list)

For many operators, you have a choice between using member functions or nonmember functions to implement operator overloading. Typically, the nonmember version is a friend function so it can directly access the private data for a class.

Case 1 (member function): Suppose both object1 and object2 belong to some class T, then the operation

object1 + object2

is the same as

object1.operator+(object2)

if we have implemented an appropriate operator+(const T &t) function in class T.

Case 2 (nonmember function):

friend T operator+(const T & t1, const T & t2);

Case 3 (nonmember function): Another example is object1 and object2 belong to different classes.

cout << object

comes from a nonmember friend function

void operator<<(ostream & os, const Time &t)  {  ...  }
// OR better with (see p519-520)
friend std::ostream & operator<<(std::ostream &os, const Time &t) { ... }

Case 4 (nonmember function): In another situation (see C++ Primer Plus -> Chapter 11 -> Creating Friends), the statement

ConstantVariable + object
// OR
operator+(ConstantVariable, object)

where operator+() is a nonmember friend function; e.g.

friend T operator+(double m, const T &t) { ... }
An example

From Chapter 11 of the C++ Primer Plus.

Class Main
Ex1
class Time { 
public:
  Time Sum(const Time &t) const;
}
Time time1, tim2, total;

total = time1.Sum(time2);
Ex2
class Time {
public:
  Time operator+(const Time &t) const;
}
total = time1.operator+(time2);
// OR
total = time1 + time2;

Friend functions & classes

Idea of friend functions

We want some general function (see examples below) or a member function of some class (see http://www.cprogramming.com/tutorial/friends.html) to access the (private, protected and public) members of one or more classes.

By declaring a nonmember function a friend, we can give it to access the private part of the class declaration. A friend declaration can be placed in either the private or the public part of a class declaration; it does not matter where. (Chapter 19.4 Friends of The C++ Programming Language)

Note that when friends are specified within a class, this does not give the class itself access to the friend function. That function is not within the scope of the class; it's only an indication that the class will grant access to the function (http://www.cprogramming.com/tutorial/friends.html).

For example, we could define an operator that multiplies a Matrix and a Vector. The implementation routine cannot be a member of both. Also, we don't want to provide low-level access function to allow every user to both read and write the complete representation of both Matrix and Vector. To avoid this, we declare the operator* a friend of both.

constexpr int rc_max {4]; // row and column size

class Matrix;

class Vector {
    float c[rc_max];
    // ...
    friend Vector operator*(const Matrix&, const Vector&); 
    // operator*() can reach into the implementation of Vector.
};

class Matrix {
    Vector v[rc_max];
    // ...
    friend Vector operator*(const Matrix&, const Vector&);
    // operator*() can reach into the implementation of Matrix.
};

Vector operator*(const Matrix& m, const Vector& v)
{
    Vector r;
    ...
    return r;
}

Now operator*() can reach into the implementation of both Vector and Matrix.

Other examples: std::ostream& operator<< operator as in freebayes and Rserve.

More examples from operator functions

Consider the following (C++ Primer Plus)

time2 = time1 * 2.75;
// is the same as
time2 = time1.operator*(2.75);

But the following statement is unclear because 2.75 is not a type Time object.

time2 = 2.75 * time1;

One solution is to use a nonmember function. So the compiler could match the expression time2 = 2.75 * time1 to the following nonmember function call: time2 = operator*(2.75, time2). However, nonmember functions can't directly access private data in a class. This is how friends comes.

See <mytime3.h> (the friend keyword in front of line 20 is required). It is a little tricky to understand the lines 20&21 at first. The function is a nonmember function (written in an inline way) and thus the keyword friend is necessary.

Overloading the << Operator

A common use of friend is overloading the << operator.

// mytime3.h
class Time 
{
public:
    friend std::ostream & operator<<(std::ostream &os, const Time &t);
};

// mytime3.cpp
std::ostream & operator<<(std::ostream& os, const Time &t)
{
  os << t.hours << " hours, " << t.minutes << " minutes";
}

// usetime3.cpp
int main()
{
    Time aida, tosca;
    cout << aida << "; " << tosca << endl;
}

Inheritance

protected keyword in the base class

See an example from Accelerated C++ Chapter 13.1.

Suppose there is a base class and a derived class. Then the private functions and members in the base class cannot be accessed from outside of the base class. But if we declare these functions and members as protected, then we give derived classes access to the protected members of their constituent base-class objects, but keeps these elements inaccessible to users of the classes.

That is, members declared in protected are designed for the derived class. This should not be confused with the friend keyword.

See the example in cplusplus.com.

Polymorphism

http://www.cplusplus.com/doc/tutorial/polymorphism/

A function's signature is its argument list. You can define two functions having the same name, provided that they have different signatures. This is called function polymorphism.

Function polymorphism is also called function overloading.

Virtual functions

http://www.cplusplus.com/doc/tutorial/polymorphism/

   Class 1
     ^
     |
     |
     v
   Class 2 (child)

(See 13.2.1 in Accelerated C++) Suppose we have two classes: one is a base class and the other is a derived class. Both of them have defined their own function called grade(). Then if we have defined a function as

bool compare_grades(const Base& c1, const Base& c2) {
   return c1.grade() < c2.grade();
}

then there is no way to distinguish between these two versions of grade() function. When we execute the compare_grades() function, it will always execute the Base::grade() member.

If we want C++ to use Derived::grade() member in compare_grades(), we can use C++ virtual functions:

class Base {
public:
    virtual double grade() const;  
    ...
}

Now when we call compare_grades(), the implementation will determine the version of grade() to execute by looking at the actual types of the objects to which the reference c1 and c2 are bound. That is, if the argument is a Derived object, it will run Derived::grade() function; if the argument is a Base object it will run the Base::grade() function.

The virtual keyword may be used only inside the class definition. If the functions are defined separately from their declaration, we do not repeat virtual in the definitions.

(See OOP Demystified Chapter 4.3: Run-time polymorphism) Virtual functions may be actual functions or merely placeholders for real functions that derived classes must provide. If you define a virtual function without a body, that means the derived class must provide it (it has no choice, and the program will not compile otherwise). Classes with such functions are called abstract classes, because they aren’t complete classes and are more a guideline for creating actual classes. (For example, an abstract class might state “you must create the Display() method.”) In C++, you can create a virtual function without a body by appending =0 after its signature (also known as a pure virtual function).

(Rserve R package). The cxx client header file <Rconnection.h>

class Rexp {
public:
    virtual Rsize_t length() { return len; }
    virtual std::ostream& os_print(std::ostream& os) {
        return os << "Rexp[type=" << type << ",len=" << len <<"]";
    }
};

class Rinteger : public Rexp {
public:
    virtual Rsize_t length() { return len/4; }
    virtual std::ostream& os_print (std::ostream& os) {
        return os << "Rinteger[" << (len/4) <<"]";   
    }
};

class Rdouble : public Rexp {
public:
    virtual Rsize_t length() { return len/8; }
    virtual std::ostream& os_print (std::ostream& os) {
        return os << "Rdouble[" << (len/8) <<"]";
    }    
};

Template

C++ uses templates to enable generic programming techniques. The C++ Standard Library includes the Standard Template Library (STL) that provides a framework of templates for common data structures and algorithms.

There are two kinds of templates: function templates (e.g. algorithm library) and class templates (e.g. array, vector, list containers).

Generic Programming & STL

  • Generic programming is an approach to programming that focuses on algorithm reuse (to contrast, OOP focuses on data reuse). Read What is generic Programming ?. For example, a find function would work with arrays (int, double,...) or linked lists or any other container type. That is, not only should the function be independent of the data type stored in the container, it should be independent of the data structure of the container itself.
  • A goal of generic programming is to write code that is independent of data types and data structures/containers. Templates are the C++ tools for creating generic programs. The STL goes further by providing a generic representation of algorithms. Chapter 16 of C++ Primer Plus.

Iterators (an extension of pointers)

Understanding iterators is the key to understanding the STL.

Just as template make algorithms independent of the type of data stored, iterators make the algorithms independent of the type of container (array, list, set, map, ...) used.

For example consider the find function which would work on different container types. We need a generic representation of the process of moving through the values in a container. The iterator is that generalized representation.

An iterator should contain some properties. See also Iterators.

  • Able to dereference an iterator in order to access the value. If p is an interator, *p should be defined.
  • Able to assign one iterator to another. The expression p = q should be defined.
  • Able to compare two iterator for equality. The expression p == q and p != q should be defined.
  • Able to move an iterator through all the element of a container. ++p and p++ Should be defined.

The following example shows how the pointers can be extended to iterators.

// C++ Primer Plus Chapter 16. Why Iterators?
// Method 1. pointers
// If the function finds the value in the array, it returns the address in the array 
// where the value is found; otherise, it returns the null pointer.
double * find_ar(double * ar, int n, const double * val)
{
    for (int i = 0; i < n; i++)
        if (ar[i] == val)
            return &ar[i];
    return 0;
}

// Method 2. Self-defined iterator type
typedef double * iterator;
iterator find_ar(iterator ar, int n, const double & val)
{
    for (int i = 0; i < n; i++, ar++)
        if (*ar == val)
            return ar;
    return 0;
}

// Method 3. More like a STL function style
// The function can return the end of pointer as a sign the value was not found.          
// http://www.cplusplus.com/reference/algorithm/find/?kw=find
typedef double * iterator;
iterator find_ar(iterator begin, iterator end, const double & val)
{
    iterator ar;
    for (ar = begin; ar != end; ar++)
        if (*ar == val)
            return ar;
    return end; // indicates val not found
}

Function templates

http://www.cplusplus.com/doc/tutorial/templates/

// function template
#include <iostream>
using namespace std;

template <class T>
T GetMax (T a, T b) {
  T result;
  result = (a>b)? a : b;
  return (result);
}

int main () {
  int i=5, j=6, k;
  long l=10, m=5, n;
  k=GetMax<int>(i,j);
  n=GetMax<long>(l,m);
  cout << k << endl;
  cout << n << endl;
  return 0;
}

Another example is exchanging two variables. See Listing 8.11 <funtemp.cpp> in C++ Primer Plus (5th ed).

template <class Any>
void Swap(Any &a, Any &b)
{
 // References as Function Arguments
  Any temp;
  temp = a;
  a = b;
  b = temp;
}

int main()
{
  int i = 10, j= 20;
  Swap(i, j);
  cout << "New i, j = " << i << ", " << j << ".\n";
  double x = 24.5, y = 81.7;
  Swap(x, y);
  cout << "New x, y = " << x << ", " << y << ".\n";
}

Or Overloaded Templates. See Listing 8.12 <twotemps.cpp> in C++ Primer Plus (5th ed).

template <class Any>
void Swap(Any &a, Any &b)

template <class Any>
void Swap(Any *a, Any *b, int n);

int main()
{
  int i=10, j=20;
  Swap(i, j);

  int d1[8] = {0, 7, 0, 4, 1, 7, 7, 6};
  int d2[8] = {0, 6, 2 ,0, 1, 9, 6, 9};
  Swap(d1, d2, 8);
}

template <class Any>
void Swap(Any &a, Any &b)
{
  Any temp; temp=a; a=b; b=temp;
}

template <class Any>
void Swap(Any a[], Any b[], int n)
{
  Any temp;
  for (int i=0; i<n; i++) 
  {
    temp = a[i]; a[i] = b[i]; b[i]=temp;
  }
}

Class templates

http://www.cplusplus.com/doc/tutorial/templates/

// class templates
#include <iostream>
using namespace std;

template <class T>
class mypair {
    T a, b;
  public:
    mypair (T first, T second)
      {a=first; b=second;}
    T getmax ();
};

template <class T>
T mypair<T>::getmax ()
{
  T retval;
  retval = a>b? a : b;
  return retval;
}

int main () {
  mypair <int> myobject (100, 75);
  cout << myobject.getmax();
  return 0;
}

Alternative to int

std::size_t

http://en.cppreference.com/w/cpp/types/size_t and http://www.cplusplus.com/reference/cstring/size_t/

According to http://www.cplusplus.com/forum/beginner/15959/, size_type was most used inside the STD containers.

#include <iostream>
#include <string>

int main ()
{
  std::string str ("This is a book.");
  std::size_t i, j; 
  std::cout << "The size of str is " << str.size() << " bytes.\n"; // 15
  i=0;
  j=4;
  std::cout << str.substr(i, j) << std::endl; // "This"
  return 0;
}

std::string::size_type

http://www.cplusplus.com/reference/string/string/

size_type is an unsigned type of container subscripts, element counts, etc. It is conceivable that an int is insufficient to contain the length.

const std::string::size_type cols = myString.size() + 2;

std::vector<TYPE>::size_type

http://www.cplusplus.com/reference/vector/vector/

See an example at vector case.

String, iterator and printing

The iterator is one of string's member types and begin() & end() are two string's member functions with return type of iterators.

An iterator is a generalization of a pointer. You can print the value by using star (*) to deference, use +/- to move it and use != to compare different iterators. Moreover, we can use subset operator [] to get elements (eg iter[-1]).

Usually an iterator is used in sequences or vectors, see Sequence.

// string::begin/end
#include <iostream>
#include <string>

int main ()
{
  std::string str ("Test string");
  for ( std::string::iterator it=str.begin(); it!=str.end(); ++it)
    std::cout << *it;
  std::cout << '\n';

  return 0;
}

snprintf() and sprintf()

See http://www.cplusplus.com/reference/cstdio/sprintf/

  • int snprintf(char *s, size_t n, const char * format, ...)
  • int sprintf(char *s, const char * format, ...)

String, string, string

http://stackoverflow.com/questions/11322200/unable-to-build-my-c-code-with-g-4-6-3

http://stackoverflow.com/questions/2258561/getting-the-length-of-an-array-using-strlen-in-g-compiler

Learn C Essentials by MagPi

  • C++ uses <cstring> for char*, strlen, strcpy ...
  • C uses <string.h> for char*
  • C++ uses <string> for string class & <sstream> for stringstream class.
  • Qt uses QString & QStringList from <QtCore>, QTextStream from <QTextStream>. Cf QDataStream from <QDataStream>.

Convert a c-style string to a c++ string

char * mystr = "klajlfdjlajfd";
std::string mycppstr(mystr);

string::size_type

padding

Example:

#include <iostream>
#include <string>
#include <iomanip>      // std::setw
using namespace std;

int main() {
	string str;
	str = "abcd";
	cout << setw (10) << str << endl;
	cout << str + string(4, ' ') + str << endl;
	return 0;
}
/* Output:
      abcd
abcd    efgh
*/

Comparison

strcmp and compare

P50 of Learn C Essentials - <string.h> library.

  • C-string
  char key[] = "apple";
  char buffer[80];
  do {
     printf ("Guess my favorite fruit? ");
     fflush (stdout);
     scanf ("%79s",buffer);
  } while (strcmp (key,buffer) != 0);
  • C++ string
  std::string str1 ("green apple");
  std::string str2 ("red apple");

  if (str1.compare(str2) != 0)
    std::cout << str1 << " is not " << str2 << '\n';

Remove file name & Get the basename from a full path

#include <iostream>
using namespace std;

int main() {
    // This solution works with both forward and back slashes.
    string s1("../somepath/somemorepath/somefile.ext");
    string s2("..\\somepath\\somemorepath\\somefile.ext");
    // Remove file name from a full path
    cout << s1.substr(0, s1.find_last_of("\\/")) << endl; // s1 is not changed yet
    cout << s2.substr(0, s2.find_last_of("\\/")) << endl; // use s1=s1.substr() to change it
    // Get the basename
    cout << s1.substr(s1.find_last_of("\\/")+1) << endl;
    cout << s2.substr(s2.find_last_of("\\/")+1) << endl;
    return 0;
}

The output is

../somepath/somemorepath
..\somepath\somemorepath
somefile.ext
somefile.ext

Replace a substring

#include <iostream>
using namespace std;

bool replace(std::string& str, const std::string& from, const std::string& to) {
    size_t start_pos = str.find(from);
    if(start_pos == std::string::npos)
        return false;
    str.replace(start_pos, from.length(), to);
    return true;
}

int main() {
    std::string string("/home/brb/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/");
    replace(string, "GRCh37", "GRCh38");
    cout << string << endl;
    return 0;
}

The output is

/home/brb/igenomes/Homo_sapiens/Ensembl/GRCh38/Sequence/BWAIndex/

convert backslash to forward slash in string

In C++ (not tested)

string path = "C:\Program Files";
std::replace(path.begin(), path.end(), '\\', '/');

In Qt (tested), we can use

QString sReturnedValue = "C:\Program Files";
sReturnedValue.replace("\\", "/");

Convert

Danger of implicit type conversion

Implicit type conversion (coercion) can result in unexpected results. See the following example.

#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
  float myFloat;
  long int myInt = 123456789; 
  // long int  are 32-bit signed integers with a range -2147483648 to 2147483647.
  myFloat = myInt;
  streamsize prec=cout.precision();
  cout << "myInt is changed to" << setprecision(9) << myFloat << setprecision(prec) << endl;  
  // 123456792. It is not a typo!
  return 0;
}

Implicit types conversion rules:

  • long int to float can cause wrong results
  • float to int removes the decimal part
  • double to float rounds digit of double
  • long int to int drops the encoded higher bits

Convert char* to string

std::string has a constructor that takes a char*.

char *path = "Eggs on toast.";
std::string str = std::string(path);

Convert std::string to c-string/char *

See

char * S = new char[R.length() + 1];
std::strcpy(S,R.c_str());

This is useful when we want to use the ifstream class to open a file where the file name is a string. The standard streams doesn't accept a standard string, only c-string! So pass the string using c_str():

Table::Table(string filename)
{
    ifstream fin;
    fin.open(filename.c_str());
    ...
}

Convert an integer to character string

See www.cplusplus.com.

We need to include

#include <sstream>

Convert a numerical number to char

Use

char mychar[256]="";
double number;
sprintf_s(mychar, "%0.2f", number);

This will save a number for example -7.035425 to -7.03 as characters.

C++ IO Streams

The first example is to read the input filename using command line argument (Accelerated C++ Chapter 10.5).

#include <iostream>
#include <fstream>
int main(int argc, char **argv)
{
    int fail_count = 0;
    // for each file in the input list
    for (int i = 1; i < argc; ++i) {
        ifstream in(argv[i]);
        // if it exists, write its contents, otherwise generate an error message
        // another way is to use (in.is_open() == false) to check if the file can be opened; see other places
        if (in) {
            string s;
            while (getline(in, s))
                cout << s << endl;
        } else {
            cerr << "cannot open file " << argv[i] << endl;
            ++fail_count;
        }
    }
    return fail_count;
}

The 2nd example is to hard-code the input filename.

#include <fstream>
#include <string>

using std::endl;
using std::getline;
using std::ifstream;
using std::ofstream;
using std::string;

int main()
{
    ifstream infile("in");
    ofstream outfile("out");

    string s;

    while (getline(infile, s))
   	outfile << s << endl;

    infile.close();
    outfile.close();

    return 0;
}

No matching function - ifstream open()

If we want to use a string type as a file name instead of specifying the file name in the code (hard-code), we need to use the c_str member function first.

See here. It Change to:

std::string filename

std::ifstream infile;
infile.open(filename.c_str());
# Or
std::ifstream infile(filename.c_str());

An argument natural to ifstream is argv[i] from main(). For example,

ifstream in(arv[i]);

If we want to check if the open operation is successful or not, we can not use statement like

if (file.open())  { ... }

We need to use ether if (infile) or infile.is_open() function or declare filename as filebuf* type and the open() function will return NULL if the file cannot be opened. See an example in cplusplus.com.

Support for std::string argument was added in c++11.

fail(), bad() and eof() functions

  • fail() means logical error. An example of its usage after .open()
  • bad() means read/write error. Possible causes are more complex like memory shortage or buffer throws an exception. See an answer from stackoverflow.com.

Here is an example from cppreference.com to test bad(), eof() and fail() methods.

stringstream

Since stringstream is based on sstream, a stringstream is closer to stream (or file) rather than a string.

Simple example (using multiple << and >> operators)

// swapping ostringstream objects
#include <string>       // std::string
#include <iostream>     // std::cout
#include <sstream>      // std::stringstream

int main () {

  std::stringstream ss;

  ss << 100 << ' ' << 200;

  int foo,bar;
  ss >> foo >> bar;

  std::cout << "foo: " << foo << '\n';
  std::cout << "bar: " << bar << '\n';

  return 0;
}
// Output:
// foo: 100
// bar: 200

stringstream::str

#include <string>       // std::string
#include <iostream>     // std::cout
#include <sstream>      // std::stringstream, std::stringbuf

int main () {
  std::stringstream ss;
  ss.str ("Example string");
  std::string s = ss.str();
  std::cout << s << '\n';
  return 0;
}
// Output:
// Example string

Clean/empty a stringstream

See stringstream.

std::stringstream ss;
ss.str("");

P.S. The clear() member function is inherited from ios and is used to clear the error state of the stream only. For some reason, it is necessary to use the clear() function inside a loop call. Another method is to declare a stringstream variable inside a loop.

clear() and rdstate() functions

stringstream is related to reading.

#include <iostream>     // std::cout
#include <fstream>      // std::fstream

int main () {
  char buffer [80];
  std::fstream myfile;

  myfile.open ("test.txt",std::fstream::in);

  myfile << "test";
  if (myfile.fail())
  {
    std::cout << "Error writing to test.txt\n";
    myfile.clear();
  }

  myfile.getline (buffer,80);
  std::cout << buffer << " successfully read from file.\n";

  return 0;
}

And convert a string from in input to integers.

#include <sstream>
#include <vector>
#include <iostream>

using namespace std;

int main()
{
    string Digits("11 22 33");
    stringstream ss(Digits);
    string Temp;
    vector<string>Tokens;

    while(ss >> Temp)
        Tokens.push_back(Temp);

    if (ss.rdstate() != 0) {
        cout << "not goodbit" << endl;
    } else cout << "goodbit" << endl;

    // When the stream extracts the last of the 3 digist "1 2 3", the eof state will be set.  

    // You have to reset all status flags (eofbit) and bring the stream into a good state (goodbit):

    // After that, read operations will be canceled and you have to clear that flag out again. 
    // Anyway, after clearing and resetting the string, you can then go on extracting the integers.

    ss.clear(); // clear the flag; it is not needed for the next line (ss.str(Tokens[0]), 
                // but for the line of ss >> Num;
                // Without this line, building the program is OK but the output of Num is 0.

    if (ss.rdstate() != 0) {
        cout << "not goodbit" << endl;
    } else cout << "goodbit" << endl;

    ss.str(Tokens[0]);
    cout << ss.str() << endl;

    int Num = 0;
    ss >> Num;
    cout << Num << endl;
}
// Output:
// 11

And an example to verify if the string is an integer (by converting a string to a stringstream, and then use the ">>" extract operator)

#include <sstream>
#include <iostream>

using namespace std;

int main() {

std::stringstream ss;
std::string input = "a b c 4 e";
ss << input;
if (ss.rdstate() != 0) {
        cout << "not goodbit" << endl;
    } else cout << "goodbit" << endl;

int found;
std::string temp;

while(std::getline(ss, temp,' ')) {
    if(std::stringstream(temp)>>found)
    {
        std::cout<<found<<std::endl;
    }
}

if (ss.rdstate() != 0) {
        cout << "not goodbit" << endl;
    } else cout << "goodbit" << endl;

return 0;
}
// Output:
// goodbit
// 4
// not goodbit

getline() function to extract

istream/ostream, iostream and fstream headers

See 4.1.2 the standard-library headers and namespace in The C++ Programming Language. The standard library is defined in a namespace called std.

  1. iostream include cin, cout, cerr, clog, istream and ostream
  2. fstream includes ifstream and ofstream. These two are useful for files i/o. Note ostream is termed a base class and ofstream class is based on it. ofstream is termed a derived class.

(10.5.2 in Accelerated C++) It is possible to use an ifstream wherever the library expects an istream and an ofstream wherever the library expects an ostream???

Read from console --- std::cin or std::istream

We need to include the header file <iostream>.

Create a vector

#include <iostream>
#include <vector>
using namespace std;

int main()
{
    vector<double> temps;         // temperatures
    double temp;
    while (cin >> temp)             // read 
        temps.push_back(temp);    // put into vector
    cout << "length of the vector is " << temps.size() << endl;
    for (const auto tmp : temps)
       cout << tmp << " ";
    cout << "\n";
}

Hit the 'Enter' key first and then use 'ctrl+d' to end the input.

$ g++ -std=c++11 foo.cpp
$ cat > input.txt
4 3 2 1 9.0
$ ./a.out < input.txt
length of the vector is 5
4 3 2 1 9

Another example is <main1.cc> from Accelerated C++ Chapter 4. The function read_hw() uses std::istream as the type instead of std::cin to read from the console. It also has a return type std::istream.

#include <iostream>   // std::cin, std::istream
using std::cin;
using std::istream;

istream& read_hw(istream& in, vector<double>& hw) {
  ...
}

int main() {
  ...
  vector<double> homework;
  read_hw(cin, homework);
  ...
}

Create a map (key-index vector) object (word count)

This is example code from Chapter 21.6.1 "Map" of PPP.

#include <iostream>
#include <map>
#include <string>

using namespace std;

//------------------------------------------------------------------------------

int main()
{
    map<string,int> words;    // keep (word,frequency) pairs

    string s;
    while (cin>>s) ++words[s];  // reads every whitespace-separated word on input    
                                // note: words is subscripted by a string
                                    
    typedef map<string,int>::const_iterator Iter;
    for (Iter  p = words.begin(); p!=words.end(); ++p)
        cout << p->first << ": " << p->second << '\n';
}
$ ./chapter.21.6.1.exe
b b c c a ab
ab
a: 1
ab: 2
b: 2
c: 2

It seems the words are sorted too.

Directory

Check if a directory exists or not

http://stackoverflow.com/questions/3828192/checking-if-a-directory-exists-in-unix-system-call

#include <sys/stat.h>

struct stat sb;

if (stat(pathname, &sb) == 0 && S_ISDIR(sb.st_mode))
{
    ...it is a directory...
}

Create a directory in Linux

http://codeyarns.com/2014/08/07/how-to-create-directory-using-c-on-linux/

#include <sys/stat.h>
 
const int dir_err = mkdir("foo", S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH);
if (-1 == dir_err)
{
    printf("Error creating directory!n");
    exit(1);
}

Note that this does not create parent directories .

List files in a directory

http://www.linuxquestions.org/questions/programming-9/c-list-files-in-directory-379323/

File

Delete a file

Use C library stdio; i.e. include <stdio.h> or <cstdio>. See also remove. Note that according to The Linux Programming Interface book, remove() function removes a file or an empty directory. The empty directory part was not mentioned on website??

#include <iostream>
#include <stdio.h>

int main() {
	remove("myfile.txt");
	return 0;
}

A convenient way to see the manual is to use man remove.

On LInux,

REMOVE(3)                Linux Programmer's Manual                  REMOVE(3)

NAME
       remove - remove a file or directory

SYNOPSIS
       #include <stdio.h>

       int remove(const char *pathname);

On Mac,

REMOVE(3)                BSD Library Functions Manual                REMOVE(3)

NAME
     remove -- remove directory entry
SYNOPSIS
     #include <stdio.h>

     int
     remove(const char *path);

Reading a table file

Line ending difference between DOS and UNIX text file

The DOS text file has two ending characters (CR + LF) while UNIX text file has only 1 ending character (LF). This may create two answers when we want to count the number of columns in a row or compare the elements from the last column.

For example, if a text file is created from DOS, it will look like below on Linux OS.

"ID	1	2	3	4	5	6	7	8	9	10
" 

In this case, the last element is read in as 10[CR] instead of 10 on Linux. So if you need to run string/character comparison, you may not able to get what you want.

But if we remove the trailing \r character ( tr -d '\r' < INPUT > OUTPUT), we will obtain

"ID	1	2	3	4	5	6	7	8	9	10"

Interestingly, if we use winscp to transfer text files from Linux to Windows, it automatically add \r character to text files.

  • If we want to convert files using dos2unix/unix2dos, we just need to specify the input file. By default, the input file will be overwritten.
dos2unix inputoutputfile
unix2dos inputoutputfile
  • Normally when we create a file, the file has a LF character (or CR + LF). So when we using 'cat command, the output is normal.
brb@brb-T3500:~/Downloads$ cat combobox.txt
ID	1	2	3
1	-0.80	-0.30	-0.41
brb@brb-T3500:~/Downloads$

But if we remove the last line's LF character (eg using Window's notepad or geany editor)

brb@brb-T3500:~/Downloads$ cat combobox.txt
ID	1	2	3
1	-0.80	-0.30	-0.41brb@brb-T3500:~/Downloads$

Count number of lines in a text file --- std::ifstream & getline()

We need to include the header <fstream>.

http://stackoverflow.com/questions/3482064/counting-the-number-of-lines-in-a-text-file

Method 1. Succinct

#include <iostream>     // std::cout
#include <fstream>      // std::ifstream
#include <string>       // std::getline
using namespace std;
int main() { 
    int number_of_lines = 0;
    std::string line;
    ifstream fin;
    fin.open("combobox.txt"); // assume the file uses tab as delimiter

    while (std::getline(fin, line))
        ++number_of_lines;
    std::cout << "Number of lines in text file: " << number_of_lines;
    return 0;
}

Very reliable. Don't need to worry about an empty line at the end of the file.

Method 2. Good, but lenghy

#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main() { 
    int number_of_lines = 0;
    std::string line;
    ifstream fin;
    fin.open("combobox.txt"); // assume the file uses tab as delimiter

    std::getline(fin, line);
    while (fin) 
    {
	  ++number_of_lines;
	  std::getline(fin, line);
    }
    std::cout << "Number of lines in text file: " << number_of_lines;
    return 0;
}

Method 3. Not good. Need extra correction

#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main() { 
    int number_of_lines = 0;
    std::string line;
    ifstream fin;
    fin.open("combobox.txt"); // assume the file uses tab as delimiter

    while (fin.good()) 
    {
	  ++number_of_lines;
	  std::getline(fin, line);
    }    
    --number_of_lines; // need an extra step
    std::cout << "Number of lines in text file: " << number_of_lines;
    return 0;
}

Read a text file with one row only (using getline())

#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main() 
{
  ifstream fin;
  fin.open("combobox.txt"); // assume the file uses tab as delimiter
  if (fin.is_open() == false)
  {
    cout << "Can't open file combobox.txt. Bye." << endl;
    return 1;
  }
  string item;
  int count = 0;
  getline(fin, item, '\t');
  while (fin)
  {
    ++count;
    cout << count << ": " << item << endl;
    getline(fin, item, '\t');
  }
  cout << "Done\n";
  fin.close();
}

If the input file 'combobox.txt' looks like (the [LF] is line feedback, hidden character),

ID	1	2	3[LF]
1	-0.80	-0.30	-0.41[LF]

the output will look like

1: ID
2: 1
3: 2
4: 3
1
5: -0.80
6: -0.30
7: -0.41

Done

Explanation

  • After the element 2, the next element is 3[LF]1. So the output looks a little strange.
  • After the element -0.30, the next element is -0.41[LF]. The there is an extra blank line there.
  • The output is the same if the end of line is in DOS format (CR + LF).

Read a text file with multiple columns --- std::stringstream and getline()

With the following examples, we can count the number of columns and number of rows of a text file.

Solution 1.

#include <iostream>
#include <fstream>
#include <string>
#include <sstream> // stringstream
#include <stdlib.h>  /* exit, EXIT_FAILURE */
using namespace std;
int main() 
{
  ifstream fin;
  fin.open("combobox.txt"); // step 1. open a file
  if (fin.is_open() == false)
  {
    cout << "Can't open file. Bye." << endl;
    exit(EXIT_FAILURE);
  }
  string line, item;
  int ncol = 0, nrow = 0;
  stringstream iss;
  while(getline(fin, line)) {  // step 2. extract a line ("\n" is the default delimiter) from a (file) stream
    nrow++;                  //         accumulate row number.
    iss << line;             // step 3. insert a string to a stringstream  
    // Assume data is tab delimited
    // http://www.cplusplus.com/reference/string/string/getline/
    while (getline(iss, item, '\t')) { // step 4. extract a string from a stringstream
        if (nrow == 1 ) ++ncol;        //         accumulate column number. 
        cout << item << endl;
    }
    iss.clear();              // step 5.  clear the error state of the stream?
  }
  cout << "There are " << nrow << " rows and " << ncol " columns\n";
  fin.close();
}

If we want to assign the column names to an array of string and elements to 2 2D string arrays, we need to determine the dimension and then declare the variables first.

col_names = new string[ncol];
for(int i=0;i<ncol;i++){
    getline(iss, col_names[i], '\t');
}

fin.clear();
fin.seekg(0, ios::beg);
element = new string *[nrow];
stringstream iss;
string str;
for(int i=0;i<nrow;i++){
    getline(fin, line);
    iss << line;
    element[i] = new string[ncol];
    for(int j=0;j< ncol;j++){
        if(getline(iss, str, '\t') )	{
	    element[i][j]=str;
        }else{
	    element[i][j]=string("");
	}
    }
    iss.clear()
}

Solution 2

typedef vector<vector<string> > Rows;
Rows rows;
ifstream input("filename.csv");
char const row_delim = '\n';
char const field_delim = '\t';
for (string row; getline(input, row, row_delim); ) {
  rows.push_back(Rows::value_type());
  istringstream ss(row);
  for (string field; getline(ss, field, field_delim); ) {
    rows.back().push_back(field);
  }
}

reset position to the beginning of file

    ifstream orderfile;
    orderfile.open (fullpath);

    std::string line;
    long total_count=0;

    while (orderfile.good()){
        getline(orderfile, line);
        if (!line.empty()) total_count++;
    }

    orderfile.clear();
    orderfile.seekg(0,ios::beg);
    getline(orderfile, line);

Write a Function to Contain an Argument of Output Device (eg file, screen)

See Listing 8.8 <filefunc.cpp> in C++ Primer Plus (5th ed). It uses ostream & as the class of output device type in the function definition. For the main function, we can use objects of class ofstream or istream. Recall, ostream is a base class and ofstream is a derived class.

The program teaches

  1. use reference (ostream &) as a function argument to refer to an ostream object such as cout (#include <iostream>) and to an ofstream (#include <fstream>) object.
  2. how ostream formatting methods such as precisions(), setf() and width() can be used for both types.
int main() {
    ofstream fout;
    ...
    file_it(fout, objective, eps, LIMIT);
    ...
}

void file_it(ostream & os, double fo, const double fe[],int n) {}

Another example <analysis.gcc> can be found on Chapter 6 of Accelerated C++.

using std::cout;
int main() {
    ...
    write_analysis(cout, "median", median_analysis, did, didnt);
    ...
}    

void write_analysis(ostream& out, const string& name,
                    double analysis(const vector<Student_info>&),
                    const vector<Student_info>& did,
                    const vector<Student_info>& didnt) {}

Sorting only

#include <iostream>
#include <string>
#include <iterator>
#include <algorithm>

int main() {

   std::string obj[4] = {"fine", "ppoq", "tri", "get"};
   std::sort(obj, obj + 4);
   std::copy(obj, obj + 4, std::ostream_iterator<std::string>(std::cout, "\n"));
// And for vector
// #include <vector>
// std::vector<std::string> stringarray;
// std::sort(stringarray.begin(), stringarray.end());
}

and

#include <stdlib.h>
#include <string.h>

int compare_cstr(const void* c1, const void* c2) 
{ 
   return strcmp(*(const char**)(c1), *(const char**)(c2)); 
}

int main() {

   const char* obj[4] = {"fine", "ppoq", "tri", "get"};
   qsort(obj, 4, sizeof(obj[0]), compare_cstr);
   std::copy(obj, obj + 4, std::ostream_iterator<const char*>(std::cout, "\n"));
}

Note: The use of ostream_iterator<string>(cout, "\n") was explained & used in Accelerated C++ Chapter 8.3 (Input and output iterators) & 8.4. It was also explained in cplusplus.com.

Return permutation (R's order() function) using 3 approaches

Good example. This is using lambda from C++0x but it can be replaced with simple functor object.

#include <vector>
#include <algorithm>
#include <iostream>

template<class Vals>
void sortingPermutation(const Vals& values, std::vector<int>& v){
  int size = values.size(); 
  v.clear(); v.reserve(size);
  for(int i=0; i < size; ++i)
    v.push_back(i);

  std::sort(v.begin(), v.end(), [&values](int a, int b) -> bool { 
    return values[a] < values[b];
  });
}

int main()
{
    std::vector<double> values;
    values.push_back(24);
    values.push_back(55);
    values.push_back(22);
    values.push_back(1);

    std::vector<int> permutation;
    sortingPermutation(values, permutation);

    typedef std::vector<int>::const_iterator I;
    for (I p = permutation.begin(); p != permutation.end(); ++p)
        std::cout << *p << " ";
    std::cout << "\n";
}

This will return values 3, 2, 0, 1. In fact, the code is so general: if I add #include <string> and change double to std::string in the declaration of values, the code works for string data type.

Method 2. You can use std::sort to sort the list of pairs {(24, 0), (55, 2), (22, 0), (1, 1)}.

#include <vector>
#include <algorithm>
#include <utility>

typedef std::pair<double, int> Pair;

struct CmpPair
{
    bool operator()(const Pair& a, const Pair& b)
    { return a.first < b.first; }
};

void sortingPermutation(
    const std::vector<double>& values,
    std::vector<int>& permutation)
{
    std::vector<Pair> pairs;
    for (int i = 0; i < (int)values.size(); i++)
        pairs.push_back(Pair(values[i], i));

    std::sort(pairs.begin(), pairs.end(), CmpPair());

    typedef std::vector<Pair>::const_iterator I;
    for (I p = pairs.begin(); p != pairs.end(); ++p)
        permutation.push_back(p->second);
}

#include <iostream>

int main()
{
    std::vector<double> values;
    values.push_back(24);
    values.push_back(55);
    values.push_back(22);
    values.push_back(1);

    std::vector<int> permutation;
    sortingPermutation(values, permutation);

    typedef std::vector<int>::const_iterator I;
    for (I p = permutation.begin(); p != permutation.end(); ++p)
        std::cout << *p << " ";
    std::cout << "\n";
}

This will give the same result as above. In fact, the code is so general: if I add #include <string> and change double to std::string in the declaration of values, the code works for string data type. See also c version http://stackoverflow.com/questions/2804493/finding-unique-elements-in-an-string-array-in-c.

Method 3. Create a vector of ints 0..N and then sort that array with a comparison function that compares the corresponding elements of the vector you're trying to find the sorted permutation of. Something like:

#include <iostream>
#include <algorithm>
#include <vector>
 
template<class T> class sorter {
    const std::vector<T> &values;
public:
    sorter(const std::vector<T> &v) : values(v) {}
    bool operator()(int a, int b) { return values[a] < values[b]; }
};

template<class T> std::vector<int> order(const std::vector<T> &values)
{
    std::vector<int> rv(values.size());
    int idx = 0;
    for (std::vector<int>::iterator i = rv.begin(); i != rv.end(); i++)
        *i = idx++;
    std::sort(rv.begin(), rv.end(), sorter<T>(values));
    return rv;
} 
 
int main()
{
    std::vector<double> values;
    values.push_back(24);
    values.push_back(55);
    values.push_back(22);
    values.push_back(1);
 
    std::vector<int> permutation;
    permutation = order(values);
 
    typedef std::vector<int>::const_iterator I;
    for (I p = permutation.begin(); p != permutation.end(); ++p)
        std::cout << *p << " ";
    std::cout << "\n";
}

This also gives the same result. In fact, the code is so general: if I add #include <string> and change double to std::string in the declaration of values, the code works for string data type.

Infinity value

http://en.cppreference.com/w/cpp/types/numeric_limits/infinity

    double max = std::numeric_limits<double>::max();
    double inf = std::numeric_limits<double>::infinity();
    if(inf > max)
        std::cout << inf << " is greater than " << max << '\n';

Scope

Method 1: Created object cannot be used in other places.

Class aClass {
  aClass();
  void foo();
};
aClass::aClass() {
   bClass *obj = new bClass;
}
void aClass::foo() {
   obj->myfunction(); // Won't work!!
}

Method 2: Created object can be used within the class.

Class aClass {
  aClass();
  void foo();
};
aClass::aClass() {
   obj = new bClass;
}
void aClass::foo() {
   obj->myfunction();
}

C++11

To compile a code containing C++11 features (gcc 4.7 and up), using -std=c++11 or -std=c++0x option in g++

g++ -std=c++11 MYCODE.cc

regex

regex_match() function

#include <algorithm>
#include <iostream>
#include <string>
#include <regex>
int main()
{
    if (std::regex_match("yx12", std::regex("(.*x1.*)")))
        std::cout << "string yx12 contains x1.\n";
    if (std::regex_match("yx12", std::regex("(.*x2.*)")))
        std::cout << "string yx12 contains x2.\n";
    if (std::regex_match("yx12", std::regex("(.*yx1.*)")))
        std::cout << "string yx12 contains yx1.\n";
}

See this post to match a string that does not contain some string.

Lambda functions

A lambda function is essentially an anonymous function (a function without a name) that’s defined inline.

A simple example from an article from oracle.com.

#include <stdlib.h>
#include <stdio.h>
#include <algorithm>

int main()
{
  int a[10] = { 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 };
  std::sort( a, &a[10], [](int x, int y){ return x < y; } );
  for(int i=0; i<10; i++) { printf("%i ", a[i]); }
  printf("\n");
  return 0;
}

STL vector vs C++ new

Basic STL vector

Vectors examples:

#include <vector>
using std::vector;

vector<int> ivec;             // ivec holds objects of type int
vector<Sales_item> Sales_vec; // holds Sales_items
vector<vector<string>> file;  // vector whose elements are vectors

# Initialize vectors
vector<int> ivec;             // initially empty
vector<int> ivec2(ivec);      // copy elements of ivec into ivec2
vector<int> ivec3 = ivec;     // copy elements of ivec into ivec3
vector<string> svec(ivec2);   // error: svec holds strings, not ints
vector<string> articles = {"a", "an", "the"};
vector<int> ivec(10, -1);       // ten int elements, each initialized to -1
vector<string> svec(10, "hi!"); // ten strings; each element is "hi!"
vector<int> ivec(10);    // ten elements, each initialized to 0
vector<string> svec(10); // ten elements, each an empty string
vector<int> v1(10);    // v1 has ten elements with value 0
vector<int> v2{10};    // v2 has one element with value 10
vector<int> v3(10, 1); // v3 has ten elements with value 1
vector<int> v4{10, 1}; // v4 has two elements with values 10 and 1

# Add an element to a vector
vector<int> v2;         // empty vector
for (int i = 0; i != 100; ++i)
    v2.push_back(i);    // append sequential integers to v2
                  // at end of loop v2 has 100 elements, values 0 . . . 99

# vector operators
# v.empty()
# v.size()
# v.push_back(t)
# v[n]
# v1 = v2
# v1 = {a, b, c}
# v1 == v2
# v1 != v2
# <, <=, >, >=
vector<int> v{1,2,3,4,5,6,7,8,9};
for (auto &i : v)     // for each element in v (note: i is a reference)
    i *= i;           // square the element value
for (auto i : v)      // for each element in v
    cout << i << " "; // print the elemen
vector<int> ivec;   // empty vector
for (decltype(ivec.size()) ix = 0; ix != 10; ++ix)
    ivec.push_back(ix);  // ok: adds a new element with value ix

Iterator examples:

auto b = v.begin(), e = v.end(); // begin and end are two iterator members. 
                                 // Return type is an iterator.
# standard container iterator operators
# *iter         returns a reference to the element denoted by the iterator iter
# iter->mem
# ++iter
# --iter
# iter1 == iter2
# iter1 != iter2
string s("some string");
if (s.begin() != s.end()) { // make sure s is not empty
    auto it = s.begin();    // it denotes the first character in s
    *it = toupper(*it);     // make that character uppercase
}
// changed the case of the first word in a string to use iterators instead
// process characters in s until we run out of characters or we hit a whitespace
for (auto it = s.begin(); it != s.end() && !isspace(*it); ++it)
    *it = toupper(*it); // capitalize the current character
# Combining Dereference and Member Access
(*it).empty() // dereferences it and calls the member empty on the resulting object
# Iterator arithmetic
# iter + n
# iter -n
# iter1 += n
 iter1 -= n
# iter1 - iter2
# >, >=, <, <=
// compute an iterator to the element closest to the midpoint of vi
auto mid = vi.begin() + vi.size() / 2;

convert a C-style array to a c++ vector

#include <vector>

int data[] = { 1, 2, 3, 4, 4, 3, 7, 8, 9, 10 };
std::vector<int> v(data, data+10);

WINVER

http://stackoverflow.com/questions/1439752/what-is-winver WINVER determines the minimum platform SDK required to build your application, which in turn will determine at compile time which routines are found by the headers.

You can use this to verify, at compile time, that your application will work on Windows 2000 (0x0500), for example, or on Windows XP (0x0501).

This was used in win32 disk imager program.

Period

What is the difference between the dot (.) operator and -> in C++?

foo->bar() is the same as (*foo).bar().

Member Access Operators: . and -> from msdn.microsoft.com

Protected vs private members

See here or C++ FAQ

Private members are only accessible within the class defining them.

Protected members are accessible in the class that defines them and in classes that inherit from that class.

Edit: Both are also accessible by friends of their class, and in the case of protected members, by friends of their derived classes.

Edit 2: Use whatever makes sense in the context of your problem. You should try to make members private whenever you can to reduce coupling and protect the implementation of the base class, but if that's not possible then use protected members. Check C++ FAQ Lite for a better understanding of the issue. This question about protected variables might also help.

alignment of pointers

https://stat.ethz.ch/pipermail/r-devel/2013-August/067314.html

Find sample quantiles (percentiles) using STL

STL algorithms defined in the <algorithm> and <numeric> libraries.

#include <iostream>
#include <numeric>      // accumulate
#include <algorithm>    // std::sort, std::nth_element
#include <vector>       // std::vector
using namespace std;

int main()
{
  int grades[]={89, 74, 89, 63, 100};
  
  size_t elements=sizeof(grades)/sizeof(grades[0]);
  double res= accumulate(grades, grades+ elements, 0)/double(elements);
  std::cout << "Raw data:";
  for(int i=0; i< elements; i++) std::cout << ' ' << grades[i];
  std::cout << '\n';
  cout << res << " is the mean" << endl << endl;
    
  // the iterator constructor can also be used to construct from arrays:  
  std::vector<int> myvector (grades, grades + elements);  
  std::sort(myvector.begin(), myvector.end());
  std::cout << "myvector contains:";
  for (std::vector<int>::iterator it = myvector.begin(); it!=myvector.end(); ++it)
    std::cout << ' ' << *it;
  std::cout << '\n';
  double median= *(myvector.begin()+myvector.size()/2); //89
  cout<<median << " is the median" << endl  << endl;

  std::vector<int> myvector2 (grades, grades + elements);  
  // None of the elements preceding nth (25-th perct) are greater than it, 
  // and none of the elements following it are less.
  nth_element(myvector2.begin(),
              myvector2.begin()+ (int)(myvector2.size()*0.25), 
              myvector2.end());
  std::cout << "myvector2 contains:";
  for (std::vector<int>::iterator it = myvector2.begin(); it!=myvector2.end(); ++it)
     std::cout << ' ' << *it;
  std::cout << '\n';            
  // This method will select one data as the percentile!  
  int p_25= *(myvector2.begin()+ (int)(myvector2.size()*.25)); 
  cout<<p_25<< " is the 25th percentile" << endl;

  return 0;
}

The output looks like (note in myvector2 data preceding 74 is less than 74 and data following 74 is greater than 74)

$ g++ example.cpp 
$ ./a.out
Raw data: 89 74 89 63 100
83 is the mean

myvector contains: 63 74 89 89 100
89 is the median

myvector2 contains: 63 74 89 89 100
74 is the 25th percentile
$

Calculate execution time

http://stackoverflow.com/questions/5248915/execution-time-of-c-program. The following code is from Programming Arduino Next Steps. It took .04 seconds on Xeon W3690 @ 3.47GHz and 28 seconds on Arduino Uno.

#include <stdio.h>
#include <time.h>  

main()
{
  printf("\nStarting Test\n");
  time_t startTime = clock();
  
  // test code here
  long  i = 0;
  long j = 0;
  for (i = 0; i < 20000000; i ++)
  {
    j = i + i * 10;
    if (j > 10) j = 0;
  }
  // end of test code
  time_t endTime = clock();
  
  printf("%ld\n", j); // prevent loop being optimized out
  printf("Finished Test\n");
  double timeSpent = (double)(endTime - startTime) / CLOCKS_PER_SEC;
  printf("seconds taken: %f\n", timeSpent); 
  return 0;
}

Time the iterations from 0 to 2147483647

#include <iostream>
using namespace std;

int main() {
    /* Purpose: time the iterations from 0 to
     * the maximum positive integer.
     * See
     * http://en.cppreference.com/w/cpp/language/types
     * http://stackoverflow.com/questions/728068/how-to-calculate-a-time-difference-in-c
     */
    const clock_t begin_time = clock();
    int flag=0;
    for(int i=0; i<2147483647; i++)
      if (!(i%50)) flag++;
    std::cout << float( clock () - begin_time ) /  CLOCKS_PER_SEC;
    cout << "\nflag " <<  flag << endl;

    int flag2 = 2147483647;
    cout << "\nflag2=" << flag2 <<endl;
    flag2++;
    cout << "\nflag2 + 1=" << flag2 << endl;
    flag2++;
    cout << "\nflag2 + 2=" << flag2 << endl;
    return 0;
}

On Xeon(R) CPU E5-1650 0 @ 3.20GHz, it took about 7 seconds and on my UDOO Dual, it took about 1 minute. On Phenom(tm) II X6 1055T, it took 9 seconds. On Raspberry Pi 2, it took 97 seconds. On ODroid xu4, it took 13 seconds.

brb@T3600 /tmp $ g++ tmp.cpp; ./a.out
6.83841
flag 42949673

flag2=2147483647

flag2 + 1=-2147483648

flag2 + 2=-2147483647

execute a command line command from a C++ program

#include <stdlib.h>

int main() {
  system("cp ~/Downloads/testInt.cpp ~/Downloads/testInt2.cpp");
}
FILE* pipe = popen("your shell command here", "r");
if (!pipe)
{
cerr<<"popen error"<<endl;
}    
char buffer[128];
    while(!feof(pipe)) 
     {
      if(fgets(buffer, 128, pipe) != NULL){}
     }
pclose(pipe);
buffer[strlen(buffer)-1] = '\0';

Check OS

http://stackoverflow.com/questions/142508/how-do-i-check-os-with-a-preprocessor-directive

#ifdef _WIN32
 ...
#else
 ...
#endif

C libraries

cctype

cerrno

climits

cmath

cstddef

cstdio

cstdlib

cstring

ctime

C++ standard libraries

STL and c++ standard library

  • C++ standard library from wikipedia.
  • STL vs C++ standard library. STL was written in the days long before C++ was standardized. Parts of the C++ Standard Library were based on parts of the STL, and it is these parts that many people (including several authors and the notoriously error-ridden cplusplus.com) still refer to as "the STL". However, this is inaccurate; indeed, the C++ standard never mentions "STL", and there are content differences between the two.

Containers

Four categories

  1. Sequence containers: vector, array, deque, list, forward_list.
  2. Associative containers: map, set, multimap, multiset.
  3. Unordered associative containers: unordered_map, unordered_set, unordered_multimap, unordered_multiset.
  4. Container adaptors: stack, queue, priority_queue.

The most important containers are vector, list and map.

Common operations

See Accelerated C++ 5.9.

container<T>::iterator 
container<T>::const_iterator
container<T>::size_type

c.begin()
c.end()
container<T> c;
container<T> c(c2);

container<T> c(n);
container<T> c(n, t);

c=c2;

c.size()
c.empty()
c.insert(d, b, e)

c.erase(it)
c.erase(b, e);    // remove [b, e)

c.push_back(t)

c[n]

// iterator
*it
(*it).x
it->x

++it;

b == e;
b != e;

// string type
s.substr(i, j)
getline(is, s)
s += s2;

// vector
v.reserve(n)
v.resize(n)

// list
l.sort()
l.sort(cmp)

// <cctype> header isspace(c) isalpha(c) isdigit(c) isalnum(c) isupper(c) islower(c) toupper(c) tolower(c)

vector

http://www.cplusplus.com/reference/vector/vector/

push_back() method

http://www.cplusplus.com/reference/vector/vector/push_back/

The vector length does not have to be determined beforehand. The push_back() method can be used to insert an element to the end of a vector object.

clear() method

Removes all elements from the vector (which are destroyed), leaving the container with a size of 0. See http://www.cplusplus.com/reference/vector/vector/clear/.

resize() method

See an application of transform.

Vector of own type
  • See Accelerated C++ Chapter 4. See <main2.cc> program which is a standalone program which does not depend on other source. Below it gives an example of the output:
brb@T3600 ~/github/C/accelerated_unix/chapter04 $ ./main2
Taylor 80 90
1 2 3
Jones 90 80
3 2 1
Jones  50.8
Taylor 52.8
size_type
vector<int>::size_type x;

size_type is a (static) member type of the type vector<int>. Usually, it is a typedef for std::size_t, which itself is usually a typedef for unsigned int or unsigned long long.

Declare x as a variable of a type suitable for holding the size of a vector

For example,

string::size_type width(const vector<string>& v)
{
	string::size_type maxlen = 0;
#ifdef _MSC_VER
	for(std::vector<string>::size_type i = 0; i != v.size(); ++i)
#else
	for(vector<string>::size_type i = 0; i != v.size(); ++i)
#endif
		maxlen = max(maxlen, v[i].size());
	return maxlen;
}
square bracket operator

For example,

const vector<double> v;
double d = v[1];

The square bracket operation is actually doing

double d1 = v.operator[](1)
2 dimensional matrix

The first example uses resize() method to specify the number of rows and columns.

#include <vector>
using std::vector;

#define HEIGHT 5
#define WIDTH 3

int main() {
  vector<vector<double> > array2D;

  // Set up sizes. (HEIGHT x WIDTH)
  array2D.resize(HEIGHT);
  for (int i = 0; i < HEIGHT; ++i)
    array2D[i].resize(WIDTH);

  // Put some values in
  array2D[1][2] = 6.0;
  array2D[3][1] = 5.5;

  return 0;
}

The second example (Method 2 below) needs to use push_back() twice. So it is probably less efficient.

int main () 
{
    /**************
        1   2   3
        4   5   6
    ***************/
    // Method 1
    const int ROW = 2;
    const int COL = 3;
    int array1[ROW][COL];
    for(int i=0; i<ROW; i++)
        for(int j=0; j<COL; j++)
            array1[i][j] = i*COL+j+1;

    // Method 2
    typedef vector<vector<int> > ARRAY; 
    ARRAY array2;
    vector<int> rowvector;
    for(int i=0; i<ROW; i++)
    {
        rowvector.clear();
        for(int j=0; j<COL; j++)
            rowvector.push_back(i*COL+j+1);
        array2.push_back(rowvector);
    }
    return 0;
}

The first example also works on string type.

#include <iostream>
#include <vector>
#include <string>

using namespace std;

void foo1(vector<vector<string>> &vs) {
    vs.resize(2);
    for(int i=0; i < 2; ++i)
        vs[i].resize(4);
    vs[0][0] = "This";
    vs[0][1] = "Is";
    vs[0][2] = "A";
    vs[0][3] = "BOok";
    vs[1][0] = "That";
    vs[1][1] = "Is not";
    vs[1][2] = "An";
    vs[1][3] = "Apple";
}

void foo2(vector<vector<string>> &vs) {
    cout << vs.size() << endl;
    for(unsigned int i=0; i < vs.size(); ++i) cout << "row " << i << ", size " << vs[i].size() << endl;
    for(unsigned int i=0; i < vs.size(); ++i)
    {
        for(unsigned int j=0; j < vs[i].size(); ++j)
        {
            cout << vs[i][j] << " ";
        }
        cout << endl;
    }
}

int main()
{
    vector<vector<std::string>> vs;

    foo1(vs);
    foo2(vs);
    cout << endl;

    return 0;
}
Comparison of a C++ array and C++ vector

See the example from PPP Chapter 20.1.

double* get_from_jack(int* count);  // jack puts doubles into an array
                                    // and returns the number of elements in *count
vector<double>* get_from_jill();    // Jill fills the vector

int main()
{
    int jack_count = 0;
    double* jack_data = get_from_jack(&jack_count);    
    vector<double>* jill_data = get_from_jill();    
    // ... Process ...
    // jack_data[i]     ---- value
    // &jack_data[i]    ---- address
    //
    // (*jill_data)[i]  ---- value, deference the pointer first
    // &(*jill_data)[i] ---- address, deference the pointer first
    //
    // note *jill_data[i] is not what we want; that means *(jill_data[i])
    //
    delete[] jack_data;
    delete jill_data;
}

double* get_from_jack(int* count)
{
    if (!count)
        return 0;
    const int n = 10;
    double* arr = new double[n];
    if (arr)  {
        *count = n;

        for (int i = 0; i < n; ++i)
            arr[i] = i;
    }
    return arr;
}

vector<double>* get_from_jill()
{
    const int n = 10;
    vector<double>* arr = new vector<double>(n);
    if (arr)
    {
        for (int i = 0; i < n; ++i)
            (*arr)[i] = i;
    }
    return arr;
}

list

(From The C++ PL) A list is a double-linked list. We use a list for sequences where we want to insert and delete elements without moving other elements.

(From Acceleted C++) Just as vectors are optimized for fast random access, lists are optimized for fast insertion and deletion anywhere within the container. Because lists have to maintain a more complicated structure, they are slower than vectors if the container is accessed only sequentially. That is, if the container grows and shrinks only or primarily from the end, a vector will outperform a list. However, if a program deletes many elements from the middle of the container, then lists will be faster for large inputs. ... lists and vectors share many operations. As a result, we can often translate programs that operate on vectors into programs that operate on lists, and vice versa.

One key operation that vectors support, but lists do not, is indexing. But if we use 'iterators instead of indices, these two types will be even similar. See 5.5.1 for some important differences between these two.

list<Student_info> extract_fails(list<Student_info>& students)
{
  list<Student_info> fail;
  list<Student_info>::iterator iter = students.begin();
  while (iter != students.end()) {
     if (fgrade(*iter)) {
         fail.push_back(*iter);
         iter = students.erase(iter);
     } else 
         ++iter;
  }
}

(From The C++ PL) When we use a linked list, we end not to access elements using subscripting the way we do for vectors. Instead, we might search the list looking for an element with a given value.

struct Entry {
   string name;
   int number;
}

list<Entry> phone_book = {
  {"David Hume", 123456},
  {"Karl Popper", 234547},
  {"Bert Arthur", 345678}
};

int get_number(const string& s)
{
  for (const auto& x : phone_book)
    if (x.name == s)
       return x.number;
  return 0; // use 0 to represent 'number not found'
}
// OR using the iterator
int get_number(const string& s)
{
  for (auto p = phone_book.begin(); p!= phone_book.end(); ++p) 
      if (p->name == s)
          return p->number;     
  return 0;
}

// To delete or insert an element
void f(const Entry& ss, list<Entry>::iterator p, list<Entry>::iterator q)
{
   phone_book.insert(p, ee); // add ee before the element referred to by p
   phone_book.erase(q);      // remove the element referred to by q
}

map

A map is like a vector but using a key instead of an integer to index it (R's vector object can do it already; see An introduction to R).

The following example is from Chapter 7 of Accelerated C++. See also the example and a list of member types/functions from cplusplus.com. The interesting thing is the elements are ordered by their key at all times (see the example below).

#include <iostream>
#include <map>
#include <string>

using std::cin;
using std::cout;
using std::endl;
using std::map;
using std::string;

int main()
{
	string s;
	map<string, int> counters; // store each word and an associated counter

	// read the input, keeping track of each word and how often we see it
	while (cin >> s)
		++counters[s];

	// write the words and associated counts
	for (map<string, int>::const_iterator it = counters.begin();
	     it != counters.end(); ++it) {
		cout << it->first << "\t" << it->second << endl;
	}
	return 0;
}

And the output

brb@T3600 ~/Downloads $ g++ mapTest.cpp
brb@T3600 ~/Downloads $ ./a.out
this is a book
a       1
book    1
is      1
this    1
brb@T3600 ~/Downloads $

array

deque

queue

set

stack

unordered_map

unordered_set

Algorithms

Accelerated C++ 6.4 says Algorithms act on container elements; they do not act on containers. For example, remove_if() and partition() do not change the size of the container on which it operates. The size of the container is the same. To really shorten a vector, we need to use the erase() method from container operations. That is to say, when we use remove() or remove_if() function, it is likely we want to apply erase() method to the container.

/* 6.4 Algorithms, containers and iterators in Accelerated C++ */
students.erase(remove_if(students.begin(), students.end(), fgrade), 
               students.end());

all_of/ any_of/ none_of/ for_each

count/ count_if/ find/ find_if/ mismatch/ search/ search_n

N.B.

  1. string, map and set types also has a member function called find(). See string::find, map::find and set::find.
  2. Qt has its own solution: QRegExp.

The algorithms find() and find_if() return an iterator to the first element that matches a value and a predicate (bool return type), respectively.

void f(const string& s)
{
  auto p_space = find(s.begin(), s.end(), ' ');
  auto p_whitespace = find(s.begin(), s.end(), isspace);
}

See also the split() function example in 6.1.1 of Accelerated C++ that uses find_if() function.

Example of using std::find() function to return the index of a string (only the first match) in a vector and std::distance to find the distance between two iterators.

#include <algorithm>
#include <iostream>
#include <vector>
// #include <iterator>
int main()
{
    int data[] = { 1, 2, 3, 4, 4, 3, 7, 8, 9, 10 };
    std::vector<int> v(data, data+10);
    std::vector<int> indexResult;

    std::cout << "Original data is ";
    for(std::vector<int>::iterator it=v.begin(); it != v.end(); ++it)
      std::cout << *it << " ";
    std::cout << '\n';

    for(std::vector<int>::iterator it=v.begin(); it != v.end(); ++it) {
      // std::cout << "Begin shift= " << std::distance(v.begin(), it) << ", ";
      std::cout << "Current data is ";
      for(std::vector<int>::iterator it2=it; it2 != v.end(); ++it2) std::cout << *it2 << " ";
      std::cout << ", ";

      auto p = std::find(it, v.end(), 3);
      if (p != std::end(v))
        std::cout << "element 3 was found in v at position (starting from 0): " << 
           std::distance(it, p) << std::endl;
      else
        std::cout << "element 3 was not found in Data\n";
    }  
}

And the output

$ g++ -std=c++11 count.cpp; ./a.out
Original data is 1 2 3 4 4 3 7 8 9 10 
Current data is 1 2 3 4 4 3 7 8 9 10 , element 3 was found in v at position (starting from 0): 2
Current data is 2 3 4 4 3 7 8 9 10 , element 3 was found in v at position (starting from 0): 1
Current data is 3 4 4 3 7 8 9 10 , element 3 was found in v at position (starting from 0): 0
Current data is 4 4 3 7 8 9 10 , element 3 was found in v at position (starting from 0): 2
Current data is 4 3 7 8 9 10 , element 3 was found in v at position (starting from 0): 1
Current data is 3 7 8 9 10 , element 3 was found in v at position (starting from 0): 0
Current data is 7 8 9 10 , element 3 was not found in Data
Current data is 8 9 10 , element 3 was not found in Data
Current data is 9 10 , element 3 was not found in Data
Current data is 10 , element 3 was not found in Data

But the string type case is more complicated. It only finds the EXACT match string.

#include <algorithm>
#include <iostream>
#include <vector>
#include <string>
// #include <iterator>
int main()
{
    std::vector<std::string> v = {"x1", "yx12", "x1", "x3", "x4"};
    std::vector<int> indexResult;

    std::cout << "Search position (starting from 0)" << std::endl;
    std::cout << "Original data is ";
    for(auto it : v)
      std::cout << it << " ";
    std::cout << '\n';

    for(std::vector<std::string>::iterator it=v.begin(); it != v.end(); ++it) {
      std::cout << "Current data is ";
      for(std::vector<std::string>::iterator it2=it; it2 != v.end(); ++it2) 
          std::cout << *it2 << " ";
      std::cout << ", ";
      auto p = std::find(it, v.end(), "x1");
      if (p != std::end(v))
        std::cout << "element x1 was found in v at position: " << 
                     std::distance(it, p) << std::endl;
      else
        std::cout << "element x1 was not found in Data\n";
    }
}

After compiling & running it, we will see the find() function cannot pick up the 2nd case.

$ g++ -std=c++11 count.cpp; ./a.out
Search position (starting from 0)
Original data is x1 yx12 x3 x4 
Current data is x1 yx12 x1 x3 x4 , element x1 was found in v at position: 0
Current data is yx12 x1 x3 x4 , element x1 was found in v at position: 1
Current data is x1 x3 x4 , element x1 was found in v at position: 0
Current data is x3 x4 , element x1 was not found in Data
Current data is x4 , element x1 was not found in Data

copy/ copy_if/ swap/ swap_ranges/ remove/ remove_if/ fill/ replace/ replace_if/ shuffle/ unique/ transform

The transform() function is similar to the for_each() function.

An application of transform function is to copy data from the vector to the set and convert that to lowercase. PS. the following example will overwrite the original string because the output iterator is the same as the input iterator.

#include <iostream>
#include <algorithm>
#include <string> 
int main() {
    std::string data = "Abc"; 
    std::transform(data.begin(), data.end(), data.begin(), ::tolower);
    std::cout << data << std::endl;
    return 0;
}

If the output length is known we can create a new object of the desired size as in this example in cplusplus.com. If the output is unknown, we can use back_inserter() function as in 6.2.3 of Accelerated C++.

Examples of remove_copy() can be found in 6.2.4 of Accelerated C++.

Examples of remove_copy_if() and remove_if() can be found in 6.3.1 of Accelerated C++.

sort/ stable_sort/ is_sorted/ partial_sort

merge/ set_difference/ set_intersection/ set_union

max/ max_element/ min/ min_element/ next_permutation

accumulate/ inner_product/ partial_sum

iterator & sequence

Container   -------   Iterator  --------   Algorithm

(PPP 20.3 Sequences and iterators) The central concept of the STL is the sequence. From the STL point of view, a collection of data is a sequence.

The reason STL algorithms and containers work so well together is that they don't know anything about each other. Instead, both understand about sequences defined by pairs of iterators.

A sequence has a beginning and an end. We identify the beginning and the end of a sequence by a pair of iterators. An iterator is an object that identifies an element of a sequence. An STL sequence is what is usually called "half-open"; the element identified by begin is part of the sequence, but the end iterator points one beyond the end of the sequence.

What is an iterator?

  • An iterator points to an element of a sequence
  • You can compare two iterators using == and !=
  • You can refer to the value of the element pointed to by an iterator using the unary * operator ("dereference")
  • You can get an iterator to the next element by using ++.
  • The idea of an iterator is related to the idea of a pointer. However, many iterators are not just pointers; for example, we could define a range-checked iterator that throws an exception if you try to make it point outside its [begin:end) sequence or dereference end. We get enormous flexibility and generality from having iterator as an abstract notion rather than as a specific type.

Advantages of using an iterator

Basic standard iterator operations

// if p and q are two iterators
p == q
p != q
*p
*p = val
val = *p
++p

Using iterators in string type

auto b=v.begin(), e=v.end();

string s("some string");
for (auto it = s.begin(); it != s.end() && !isspace(*it); ++it) {
    *it = toupper(*it);      // make that char uppercase
}

Using iterator in vector type

vector<int>::iterator it;  // it can read and write vector<int> elements
string::iterator it2;      // it2 can read and write characters in a string
vector<int>::const_iterator it3; // it3 can read but not write elements
string::const_iterator it4;      // it4 can read but not write characters

vector<int> v;
const vector<int> cv;
auto it1 = v.begin();  // it1 has type vector<int>::iterator
auto it2 = cv.begin(); // it2 has type vector<int>::const_iterator
auto it3 = v.cbegin(); // it3 has type vector<int>::const_iterator

Assuming it is an iterator into this vector, we can check whether the string that it denotes is empty as follows:

(*it).empty()  # *it.empty() will gives an error

Print each element of a vector of strings.

vector<string> text;
for (auto it = text.cbegin(); it != text.cend() && !it->empty(); ++it)
    cout << *it << endl;

Examples

  • split() function (+ find_if()) in 6.1.1 of Accelerated C++
  • find_urls() function in 6.1.3 of Accelerated C++

numeric

string

utilities

Function objects

pair

Debugging

valgrind

http://valgrind.org/docs/manual/quick-start.html

For example,

valgrind --leak-check=yes build-Qheatmap-Desktop_Qt_4_8_5-Debug/Qheatmap /home/brb/Qt/example/BRCACC/

GDB, DDD, Nemiver

gdb

To run gdb with its TUI mode,

gcc -g -o foo foo.c
gdb -tui foo 
# Press return key to see the source showing on the top panel

To run a command with arguments, do

gdb --args executablename arg1 arg2 arg3

To combine tui and args parameters, do

gdb -tui --args executablename arg1 arg2 arg3

To run a command with redirect input (e.g. the source code contains e.g. std::cin), do

gdb executablename
run < input
file a.out Load an executable file by name
break hello.c:100 set a break point (a function, line number)
break hello.c:100 if i == 5 set a break point if a condition is true
cond N Condition N is the break point number, Condition is any condition (eg i < .5, *p == 'a', strcmp(msg,"OK") == 0)
info break
delete Location
clear Location delete any breakpoints set at or within the code of the specified line
list
run
next
step step in
until run until the program reaches a source line greater than the current (eg loop)
finish Run until the end of the current 'function'. It'll jump back to the caller.
continue
print Expression
set Variable=Expression For example, set x=5
backtrace (bt) #0 is where the code broke.
frame N switch to frame N (see backtrace output)
quit

ddd

  • A Brief Introduction to DDD by knuth.luther.edu.
  • To use command line argument, go to Program -> Run where you can specify your command line arguments.
  • Check Source -> Display Line Numbers.
  • It seems there is not keyboard shorts for ddd. And trying to change the fonts will give you errors.
  • A good feature in ddd is once a program aborted, ddd can show the backtrace (Status -> Backtrace...). So it is easy to find out which line of code broke the program and how the line was called. I did not see this feature in Nemiver.

For example,

g++ -Wall -g -o XXX.o -c XXX.cpp
g++ -o XXX.exe XXX.o -lstdc++
ddd XXX.exe

nemiver

Nemiver is an on going effort to write an easy to use standalone C/C++ debugger that integrates well in the GNOME environment.

sudo apt-get install nemiver 
nemiver
  • Qt Creator. See the above discussion link for an instruction.

Qt Creator

A screenshot based on Qt Creator 3.3 and Qt 5.4.

QtCreatorDebug.png

Tools

IDE editor

Geany

Pros: show a list of symbols/functions on the left hand side panel. Show the open files as a tree structure on the left hand side. Code folding. Con: the GDB debugger? To install the latest version of geany and geany-plugin-debugger:

sudo add-apt-repository ppa:geany-dev/ppa
sudo apt-get update
sudo apt-get install geany
sudo apt-get install geany-plugin-debugger

Code::Blocks

sudo apt-get install codeblocks

Pros: built-in GDB debugger.

This is the one people recommend.

If you have already source code and Makefile. Then

  1. Put source code together with Makefile under the same directory as ProjectName.cbp file.
  2. Project->Properties check 'This is a custom Makefile' and modify the makefile name as needed.
  3. Don't worry about the Execution directory shown there.
  4. Go to the 'Build targets' tab, change the Output filename to the one we want to run or debug.
  5. Go to Project->Build Options. Go to "Make" commands tab and change the 'Build project/target:' to $make -f $makefile (i.e. remove $target). This needs to be done for 'Debug' build.

Now we can click the build, run or debug button on the toolbar.

As we can see, the required step to debug an executable in codeblocks is much more complicated than ddd program.

The only thing I want to change is the font of the execution terminal. The font is too small.

Qt Creator

I haven't found a way to use own Makefile to run debugging.

GCC

Introduction to GCC

Show all libraries used (dynamically linked) by an executable program

Use ldd (list dynamic dependencies) command on Linux environment. On Windows OS, we can use Dependency Walker; see this post.

debian@beaglebone:~$ ldd /usr/bin/netsurf
	libjpeg.so.8 => /usr/lib/arm-linux-gnueabihf/libjpeg.so.8 (0xb6f1c000)
	libz.so.1 => /lib/arm-linux-gnueabihf/libz.so.1 (0xb6f02000)
	libxml2.so.2 => /usr/lib/arm-linux-gnueabihf/libxml2.so.2 (0xb6e22000)
        ....
	libcurl.so.4 => /usr/lib/arm-linux-gnueabihf/libcurl.so.4 (0xb6dd7000)
	libtasn1.so.3 => /usr/lib/arm-linux-gnueabihf/libtasn1.so.3 (0xb5d5b000)
	libp11-kit.so.0 => /usr/lib/arm-linux-gnueabihf/libp11-kit.so.0 (0xb5d47000)

Linking with external libraries

$ gcc -Wall calc.c /usr/lib/libm.a -o calc
$ gcc -Wall calc.c -lm -o calc

To specify the library path, we can use "-L", and/or "-Wl,-rpath" in gcc/g++.

When shared libraries are present in nondefault directories, the option "-Wl,-rpath" is needed in linker options. See here. If we don't specify "-Wl,-rpath" in linker options, we need to define "LD_LIBRARY_PATH" environment variable. See an example in RInside.

Header files/Include path

The list of directories for header files is often referred to as the include path, and the list of directories for libraries as the library search path or link path.

When additional libraries are installed in other directories it is necessary to extend the search paths, in order for the libraries to be found. The compiler options -I and -L add new directories to the beginning of the include path and library search path respectively.

$ gcc -Wall -I/opt/gdbm-1.8.3/include -L/opt/gdbm-1.8.3/lib dbmain.c -lgdbm



Environment variables

We can use some environment variables to replace -I and -L flags. See some basic introduction on

Header files (CPATH, C_INCLUDE_PATH, CPLUS_INCLUDE_PATH, OBJC_INCLUDE_PATH)

$ CPLUS_INCLUDE_PATH=/opt/gdbm-1.8.3/include 
$ export CPLUS_INCLUDE_PATH

Library files during link time

$ LIBRARY_PATH=/opt/gdbm-1.8.3/lib
$ export LIBRARY_PATH

Library files during run time (needed only for dynamic libraries). On Windows platform, the PATH variable will be used.

$ export LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib

Check environment variables

echo $SHELL
echo $PATH
echo $LIBRARY_PATH
env

Makefile

This is an example of <Makefile>. The source code files are Parent.cpp, Parent.h, Child.cpp, Child.h, and main.cpp. Remember the indents should be a single tab.

CC=g++
TARGET=pc
OBJECTS=main.o Parent.o Child.o

$(TARGET): $(OBJECTS)
	@echo "** Linking Executable"
	$(CC) $(OBJECTS) -o $(TARGET)
	
clean:
	@rm -f *.o *~
		
veryclean: clean
	@rm -f $(TARGET)
	
%.o: %.cpp
	@echo "** Compiling C++ Source"
	$(CC) -c $(INCFLAGS) $<

'all' and '.PHONY' targets

minimal make

http://kbroman.org/minimal_make/

cmake

On Windows, it will be installed on C:\Program Files (x86)\Cmake 2.8 folder. By default, it is not added to system PATH. The 'Cmake' program will ask for source, binary folders and the Compiler option. After clicking 'configure' and 'generate' buttons, it will create VS solution file and we can double click the solution file to open the project in Visual Studio. In Visual Studio, we can just build the solution (Ctrl + Shift + B). When we want to debug the code, we should 1. right click on project and select property. Change the working directory to the source code (note that .exe file will be generated there). 2. Set the project as the starting project.

The idea of cmake is it can create <Makefile> file from <CMakeLists.txt> file. See

For example, if we use git clone to get files from Effective Modern C++.

$ git clone https://github.com/BartVandewoestyne/Effective-Modern-Cpp.git
$ cd Effective-Modern-Cpp/Item01_Understand_template_type_deduction
$ ls
CMakeLists.txt
[and other cpp files]
$ sudo apt-get install cmake
$ mkdir build
$ cd build
$ cmake ..
$ ls MakeCache.txt	CMakeFiles  cmake_install.cmake  Makefile
$ make

Autotools

Makefile.am is a programmer-defined file and is used by automake to generate the Makefile.in file. The ./configure script typically seen in source tarballs will use the Makefile.in to generate a Makefile.

The ./configure script itself is generated from a programmer-defined file named either configure.ac or configure.in , I prefer .ac (for AutoConf) since it differentiates it from the generated Makefile.in files and that way I can have rules such as make dist-clean which rm -f *.in . Since it is a generated file it is not typically stored in a revision system such as SVN or CVS, rather the .ac file would be.

Read more on GNU build system/Autotools. Read about make and Makefile first, then learn about automake, autoconf, libtool, etc.

Profiling

C++ Libraries

pthreads - POSIX Threads Programming

  • pbzip2 - a parallel implementation of the bzip2 block-sorting file compressor that uses pthreads and achieves near-linear speedup on SMP machines.
  • bowtie software which makes use of pthread library. Similar tophat and cuffmerge also have a '-p' argument to support pthread.

Wt

C++ library for developing web applications

SeqAn

C++ for sequencing data. On Windows OS, it requires Python 2 and CMake in addition to VS.

Boost

Boost is a set of C++ libraries for numerical computation that provide support for tasks and structures such as linear algebra, pseudorandom number generation, multithreading, image processing, regular expressions, and unit testing. Release 1.52 contains over eighty individual libraries.

The R library BH provides template use in R. See http://gallery.rcpp.org/articles/using-boost-with-bh/ for a gallery.

Boost contains a lot of libraries.

brb@brb-VirtualBox:~/Downloads/boost_1_56_0$ ./bootstrap.sh --show-libraries
Building Boost.Build engine with toolset gcc... tools/build/src/engine/bin.linuxx86/b2

The following Boost libraries have portions that require a separate build
and installation step. Any library not listed here can be used by including
the headers only.

The Boost libraries requiring separate building and installation are:
    - atomic
    - chrono
    - container
    - context
    - coroutine
    - date_time
    - exception
    - filesystem
    - graph
    - graph_parallel
    - iostreams
    - locale
    - log
    - math
    - mpi
    - program_options
    - python
    - random
    - regex
    - serialization
    - signals
    - system
    - test
    - thread
    - timer
    - wave

build under Ubuntu OS

To install via apt-get is 'sudo apt-get install libboost-all-dev'. The software center shows the version is 1.48 (kind of old) on Ubuntu 12.04 and 1.54 on Ubuntu 14.04.

We can install it by downloading its source code and build it by ourselves. See this post on ubuntuforums.org.

./bootstrap.sh 
./b2 install

At the end of buliding, it will show the paths to header files (sometimes it goes to /usr/local/include) and the libraries themselves (sometimes it is /usr/local/lib). If we use apt-get to install boost, the header files go to /usr/include and the libraries files libboost*.a and libboost*.so go to /usr/lib.

The Boost C++ Libraries were successfully built!

The following directory should be added to compiler include paths:

    /home/brb/Downloads/boost_1_55_0

The following directory should be added to linker library paths:

    /home/brb/Downloads/boost_1_55_0/stage/lib

The source directory contains <index.html> file. It tells us how to run a simple test program. For example, the following example test headers-only libraries and require no separately-compiled library binaries or special treatment when linking.

g++ -I /home/brb/Downloads/boost_1_55_0 example.cpp -o example
echo 1 2 3 | ./example

To link to boost binary libraries, we can do

$ g++ -I /home/brb/Downloads/boost_1_55_0 example.cpp -o example2 \
   -L~/Downloads/boost_1_55_0/stage/lib/ -lboost_regex
$ ./example2 < jayne.txt
Will Success Spoil Rock Hunter?

Note that under boost_1_55_0/stage/lib directory, both static and dynamic libraries are available for boost_regex. My system picks the static library. If we want to link to dynamic library, we can

$ g++ -I /home/brb/Downloads/boost_1_55_0/ example2.cpp -o example2 \
    ~/Downloads/boost_1_55_0/stage/lib/libboost_regex.so
$ ./example2 < jayne.txt
./example2: error while loading shared libraries: libboost_regex.so.1.55.0: cannot open shared object file:
 No such file or directory
$ export LD_LIBRARY_PATH=/home/brb/Downloads/boost_1_55_0/stage/lib

Here we see the purpose of specifying the environment variable LD_LIBRARY_PATH.

To try an example (Math and numerics > math/statistical distributions > Calculating confidence intervals on the mean with the Students-t distribution), we can compile and generate the executable file by (no need to link to library in this case)

g++ -I /home/brb/Downloads/boost_1_55_0/ \
    ~/Downloads/boost_1_55_0/libs/math/example/students_t_single_sample.cpp 

If we like the headers and libraries automatically available in linux environment without using -I and -L parameters in g++, we can run

./bootstrap.sh --prefix=/usr/local
sudo ./b2 install

Running ./b2 took about 1 hours on my single core VM. At the end, it will create /usr/local/include/boost subdirectory and a bunch of libboost*.a and libboost*.so will be created under /usr/local/lib directory.

Check BOOST version

  • On Ubuntu, use ' tail /usr/include/boost/version.hpp'
  • IN boost C++ code, use boost version macro 'BOOST_LIB_VERSION'
#include <boost/version.hpp>
#include <iostream>

using namespace std;

int main()
{
    cout << "Boost version: " << BOOST_LIB_VERSION << endl;
    return 0;
}

Build boost from source on Windows OS

MinGW

boostrap --with-toolset=gcc
.\b2 --build-type=complete toolset=gcc link=shared runtime-link=shared install 

use of statistical distributions in Boost

See an article in www.quantnet.com

For each of distribution functions, we can compute cdf, quantile, pdf, mean, mode, median, variance, etc.

To compute t(alpha/2, df), we can

#include <boost/math/distributions/students_t.hpp>
#include <iostream>

using namespace std;
using boost::math::students_t;

int main()
{
  int df = 195;
  double alpha = .1;
  students_t dist(df);
  double T = quantile(complement(dist, alpha / 2));
  cout << T << endl;
  return 0;
}

We can compile the code and get the result (check by using R statement qt(.95, 195)).

$ g++ example.cpp; ./a.out
1.65271

PS For some reason, I don't need to worry about environment variable.

Googling shows we can use g++ --print-search-dirs to show LIBRARY search paths and adding -v to show include search paths to g++ command.

Boost thread example

http://ashishgrover.com/boost-multi-threadingfor-c/

#include <boost/thread.hpp>
void readerApi()
{
  for (int i=0; i < 10; i++) {
    usleep(400);
    std::cout << "readerApi: " << i
              << std::endl;
  }
}
void writerApi()
{
  for (int i=0; i < 10; i++) {
    std::cout << "writerApi: " << i
              << std::endl;
    usleep(400);
  }
}

int main()
{
  boost::thread readerThread(readerApi);
  boost::thread writerThread(writerApi);

  readerThread.join();
  writerThread.join();
}

Then compile the code by

$ g++ -o b1 boost1.cpp -lboost_thread

Before you can run it, specify LD_LIBRARY_PATH env variable if you build boost library by yourself.

$ export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
$ ./b1
writerApi: 0
readerApi: 0
writerApi: 1
readerApi: 1
writerApi: 2
readerApi: 2
writerApi: 3
writerApi: 4
readerApi: 3
writerApi: 5
readerApi: 4
writerApi: 6
readerApi: 5
writerApi: 7
readerApi: 6
writerApi: 8
readerApi: 7
writerApi: 9
readerApi: 8
readerApi: 9

To make changes permanent in Ubuntu you have to add a new configuration file for ldconfig:

sudo nano /etc/ld.so.conf.d/libboost.conf

(my editor of choice is nano, and the name of the file itself doesn’t matter) Add the library path to that file, i.e. /usr/local/lib/. Save the file, quit and reload your configuration by calling

sudo ldconfig

Note that your LD_LIBRARY_PATH won’t change, but your program will now run!

Another working example is from http://www.codeproject.com/Articles/279053/How-to-get-started-using-Boost-threads

#include <iostream>  
#include <boost/thread.hpp>   
#include <boost/date_time.hpp>       
      
void workerFunc()  
{  
    boost::posix_time::seconds workTime(3);          
    std::cout << "Worker: running" << std::endl;    
      
    // Pretend to do something useful... 
    boost::this_thread::sleep(workTime);          
    std::cout << "Worker: finished" << std::endl;  
}    
      
int main(int argc, char* argv[])  
{  
    std::cout << "main: startup" << std::endl;          
    boost::thread workerThread(workerFunc);    
      
    std::cout << "main: waiting for thread" << std::endl;          
    workerThread.join();    
      
    std::cout << "main: done" << std::endl;          
    return 0;  
}

It can be built by g++ test.cpp -lboost_thread -lboost_system.

Another excellent example is from Jeff Benshetler. Check out this page http://advancedcplusplus.com/5min-threads/ (I cannot build it successfully).

Books

Boost and Rcpp package

Boost.python

http://edyfox.codecarver.org/html/boost_python.html

Rcpp was inspired by Boost.python. See the Rcpp module vignette.

The Scythe Statistical Library

Scythe: An Open Source C++ Library for Statistical Computation from J of Stat Software.

Armadillo

Armadillo is a C++ linear algebra library (matrix maths) aiming towards a good balance between speed and ease of use.

GNU Scientific Library

BLAS, LAPACK

IMSL

Numerical Recipes

Some C++ Projects

Trending in Github

Approximate nearest neighbor search

The R wrap is here

BEDTools

https://github.com/arq5x/bedtools2

freebayes

https://github.com/ekg/freebayes/

NGS++

A programming library in C++11 specialized in manipulating both next-generation sequencing (NGS) datasets and genomic information files. See the paper.

LIBSVM

SAMTools (in C only)

Tophat

It also requires the packages

Salmon - fast and bias-aware quantification of transcript expression

http://www.rna-seqblog.com/salmon-fast-and-bias-aware-quantification-of-transcript-expression/

Parana2

It also depends on a few other tools.

  • Bio++ - a set of C++ libraries for Bioinformatics, including sequence analysis, phylogenetics, molecular evolution and population genetics.
  • Boost
  • GMP - The GNU Multiple Precision Arithmetic Library
  • MPFR - C library for multiple-precision floating-point computations with correct rounding.
  • pugixml - light-weight C++ XML processing library.

Comprehensive short read mapping

Short read alignment with populations of genomes

https://github.com/viq854/bwbble

Janus-comprehensive tool investigating the two faces of transcription

It depends on bamtools.

Rcount-simple and flexible RNA-Seq read counting

RSEM/RNA-Seq by Expectation-Maximization

RNA-Pareto

Interactive Analysis of Pareto-optimal RNA Sequence-Structure Alignments

The software is written in Java 6 (graphical user interface) and C++ (dynamic programming algorithms). To run, a Java Runtime Environment, version ≥1.6.0 is required. It is well tested with GCC 4.6.

ANGSD

Analysis of Next Generation Sequencing Data

Open MS

http://open-ms.sourceforge.net/. It used external libraries such as: (i) Qt, which provides visualization and database support; (ii) Xerces for XML file parsing; (iii) libSVM, for machine learning algorithms; and (iv) the GNU Scientific Library (GSL), used for mathematical and statistical analysis. One of the strong points of OpenMS is a complete set of examples to extend and use the libraries, the TOPP (The OpenMS Proteomics Pipeline) and TOPPView tutorials describe in detail the OpenMS.

MACAU - Differential Expression Analysis for RNAseq using Poisson Mixed Models

http://www.xzlab.org/software.html. GSL and Lapack are used.

GUI Programming

Windows Programming

Resource

Difference between Win32 project and CLR (common language runtime) project

See here.

A Win32 project is used if you want to end up with a DLL or a Win32 application usually using the bare WinAPI. A CLR project is used to create C++/CLI project, i.e. to use C++/CLI to target the .NET platform.

The main difference between projects is what Visual Studio comes up with in terms of pre-created files. A windowed Win32 application for example (what you get when you choose Win32 project, but not a DLL) is created with a file for resources (menus, acceleators, icons etc.) and some default code to create and register a window class and to instantiate this window.

Quoted from http://en.wikipedia.org/wiki/Common_Language_Runtime

The Common Language Runtime (CLR) is the virtual machine component of Microsoft's .NET framework and is responsible for managing the execution of .NET programs. In a process known as Just-in-time compilation, the compiled code is converted into machine instructions that, in turn, are executed by the computer's CPU. The CLR provides additional services including memory management, type safety and exception handling. All programs written for the .NET framework, regardless of programming language, are executed by the CLR. It provides exception handling, garbage collection and thread management. CLR is common to all versions of the .NET framework.

The CLR is Microsoft's implementation of the Common Language Infrastructure (CLI) standard.

Difference between Win32, MFC and .NET

http://stackoverflow.com/questions/821676/how-do-i-decide-whether-to-use-atl-mfc-win32-or-clr-for-a-new-c-project

Using the CLR will provide you with the most expressive set of libraries (the entire .NET framework), at the cost of restricting your executable to requiring the .NET framework to be installed at runtime, as well as limiting you to the Windows platform (however, all 4 listed technologies are windows only, so the platform limitation is probably the least troublesome).

However, CLR requires you to use the C++/CLI extensions to the C++ language, so you'll, in essense, need to learn some extra language features in order to use this. Doing so gives you many "extras," such as access to the .net libraries, full garbage collection, etc.

Using Win32 directly provides the smallest executables, with the fewest dependencies, but is more work to write. You have the least amount of helper libraries, so you're writing more of the code.

Win32 is the raw, bare-metal way of doing it. It's tedious, difficult to use, and has alot of small details you need to remember otherwise things will fail in relatively mysterious ways.

MFC builds upon Win32 to provide you an object oriented way of building your application. It's not a replacement for Win32, but rather an enhancement - it does alot of the hard work for you.

System.Windows.Forms (which is what I assume you meant by CLR) is completely different, but has large similarities to MFC from its basic structure. It's by far the easiest to use, but requires the .NET framework, which may or may not be a hindrance in your case.

Why not MFC

http://win32-framework.sourceforge.net/explanation.htm The website also provides an alternative software called Win32++ to replace MFC. It also provides useful links for C++ compilers, tools, tutorial and references.

Qt

wxwidgets

Some projects:

Simple OpenGL GUI

As mentioned in http://www.oppi.uef.fi/bioinformatics/forg3d/downloads.php

OpenGL Programming on Windows

We need to include

#include <gl/gl.h>
#include <gl/glu.h>

And go to project's link properties and enter <opengl32.lib> & <glu32.lib>. Check the directory C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Include and C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\lib

header files are: gl\gl.h and glu.h libraries are: openGL32.lib and GLU32.lib

x64 libs can be found in C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\lib\x64 you can put freeglut lib and header files to those locations to use freeglut with visual studio 2010 when you copy freeglut DLLs to C:\Windows\System32 don’t copy 64 bit DLL to syswow64 this gives a freaky error 0xc000007b when running code. Don’t know what it mean, but if you have freeglut only in system32 you going to be fine.

Resource

Example 1

http://openglbook.com/setting-up-opengl-glew-and-freeglut-in-visual-c/

  1. Download freeglut (freeglut-MSVC-2.8.0-1.mp.zip) & glew (glew-1.9.0-win32.zip)
  2. Copy files include and lib to appropriate location
  3. Copy freeglut.dll to the Project's Release or Debug folder

I don't need Step 5 (Compiler) and Step 6 (Linker).

I keep a copy of the instruction in Evernote.

Example 2

Teapot and Glut shapes (WireTeapot, SolidTeapot, SolidCube & SolidSphere) from http://openglsamples.sourceforge.net/

Example 3 (no Glut, Windows OS only)

http://www.nullterminator.net/opengl32.html

Examples from opengl.org

http://www.opengl.org/sdk/docs/tutorials/

Example of American Flag

http://www.youtube.com/watch?v=9xjBlde4Cew

Computer Systems

MIPS Instruction Set