Naming: Difference between revisions

From 太極
Jump to navigation Jump to search
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
= [https://code.google.com/p/google-styleguide/ Google C++ Style Guide] =
= [https://code.google.com/p/google-styleguide/ Google Style Guide] =
* [http://google-styleguide.googlecode.com/svn/trunk/cppguide.html C++]
* [http://google-styleguide.googlecode.com/svn/trunk/cppguide.html C++]
* [http://google-styleguide.googlecode.com/svn/trunk/Rguide.xml R]
* [http://google-styleguide.googlecode.com/svn/trunk/Rguide.xml R]
= Javascript =
* [https://www.makeuseof.com/javascript-naming-conventions/ 10 Essential JavaScript Naming Conventions Every Developer Should Know]
** Naming Variables: totalPrice ('''camelCaseNaming''')
** Naming Boolean: isValid
** Naming Functions: calculateTotalPrice
** Naming Constants: MAX_PRICE
** Naming Classes: '''PascalCase'''
** Naming Components: '''PascalCase() '''
** Naming Methods: startEngine()
** Naming Private Functions: _startEngine()
** Naming Global Variables: MAX_PRICE
** Naming Files: string-utils.js (lowercase letters)


= The Practice of Programming =
= The Practice of Programming =
Line 204: Line 217:
* Don't belabor the obvious.
* Don't belabor the obvious.
* Comments functions and global data.
* Comments functions and global data.
* Don't comment bad code, rewrite it.
* Don't comment bad code, rewrite it. Comment anything unusual or potentially confusing, but when the comment outweighs the code, the code probably needs fixing.
* Don't contradict the code.
* Don't contradict the code. Most comments agree with the code when they are written, but as bugs are fixed and the program evolves, the comments are often left in their original form, resulting in disagreement with the code.
* Clarify, don't confuse.
* Clarify, don't confuse. Do not take more words to explain what's happening.
 
= Five CS Skills I Wish I Learned in College =
http://stattrak.amstat.org/2017/07/01/csskills/
* Working in an existing codebase
* Testing code
* Writing design documents
* Conducting code reviews
* Working on large-group projects
 
= Good Practices in R Programming =
* http://user2014.stat.ucla.edu/#invited and [https://www.youtube.com/watch?v=ytbX-T1A8wE youtube] video.
* http://stat.ethz.ch/Teaching/maechler/R/useR_2014/
 
== Gene expression ==
* normalized_counts, normalized_expression, exprs_values, expression_values, expression_data
* sample_info, sample_ids, p_data, ps_data
* gene_info, gene_ids
 
'''normalized.counts''' vs '''normalized_counts''': Both '''normalized.counts''' and '''normalized_counts''' are valid variable names in R. However, the use of dots (.) in variable names is generally discouraged in R, as dots have a special meaning in the language and can cause confusion. In R, dots are used to separate the names of '''methods''' and '''classes''' in '''object-oriented programming'''. For this reason, it is generally considered better practice to use underscores (_) instead of dots when naming variables in R.
 
Don't use
* exprs as a variable name since [https://rdrr.io/bioc/Biobase/man/exprs.html exprs] is a function name defined by Biobase. See [https://www.bioconductor.org/packages/release/bioc/vignettes/Biobase/inst/doc/ExpressionSetIntroduction.pdf An Introduction to Bioconductor's ExpressionSet Class].
* pData (pheno) as a variable name since pData is a function defined by Biobase.
* fData (feature) as a variable name since fData is a function defined by Biobase.
 
In BRB-ArrayTools plugins (see the last COMMANDS block in a .plug file), the following are commonly used,
* lr
* edw
* genes
 
In BRB-Arraytools export plugins, the following objects are exported,
* NORMALIZEDLOGINTENSITY (and LOGINTENSITY)
* EXPDESIGN
* GENEID (and FILTER)
 
Other choice:
* data
* samples
* features
 
== shinyExprPortal ==
[https://academic.oup.com/bioinformatics/article/40/4/btae172/7637675 shinyExprPortal: a configurable ‘shiny’ portal for sharing analysis of molecular expression data] 2024 and [https://c4tb.github.io/shinyExprPortal/articles/dataprep.html Github].
 
* Expression matrix: matrix_object/expression
* Measures table: measures
* Lookup table: It is for datasets where a subject has more than one sample (e.g. samples over time, from different tissues or combinations thereof).
 
== Examples ==
* data.expr  (data.measure, data.abundance)
* data.pheno (data.clinical, data.expdesign, data.sampleinfo)
* data.anno (data.genes, data.features)
* optional
** data.lookup
** data.labs
** data.models
 
= Top 10 Programming Languages used on Github =
* [https://github.com/blog/2047-language-trends-on-github from 2008-2015]
 
= The most popular programming languages =
* On [http://blog.revolutionanalytics.com/2015/07/the-most-popular-programming-languages-on-stackoverflow.html 2015]

Latest revision as of 10:09, 18 September 2024

Google Style Guide

Javascript

  • 10 Essential JavaScript Naming Conventions Every Developer Should Know
    • Naming Variables: totalPrice (camelCaseNaming)
    • Naming Boolean: isValid
    • Naming Functions: calculateTotalPrice
    • Naming Constants: MAX_PRICE
    • Naming Classes: PascalCase
    • Naming Components: PascalCase()
    • Naming Methods: startEngine()
    • Naming Private Functions: _startEngine()
    • Naming Global Variables: MAX_PRICE
    • Naming Files: string-utils.js (lowercase letters)

The Practice of Programming

Naming

  • Use capital letter for constant
#define ONE 1
  • Include a brief comment with the declaration of each global
  • Use descriptive names for global.
int npending = 0; // current length of input queue
  • Use short names for locals. Compare
for (theElementIndex = 0; theElementIndex < numberOfElements; theElementIndex++)
  elementArray[theElementIndex] = theElementIndex;

to

for (int i =0; i < nelems; i++)
   elem[i] = i;
  • Other naming conventions and local customs.
    • Use names that begin or end with p, such as nodep, for pointers;
    • Initial capital letters for globals;
    • All capitals for constants;
    • Use pch to mean a pointer to a character;
    • strTo and strFrom to mean strings that will be written to and read from;
    • It is a matter of taste as for the spelling of the names themselves: npending or or numPending or num_pending.
  • Be consistent. Give related things related names that show their relationship and highlight their difference. For example
class UserQueue {
  int noOfItemsInQ, frontOfTheQueue, queueCapacity;
  public int noOfUsersInQueue() { ... }
}

can be better changed to

class UserQueue {
  int nitems, front, capacity;
  public int nusers() { ... }
}
  • Use active name for functions. Functions names should be based on active verbs, perhaps followed by nouns:
now = date.getTime();
putchar('\n');

Functions that return a boolean value should be named so that the return value is unambiguous. Thus

if (checkoctal(c)) ...

does not indicate which value is true and which is false, while

if (isoctal(c)) ...

makes it clear that the function returns true if the argument is octal and false if not.

  • Be accurate. A name not only labels, it conveys information to the reader. For example, the function name below is doing the opposite of what it has been implemented.
public boolean inTable(Object obj) {
  int j = this.getIndex(obj);
  return( j == nTable);
}

Expressions and Statements

  • Indent to show structure
  • Use the natural form for expressions.
if (!(block_id < actblks) || !(block_id >= unblocks)) ...

is not good compare with

if ((block_id > actblks) || (block_id < unblocks)) ...
  • Parenthesize to resolve ambiguity.
  • Break up complex expressions
  • Be clear. For example ?: operator is fine for short expressions , as in
max = (a > b) ? a : b;
printf("The list has %d item%s\n", n , n==1 ? "" : "s");

but it is not general replacement for conditional statements.

  • Be careful with side effects. For example, the '++' operator and I/O. For example, the following is wrong because all the arguments to scanf are evaluated before the routine is called.
scanf("%d %d", &yr, &profit[yr])

The fix is

scanf("%d", &yr);
scanf("%d", &profit[yr]);

Consistency and Idioms

  • Use a consistent indentation and brace style.
  • Use idioms for consistency. For example,
for(int i = 0; i < n; i++)
    array[i] = 1.0;

is better than

// Form 1
int i = 0;
while (i <= n-1)
    array[i++] = 1.0;
// Form 2
int i;
for (i = 0; i < n; )
    array[i++] = 1.0;
// Form 3
int i;
for (i = 0; i < n; i++)
    array[i] = 1.0;

Other standard idioms include

for (p = list; p != NULL; p = p->next) ...
for (;;) ...

Another commom idiom is to nest an assignment inside a loop condition, as in

while ((c = getchar() != EOF)
    putchar(c);

Another example (wrong code, the error may not be detected until the damage has been done.) is

int i, *iArray, nmemb;
iArray = malloc(nmmeb * sizeof(int));
for (i = 0; i <= nmemb; i++)
    iArray[i] = i;

The following code

char *p, buf[256];
gets(buf);
p = malloc(strlen(buf));
strcpy(p, buf);

should be replaced by

p = malloc(strlen(buf) + 1);
strcpy(p, buf);
// OR
p = new char[strlen(buf)+1];
strcpy(p, buf);
  • Use Use else-ifs for multi-way decision.

Function Macros

  • Avoid function macros.
  • Parenthszie the macro body and arguments. A macro like this,
#define square(x)  (x) * (x)

the expression 1/square(x) will be expanded to the erroneous

1 / (x) * (x)

The macro should be rewritten as

#define square(x)  ((x) * (x))

Magic Numbers

  • Give names to magic numbers.
enum {
    MINROW    = 1,               /* top edge */
    MINCOL    = 1,               /* left edge */
    MAXROW    = 24,              /* bottom edge */
    HEIGHT    = MAXROW - 4,      /* height of bars */
    WIDTH     = (MAXCOL-1)/NLET  /* width of bars */
};
...
fac = (lim + HEIGHT-1) / HEIGHT; 
  • Define numbers as constants, not macros. The macros like #define are dangerous way to program because they change the lexical structure of the program underfoot.
const int MAXROW = 24, MAXCOL = 80;

C also has const values but they cannot be used as array bounds, so the enum statement remains the method of choice in C.

  • Use character constants, not integers.Or better to use the library.
if (c >= 65 && c <= 90)    // not good
if (c >= 'A' && c <= 'Z')  // better
if (isupper(c))            // best

The number 0 should be avoided in certain situations. For example, use (void*)0 or NULL to represent a zero pointer in C, and '\0' instead of 0 to represent the null byte at the end of a string. In other words, don't write

str = 0;
name[i] = 0;
x = 0;

but rather:

str = NULL;
name[i] = '\0';
x = 0.0;

However, in C++, 0 rather than NULL is the accepted notation for a null pointer.

  • Use the language to calculate the size of an object. Don't use an explicit size for any data type: use sizeof(int) instead of 2 or 4, for instance.

Comments

  • Don't belabor the obvious.
  • Comments functions and global data.
  • Don't comment bad code, rewrite it. Comment anything unusual or potentially confusing, but when the comment outweighs the code, the code probably needs fixing.
  • Don't contradict the code. Most comments agree with the code when they are written, but as bugs are fixed and the program evolves, the comments are often left in their original form, resulting in disagreement with the code.
  • Clarify, don't confuse. Do not take more words to explain what's happening.

Five CS Skills I Wish I Learned in College

http://stattrak.amstat.org/2017/07/01/csskills/

  • Working in an existing codebase
  • Testing code
  • Writing design documents
  • Conducting code reviews
  • Working on large-group projects

Good Practices in R Programming

Gene expression

  • normalized_counts, normalized_expression, exprs_values, expression_values, expression_data
  • sample_info, sample_ids, p_data, ps_data
  • gene_info, gene_ids

normalized.counts vs normalized_counts: Both normalized.counts and normalized_counts are valid variable names in R. However, the use of dots (.) in variable names is generally discouraged in R, as dots have a special meaning in the language and can cause confusion. In R, dots are used to separate the names of methods and classes in object-oriented programming. For this reason, it is generally considered better practice to use underscores (_) instead of dots when naming variables in R.

Don't use

  • exprs as a variable name since exprs is a function name defined by Biobase. See An Introduction to Bioconductor's ExpressionSet Class.
  • pData (pheno) as a variable name since pData is a function defined by Biobase.
  • fData (feature) as a variable name since fData is a function defined by Biobase.

In BRB-ArrayTools plugins (see the last COMMANDS block in a .plug file), the following are commonly used,

  • lr
  • edw
  • genes

In BRB-Arraytools export plugins, the following objects are exported,

  • NORMALIZEDLOGINTENSITY (and LOGINTENSITY)
  • EXPDESIGN
  • GENEID (and FILTER)

Other choice:

  • data
  • samples
  • features

shinyExprPortal

shinyExprPortal: a configurable ‘shiny’ portal for sharing analysis of molecular expression data 2024 and Github.

  • Expression matrix: matrix_object/expression
  • Measures table: measures
  • Lookup table: It is for datasets where a subject has more than one sample (e.g. samples over time, from different tissues or combinations thereof).

Examples

  • data.expr (data.measure, data.abundance)
  • data.pheno (data.clinical, data.expdesign, data.sampleinfo)
  • data.anno (data.genes, data.features)
  • optional
    • data.lookup
    • data.labs
    • data.models

Top 10 Programming Languages used on Github

The most popular programming languages