Software development convention



Why another convention?

Well, no major convention really fits.

Firstly, they mostly are unreasonably huge and complex. Sometimes the rules cover only the most particular situations and are as numerous as exceptions.
And their documents tend to mix it all up: style, practices and idioms. Often a style is used as a misnomer for practices and rules.

Secondly, no convention is complete. Even after reading quite some number of them one keeps revealing decisions that they serialize across various artifacts unconsciously and implicitly.

Thirdly, conventions rarely assist one coming to an agreement with themselves. Just stating an agreement with no reasoning and expecting everyone to follow it is unconstructive. Statements want grounding. The adoption lies where the guidance becomes preferred over falling back to habits.

A document is required that would be revisited iteratively and added to with description of deliberate and explicit choices. A thorough system of those with a structure and no contradictions is required to bring awareness.
Ideally, the convention should be so simple that reading it once would be enough to adhere to it without referring back or needing to recall. The rules should not get hung up on specifics but be general and versatile. The reasoning would make the decisions more appealing than personal preferences.
This is not achievable by only extending some other convention.

Thus, this convention focuses on:
minimizing the number of rules and keeping them simple;
stating every aspect explicitly, systematically and structured;
providing rationale for guidances and making choices based on requirements;
enumerating all sane alternatives, describing and comparing their advantages and figuring out a general rationale;
ensuring extensibility and full inner consistency.

The final goal is not to impose the rules but to provide a framework. It is perfectly fine that all decisions do not fit everyone. One coming from a different perspective is always free to fork the convention, add their causes wherever necessary and draw a different conclusion. In this case the convention provides the system and the alternatives.

Some topics are too broad. Some parts of the convention may not be so elaborated yet. Sometimes the suitable solution depends on requirements entirely. The best effort strategy is at least to declare a request for comments, or to list multiple choices for reference, or to name all points of improvement. The convention is always open for new topics to discuss.

One such subject is automated tooling. It would be so nice to have a linter that not only checks the style but can also recognize places for pattern and syntax improvements or even detect unsafe constructions. In practice, these tools are limited to work only within a certain technology or in a single environment. They operate well in basic scenarios, but their degree of flexibility is rarely sufficient.
Another such topic is personal preference.

Benefits of conventions in general

It is common knowledge that artifacts are read more often than written. Readability and consistency are attributes of artifacts and should be regarded as components of quality of the product itself. By fixing rules and guidelines not only conventions improve quality, they also help avoiding interoperability issues and bugs. Having a convention sorted out makes the development process more efficient. It prevents wasting time on aspects that have already been agreed upon. Making up mind on those things and putting them aside allows focusing on creating without getting distracted.

Methods, approaches and practices



Software is hard. Software development comprises multitude of disciplines and activities and the field of knowledge about it is vast. Despite many emerging issues most likely have been solved already, facing a problem the optimal way requires a broad outlook in computer science and mainly software engineering, information technology with its history, management science, marketing... There are essential considerations worth taking into account and fixing at the project or organization level. And the solutions apply only on a case by case basis and are entirely dictated by requirements and available resources.

Picking a methodology is one of the core decisions. Agile software development with its variations is mainstream. But the waterfall approach can still be found as the only option for cooperation with certain contractors. The undeniable advantages of TDD and BDD are excessive for earlier stages of the software life cycle and may even be too expensive for some projects.

The use of version control is mandatory. And there are questions that come with it. A branching strategy is to be chosen; trunk based branching is considered the default. Commit messages are subject to guidelines as well. To avoid bloating the repository, it is necessary to ignore artifacts that do not belong to it, such as builds, binaries, libraries, archives, settings, but to keep image and audio resources or datasets.

The concerns are inexhaustible. Many aspects are preferred to be determined early: the documentation approach, the testing and reviewing policies, the governance structure...

Paradigms, idioms, patterns and principles


When it comes to implementation and design, stick to the proven solutions: code and software reuse, KISS, SOLID, ...

Source artifact encoding



To prevent interoperability issues, all text files, such as source code or settings, must be UTF-8 encoded. There are other Unicode and non Unicode alternatives, but UTF-8 is by far the most common and is the core of many technologies.

The byte order mark should not be used because there is still a lot of BOM unaware software.

Line ending consistency is often broken and forgotten about. CRLF is regarded redundant by some. Overall spread of LF is wider because of Unix. In general, it is safer to stick to one way of line breaking.

Structure



This is a quite prevailing approach to enforce technology specific standards to the program structure and the project layout. There are conventions to put source code into the "src" directory and to split the functionality into separate controller and model modules or packages. Some languages and frameworks go even further in regulating program elements, entities and their relations. The use of the twisted interpretation of the encapsulation principle results in plain getters and setters for datatypes without any behavior. Methods and fields get grouped and ordered solely on the basis of whether they are static or instance and public or private.

There are definitely advantages to clarity and comprehension speed in even controversial standards. But blindly following them leads to contradictions, to the use of misnomers or contractions, to all shortcomings listed in the introduction above. Naming a directory containing front end styles "css" is clear but not accurate enough. A top level project organization messaging nothing other than that it is written in Java or Ruby hints nothing about the problem being solved or how it is approached.

Instead, the structure, whether of a program or directories, should be based on and reflect the mental model of the project domain over technologies, types of artifacts and implementation details.




project_1/
├── src/
│   ├── models/
│   │   ├── module_1.js
│   │   └── feature_2.java
│   ├── controllers/
│   │   ├── module_1.rb
│   │   └── feature_2.cpp
│   └── main.c
├── tests/
│   ├── module_1/
│   ├── feature_2/
│   └── integration_3/
├── css/
└── readme.md




project_1/ ├── sources/ │ ├── module_1/ │ │ ├── model.java │ │ ├── controller.java │ │ └── tests.java │ ├── feature_2/ │ │ ├── model.js │ │ ├── controller.js │ │ └── tests.js │ ├── tests/ │ │ └── integration_3/ │ └── main.c ├── resources/ └── project_1.md

Applying indentation and spacing to convey structure


When representing different structures, such as tree, plain or generally hierarchical and graph, relations usually in addition to being denoted are desirably emphasized with indentation and spacing.

If one tries to reflect nesting depth into horizontal indentation levels directly, they are likely to encounter confusion and resort to introducing different rules and exceptions. If we take a look at some program from a perspective of an hierarchical structure, we would find a good example of a class with data and methods contained inside a namespace. The direct mapping would oblige us to indent the class within the namespace just as the methods and the data inside the class. But namespaces do not want no indentation — it is unnecessary.

The only seemingly universal but simple approach is to reconsider the attitude towards the structure itself. In the example, the data and the methods of the class are the central meaningful components, while the namespace is just a service entity, just like the module declaration, probably introduced to prevent any possible name collisions. This model works well when expanded. In a typical web application markup elements are arranged hierarchically and indented according to semantics and layout: there is a navigation widget in a header; a form is comprised of inputs. But when it comes to writing text documents, the focus is on the content, and the rest becomes utility and auxiliary. The paragraphs are just written as is and are not indented relative to chapters.

The use of vertical spacing shall not be unreasonable. It is a trade off between fitting more text and keeping it comfortable to read and depends on cases. So as not to define them all, the general mindset is to visually split sibling blocks within indentation levels when either some of them group semantically or at least one of items has inner structure. Blank lines are added at the start and the end of groups as well.



export module module_1;

namespace namespace_2
{
    class class_1
    {
        void method_1();
        int data_1;
    };
}

namespace namespace_1
{

class class_1
{
    void method_1();
    int data_1;
};

}





\starttext
\startsection[title={Trying ConTeXt}]
This is a {\em ConTeXt} document.
\stopsection
\stoptext







services:

    web:
        image: web
        depends_on:
            - redis
            - sql

    redis:
        image: redis

    sql:
        image: postgres





Dependency order


It is much easier to read and maintain import, include or dependency declarations and statements in source code, manifests, build configuration files that are ordered and grouped. There appears to be the only common approach, and it is also quite intuitive. There might have been performance and interoperability considerations on overriding, recurrence, circular dependencies before. But those do not seem to matter today.

The current module declaration always goes first. In C family languages, that would be an include statement rather than a separate keyword. But it should be perceived in its special meaning and stand alone regardless. Then the libraries or dependencies come in order from generics to specifics. First any system and language abstractions, next perhaps some libraries, such as networking ones. Finally, current project or module parts are at the end.

Control and data flows


Control flow and data flow are one of the keys to understanding what a program does and how it operates, along with its data model, module breakdown, requirements, maybe something else... A program structure is mainly derived from a paradigm used. And there are many different ideas and problems to it. But both control flow and data flow are always to some extent inherent to it, whether in declarative or imperative programming.

The return early pattern is one of the most popular concepts, and it is ubiquitously applicable, especially in the imperative style. It promotes clarity of traversing a successful path and discarding concerns first. Failing fast is a way of scaling the logic of a program. Also, indentation levels get reduced drastically compared to nesting control and data flow alteration constructs. But the same can be achieved by introducing functions and extracting long blocks. The return early pattern requires changing the way of thought about the flow. And if it fits, the benefits are worth it. Overall, following the mental model is still primary. Say, in the case of mutually exclusive conditions, the else branch is the most appropriate. Even if the first branch alters the flow, visually parsing its body is not required to reason about the second one.

It can also be said that returning early is incompatible with system level programming. As having a single exit point enables centralized cleanup, which is most relevant when working with system APIs (not to be confused with the single entry single exit principle that only suggests that functions should return control to where it comes from). This is solved with the scope guard pattern. It uses function objects for cleanup on their destruction.

Composing and grouping entities

Flat organization might result in longer names while hierarchical structure may impose unnecessary burden. Nested namespaces, modules, classes should be introduced with caution and in balance with creating identifiers by combining names.

Language and semantics



Programming languages are usually based on natural languages, typically English, but greatly simplified. Syntax and formal structure are the only significant aspects to machines. Excessive expressiveness would overwhelm humans as well. And while punctuation and complex grammar are disregarded, orthography and basic grammar still apply. It is common
to omit articles,
not to add the suffixes to third person singular present verb endings,
to drop and imply subjects.
Only simple and continuous verb tenses are used.
The verb be and its forms as auxiliary verbs are usually skipped.
And so on...
But spelling mistakes are discouraged.

When certain semantic patterns and habits establish, it may be preferable to formulate them explicitly so that wordings of recurring phrases are consistent over the entire codebase. There is no use trying to anticipate all occasions as they can be very specific. But here are some examples.
Functions are verbs in the imperative mood.
Signals, events and handlers are indicative.
Possession relation is denoted with the verb have.
Descriptive, attributive, distributive adjectives are chosen over limiting, demonstrative ones.
Setting names should clearly indicate what their defaults are. The words set and unset can help signify setting values if no action has been taken.
...

It takes countless considerations to make artifacts concise. The domain is the best source of names. The number of entities is better to be kept minimal. It is important that unrelated terms are named distinctly and close terms are called similarly. There should be no meaningless and implicit distinctions between instances of the same term. All of these points improve comprehension. The searching convenience is also not to be forgotten when naming.

Names are often given unnecessary specificity. It could be for naming repeated terms or even resolving name clashes. When input is processed, say, it comes as a string and then is converted to a number. One could say that we have input_string variable and input_number variable. But that would result in having two different terms for the same purpose of reading input. Shadowing is the best mean of keeping identifiers concise in this case given it is supported by the technology used. And if it is not supported or some occurrences can not be shadowed, as they are used later, then just numbering variables is still better. This way it is obvious that all instances are responsible for the same intention. Another encountered technique is to embed type information into identifiers. Sometimes this is to decorate types in dynamic languages, sometimes just for the sake of it. For example, in C, C family languages and POSIX types are conventionally defined with the _t suffix. This is a slippery slope and should not be abused. It all ends up with adding words method, function and field to each program component name. To the point where it is unreadable.



function process_input (input_string)
{
    let input_number = number.from (input_string)
    ...
}

function process_input (input)
{
    input = number.from (input)
    ...
}



Terms might be not specific enough or emphasize unimportant aspects. There is nothing wrong in having the same name for both variable and its type in general. But the purpose of variable file of type file may be unclear without context. If it is a log file, then perhaps it should be called so - log. And repeating the word file in log file is redundant, since this information is already present in the type. Naming a variable callback messages only that it gives control back at some point, which is not as important as that it gets called after some operation finishes or when some event occurs.

Some components are more specific than others. This may be because they apply broader functionality to certain use cases, as in the log file example from before. Or it may be an implementation of some functionality for a platform. The best way to express this relation and set the context is by structure using means of composition typical to the language: modules, namespaces, classes. That would be putting fields file and level into a structure log. When for some reason a flatter structure is chosen, the name specifies meaning additional to the base term. Nouns and adjectives, complements and objects appear in postposition rather than preposition. So service_unix and service_windows are implementations of service. This groups identifiers by structure and relation relative to the base concept.

Thus, to strengthen the point, if some concepts are not only close but very similar, call them the same; and distinct names are to be altered directly, explicitly and significantly.




class file { ... }

let log_file = file.open (...);

let log : file = file.open (...);




let log_file = file.open (...); let log_level = "debug"; class log { file = file.open (...); level = "debug"; } let service_log = new log (); class log { ... set_verbose () { ... } unset_verbose () { ... } } service_log.set_verbose (); class log { ... verbose = { ... }; } service_log.verbose.set ();


class service { start (started) { ... } started = null; static unix = class { ... } } class service_unix { ... }

If in doubt whether to use the plural or the singular form of a term, an heuristic can be utilized. Plurals are for things that consist of items of the same type, and singulars are for things comprised of different items. A folder and an SQL entity are great examples. A folder that contains source code of implementations for various platforms is plural. An implementation split into multiple files is singular. Table users lists many users.

Contractions and abbreviations make it shorter but are informal and lead to confusion. Therefore, one should refrain from using them.

Style



Letter case


Different letter case styles exist. And there are different models and theories of reading. That would be great to have scientific evidence of a certain letter case being statistically faster and more precise to recognize. But this just does not seem to be the case. Instead, here it is simply taken for granted that speed and accuracy of a letter case style is a matter of habit, which happens to be the dominant view.

It is suggested to use lowercase letters only. There are no capitalization rules to remember then. Sticking to the same letter case everywhere is also interoperable in the sense that some systems, such as databases and older file systems, are case insensitive. Furthermore, there are few letters that are difficult to distinguish when presented in a combination of majuscule and minuscule glyphs. Without getting into the readability dispute, choosing between all lower and all upper case, capitalized text may be perceived as yelling.

What is the value of using mixed cases and particularly upper camel case, anyway? Maybe one is comfortable always starting class names with capital letters. This distinguishes class names from other identifiers such as variables. Does it has something to do with treating components as proper nouns? And how is it any different from namespaces, which always start with a small letter in any convention? And acronyms become very confusing as well. This is nothing more than embedding implicit information into an identifier. As discussed earlier, relying on minor subtle distinctions between terms is impractical. The language might not support the same name for a type and its instance. Well, it should. Otherwise, it is worth making disambiguation more explicit than letter register. Coming up with a variable identifier name hinting at its purpose is even better. What if one wants the code to scream at them, after all? That is to denote constants or global scopes — an utterly popular stylistic device. Yet it has no semantics whatsoever. Putting constants into a namespace, a module or a class separately does a better job of expressing intentions.

In the natural languages words are separated by spaces. Most programming and markup languages do not allow whitespace characters in identifiers. Since there is no case alteration of the camel style and all letters are lowercase, there has to be a word delimiter. snake_case and kebab-case are the two alternatives. It is hard to give preference to one over another universally. Software treats underscores and hyphens differently. Search engines used to not consider underscores as word breaks in URLs. Selection behavior varies for these delimiters in text editors, browsers, well, in just about any program that displays text, including operating systems. Double clicking a word selects the full identifier if underscores are used as separators, and only the clicked word in the case of hyphens. Both symbols are not applicable everywhere. Underscores are not valid for hostnames, unlike hyphens, and should not be used for domain names either. And the hyphen is the same character as the minus operator reserved in programming languages and can not appear in identifiers. If one is really up to stand for the single delimiter everywhere, the most pro underscore argument is that it resembles the space character and has the same width. But it might blend with underlined text. The major argument against hyphens as word separator is that it is already a part of the natural language and is used for producing open compound words. But dates actually look organically with the hyphen. Overall, the underscore seems to be the better choice. Especially if you treat the identifier as a whole. And for when the underscore is not a valid option it might be better either to resort to single word identifiers or to introduce some structure with periods in hostnames and subdomains and slashes in URLs.

Spacing


The use of spaces is the hardest part of the coding style both for guides and for codebases. The nonprinting character is easy to miss. Yet altogether spaces change appearance of code significantly. Conventions usually enumerate situations and tell whether to put them or not. Making another list would not scale over technologies and languages. The set of cases would neither be complete nor feasible to properly maintain because they are plenty. Even formatting tools will inevitably fail and emit false negatives and positives as more rules begin to overlap. Especially if they are inconsistent with themselves. Moreover, the incalculable spacing habits are usually depicted as they have been established without any reflection or reasoning. The more they are, trying to remember and differentiate them all becomes unhealthy. A systematic solution is required. Otherwise, the guidances will get broken often.

Need to determine where spaces are due and where not. The trend is clear: spacing is to improve readability. But there are places where the space character is discouraged. And opinions are often very strong on those. Starting examination with characteristic language constructions immediately unveils irregularities.

Keywords usually have spaces put near, but not all of them. Some particularly emphasize on control flow statements. But perhaps this is rather about those opposite keywords which form statements that resemble function calls: sizeof, typeof, super, function keywords itself, ... Because everyone has a habit of not putting spaces between function names and argument lists. This is taught with mathematical notation and looks very familiar. Rarely one can even come across the point of view that spaces should be used to help distinguish keywords from functions. The latter sounds more of an excuse for the existing state of affairs due to ancient hindsight than an actual argument: development environments highlight reserved tokens differently; and the visual structure of corresponding expressions is distinctive. The division between functions and keywords is futile. No second order justification needed in place for simply saying that something looks right. But this is indeed a major quandary. If there must be spaces only after some keywords, then we would have to deal with criteria for belonging to this group. If it is all reserved words, then what to do with function alike statements? The prevalent style from mathematics, a field larger than programming, does not fit. Maybe just embrace the space even in function calls and declarations to make everything consonant.

Giving up on spaces at all could ease the rules. But there are limiting narrow cases that discard this option. Some constructs and keywords literally require a space, such as return, new, delete, typeof, ... Some languages omit parentheses and braces in various situations, and the space character becomes the token separator. A prime example is the conditional statements in Lua and Go. Using spaces only when required by the compiler or interpreter is simple but stylistically incoherent. That would conflict with tendency of adorning code to improve readability. Could try the opposite side of this maximalist approach and put spaces everywhere just to see how it works. Say, if at least a second is spent contemplating whether to use spaces in particular positions or not, then do. And when it imminently goes beyond readable or just starts feeling forced, there is nothing else to do other than following something more traditional. That being, if not mathematics, then natural language punctuation.



// the same with for, while, match, switch, ...
if(true){ ... }
if (true) { ... }
// in Lua that would be

if true then ... end


function(){ ... } function function_1(){ ... } // half measures function() { ... } function function_1() { ... } function () { return 42; } function function_1 () { typeof true === "boolean"; }


function_1(); super(); function_1 (); super ();

Lists demonstrate this well: function parameters and arguments, array literals, tuples, ... In the natural language there is a space after comma and never before — very straightforward. Next, parenthetical phrases do not have spaces after opening and before closing parentheses, neither do not functions in math. This finds reflection in forming statements and grouping expressions. However, inline object literals can be regarded both as series separated by comma inside braces without spaces near them or rather as structural entities. The spaces are actually wanted because they add the shape and the impression of a block or a class. That correlates with the multiple line writing of objects — newline characters are simply replaced with spaces. Further, array literals with brackets can be assigned to any of these two types of usage. So for one who is not bothered by too many spaces, simply opting for them removes many concerns.




// too obtrusive? does not really add much.
if ( true ) { ... }
function_2 ( argument_1, argument_2 );
( a + b ) / c




if (true) { ... } function_2 (argument_1, argument_2); (a + b) / c




{property_1: 123, property_2: "321"}
[1, 2, 3]




{ property_1 : 123, property_2 : "321" } [ 1, 2, 3 ] { property_1 : 123, property_2 : "321" } [ 1, 2, 3 ]

The colon mark has diverse purposes. The space after it is always there. It mostly lingers to the introductory clause when it has place. The downside of no space before the colon character is that visual clarity might be harmed. Type declaration and indication, inheritance, loop iteration, properties in object literals, initializer lists in constructors, cases in switch statements or labels, ternary conditional expressions, ... — the variants are too many to distinguish.



for (String key: map) { ... }

let name: string = 'First Last';
{ property_1: 123, property_2: "321" }


for (String key : map) { ... } // Java
let name : string = 'First Last'; // TypeScript
{ property_1 : 123, property_2 : "321" } let result = true ? a : b;
struct derived : base { ... } // C++

Sometimes it is just a question of whether it is one word or multiple. A template type inside angle brackets is still a part of a class or function identifier. The double colon scope operator and the dot reference operator form compound terms. So there are no spaces. The odder it is to see the lack of spaces in expressions with the unary operators, such as unary increment, boolean negation and arithmetic negation, similarly to the binary ones.




// single compound term
object_1.property_1;
using namespace_1::class_1;




let result=a+b; a++;


// just like words are separated in writing let result = a + b; ++ a; a ++; ! b; print (- i);

Textual commentaries prefer being slightly detached so they do not blend with the described. Every style inserts the space on both sides of the escape sequence near the beginning of the text and the code. That should not be always expected from code commentaries that temporarily remove instructions from execution or modify plain text data during tinkering.



int a (0); // helpful message
//disabled_line ();



There should be no trailing whitespace at the end of lines. No strong reason behind this, maybe except for rare issues with careless regular expression searches. Just to be diligent.

These rational designs should be enough. The results will not fit existing codebases as the divergences are numerous. Most of the time the answer is just a direct matter of what is primary: preferences, programming and cultural background, fanciness, uniformity or simplicity. Hopefully, at least a proper overview on the topic is provided.

Line breaking and indentation


It is best not to have long statements in code at all. There are always means of reducing cognitively loaded constructs, such as extraction of expressions into separate functions or variables. If they are not applicable for some reason, the line is to be broken and wrapped. Some statements always want to occupy multiple lines regardless of how much space they would take up on a single line. This is to outline the visual structure so it is recognized easily. Technical limitations on line length are a thing of the past. Extra eye movement can be positive in general, though not sure about the unhealthy effect of repetitive trajectories on the eye muscle balance. What is certain is that text out of sight interrupts the reading flow.

Having figured out why, the next question is when to break lines. An objective restriction on the margin of the line eliminates the need to evaluate its lengthiness. So just pick the maximum number of characters per line that suits: 80, 88, 99, 100, 101, 111.

When the code wraps and gets indented so as the track of the program structure is not lost, whitespace fills the padding. Tabs can be adjusted to the preferred width. On the other hand, different platforms and software have different defaults. Chances are that tabs will not look the same they were written. Spaces are one character reliably. They are more flexible and give control within columns, which is not necessarily a benefit, though. The appropriate amount of spaces probably varies for programming and markup languages. The most common are 4 and 2 spaces.

As for where to break and what to wrap... All indentation levels are to be aligned. The number of levels is to be minimized more in the sense of creating symmetry between code blocks rather than maintaining indentation depth, but of course along with utilizing the extraction technique and the early return pattern. And the logic for newline placement shall be so innate and intuitive that it reproduces consistently and does not have to be looked up.

Navigating the syntax tree, equal (equally significant) parts of nodes stay on the same indentation levels. Being wrapped, they do not add an offset to the beginning of the statement, as it would be needless. Because this repeats the way usual multiline conditional constructs are written. The closing parenthesis of the argument list of a function call is another example.




module_1.module_2
    .module_3.function_1 ();

a =
    b + c;

while
    (true)
    {
        ...
    }




module_1.submodule_2.submodule_3 .function_1 (); a = b + c; while (true) { ... }

The choice of which parts to wrap is free. Although, it is wiser to break at higher levels in the syntax tree, because this makes overall blocks more organized. Sometimes this would do no impact when it is only the nested part that does not fit on the line. Or when other parts already span several lines or do not take that much space. It is the condition that we break in conditional statements.

And so, when deeper expressions become multiline, they add indentation levels. Given the above requests, this firm translation of syntactic depth into indentation levels leads to provisions. Parts get the same indentation as a whole, and their beginning always goes to a new line. Contrary to this, wrapping mid nodes would create unaligned indentation levels. And leaving the start on the previous line as an alignment beacon would make the style hard to maintain. The position may change during refactoring and tear the alignment. Additionally, all parent levels also wrap and indent. This makes the formatting regular.




// breaks alignment. problematic to maintain.
long_function_1 (argument_1,
                 argument_2);
slightly_longer_2 (argument_1,
                   argument_2);
name_changed_1 (argument_1,
                 argument_2); // refactoring tools do not respect alignment




long_function_1 (argument_1, argument_2); function_1 ( long_argument_1, // function name is fine, parameters are too many. long_argument_2 ); // aligned with the outset if ( condition_1 // nowhere else to break && condition_2 ) { ... }


if ( expression_1 && function_2 ( long_argument_1, // what indentation level should it be? long_argument_2 )) { ... } if ( expression_1 && function_2 ( long_argument_1, long_argument_2 ) ) { ... } if ( expression_1 // has to be on a separate line because of function_2 && function_2 ( long_argument_1, long_argument_2 ) ) { ... }

Bracket placement is another long going debate. There is the one true brace style, derived from the Kernighan and Ritchie style, with the advantage of saving a line of vertical space. It is also said that with this style a mistakenly placed semicolon between the condition and the branch is easier to catch. Which is too minor of an argument. Curly braces are devoted to denote blocks of code. And keeping indentation aligned for easier detection is what the Allman style for. Similarly to the use of spaces for creating impression of structure, an extra line with a brace on it adds visual clarity. And despite following the style that keeps the brace on the same line, one might still add an extra empty line at the beginning of a longer block and negate the line economy. Opening curly braces on separate lines already act as vertical spacing. The opening round brackets residing on the same line are probably alright when used in their parenthetical or statement continuation meaning.




// no extra line. but the code block is not denoted as clearly.
function function_1 () {
    ...
}

if (true) {
    ...
}

if (true); { // typo causing the branch to always execute is easier to spot
    ...
}

if (
    condition_1
    && condition_2) {
    statement_1; // blends
}

if (
    condition_1
    && condition_2
) {
    statement_1;
}




// the code block is visually aligned function function_1 () { ... } if (true) { ... } if ( condition_1 && condition_2 ) { statement_1; }

Note, however, that automatic semicolon insertion sometimes turns newline characters into statement terminators. Should not let this have much effect on line breaking.



// returns undefined
function (...)
{
    return // ;
    {
        ...
    };
}

function (...)
{
    return (
        {
            ...
        }
    );
}

function (...)
{
    let result =
    {
        ...
    }
    return result;
}



Correctness and operability aside, newline and statement continuation precedence has stylistic implications. Operators that reside at the end of the previous line indicate statement incompleteness. Continuation placement on the next line shows links between expressions concentrated at the indentation start. So one does not have to jump over the code to understand what it does.




if (
    expression_1 &&
    long_expression_2 ||
    expression_3
)
{ ... }

let result =
variable_1 +
long_variable_2 -
variable_3;

let result =
condition_1 ?

long_expression_2 : expression_3;


if ( expression_1 && long_expression_2 || expression_3 ) { ... } let result = variable_1 + long_variable_2 - variable_3; let result = condition_1 ? long_expression_2 : expression_3;

As already rendered in the indentation and structure topic, some utility entities may slightly float out of indentation. Apart from mentioned namespaces, labels could make an example. But this is rather a question of view on the relations of such elements in the program structure: do they complement other components or make their own.



let i = 5;

switch (i)
{
case 0 :
    ...
    break;

case 1 : ... break;

default :
    {
        ...
    }
    break;
}

switch (0)
{
    case i % 15 :
    ...
    break;

    case i % 3 : ... break;

    case i % 5 : ... break;

    default :
    {
        ...
    }
    break;
}

label_1 :
for (...)
{
    label_2 :
    for (...)
    {
        if (...)
        {
            break label_1;
        }
    }
}



There are some tempting indentation techniques that turn out to be unusable. Column alignment looks fancy but, again, is tiresome to maintain. The first refactoring will either leave the layout not restored or clutter the diff with irrelevant changes. Aligning selectively does not work either. Orienting, say, identifiers while shifting operators from the indentation grid is limited to the space available and is difficult to assist with editors.



class class_1
{
    int                 field_1;
    String              other_field_2;
    type_changed_3                 field_3;
}





let result =
    condition_1
  ? expression_2
  : expression_3;

if (
    condition_1
 && expression_2
=== expression_3 // pops beyond its alignment
)
{ ... }



And just as with trailing whitespace, ending either every text document or none with a blank line is a matter of diligence.

Other considerations


That is about it. But there are always more considerations. At a lower level of detail, specific to technologies, organizations and projects, they are to be documented and communicated. The structure of the convention is to be reused if deemed convenient.