不要把下面的文字视为真理,而应该是我发现的一个模型,它对理解和寻找更好的编写代码的方式非常有帮助。 我认为每个程序员都在努力编写更好的代码。可读性是“良好代码”的方面之一。关于该主题的论文和书籍很多,但是我发现其中很多都缺乏点什么,不是建议,而是分析。 是什么使某些代码比其他代码更具可读性?说它使用更好的变量名是一回事,但是什么使某个变量名更易于阅读呢?我的意思是更深入地研究人类心理。毕竟,正是我们的大脑在进行所有处理。
Psychology Primer
任何程序员都知道,我们思考问题的能力有限。这是我们的工作记忆极限。有一个古老的神话说,我们的头部可以容纳7±2个物体。它被称为“神奇的数字七”,并不完全准确。这个数字已经精炼为4±1,甚至有人认为这没有限制,但是随着时间的推移,想法会逐渐退化。出于所有意图和目的,我们可以假设我们在给定的时间里脑子里只有少量想法可以处理。确切的数字不是那么重要。
但是有些人仍然会自信地说他们可以处理涉及4个以上想法的问题。幸运的是,我们大脑中还有另一个过程叫做分块。我们的大脑会自动将信息片段分成较大的片段(块)。
你可以从此图像猜测,内存中从一个位置移动到另一个位置的速度很慢。你会是对的。在UX(用户体验设计)中,有一个概念称为“关注的焦点”。这意味着我们可以一次专注于一件事情。它还有一个称为关注点的朋友,这个概念也是说我们的关注点是有局限的。
你可能会认为这与工作内存限制是一回事,但是有一些细微的差别。工作记忆容量说明了我们的专注区域有多大,关注的焦点/位置表示只有在大脑中存在一个包含想法的位置时,我们才能做到这一点。
注意焦点和关注点很重要,因为转换成本很高。当我们需要创建新的块和分组时,它甚至更慢。反之亦然,越是熟悉的事物,我们转换焦点到它上面所花费的时间就越少。
当我们处于相似的环境中时,我们也会更好地记住事情。这称为编码特异性原理。这意味着,通过设计编码和调用条件,我们可以设计出更好的记忆。
在一个实验中,指派潜水员在陆地和水下环境去记忆单词。然后在陆地或水中回忆他们。那些在陆地上记忆并且在陆地回忆单词的人表现得最好。令人惊讶的是,第二佳的人是在水里记忆并且在水里回忆的人。这表明学习事物的环境会影响你记忆事物的能力。
为简化起见,我将使用环境(context)来指代“关注的焦点和位置”及其与其他数据块和位置的关系。实际上,我们的大脑正在从一种环境移动到另一种环境。当我们转移注意力的焦点时,我们还记得之前的环境,直到我们的记忆消失。
从这些环境,我们建立了心理表征和心理模型。这两件事之间略有不同。心理表征是我们内部的认知符号,用于表示外部世界或心理过程。心理模型可以被认为是心理表征的一种解释。这些术语通常可以互换使用。
心理模型对于我们精确描述问题的解决方案的能力至关重要。对于一个问题,可能有许多不同的心理模型,每种模型都有自己的利弊。
所有这些想法听起来都不错而且很精确,但是我们的大脑非常不精确。我们的大脑还有许多其他问题。
处理抽象时,我们的大脑需要做更多的工作。
当想法相似时,它们的组成部分便以相似的方式关联并链接到我们的大脑中。这导致我们的大脑无法“正确地重建上下文”,因为我们不确定哪个块是正确的块。示例:I和1; O和0。
模糊是不确定性的另一个来源。当某事物含糊不清时,同一事物会有多种解释。同名是该属性的最佳示例。示例:起重机-鸟或机器。
不确定性使我们放慢了脚步。可能需要几毫秒的时间,但这足以破坏我们的思绪或使大脑不必要的使用更多工作内存。
当然,有一些中断会干扰我们的工作记忆,但也有一些“较小的中断”,即噪音。如果有人随机念数字而你要计下来,那么最终我们可能会意外地开始处理它们并耗尽我们的一些工作记忆。当重要内容之间有许多无关紧要的内容时我们会忽视无关内容,比如看屏幕时由于过于专注忽略了提醒。
很多研究都表示,我们的大脑难以处理否定信息。否定的效果取决于环境,所以请谨慎使用否定。
所有这些加在一起构成了认知负担。这是所使用的脑力劳动总量。我们的处理能力随着认知负荷的增加而降低,并随着休息而恢复。随着认知负荷的延长,我们的思想也开始漂移。
如果这对您来说是新信息,那么我强烈建议您现在休息一下。这些形成了代码分析将依赖的基本属性。
As any programmer knows we have limited capacity to think about things. This is our working memory limit. There’s an old myth going around that we can hold 7±2 objects in our head. It is known as “The Magical Number Seven” and it isn’t entirely accurate. This number has been refined to 4±1 and some even suggest there isn’t a limit, but rather a degradation of ideas over time. For all intents and purposes we can assume that we have a small number of ideas we can process in our head at a given time. The exact number isn’t that important.
But some would still confidently say that they can handle problems involving more than 4 ideas. Luckily there’s another process going on in our brain called chunking. Our brain automatically groups information pieces into larger pieces (chunks).
Dates and phone-numbers are good examples of this:
two levels of chunking a date
From these chunks we build up our long term memory. I like to imagine it as a large web of consisting many chunks, chunk sequences and groupings.
memory web
You might guess from this image that moving from one place to another in memory is slow. And you would be right. In UX there’s a concept called singular focus of attention. Which means that we can focus at a single thing at a time. It also has a friend called locus of attention, which says that our attention is also localized in space.
You might think this is the same thing as working memory limit, however there is a slight difference. Working memory capacity talks how big our focusing area is, the focus/locus of attention say that we can only do that when there is a place in our brain that contains the ideas.
focus of attention and working memory capacity
The focus and locus of attention are important to know, because switching cost is significant. It is even slower when we need to create new chunks and groupings. It also goes the other way, the more familiar something is the less time it takes to make it our focus.
We also remember things better when we are in a similar context. This is called encoding specificity principle. This means by designing our encoding and recalling conditions we can design better what we remember.
In an experiment divers were assigned to memorize words on land and under water. Then recall them on land or in water. The best results were for people who memorized and recalled on land. Surprisingly the second best were the people that memorized and recalled on water. This showed that the context where you learn things has an impact on how well you can remember things.
encoding specificity principle
To make things shorter, I’ll use context to refer to “focus and locus of attention” and how it relates to other chunks and loci. Effectively our brain is moving from one context to another. When we move our focus of attention we also remember what our previous contexts were, until our memory fades.
related contexts
From these contexts and chunks we build up mental representations and a mental model. There’s a slight difference between these two things. Mental representation is our internal cognitive symbol for representing the external world or a mental processes. Mental model can be thought of as a explanation of a mental representation. Often these terms are used interchangeably.
Mental models have a vital importance in our ability to precisely describe a solution to a problem. There are many different mental models possible for a single problem each having their own benefits and problems.
All of these ideas sound nice and precise, however our brains are quite imprecise. There are many other problems with our brain.
Our brains need to do more work when dealing with abstractions.
When ideas are similar their chunks are related and linked in our brains in a similar way. This leads to our brain being unable to “rebuild the contexts properly” because we are uncertain which chunk is the right one. Example: I and 1; O and 0.
Ambiguity is another source for uncertainty. When a thing is ambiguous then there are multiple interpretations for the same thing. Homonyms are the best example of this property. Example: Crane — the bird or the machine.
Uncertainty causes us to slow down. It might be a few milliseconds, but that can be enough to disrupt our state of flow or make us use more working memory than necessary.
There are of course interruptions that can disrupt our working memory, but there are also “smaller interruptions” called noise. If someone is saying random numbers and you are trying to calculate, then we can end-up accidentally start processing them and use up some of our working memory. This can happen also visually on screen when there are many irrelevant things between the important things.
Our brains also have trouble processing negation, with support from many studies. The effect of negation depends on the context, but negation should be used with care.
All of these together add up to cognitive load. It is the total amount of mental effort being used. Our processing capacity decreases with prolonged cognitive load and it is restored with rest. With prolonged cognitive load our minds also start to wander.
If this is new information to you, then I highly suggest taking a break now. These form fundamental properties that code analysis will rest upon.
I’m going to use the term *programming artifact*. By that I mean everything that is created as a result of programming. It might be a method you write, type declarations for a function, variable names, comments, Unreal Engine Blueprints, UML diagrams etc. Effectively anything that is a direct result of programming.
Here are a few recommendations, rules-of-thumb and paradigms analyzed in the context of psychology. By no means is this an exhaustive list or even a guide on what exactly to do. Probably there are many places where the analysis could be better, but this is more about showing how we can gain deeper insight into code readability by using psychology.
Length is not a virtue in a name; clarity of expression is. — Rob Pike
Let’s take a simple for
loop:
A. for(i=0 to N)
B. for(theElementIndex=0 to theNumberOfElementsInTheList)
Most programmers would recommend A. Why?
B. uses longer names which prevents us from recognizing this as a single chunk. The longer name also doesn’t help creating a better context, effectively it is just noise.
However, let’s imagine different ways of writing packages / units / modules / namespaces:
A. strings.IndexOf(x, y)
B. s.IndexOf(x, y)
C. std.utils.strings.IndexOf(x, y)
D. IndexOf(x, y)
In example B. the namespace s
is too short and doesn’t help “to find the right chunk”.
In example C. the namespace std.utils.strings
is too long, most of it’s unnecessary, because strings
itself is descriptive enough. (Unless you need to use multiple of them).
In example D. without namespaces, then the call becomes ambiguous, you might be unsure where the IndexOf
comes from and what it is related to.
It’s important to mention that, if all of code is dealing with strings it will be quite easy to assume that IndexOf
is some string related function. In such cases, even the strings
part might be too noisy. For example: int16.Add(a, b)
compared to a + b
, would be much harder to read.
With variables it would be easy to conclude that “modification is bad, because it makes harder to track what is happening”. But, lets take these examples:
// A.
func foo() (int, int) {
sum, sumOfSquares := 0, 0
for _, v := range values {
sum += v
sumOfSquares += v * v
}
return sum, sumOfSquares
}// B.
func GCD(a, b int) int {
for b != 0 {
a, b = b, a % b
}
return a
}// C.
func GCD(a, b int) int {
if b == 0 {
return a
}
return GCD(b, a % b)
}
Here foo
is probably easiest to understand. Why? The problem isn’t modifying the variables, but rather how they are modified. A doesn’t have any complex interactions, which both B and C do. I would also guess, that even though C doesn’t have modifications, our brain still processes it as such.
// D.
sum = sum + v.x
sum = sum + v.y
sum = sum + v.z
sum = sum + v.w// E.
sum1 = v.x
sum2 := sum1 + v.y
sum3 := sum2 + v.z
sum4 := sum3 + v.w
Here is another example where the modification based version (D) is easier to follow. E introduces new variables for the same idea, effectively, the different variables become noise.
Let’s take another for
loop:
A. for(i = 0; i < N; i++)
B. for(i = 0; N > i; i++)
D. for(i = 0; i <= N-1; i += 1)
C. for(i = 0; N-1 >= i; i += 1)
How long did it take for you to figure out what each line is doing? For anyone who has been coding for a while, A probably took the least time. Why is that?
The main reason is familiarity. To be more precise, we have a chunk in our long-term-memory for A, however not for any of the others. This means that we need to do more processing, before we can extract the meaning and concept from it.
For any complete beginner, all of these would be processed quite similarly. They wouldn’t notice that one is “better” than any other.
A proficient programmers reads A as a single chunk or idea “i is looped for N items”. However a beginner reads this as “We initialize i to zero. Then we test whether each time we are still smaller than N. Then we add one to i.”
A is what you call the “idiomatic way” of writing the for loop. It’s not really better in terms of intrinsic complexity. However, most programmers can read it more easily, because it is part of our common vocabulary.
Most languages have an idiomatic way of writing things. There are even papers and books about them, starting with APL idioms, [C++ idioms](http://www.dre.vanderbilt.edu/~sutambe/documents/More C++ Idioms.pdf) and more structural idioms like in GoF Design Patterns. These books can be regarded as a vocabulary for writing sentences and paragraphs, such that it will be recognized by people.
There’s however a downside to all of this. The more idioms there are, the bigger vocabulary you have to have to understand something. Languages with unlimited flexibility often suffer due to this. People end up creating “idioms” that help them write more concise code, however everybody else will be slowed down by them.
With regards to repeated structures names such as “model” and “controller” act as a chunk to remind of how these structures relate to each other.
Frameworks, micro-architectures and game engines all try to create and enforce such relations. This means people have to spend less time figuring out how things communicate and are wired up. Once you grok the structures it becomes easier to jump from one code base to another.
However the main factor with all of this is consistency. The more consistent the code base is in naming, formatting, structure, interaction etc. the easier it is to jump into arbitrary code and understand it.
As previously mentioned uncertainty can cause stuttering when reading or writing code.
Let’s take ambiguity as our first example. The simplest example would be [1,2,3].filter(v => v >= 2)
. The question is, what will this print, is it “2 and 3” or “1”. It’s a simple question, but it can cause a reading/writing stutter when you don’t use it day-in-out.
The source of the stutter is ambiguity. In the real-world there are two uses for it, one is to keep the part that is getting stuck in the filter and the other that passes through the filter. For example when you have gold in water, then you want to get rid of the water. When you have dirt in the water, you probably want to get rid of the dirt.
Even if we precisely define what filter
does, it can still cause stutter because it’s hardwired with two meanings in our brain. The common solution is to use functions such as select, discard, keep
.
We can also attach meaning in different ways, such as types. For example: instead of GetUser(string)
you can use type CustomerID string
to ensure GetUser(CustomerID)
to make clear that the interpretation is “get user using a customer id” instead of other possibilities such as “get user by name”.
Similarity is also easy to conceptually understand. For example having variables such as total1, total2, total3
can lead to situation where you make copy paste mistakes or over a longer piece of code lose track what it meant. For example name such as sum, sum_of_squares, total_error
can provide more meaning.
Having multiple names for the same thing can also be source of confusion when moving between packages. For example in one package you use variable name c
, cl
and in another client
in the third source
. It’s interesting to think about special variables such as *this*
and *self*
.
Ambiguity and similarity is not a problem just at the source level. Eric Evans noted this in DDD with the Ubiquitous Language pattern. The notion is that in different contexts such as billing and shipping, words such as “client” can have widely different usages and meanings, so it’s helpful to keep a vocabulary around to ensure that everyone communicates clearly.
We have all seen the “stupid beginner examples” of commenting:
// makes variable i go from 0 to 99
for(var i = 0; i < 100; i++) {// sets value 4 to variable a
var a = 4;
While it may look stupid, it might have some purpose. Think about learning a second or third language. You usually learn the new language by understanding the translation in your primary language. These are the “chunks” written out explicitly.
Once you have learned “chunk” the comments become noise, because you already know that information by looking at the second line.
As programmers get better, the intent of comments becomes to condense information and to provide a context for understanding code. Why was a particular approach taken when doing X or what needs to be considered when modifying the code.
Effectively, it’s for setting up the right mental model for reading the code.
Working memory limitation leads us to decompose and partition our code into different interacting pieces. We must be mindful in how we relate different pieces and how they interact.
For example when we have a very deep inheritance chain and we use things from all different inheritance levels, the class might be too complicated, even if each class has maybe two methods and each method is five lines of code. The class and all the parents form a single “whole”. Illustratively you can count each “inheritance step” as a “single idea” that you need to remember when you use that particular class.
The other side of contexts is moving between function calls. Each call is a “context in our mental model”, so we need to remember where we came from and how it relates to the current situation. The deeper the call stack, the more stuff we have to keep in mind.
One way to reduce the depth of our mental model contexts is to clearly separate them. One of such examples is early return:
public void SomeFunction(int age)
{
if (age >= 0) {
// Do Something
} else {
System.out.println("invalid age");
}
}public void SomeFunction(int age)
{
if (age < 0){
System.out.println("invalid age");
return;
}
// Do Something
}
In the first version when we read the “Do Something” part we understand it only happens when the age is positive. However, when we reach the “else” part we have forgotten what the condition was, because at that point the distance from the condition can be quite far away.
The second version is somewhat nicer. We have lost the necessity to keep multiple “contexts” in our head, but can focus instead of a single context that is setup and verified by multiple checks in the beginning.
One of the usual recommendations is “don’t have global variables”. But, when a variable is set during startup and never changed again, is that a problem? The problem isn’t in the “variableness” or “globalness” of something, but rather in how it affects our capability to understand code. When something is modified at a distance then we cannot build a contained model of it. The “globalness” of course clutters the namespace (depending on the language) and means there are more places it can be accessed from. Of course there are many other things that have same properties, such as “Singleton”. So, why is it considered better than a global variable?
Single responsibility principle (SRP) is easy to understand with these concepts. It tries to ensure that we have proper chunks for a thing. This constraint often makes chunks smaller. Having a single responsibility also means that we end up with things that have working memory need. However, we need to consider that when we separate a class or function into multiple pieces we introduce many new artifacts. When these artifacts are deeply bound together we may not even gain the benefits of SRP.
Carmack’s comments on inlined functions is a good example of this. The three examples he gave were these:
// A
void MinorFunction1( void ) {
}
void MinorFunction2( void ) {
}
void MinorFunction3( void ) {
}
void MajorFunction( void ) {
MinorFunction1();
MinorFunction2();
MinorFunction3();
}
// B
void MajorFunction( void ) {
MinorFunction1();
MinorFunction2();
MinorFunction3();
}
void MinorFunction1( void ) {
}
void MinorFunction2( void ) {
}
void MinorFunction3( void ) {
}
// C.
void MajorFunction( void ) {
{ // MinorFunction1
}
{ // MinorFunction2
}
{ // MinorFunction3
}
}
By making pieces smaller we made the chunks smaller, however understanding the system became harder. We cannot read our code from top-to-bottom and understand what it does, but instead we have to jump around in the code base to read it. Version C preserves the linear ordering while still maintaining the conceptual chunks.
Overall we can summarize the code readability as trying to balance different aspects:
- Names help us retrieve the right chunks from memory and help us figure out their meaning. Too long a name can end up being noisy in our code. Too short a name may not help us figure out its true meaning. Bad names are misleading and confusing.
- To minimize the cost of shifting attention, we try to write all related code close together. To minimize the burden to our working memory, we try to split the code into smaller and more fathomable units.
- Using common vocabulary allows the author as well as the team to rely on previous code-reading experience. That means reading, understanding and contributing to code is easier. Using unique solutions in place where a common one would do, can slow down new readers of that code.
In practice there is no “perfect” way of organizing code, but there are many trade-offs. While I focused on readability, it is never the end goal, there are many other things to consider like reliability, maintainability, performance, speed of prototyping.