Chapter 18 Code Should be Obvious
Obscurity is one of the two main causes of complexity described in Section 2.3. Obscurity occurs when important information about a system is not obvious to new developers. The solution to the obscurity problem is to write code in a way that makes it obvious; this chapter discusses some of the factors that make code more or less obvious.
晦涩难懂是第 2.3 节中描述的造成复杂性的两个主要原因之一。当有关系统的重要信息对于新开发人员而言并不明显时,就会发生模糊(当和一个系统相关的一些重要信息对于新的开发人员不那么容易理解,那就是模糊性)。解决晦涩问题的方法是以显而易见的方式编写代码。本章讨论了一些使代码更明显或不明显的因素。
If code is obvious, it means that someone can read the code quickly, without much thought, and their first guesses about the behavior or meaning of the code will be correct. If code is obvious, a reader doesn’t need to spend much time or effort to gather all the information they need to work with the code. If code is not obvious, then a reader must expend a lot of time and energy to understand it. Not only does this reduce their efficiency, but it also increases the likelihood of misunderstanding and bugs. Obvious code needs fewer comments than nonobvious code.
如果代码是显而易见的,则意味着某人可以不加思索地快速阅读该代码,无需多想,他们对代码的行为或含义的初步猜测将是正确的。如果代码是显而易见的,那么读者就不需要花费太多时间或精力来收集他们使用代码所需的所有信息。如果代码不明显,那么读者必须花费大量时间和精力来理解它。这不仅会降低他们的效率,而且还增加了误解和错误的可能性。显而易见的代码比不明显的代码需要更少的注释。
“Obvious” is in the mind of the reader: it’s easier to notice that someone else’s code is nonobvious than to see problems with your own code. Thus, the best way to determine the obviousness of code is through code reviews. If someone reading your code says it’s not obvious, then it’s not obvious, no matter how clear it may seem to you. By trying to understand what made the code nonobvious, you will learn how to write better code in the future.
读者的想法是“显而易见”(易读性是由读者来判断的):注意到别人的代码不明显比发现自己的代码有问题要容易得多(相对来说关注到别人代码中的难理解比注意到自己的代码要容易的多)。因此,确定代码是否显而易见的最佳方法是通过代码审查。如果阅读您代码的人说它并不明显,那么它就不明显,无论它对您来说是多么清晰。通过尝试理解什么使代码变得不明显,您将学习如何在未来写出更好的代码。
Two of the most important techniques for making code obvious have already been discussed in previous chapters. The first is choosing good names (Chapter 14). Precise and meaningful names clarify the behavior of the code and reduce the need for documentation. If a name is vague or ambiguous, then readers will have read through the code in order to deduce the meaning of the named entity; this is time-consuming and error-prone. The second technique is consistency (Chapter 17). If similar things are always done in similar ways, then readers can recognize patterns they have seen before and immediately draw (safe) conclusions without analyzing the code in detail.
在前面的章节中已经讨论了使代码显而易见的两种最重要的技术。首先是选择好名字(第 14 章)。精确而有意义的名称可以阐明代码的行为,并减少对文档的需求。如果名字含糊不清,那么读者将不得不通读代码,以推论命名实体的含义;这既费时又容易出错。第二种技术是一致性(第 17 章)。如果总是以相似的方式完成相似的事情,那么读者可以识别出他们以前所见过的模式,并立即得出(安全)结论,而无需详细分析代码。
Here are a few other general-purpose techniques for making code more obvious:
以下是使代码更明显的其他一些通用技术:
Judicious use of white space. The way code is formatted can impact how easy it is to understand. Consider the following parameter documentation, in which whitespace has been squeezed out:
明智地使用空白。代码的格式化方式会影响其理解的容易程度。考虑以下参数文档,其中空格已被压缩:
/**
* ...
* @param numThreads The number of threads that this manager should
* spin up in order to manage ongoing connections. The MessageManager
* spins up at least one thread for every open connection, so this
* should be at least equal to the number of connections you expect
* to be open at once. This should be a multiple of that number if
* you expect to send a lot of messages in a short amount of time.
* @param handler Used as a callback in order to handle incoming
* messages on this MessageManager's open connections. See
* {@code MessageHandler} and {@code handleMessage} for details.
*/
It’s hard to see where the documentation for one parameter ends and the next begins. It’s not even obvious how many parameters there are, or what their names are. If a little whitespace is added, the structure suddenly becomes clear and the documentation is easier to scan:
很难看到一个参数的文档在哪里结束而下一个参数的文档在哪里开始。甚至不知道有多少个参数或它们的名称是什么。如果添加了一些空白,结构会突然变得清晰,文档也更容易扫描:
/**
* @param numThreads
* The number of threads that this manager should spin up in
* order to manage ongoing connections. The MessageManager spins
* up at least one thread for every open connection, so this
* should be at least equal to the number of connections you
* expect to be open at once. This should be a multiple of that
* number if you expect to send a lot of messages in a short
* amount of time.
* @param handler
* Used as a callback in order to handle incoming messages on
* this MessageManager's open connections. See
* {@code MessageHandler} and {@code handleMessage} for details.
*/
Blank lines are also useful to separate major blocks of code within a method, such as in the following example:
空行也可用于分隔方法中的主要代码块,例如以下示例:
void* Buffer::allocAux(size_t numBytes) {
// Round up the length to a multiple of 8 bytes, to ensure alignment.
uint32_t numBytes32 = (downCast<uint32_t>(numBytes) + 7) & ~0x7;
assert(numBytes32 != 0);
// If there is enough memory at firstAvailable, use that. Work down
// from the top, because this memory is guaranteed to be aligned
// (memory at the bottom may have been used for variable-size chunks).
if (availableLength >= numBytes32) {
availableLength -= numBytes32;
return firstAvailable + availableLength;
}
// Next, see if there is extra space at the end of the last chunk.
if (extraAppendBytes >= numBytes32) {
extraAppendBytes -= numBytes32;
return lastChunk->data + lastChunk->length + extraAppendBytes;
}
// Must create a new space allocation; allocate space within it.
uint32_t allocatedLength;
firstAvailable = getNewAllocation(numBytes32, &allocatedLength);
availableLength = allocatedLength numBytes32;
return firstAvailable + availableLength;
}
This approach works particularly well if the first line after each blank line is a comment describing the next block of code: the blank lines make the comments more visible.
如果每个空白行之后的第一行是描述下一个代码块的注释,则此方法特别有效:空白行使注释更可见。
White space within a statement helps to clarify the structure of the statement. Compare the following two statements, one of which has whitespace and one of which doesn’t:
语句中的空白有助于阐明语句的结构。比较以下两个语句,其中之一具有空格,而其中一个没有空格:
for(int pass=1;pass>=0&&!empty;pass--) {
for (int pass = 1; pass >= 0 && !empty; pass--) {
Comments. Sometimes it isn’t possible to avoid code that is nonobvious. When this happens, it’s important to use comments to compensate by providing the missing information. To do this well, you must put yourself in the position of the reader and figure out what is likely to confuse them, and what information will clear up that confusion. The next section shows a few examples.
注释。有时无法避免非显而易见的代码。发生这种情况时,重要的是使用注释来提供缺少的信息以进行弥补。要做好这一点,您必须放自己放在读者的位置上,弄清楚什么可能会使他们感到困惑,以及哪些信息可以消除这种困惑。下一节将介绍几个示例。
There are many things that can make code nonobvious; this section provides a few examples. Some of these, such as event-driven programming, are useful in some situations, so you may end up using them anyway. When this happens, extra documentation can help to minimize reader confusion.
有很多事情可以使代码变得不明显。本节提供了一些示例。其中一些,例如事件驱动编程,在某些情况下很有用,所以您可能最终会使用它们。发生这种情况时,额外的文档有助于最大程度地减少读者的困惑。
Event-driven programming. In event-driven programming, an application responds to external occurrences, such as the arrival of a network packet or the press of a mouse button. One module is responsible for reporting incoming events. Other parts of the application register interest in certain events by asking the event module to invoke a given function or method when those events occur.
事件驱动编程。在事件驱动编程中,应用程序对外部事件做出响应,例如网络数据包的到来或按下鼠标按钮。一个模块负责报告传入事件。应用程序的其他部分通过要求事件模块调用给定的函数或方法在事件发生时来注册对某些事件的兴趣。
Event-driven programming makes it hard to follow the flow of control. The event handler functions are never invoked directly; they are invoked indirectly by the event module, typically using a function pointer or interface. Even if you find the point of invocation in the event module, it still isn’t possible to tell which specific function will be invoked: this will depend on which handlers were registered at runtime. Because of this, it’s hard to reason about event-driven code or convince yourself that it works.
事件驱动编程使得控制流程很难被跟踪。事件处理函数从未被直接调用;它们是由事件模块间接调用的,通常使用函数指针或接口。即使您在事件模块中找到了调用点,也仍然无法确定哪个具体的函数会被调用:这将取决于在运行时注册了哪些处理程序。正因为如此,很难对事件驱动的代码进行推理,也很难说服自己相信它是有效的。
To compensate for this obscurity, use the interface comment for each handler function to indicate when it is invoked, as in this example:
为了弥补这种模糊性,使用每个处理函数的接口注释来表明它何时被调用,如以下示例所示:
/**
* This method is invoked in the dispatch thread by a transport if a
* transport-level error prevents an RPC from completing.
*/
void Transport::RpcNotifier::failed() {
...
}
img Red Flag: Nonobvious Code img
If the meaning and behavior of code cannot be understood with a quick reading, it is a red flag. Often this means that there is important information that is not immediately clear to someone reading the code.
如果无法通过快速阅读来理解代码的含义和行为,则它是一个危险标记。通常,这意味阅读代码的人并不能立即清楚某些重要的信息。
Generic containers. Many languages provide generic classes for grouping two or more items into a single object, such as Pair in Java or std::pair in C++. These classes are tempting because they make it easy to pass around several objects with a single variable. One of the most common uses is to return multiple values from a method, as in this Java example:
通用容器。许多语言提供了用于将两个或多个项目组合到一个对象中的通用类,例如 Java 中的 Pair 或 C++ 中的 std::pair。这些类很诱人,因为它们使使用单个变量轻松传递多个对象变得容易。最常见的用途之一是从一个方法返回多个值,如以下 Java 示例所示:
return new Pair<Integer, Boolean>(currentTerm, false);
Unfortunately, generic containers result in nonobvious code because the grouped elements have generic names that obscure their meaning. In the example above, the caller must reference the two returned values with result.getKey() and result.getValue(), which give no clue about the actual meaning of the values.
不幸的是,通用容器导致代码不清晰,因为分组后的元素具有模糊其含义的通用名称。在上面的示例中,调用者必须使用 result.getKey() 和 result.getValue() 引用两个返回的值,这并没有给出关于这些值的实际含义的任何线索。
Thus, it’s better not to use generic containers. If you need a container, define a new class or structure that is specialized for the particular use. You can then use meaningful names for the elements, and you can provide additional documentation in the declaration, which is not possible with the generic container.
因此,最好不要使用通用容器。如果需要容器,请定义专门用于特定用途的新类或结构。然后,您可以为元素使用有意义的名称,并且可以在声明中提供额外文档,而对于常规容器而言,这是不可能的。
This example illustrates a general rule: software should be designed for ease of reading, not ease of writing. Generic containers are expedient for the person writing the code, but they create confusion for all the readers that follow. It’s better for the person writing the code to spend a few extra minutes to define a specific container structure, so that the resulting code is more obvious.
此示例说明了一条通用规则:软件的设计应易于阅读而不是易于编写。通用容器对于编写代码的人来说是很方便的,但是它们会给所有的后续读者带来困惑。对于编写代码的人来说,花一些额外的时间来定义特定的容器结构是更好的选择,这样写出来的代码更加明显。
Different types for declaration and allocation. Consider the following Java example:
不同类型的声明和分配。考虑以下 Java 示例:
private List<Message> incomingMessageList;
...
incomingMessageList = new ArrayList<Message>();
The variable is declared as a List, but the actual value is an ArrayList. This code is legal, since List is a superclass of ArrayList, but it can mislead a reader who sees the declaration but not the actual allocation. The actual type may impact how the variable is used (ArrayLists have different performance and thread-safety properties than other subclasses of List), so it is better to match the declaration with the allocation.
将该变量声明为 List,但实际值为 ArrayList。这段代码是合法的,因为 List 是 ArrayList 的超类,但是它会误导看到声明但不是实际分配的读者。实际类型可能会影响变量的使用方式(ArrayList 与 List 的其他子类相比,具有不同的性能和线程安全属性),因此最好将声明与分配匹配。
Code that violates reader expectations. Consider the following code, which is the main program for a Java application
违反读者期望的代码。考虑以下代码,这是 Java 应用程序的主程序
public static void main(String[] args) {
...
new RaftClient(myAddress, serverAddresses);
}
Most applications exit when their main programs return, so readers are likely to assume that will happen here. However, that is not the case. The constructor for RaftClient creates additional threads, which continue to operate even though the application’s main thread finishes. This behavior should be documented in the interface comment for the RaftClient constructor, but the behavior is nonobvious enough that it’s worth putting a short comment at the end of main as well. The comment should indicate that the application will continue executing in other threads. Code is most obvious if it conforms to the conventions that readers will be expecting; if it doesn’t, then it’s important to document the behavior so readers aren’t confused.
大多数应用程序在其主程序返回时退出,因此读者可能会认为这将在此处发生。然而,事实并非如此。RaftClient 的构造函数创建了额外的线程,即使应用程序的主线程结束了,该线程仍在继续运行。应该在 RaftClient 构造函数的接口注释中记录此行为,但是该行为不够明显,因此值得在 main 末尾添加简短注释。该注释应指示该应用程序将继续在其他线程中执行。如果代码符合读者期望的惯例,那么它是最明显的。如果没有,那么记录该行为很重要,这样读者就不会感到困惑。
Another way of thinking about obviousness is in terms of information. If code is nonobvious, that usually means there is important information about the code that the reader does not have: in the RaftClient example, the reader might not know that the RaftClient constructor created new threads; in the Pair example, the reader might not know that result.getKey() returns the number of the current term.
关于显而易见性的另一种思考方式是信息。如果代码不是显而易见的,则通常意味着存在有关读者所不具备的代码的重要信息:在 RaftClient 示例中,读者可能不知道 RaftClient 构造函数创建了新线程;在“配对”示例中,读者可能不知道 result.getKey()返回当前项的编号。
To make code obvious, you must ensure that readers always have the information they need to understand it. You can do this in three ways. The best way is to reduce the amount of information that is needed, using design techniques such as abstraction and eliminating special cases. Second, you can take advantage of information that readers have already acquired in other contexts (for example, by following conventions and conforming to expectations) so readers don’t have to learn new information for your code. Third, you can present the important information to them in the code, using techniques such as good names and strategic comments.
为了使代码清晰可见,您必须确保读者始终拥有理解它们所需的信息。您可以通过三种方式执行此操作。最好的方法是使用抽象等设计技术并消除特殊情况,以减少所需的信息量。其次,您可以利用读者在其他情况下已经获得的信息(例如,通过遵循约定并符合期望),这样读者不必为您的代码学习新的信息。第三,您可以使用诸如好名和战略注释之类的技术在代码中向他们提供重要信息。