[ { "title": "数据结构:Trie", "url": "https://chunyoupeng.tech/2025/shu-ju-jie-gou-trie/", "body": "Trie是一个可以快速检索字典的一种数据结构。本篇文章我们以存储英语单词,存储a-z的26个英语字母。 class TrieNode TrieNode* children[26]; bool leaf; ; 上面, children[]: 大小为26的数组存储指针。由于我们是可以通过 字母-a的方式得到每一个字母的数值,因此字母可以和数组下标对应上。 leaf:标志是否是一个单词的结尾 图片来源于https://www.geeksforgeeks.org/ Trie的基本操作 Trie最主要的操作有两个,插入和搜索。 插入 void insert(string word) TrieNode* node = root; for(auto c : word) int idx = c - 'a'; if(!node->children[idx]) node->children[idx] = new TrieNode(); node = node->children[idx]; node->leaf = true; 说明: idx就对应每一个字母 如果不存在,则创建一个新的节点 如果存在,那么沿着这个idx走下去 搜索 bool search(string word) TrieNode* node = root; for(auto c : word) int idx = c - 'a'; if(!node->children[idx]) return false; node = node->children[idx]; return node->leaf; 说明: 如果已经发现在查找的过程中某一个没有,那么直接返回false,标志没有找到 如果沿着这个单词都访问完了,分为两种情况,一种是node->leaf=true这个时候表明已经找到这个单词了。另一种情况是node->leaf=false,说明word并不是一个完整的单词,可能是每一个单词的前缀。如果需要找前缀,只需要改成return true即可。 " }, { "title": "差分数组", "url": "https://chunyoupeng.tech/2025/chai-fen-shu-zu/", "body": "假设有一个长度为6数组: a=[0,0,0,0,0,0] 现在想在L-R之间添加3,那么就需要每一个数迭代的添加,但是这样就太慢了。这个时候差分数组的作用就体现出来了:只需要记录两个端点的情况,然后最后统一处理就可以了。 例如L=1,R=3,循环增加得到的结果是a=[0,3,3,3,0,0]。 我们可以记录一个差分数组d[1]=3, d[4]=-3。于是d=[0,3,0,0,-3,0],在最后得到结果的时候就是: $a[i]=a[i-1]+d[i]$ 那么$a[0]=0,a[1]=a[0]+d[1]=3,a[2]=a[1]+d[2]=3…a[4]=a[3]+d[4]=3+(-3)=0$。 可以看到只需要在差分数组前面加上一个数,那么就会往后面一直推下去。d[1]数值改变,a[1]的数值就会相对于前面的增大,但是后面的$d[i], i in [2,3]$这里差分并没有变,他就可以沿着后面传下去。 为什么d[4]=-3? 因为增加只在1-3,表明前面1-3都增加了3,那么d[4]想要保持不变,那么d[4]相对于d[3]就是-3。也就是说从这里开始就断了,不会往下面传递。 总结一下就是起点“传递”,终点”断流“。 " }, { "title": "BIT(Binary indexed tree)笔记", "url": "https://chunyoupeng.tech/2025/shen-me-shi-bit/", "body": "BIT是什么?全称Binary indexed tree,或者Fenwick tree。用在区间查询和更新里面。相对于前缀和数组的一个优势在于可以动态更新数据,复杂度是$O(logn)$ 核心公式: $tree[k]= sum_q(k− p(k)+ 1,k)$ $p(k)$代表可以被k整除的最大的2的幂数。$p(k)=k&-k$。为什么?由于$k&-k$得到的是k的最低有效位,实际上k的最低有效位就是刚好能被k整除的数。因此可以快速的算$p(k)$。 这个公式的规律就是如果是奇数那么这个范围就是只有自己。 核心实现 由于$sum(a, b) = sum(1, b) - sum(1, a-1)$,那么我们只需要知道$sum(1, k)$,就可以把任意区间和求解出来。下面给出C++实现。 求1-k的和: int sum(int k) int res = 0; while(k >= 1) res += tree[k]; k -= k & -k; return res; 代码说明: 因为tree[k]存的就是p(k)这么长的原数组数据,核心思路就是把这段数据加上然后继续往前走,继续处理,直到k是第一个。 数据更新: 相比于前缀和,BIT的优势在于可以动态的更新数据。 void add(int k, int x) while(k<=n) tree[k] += x; k += k & -k; 代码说明: 对于更新数据,实际上前面存的是不受影响的,受影响的是后面的,因此后面的会按照之前的规则逐个增加x。 " }, { "title": "如何用迭代的方式实现DFS?", "url": "https://chunyoupeng.tech/2025/yong-die-dai-de-fang-shi-shi-xian-dfs/", "body": "DFS,深度优先遍历,图的一种常用的遍历方式。核心思想是从每个节点,沿着子节点一直走,走到最深,因此叫做深度优先。 常见的方式是用递归的方法实现的,但是递归有一个问题需要注意递归的深度,因此在某些情况下需要用到迭代到方式。 DFS的本质是一个栈的形式。栈是一种后进先出的数据结构,也就是说会优先处理最近添加的数据。这个和DFS的处理方法不谋而合。因此我们可以用这个栈来模拟DFS过程。 为了防止重复访问节点,因此需要一个visit数组来去重。 迭代实现cpp代码如下: # 传入一个根节点 # 我们采用邻接表的方式表示一个图。邻接表就是给定一个节点,可以找到这个节点的全部邻居。 #include <stack> #include <vector> using namespace std; void dfs(int root, vector<bool>& vis, const vector<vector<int>>& g) stack<int> s; vis[root] = true; s.emplace(root); while (!s.empty()) auto u = s.top(); s.pop(); for(auto v : g[u]) if (!vis[v]) s.emplace(v); visit[v] = true; " }, { "title": "倍增法例题:# Company Queries II", "url": "https://chunyoupeng.tech/2025/bei-zeng-fa-li-ti-company-queries-ii/", "body": "题目来自于cses: # Company Queries II 解体思路: 这是关于树问题比较典型的问题求最小公共祖先。 首先想到的应该是把两个需要查询的节点首先拉到一层,之后一层一层的网上查找。 但是这样的时间复杂度是:$O(q times (lgn + n)$。接近$O(N^2)$时间复杂度肯定是不行的。 现在的问题就是如何去优化一层一层查找这个过程。 其实也可以用倍增法。 在把两个节点拉到同一水平之后,贪心的思想,让节点同时尽可能往上面跳跃。但是不能让他们跳到相同的点,因为相同点其实就是根节点1了。因为所有数都是可以用二进制表示的,参考[[倍增法/index|index]]。 那么可以让节点跳到最小共同祖先之前的一个节点,然后返回父节点就可以了。 最终C++代码如下: int main() fastio; int n, q; cin >> n >> q; int LOG = 1; while ((1 << LOG) <= n) ++LOG; vector<vi> g(n + 1), up(n + 1, vi(LOG)); vi depth(n + 1, 0); rep(i, 2, n + 1) int e; cin >> e; g[e].eb(i); g[i].eb(e); up[i][0] = e; // dfs get depth of every node auto dfs = [&](auto &&self, int u) -> void for (auto &&v : g[u]) if (depth[v] == 0 && v != 1) depth[v] = depth[u] + 1; self(self, v); ; dfs(dfs, 1); // get the lower depth, both of them jump to it, same node v ? v : jump one // level higher for (int j = 1; j < LOG; j++) for (int i = 1; i <= n; i++) int mid = up[i][j - 1]; up[i][j] = mid == 0 ? 0 : up[mid][j - 1]; while (q--) int a, b; cin >> a >> b; if (a == 1 || b == 1) cout << 1 << endl; continue; if (a == b) cout << a << endl; continue; while (a != b) int da = depth[a], db = depth[b]; if (da == db) for (int j = LOG - 1; j >= 0; --j) if (up[a][j] != up[b][j]) a = up[a][j]; b = up[b][j]; cout << up[a][0] << endl; break; // 如果不在同一层,那么首先拉到同一层 int min_depth = min(da, db); // ka, kb指的是需要跳跃的层次 int ka = da - min_depth, kb = db - min_depth; da -= ka; db -= kb; for (int j = 0; j < LOG; j++) if (ka & (1 << j)) a = up[a][j]; if (kb & (1 << j)) b = up[b][j]; if (a == b) cout << a << endl; break; " }, { "title": "谈谈我对vibe coding的看法", "url": "https://chunyoupeng.tech/2025/tan-tan-wo-dui-vibe-codingde-kan-fa/", "body": "最近关于vibe coding很火啊,针对这个问题,我结合我目前做的一些项目谈一谈我对它的看法。 其实与其说是vibe coding,不如说是分享一下近期对于ai编程辅助的一些看法。 我的一个真实的项目经历是,针对一个车辆识别的项目,我是一点这方面的知识都没有的,但是使用ai却让我逐步完成了这个项目,目前已经接近上万行python代码了。 这个过程如下: 直接指导ai的就不用说了,重点说一些踩的坑。 不要太依赖AI Google搜索看博客看开源比纯问ChatGPT效果更好。虽然现在ChatGPT搜索的能力确实很强,但是光是依赖AI对于做这些软件来讲并不是一种很好的选择。从我的经历上看,这些东西其实已经是比较成熟的,看网上的教程和开源比问ChatGPT效果更好,问ChatGPT的时候我是一脸懵逼的状态,后面是看了博客才懂的。 文档是核心 想要实现什么,一定要把文档写清楚,项目结构,AI就会找这个来。 工具要选对 工具要能看懂整个仓库的,推荐Claude Code和codex cli " }, { "title": "算法:倍增法的使用和我的理解", "url": "https://chunyoupeng.tech/2025/bei-zeng-fa/", "body": " 解释 倍增法的核心起源于每一个数字都可以使用二进制表示: 13 = 8+4+1=0b1000 + 0b0100 + 0b0001 那么我们先预处理这样的一个表,之后是不是就可以直接使用,快速得到结果呢?相比于一层层的获取的时间复杂度O(n),倍增法的事件复杂度是O(lgn) 数据结构: up[i][j]表示在i点$2^j$跳达到后的点。 up[i][0]表示i点点直接parent 核心公式: mid = up[i][j-1] // 从i经过2^(j-1)跳到达mid up[i][j] = up[mid][j-1] //从mid经过2^(j-1)达到j,总共就是2*2^(j-1)=2^j跳 预处理 for (int j = 1; j < LOG; ++j)         for (int i = 1; i <= n; ++i)             int mid = up[i][j - 1];             up[i][j] = (mid == 0 ? 0 : up[mid][j - 1]);             注意这里第一层循环是j,因为他应该一层一层的处理,那么首先j=0应该处理完,才能处理下面的,以此类推。 为什么要拆成两半? 是因为在预处理的时候我们是从小的变成大的,并且这个变化都是2次方来变化的。 运用 在完成了这个倍增表之后,我们需要使用。常见的表示方法是: if (k & (1LL << j)) u = up[u][j]; 我们把(k & (1LL << j))单独拎出来解释一下。 同样以k=13即二进制1101为例, 那么上面为真只能在j=0, 2, 3为真。那么这里就从13次直接变成了跳3次,很大程度减少了计算量。 实战练手 cses题目:# Company Queries I " }, { "title": "中文用户的福音:用Emacs的快捷键绑定VS Code,个人感觉比Vim好用", "url": "https://chunyoupeng.tech/2025/yong-emacsde-kuai-jie-jian-bang-ding-vs-code/", "body": "特别喜欢Vim,是Vim让我习惯了没有鼠标的日子,高效而流畅。 无奈作为一个中文用户,有输入拼音的需求。听说Emacs由ctrl键位控制,不需要频繁的切换中英文,中文状态下也可以流畅使用快捷键,于是开始学习Emacs。 刚开始不太习惯,个人感觉没有Vim流畅,不过在习惯之后,也是逐渐熟悉起来,打字也更加流畅,特别是在中文环境下。 除了有一个不太好的点是,Emacs配置不好弄,我会一点lisp,但是还是很难去搞配置。而且里面的插件都很老了,不太好用。相比之下,VS Code插件就丰富多了。特别是我最喜欢的ssh插件,这里推荐一下VSCode的ssh插件,连上之后基本上更本地开发没有什么太大的区别。 后面渐渐VSCode就用的多了起来。刚开始我还是会安装一个Vim的插件,不过后面还是与遇到相同的问题——打中文不好用。 于是我在想,能不能使用Emacs的快捷键呢?只需要配置一下就可以。后面配置了果然好用,和在emacs里面基本无缝连接。具体配置如下 [ "key": "ctrl+[", "command": "scrollLineDown", "when": "textInputFocus" , "key": "ctrl+pagedown", "command": "-scrollLineDown", "when": "textInputFocus" , "key": "ctrl+]", "command": "scrollLineUp", "when": "textInputFocus" , "key": "ctrl+shift+o", "command": "workbench.action.files.toggleActiveEditorReadonlyInSession" , "key": "ctrl+shift+w", "command": "workbench.action.files.resetActiveEditorReadonlyInSession" , "key": "ctrl+/", "command": "undo" , "key": "ctrl+shift+p", "command": "cursorUpSelect" , "key": "ctrl+shift+n", "command": "cursorDownSelect" , "key": "shift+alt+f", "command": "cursorWordRightSelect" , "key": "shift+alt+b", "command": "cursorWordLeftSelect" , "key": "alt+f", "command": "cursorWordRight" , "key": "alt+d", "command": "deleteWordRight", "when": "textInputFocus" , "key": "alt+b", "command": "cursorWordLeft" , "key": "shift+alt+f", "command": "-notebook.formatCell", "when": "editorHasDocumentFormattingProvider && editorTextFocus && inCompositeEditor && notebookEditable && !editorReadonly && activeEditor == 'workbench.editor.notebook'" , "key": "shift+alt+f", "command": "-editor.action.formatDocument", "when": "editorHasDocumentFormattingProvider && editorTextFocus && !editorReadonly && !inCompositeEditor" , "key": "shift+alt+f", "command": "-editor.action.formatDocument.none", "when": "editorTextFocus && !editorHasDocumentFormattingProvider && !editorReadonly" , // ===== 基础移动 ===== "key": "ctrl+b", "command": "cursorLeft", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+f", "command": "cursorRight", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+p", "command": "cursorUp", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+n", "command": "cursorDown", "when": "editorTextFocus && !editorReadOnly" , "key": "alt+b", "command": "cursorWordLeft", "when": "editorTextFocus && !editorReadOnly" , "key": "alt+f", "command": "cursorWordRight", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+a", "command": "cursorHome", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+e", "command": "cursorEnd", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+v", "command": "cursorPageDown", "when": "editorTextFocus && !editorReadOnly" , "key": "alt+v", "command": "cursorPageUp", "when": "editorTextFocus && !editorReadOnly" , // ===== 选择(Emacs 的 C-Shift-* 风格)===== "key": "ctrl+shift+b", "command": "cursorLeftSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+shift+f", "command": "cursorRightSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+shift+p", "command": "cursorUpSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+shift+n", "command": "cursorDownSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "alt+shift+b", "command": "cursorWordLeftSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "alt+shift+f", "command": "cursorWordRightSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+shift+a", "command": "cursorHomeSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+shift+e", "command": "cursorEndSelect", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+d", "command": "deleteRight", "when": "editorTextFocus && !editorReadOnly" , // C-d "key": "ctrl+h", "command": "deleteLeft", "when": "editorTextFocus && !editorReadOnly" , // C-h "key": "alt+d", "command": "deleteWordRight", "when": "editorTextFocus && !editorReadOnly" , // M-d "key": "alt+backspace", "command": "deleteWordLeft", "when": "editorTextFocus && !editorReadOnly" , // M-Backspace "key": "ctrl+k", "command": "deleteAllRight", "when": "editorTextFocus && !editorReadOnly" , // 近似 kill-line "key": "ctrl+t", "command": "editor.action.transposeLetters", "when": "editorTextFocus && !editorReadOnly" , // ===== 复制/剪切/粘贴 & 撤销 ===== "key": "alt+w", "command": "editor.action.clipboardCopyAction", "when": "editorTextFocus" , // M-w "key": "ctrl+w", "command": "editor.action.clipboardCutAction", "when": "editorTextFocus" , // C-w "key": "ctrl+y", "command": "editor.action.clipboardPasteAction", "when": "editorTextFocus" , // C-y "key": "ctrl+/", "command": "undo" , "key": "ctrl+shift+/", "command": "redo" , // C-_ 近似 // ===== 搜索 ===== "key": "ctrl+s", "command": "actions.find", "when": "editorFocus && !findWidgetVisible" , "key": "ctrl+s", "command": "editor.action.nextMatchFindAction", "when": "editorFocus && findWidgetVisible" , "key": "ctrl+r", "command": "editor.action.previousMatchFindAction", "when": "editorFocus" , "key": "alt+shift+5", "command": "editor.action.startFindReplaceAction", "when": "editorFocus" , // M-% // ===== 注释 / 代码辅助 ===== "key": "alt+;", "command": "editor.action.commentLine", "when": "editorTextFocus && !editorReadOnly" , // M-; "key": "alt+/", "command": "editor.action.triggerSuggest", "when": "editorTextFocus && !editorReadOnly" , // M-/ "key": "alt+", "command": "editor.action.formatDocument", "when": "editorHasDocumentFormattingProvider && editorTextFocus && !editorReadOnly" , // M- "key": "ctrl+g", "command": "editor.action.cancelSelectionAnchor", "when": "editorTextFocus" , "key": "ctrl+g", "command": "closeFindWidget", "when": "findWidgetVisible" , "key": "ctrl+g", "command": "cancelSelection", "when": "textInputFocus && !findWidgetVisible" , "key": "ctrl+g", "command": "workbench.action.closeQuickOpen", "when": "inQuickOpen" , // ===== 文件/缓冲区 & 命令 ===== "key": "ctrl+x ctrl+f", "command": "workbench.action.quickOpen" , // 访文件 "key": "ctrl+x b", "command": "workbench.action.quickOpenPreviousRecentlyUsedEditor" , // 切缓冲(近似) "key": "alt+x", "command": "workbench.action.showCommands" , // M-x "key": "ctrl+x 2", "command": "workbench.action.splitEditorDown" , "key": "ctrl+x 3", "command": "workbench.action.splitEditorRight" , "key": "ctrl+x 0", "command": "workbench.action.closeActivePinnedEditor" , "key": "ctrl+x 1", "command": "workbench.action.closeEditorsInOtherGroups" , "key": "ctrl+x o", "command": "workbench.action.focusNextGroup" , // ===== 跳转 / 问题导航 ===== "key": "alt+g alt+g", "command": "workbench.action.gotoLine" , // M-g M-g "key": "alt+g s", "command": "workbench.action.gotoSymbol" , // ===== 选择全部 / 打开资源管理器 ===== "key": "ctrl+x h", "command": "editor.action.selectAll" , "key": "ctrl+x ctrl+d", "command": "workbench.view.explorer" , // 近似 dired // ===== 终端 / Shell ===== "key": "ctrl+x m", "command": "workbench.action.terminal.toggleTerminal" , "key": "ctrl+x ctrl+m", "command": "workbench.action.terminal.new" , "key": "alt+shift+1", "command": "workbench.action.terminal.toggleTerminal" , // ===== 放大/缩小(字体)===== "key": "ctrl+x ctrl+=", "command": "editor.action.fontZoomIn" , "key": "ctrl+x ctrl+-", "command": "editor.action.fontZoomOut" , "key": "ctrl+x ctrl+0", "command": "editor.action.fontZoomReset" , // ===== 行/段落编辑 ===== "key": "ctrl+x ctrl+j", "command": "editor.action.joinLines", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+alt+", "command": "editor.action.formatSelection", "when": "editorHasDocumentFormattingProvider && editorTextFocus && !editorReadOnly" , "key": "alt+c", "command": "editor.action.transformToTitlecase", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+x ctrl+u", "command": "editor.action.transformToUppercase", "when": "editorTextFocus && !editorReadOnly" , "key": "ctrl+x ctrl+l", "command": "editor.action.transformToLowercase", "when": "editorTextFocus && !editorReadOnly" , // ===== 定义跳转 / 返回 / 括号跳转 ===== "key": "alt+.", "command": "editor.action.revealDefinition", "when": "editorHasDefinitionProvider && editorTextFocus" , // M-. "key": "alt+,", "command": "workbench.action.navigateBack" , // M-, "key": "alt+m", "command": "editor.action.jumpToBracket", "when": "editorTextFocus" , // ===== 折叠 ===== "key": "ctrl+x shift+4", "command": "editor.toggleFold", "when": "editorTextFocus" , // C-x $ "key": "ctrl+x 4 0", "command": "editor.unfoldAll", "when": "editorTextFocus" , // ===== 多光标 / 扩选到下一个匹配 ===== "key": "alt+shift+.", "command": "editor.action.addSelectionToNextFindMatch", "when": "editorFocus" , "key": "alt+shift+,", "command": "editor.action.moveSelectionToPreviousFindMatch", "when": "editorFocus" , // ===== Expand-Region 风格(内置 Smart Select)===== "key": "ctrl+=", "command": "editor.action.smartSelect.expand", "when": "editorTextFocus" , "key": "ctrl+-", "command": "editor.action.smartSelect.shrink", "when": "editorTextFocus" , // ===== 选中括号内 / 含括号 / 跳到括号 ===== "key": "alt+", "command": "editor.action.selectToBracket", "args": "selectBrackets": false , "when": "editorTextFocus" , // 仅括号内 "key": "alt+shift+", "command": "editor.action.selectToBracket", "args": "selectBrackets": true , "when": "editorTextFocus" , // 连同括号 "key": "ctrl+shift+", "command": "editor.action.jumpToBracket", "when": "editorTextFocus" , "key": "ctrl+alt+backspace", "command": "editor.action.removeBrackets", "when": "editorTextFocus" ] " }, { "title": "Repeat Yourself", "url": "https://chunyoupeng.tech/2025/repeat-yourself/", "body": "One of the most repeated pieces of advice throughout my career in software has been “don’t repeat yourself,” also known as the DRY principle.1 For the longest time, I took that at face value, never questioning its validity. That was until I saw actual experts write code: they copy code all the time2. I realized that repeating yourself has a few great benefits. Why People Love DRY The common wisdom is that if you repeat yourself, you have to fix the same bug in multiple places, but if you have a shared abstraction, you only have to fix it once. Another reason why we avoid repetition is that it makes us feel clever. “Look, I know all of these smart ways to avoid repetition! I know how to use interfaces, generics, higher-order functions, and inheritance!” Both reasons are misguided. There are many benefits of repeating yourself that might get us closer to our goals in the long run. Keeping Up The Momentum When you’re writing code, you want to keep the momentum going to get into a flow state. If you constantly pause to design the perfect abstraction, it’s easy to lose momentum. Instead, if you allow yourself to copy-paste code, you keep your train of thought going and work on the problem at hand. You don’t introduce another problem of trying to find the right abstraction at the same time. It’s often easier to copy existing code and modify it until it becomes too much of a burden, at which point you can go and refactor it. I would argue that “writing mode” and “refactoring mode” are two different modes of programming. During writing mode, you want to focus on getting the idea down and stop your inner critic, which keeps telling you that your code sucks. During refactoring mode, you take the opposite role: that of the critic. You look for ways to improve the code by finding the right abstractions, removing duplication, and improving readability. Keep these two modes separate. Don’t try to do both at the same time.3 Finding The Right Abstraction Is Hard When you start to write code, you don’t know the right abstraction just yet. But if you copy code, the right abstraction reveals itself; it’s too tedious to copy the same code over and over again, at which point you start to look for ways to abstract it away. For me, this typically happens after the first copy of the same code, but I try to resist the urge until the 2nd or 3rd copy. If you start too early, you might end up with a bad abstraction that doesn’t fit the problem. You know it’s wrong because it feels clunky. Some typical symptoms include: Generic names that don’t convey intent, e.g., render_pdf_file instead of generate_invoice Difficult to understand without additional context The abstraction is only used in one or two places Tight coupling to implementation details It’s Hard To Get Rid Of Wrong Abstractions We easily settle for the first abstraction that comes to mind, but most often, it’s not the right one. And removing the wrong abstraction is hard work, because now the data flow depends on it. We also tend to fall in love with our own abstractions because they took time and effort to create. This makes us reluctant to discard them even when they no longer fit the problem—it’s a sunk cost fallacy. It gets worse when other programmers start to depend on it, too. Then you have to be careful about changing it, because it might break other parts of the codebase. Once you introduce an abstraction, you have to work with it for a long time, sometimes forever. If you had a copy of the code instead, you could just change it in one place without worrying about breaking anything else. Duplication is far cheaper than the wrong abstraction —Sandi Metz, The Wrong Abstraction Better to wait until the last moment to settle on the abstraction, when you have a solid understanding of the problem space.4 The Mental Overhead of Abstractions Abstraction reduces code duplication, but it comes at a cost. Abstractions can make code harder to read, understand, and maintain because you have to jump between multiple levels of indirection to understand what the code does. The abstraction might live in different files, modules, or libraries. The cost of traversing these layers is high. An expert programmer might be able to keep a few levels of abstraction in their head, but we all have a limited context window (which depends on familiarity with the codebase). When you copy code, you can keep all the logic in one place. You can just read the whole thing and understand what it does. Resist The Urge Of Premature Abstraction Sometimes, code looks similar but serves different purposes. For example, consider two pieces of code that calculate a sum by iterating over a collection of items. total = 0 for item in shopping_cart: total += item.price * item.quantity And elsewhere in the code, we have total = 0 for item in package_items: total += item.weight * item.rate In both cases, we iterate over a collection and calculate a total. You might be tempted to introduce a helper function, but the two calculations are very different. After a few iterations, these two pieces of code might evolve in different directions: def calculate_total_price(shopping_cart): if not shopping_cart: raise ValueError("Shopping cart cannot be empty") total = 0.0 for item in shopping_cart: # Round for financial precision total += round(item.price * item.quantity, 2) return total In contrast, the shipping cost calculation might look like this: def calculate_shipping_cost(package_items, destination_zone): # Use higher of actual weight vs dimensional weight total_weight = sum(item.weight for item in package_items) total_volume = sum(item.length * item.width * item.height for item in package_items) dimensional_weight = total_volume / 5000 # FedEx formula billable_weight = max(total_weight, dimensional_weight) return billable_weight * shipping_rates[destination_zone] Had we applied “don’t repeat yourself” too early, we would have lost the context and specific requirements of each calculation. DRY Can Introduce Complexity The DRY principle is misinterpreted as a blanket rule to avoid any duplication at all costs, which can lead to complexity. When you try to avoid repetition by introducing abstractions, you have to deal with all the edge cases in a place far away from the actual business logic. You end up adding redundant checks and conditions to the abstraction, just to make sure it works in all cases. Later on, you might forget the reasoning behind those checks, but you keep them around “just in case” because you don’t want to break any callers. The result is dead code that adds complexity to the codebase; all because you wanted to avoid repeating yourself. The common wisdom is that if you repeat yourself, you have to fix the same bug in multiple places. But the assumption is that the bug exists in all copies. In reality, each copy might have evolved in different ways, and the bug might only exist in one of them. When you create a shared abstraction, a bug in that abstraction breaks every caller, breaking multiple features at once. With duplicated code, a bug is isolated to just one specific use case. Clean Up Afterwards Knowing that you didn’t break anything in a shared abstraction is much harder than checking a single copy of the code. Of course, if you have a lot of copies, there is a risk of forgetting to fix all of them. The key to making this work is to clean up afterwards. This can happen before you commit the code or during a code review. At this stage, you can look at the code you copied and see if it makes sense to keep it as is or if you can see the right abstraction. I try to refactor code once I have a better understanding of the problem, but not earlier. A trick to undo a bad abstraction is to inline the code back into the places where it was used. For a while, you end up “repeating yourself” again in the codebase, but that’s okay. Rethink the problem based on the new information you have. Often you’ll find a better abstraction that fits the problem better. When the abstraction is wrong, the fastest way forward is back. —Sandi Metz, The Wrong Abstraction tl;dr It’s fine to look for the right abstraction, but don’t obsess over it. Don’t be afraid to copy code when it helps you keep momentum and find the right abstraction. It bears repeating: “Repeat yourself.” 这就是为什么 ↩ For some examples, see Ferris working on Rustendo64 or tokiospliff working on a C++ game engine. ↩ This is also how I write prose: I first write a draft and block my inner critic, and then I play the role of the editor/critic and “refactor” the text. This way, I get the best of both worlds: a quick feedback loop which doesn’t block my creativity, and a final product which is more polished and well-structured. Of course, I did not invent this approach. I recommend reading “Shitty first drafts” from Anne Lamott’s book Bird by Bird: Instructions on Writing and Life if you want to learn more about this technique. ↩ This is similar to the OODA loop concept, which stands for “Observe, Orient, Decide, Act.” It was developed by military strategist John Boyd. Fighter pilots use it to wait until the last responsible moment to decide on a course of action, which allows them to make the best decision based on the current situation and available information. ↩ " }, { "title": "Search Index", "url": "https://chunyoupeng.tech/tinysearch.json/", "body": "" } ]