杂谈：数据库相关经验

1. 不要 `select *`

虽然会写一些比较恶心的 sql 字符串拼接，但是也有收益。这里直接参考 GPT：

Reduced Overhead
Each SQL statement incurs an overhead for parsing, planning, and network communication between the application and the database. By batching rows in a single INSERT, you reduce this overhead significantly by dealing with it only once.
Fewer Network Round-Trips:
A single batched insert minimizes the number of network round-trips between your application and the database server. Network latency can be a non-trivial component of the total operation time, especially in distributed environments.
Efficient Use of Transactions:
If you're using multiple separate INSERT statements, and each statement is executed in its own transaction, the overhead of starting and committing multiple transactions can be significant. A single INSERT for multiple rows can be executed within a single transaction, reducing the transaction management overhead.
Bulk Insert Optimizations:
Databases like MySQL can perform internal optimizations when handling a batch of inserts, such as minimizing index updates or reducing write operations until all data is inserted. This can lead to further performance benefits.
Locking and Concurrency:
Fewer transactions mean reduced locking and thus better concurrency and reduced potential for contention, especially in tables that are frequently written to.

先关闭 autocommit。

SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;

这么操作的原因参考 mysql 官方文档。另外，如果条件允许，也可以考虑换用 locking functions，这是一种不经过存储引擎的锁，更加高效。

几个原则：

判断自己的业务场景需要强一致性（或称实时一致性）还是最终一致性
- 强一致性：用于财务等严格业务。依赖于全局唯一锁（表锁、分布式锁等）实现，会降低吞吐量。
- 最终一致性：依靠各种保障策略来实现延迟的一致性。例如，读操作时的刷缓存就是天然的手段；此外还可以有后台的修复进程等。
标准的 cache-aside 方案
- 目的：最终一致性
- 读操作：若 cache miss，则读库并刷入缓存
- 写操作：先读库，然后删除缓存
- 另外，可以加入延迟双删，用来解决缓存恰好失效时的并发读写带来的不一致问题。