Repos / hi.imnhan.com / 89e6fdb4ff
commit 89e6fdb4ffb9fe33259fb230cd13547b392912ab
Author: Bùi Thành Nhân <hi@imnhan.com>
Date:   Sun Jan 30 17:24:21 2022 +0700

    add sqlite post

diff --git a/content/posts/working-with-sqlite-without-an-orm.md b/content/posts/working-with-sqlite-without-an-orm.md
new file mode 100644
index 0000000..259d608
--- /dev/null
+++ b/content/posts/working-with-sqlite-without-an-orm.md
@@ -0,0 +1,173 @@
+Title: Working with SQLite in Python without an ORM or migration framework
+Date: 2022-01-30 14:11
+Summary: |
+    Some notes on handling migrations, linking the latest SQLite, and sane
+    driver defaults.
+
+
+I learned about SQLite's user_version pragma some time ago from a comment on
+Hacker News (as one does). Not sure which comment it was specifically, but it
+went something [like this][1]:
+
+> [...] One thing you can look into using is the SQLite user_version pragma.
+> We use this right now to roll our own migrators and it's light years
+> better than how migrators work for Entity Framework, et. al.
+>
+> [https://www.sqlite.org/pragma.html#pragma_user_version]()
+
+It made a lot of sense, and as I've become more and more [disillusioned][3]
+with Django's ORM (and similar "kitchen sink" web frameworks in general), I
+decided to try it out on [Pytaku][2] as a learning experience. Things went
+okay, all things considered. Here I'll outline some of my findings.
+
+## APSW as the driver
+
+Using [apsw][5] instead of the standard library's sqlite3 package has a couple
+of advantages:
+
+### It's easy to link against the latest sqlite3 version
+
+I originally ran pytaku on a cheap Vietnamese VPS provider, which only offered
+Ubuntu 12.04. This came with a relatively old sqlite3 version that lacked
+UPSERT support (probably among other things that I forgot). I guess it's
+possible to compile a custom python version that links to a newer sqlite, but
+that would defeat the purpose of pytaku being an easy-to-deploy program. Apsw,
+on the other hand, provides a pip [one-liner][4] that compiles and links to the
+latest sqlite. (still kinda bad, but it's less bad than compiling custom
+python)
+
+### It has the same defaults as upstream sqlite
+
+Python stdlib's `sqlite3` has a few default configurations that deviate from
+sqlite's. A couple of things that actually bit me are:
+
+- autocommit mode is off by default
+- `executescript()` automatically issues a `COMMIT` statement
+
+To be fair, both of them are written in the docs, and these custom defaults are
+probably to maintain consistency with [PEP 249][6]. Still, as I was learning
+sqlite, it's frustrating to jump between sqlite docs and python docs to
+interpret the correct behavior at times. Apsw does none of those things: it's
+simply an unopinionated, honest-to-god python binding to sqlite.
+
+To be completely honest though, in the long run it seems more reasonable to
+learn the pysqlite3 API so that I can avoid an extra dependency. I'm also now
+using Debian 11 which has a reasonably recent sqlite, so the compilation
+advantange is no longer that great.
+
+## A minimum viable DB migration scheme
+
+With [<100 lines][7] of python, I ended up with a migrator that:
+
+- Finds migration files in the form of `./migrations/mXXXX.sql`
+- Uses `user_version` pragma to figure out what migrations are pending
+- Is forward-only -- I did say that this is minimally viable didn't I ;)
+
+Coming from Django, I missed a definitive place to see the latest definition of
+the whole db (which, in Django, is the models file). That's why I set up the
+migrator to always write the latest db definition out to a file using
+[`sqlite3 <db_file> .schema > latest_schema.sql`][8], and keep that file [in
+version control][9]:
+
+```sql
+-- This file is auto-generated by the migration script
+-- for reference purposes only. DO NOT EDIT.
+CREATE TABLE title (
+    id text,
+    name text,
+    site text,
+    cover_ext text,
+    chapters text,
+    alt_names text,
+    descriptions text,
+    updated_at text default (datetime('now')), is_webtoon boolean not null default false, descriptions_format text not null default 'text',
+    unique(id, site)
+);
+CREATE TABLE user (
+    id integer primary key,
+    username text unique,
+    password text,
+    created_at text default (datetime('now'))
+);
+[...]
+```
+
+Now to address the elephant in the room: SQLite has... _limited_ ALTER TABLE
+capabilities. The upside is it's [well-documented][10]. What this means in
+practice is that sometimes an otherwise simple `ALTER TABLE` in other RDBMS-es
+will require more manual gymnastics in SQLite: you'll need to create a new
+table with the desired properties, copy existing data over to the new table,
+then drop the old table. There are subtle bear traps in the specific order of
+steps to take, but thankfully the docs, again, deliver: as long as you
+follow the [12 steps][15] correctly, you won't mess up your data. It sounds
+intimidating but it's not _that_ bad. Here's a specific example from pytaku
+where I removed a FOREIGN KEY constraint:
+
+```sql
+-- Remove foreign key from "read" table pointing to "chapter".
+-- So we can, say, mark all chapters of a title as read even if some of those
+-- chapters haven't been created.
+
+pragma foreign_keys = off; -- to let us do anything at all
+begin transaction;
+
+create table new_read (
+    user_id integer not null,
+    site text not null,
+    title_id text, -- nullable to accomodate existing mangadex rows, urgh.
+    chapter_id text not null,
+    updated_at text default (datetime('now')),
+
+    foreign key (user_id) references user (id),
+    unique(user_id, site, title_id, chapter_id)
+);
+insert into new_read select * from read;
+drop table read;
+alter table new_read rename to read;
+
+pragma foreign_key_check;
+commit;
+pragma foreign_keys = on;
+```
+
+Besides the boilerplaty dance with foreign_key pragmas and transactions, all I
+had to do was copy the existing table definition from the aforementioned
+latest_schema.sql file, tweak it to my desired state, then do the table
+switcheroo. Again, the specific ordering of steps is important. I won't go into
+details, but I had actually tripped on a failure mode, which I then realized
+was already nicely warned against in the docs. RTMF is actually fine advice for
+projects that have good documentation, who would have thought?
+
+## Recommended "sane" defaults
+
+SQLite comes with some default settings that may be surprising for people
+coming from e.g. Postgres. Here are some tweaks that worked better for me.
+
+[Enable WAL mode][11]. This allows for concurrent readers, which is usually
+what you want from a web service.
+
+[Enforce foreign key constraints][12]. Yep, you read that right: SQLite
+doesn't enforce foreign key constraints by default. This is just one of the
+various consequences of SQLite being veeeeery lax about what you store. Another
+potential surprise is column types not being enforced, whose alternative only
+landed recently in the form of [STRICT Tables][13].
+
+[Set a non-zero busytimeout][14]. Otherwise if a query is blocked, it will
+crash immediately instead of waiting for the blocking query to finish, no
+matter how short the wait is.
+
+[1]: https://news.ycombinator.com/item?id=23510382
+[2]: https://sr.ht/~nhanb/pytaku/
+[3]: https://code.djangoproject.com/ticket/21961
+[4]: https://rogerbinns.github.io/apsw/download.html#i-really-want-to-use-pip
+[5]: https://rogerbinns.github.io/apsw/
+[6]: https://www.python.org/dev/peps/pep-0249/
+[7]: https://github.com/nhanb/pytaku/blob/65a6c08128ebbc2b7d33a6b043798c69ac7dfebe/src/pytaku/database/migrator.py
+[8]: https://github.com/nhanb/pytaku/blob/65a6c08128ebbc2b7d33a6b043798c69ac7dfebe/src/pytaku/database/migrator.py#L44-L51
+[9]: https://github.com/nhanb/pytaku/blob/65a6c08128ebbc2b7d33a6b043798c69ac7dfebe/src/pytaku/database/migrations/latest_schema.sql
+[10]: https://www.sqlite.org/lang_altertable.html
+[11]: https://sqlite.org/wal.html
+[12]: https://www.sqlite.org/pragma.html#pragma_foreign_keys
+[13]: https://www.sqlite.org/stricttables.html
+[14]: https://www.sqlite.org/c3ref/busy_timeout.html
+[15]: https://www.sqlite.org/lang_altertable.html#otheralter
diff --git a/nhanb-theme/static/css/main.css b/nhanb-theme/static/css/main.css
index 51fc2e3..57e8963 100644
--- a/nhanb-theme/static/css/main.css
+++ b/nhanb-theme/static/css/main.css
@@ -43,6 +43,12 @@ video {
   max-width: 100%;
 }
 
+article {
+  /* prevent long URLs from causing a horizontal scroll on mobile */
+  overflow-wrap: break-word;
+  word-wrap: break-word;
+}
+
 video {
   /* On netsurf browser, a border for <video> tag will enclose all its
    * following siblings, which is probably a bug.
@@ -159,7 +165,7 @@ .highlight .err {
 /* site-specific stuff follows */
 
 #main {
-  max-width: 600px;
+  max-width: 800px;
   margin: 0 auto;
   padding: 0 0.6rem;
 }