The kna CLI: Master Database

Query 110K bills, 2.4M votes, and ideal points from your terminal

Overview

The kna (Korean National Assembly) CLI provides offline, instant access to a pre-built master database that integrates 8 Open Assembly API endpoints into a single queryable interface. Unlike the live API (Chapters 3-4), kna works on processed Parquet files - no API key needed, no rate limits, no network latency.

Feature Live API (Ch 3-4) kna CLI
Bills 16-22nd (per query) 17-22nd (110K pre-loaded)
Lifecycle timestamps Requires chaining 4+ endpoints Single row per bill, 8 APIs merged
Roll call votes Per-bill, rate-limited 2.4M votes, instant
Member metadata Per-query, current assembly only 1,933 member-terms (17-22nd)
Asset disclosures Not available 2,928 member-year wealth rows
DW-NOMINATE Not available 936 legislator-terms
Propose-reason texts Not available 60K bill texts
Speed ~2 req/sec Instant (local Parquet)

Installation

Step 1: Install the CLI

pip install kna

Or use pipx install kna which handles PATH automatically.

Step 2: Get the data

The CLI needs processed Parquet data files hosted via Git LFS:

# Install Git LFS (one-time)
brew install git-lfs    # macOS
# or: sudo apt install git-lfs    # Ubuntu/Debian

git lfs install

# Clone with data
git clone https://github.com/kyusik-yang/kna.git
cd kna
Warning

If you already cloned without LFS, parquet files will be tiny pointer files and kna info will fail. Fix with: git lfs install && git lfs pull

Step 3: Point the CLI to the data

export KBL_DATA=~/kna/data/processed

# Make it permanent
echo 'export KBL_DATA="$HOME/kna/data/processed"' >> ~/.zshrc
source ~/.zshrc

# Verify
kna info

Quick start

Database overview

kna info
         Korean National Assembly Database
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃ Assembly           ┃    Bills ┃  Enacted ┃  Cols ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ 17th (2004-08)     │    8,369 │    2,547 │    49 │
│ 18th (2008-12)     │   14,762 │    2,930 │    49 │
│ ...                │          │          │       │
│ 22nd (2024-)       │   17,205 │    1,399 │    54 │
├────────────────────┼──────────┼──────────┼───────┤
│ Total              │  110,778 │   17,638 │       │
└────────────────────┴──────────┴──────────┴───────┘

  Roll call votes    2,425,113  (16-22nd)
  Ideal points             936  (20-22nd, DW-NOMINATE)
  Committee mtgs       572,127  (17-22nd)
  Bill texts            60,925  (20-22nd, propose-reason)

Search bills by title

# Keyword search
kna search "인공지능" --assembly 22

# Combine filters
kna search "부동산" --assembly 22 --status enacted --committee 국토

# By proposer
kna search "형법" --proposer 박범계 --assembly 21 -n 30

Filter options:

Option Description Values
--assembly Assembly number 17-22
--committee Committee name (partial match) e.g. 과방, 국토
--proposer Lead proposer name e.g. 김영식
--status Status group passed, enacted, pending, rejected
--kind Bill type 법률안, 결의안, 동의안
--from / --to Proposal date range YYYY-MM-DD
-n Max results (default 20) integer

Search propose-reason texts

search queries bill titles. text searches within the full propose-reason text (제안이유) - the substantive rationale behind each bill.

# Find bills mentioning climate change in their rationale
kna text "기후변화" --assembly 22

# Find bills discussing AI regulation rationale
kna text "인공지능 규제" -n 10

This covers 60,925 bills (20-22nd Assembly, 99.4% text coverage).

Bill lifecycle timeline

The “killer feature” of kna - see any bill’s full journey through the legislative process:

kna show 2217673
╭──────────────────────────────────────────────╮
│  약사법 일부개정법률안                          │
│                                              │
│  bill_no      2217673                        │
│  assembly     22nd                           │
│  proposer     김종민 (의원) 등 10인             │
│  committee    보건복지위원회                    │
│  status       원안가결                         │
│                                              │
│  LIFECYCLE              date    days          │
│  ● 발의         2024-09-12       -            │
│  ● 소관위 회부   2024-09-13      +1            │
│  ● 소관위 상정   2024-11-20     +69            │
│  ● 소관위 처리   2025-01-15    +126            │
│  ● 법사위 회부   2025-01-20    +131            │
│  ● 본회의 의결   2025-02-10    +152            │
│  ● 공포         2025-03-01    +171            │
│                                              │
│  VOTE  찬성 245 / 반대 3 / 기권 12 (재석 260)  │
│                                              │
│  PROPOSE REASON                               │
│  현행법상 약사법 제... (full text shown)        │
╰──────────────────────────────────────────────╯

Each stage shows:

  • reached (with date and days since proposal)
  • not reached (bill did not advance to this stage)

Lifecycle stages: 발의 → 소관위 회부 → 소관위 상정 → 소관위 처리 → 법사위 회부 → 본회의 의결 → 공포

Legislator profiles

kna legislator 이재명 --assembly 22

Shows:

  • Party and DW-NOMINATE ideal point (ideological position)
  • Rank within the assembly (← liberal … conservative →)
  • Bill record: total led, enacted count, passage rate
  • Top enacted bills with dates
# Search by MONA code for exact match
kna legislator --mona M2Q9024I --assembly 21
Note

DW-NOMINATE ideal points are available for the 20th-22nd Assembly only (936 legislator-terms). For the 17th-19th, bill record is shown without ideological positioning.

Aggregate statistics

Legislative funnel

How many bills survive each stage of the legislative process?

kna stats funnel --assembly 22
  22nd Assembly · Legislative Funnel (법률안 only)

  Stage           Bills     Rate
  발의           16,907   100.0%  ████████████████████
  소관위 회부    16,071    95.1%  ███████████████████
  소관위 상정    12,935    76.5%  ███████████████
  소관위 처리     4,060    24.0%  █████
  법사위 회부       510     3.0%  █
  본회의 의결     4,431    26.2%  █████
  공포            1,060     6.3%  █

Passage rate trend

kna stats passage-rate

Shows the enacted rate declining from 25.6% (17th) to 6.6% (22nd) - legislative inflation at work.

Data export

Extract filtered subsets for downstream analysis in R, Stata, or Python:

# Health committee bills, enacted only
kna export health.csv --assembly 22 --committee 보건복지 --status enacted

# All 22nd Assembly bills as Parquet
kna export all_22.parquet --assembly 22

# Government-proposed bills across all assemblies
kna export gov.csv --kind 법률안

Format auto-detected from extension: .csv, .parquet, .tsv.

Using kna data in Python

The kna package also works as a Python library:

from kna.data import BillDB

db = BillDB()

# Load all 22nd Assembly bills
bills = db.bills(assembly=22)
print(f"{len(bills):,} bills, {len(bills.columns)} columns")

# Load with column pruning (fast)
df = db.bills(assembly=22, columns=["bill_id", "bill_nm", "status", "ppsl_dt"])

# Ideal points
ip = db.ideal_points()  # 936 legislator-terms, sign-flipped

# Roll call votes
votes = db.roll_calls(assembly=22)  # 383K member-level votes

# Bill texts
texts = db.bill_texts()  # 60K propose-reason texts

# Member metadata (party, district, committee, gender, birth date)
members = db.members(assembly=22)  # 306 members

# Asset disclosures (net_worth, real estate, stocks, etc. in 천원)
assets = db.assets(assembly=22)  # 299 member-year wealth rows

# Committee meetings
meetings = db.committee_meetings(assembly=22)  # 108K records

Using kna data in R

library(arrow)
library(dplyr)

# Load master bills
master <- read_parquet("data/processed/master_bills_22.parquet")

# All assemblies
all_bills <- bind_rows(
  lapply(17:22, function(age)
    read_parquet(sprintf("data/processed/master_bills_%d.parquet", age)))
)

# Ideal points
ip <- read.csv("data/processed/dw_ideal_points_20_22.csv") %>%
  mutate(aligned = -aligned)  # flip: negative = liberal, positive = conservative

# Bill texts
texts <- read_parquet("data/processed/bill_texts_linked.parquet")

Command reference

Command Description
kna info Database overview
kna search KEYWORD Search bill titles
kna text KEYWORD Search propose-reason texts
kna show BILL_NO Bill detail + lifecycle timeline
kna legislator NAME Legislator profile + ideal point
kna stats funnel Legislative funnel
kna stats passage-rate Cross-assembly passage rates
kna export PATH Export filtered bills to CSV/Parquet

Companion data

For hearing and audit transcripts (9.9M speech-level records), see the companion dataset: