Catalog
DuckDB-backed index for filings. Stores core columns plus indexed fields as physical columns; extra fields in a JSON data column. Uses FilingResolver to restore Filing subclass from _filing_class (FQCN).
Constructor
Catalog(db_file_path: str, resolver: FilingResolver | None = None) -> Catalog
- db_file_path: Path to DuckDB file.
- resolver: Used to resolve
_filing_classto a class on get/search. Defaults todefault_resolverifNone.
Methods
index
index(filing: Filing) -> None
Inserts or replaces one filing. Adds physical columns for any new indexed fields of the filing’s class. Raises CatalogRequiredValueError if a core required value is missing/empty.
index_batch
index_batch(filings: list[Filing]) -> None
Bulk insert/replace. Ensures all indexed columns exist, then inserts in one batch.
get
get(id: str) -> Filing | None
Returns Filing instance (subclass resolved via resolver) or None.
get_raw
get_raw(id: str) -> dict[str, Any] | None
Returns merged dict (physical columns + data JSON) or None. No Filing instantiation.
search
search(
expr: Expr | None = None,
limit: int = 100,
offset: int = 0,
order_by: str = "created_at",
desc: bool = True,
) -> list[Filing]
WHERE from expr, ORDER BY order_by(列または json_extract), LIMIT/OFFSET。expr が Python の bool のときは CatalogExprTypeError。式のコンパイルは Collection Search を参照。
search_raw
search_raw(sql: str, params: list[Any] | None = None) -> list[Any]
Executes raw SQL and returns fetched rows. Advanced use.
count
count(expr: Expr | None = None) -> int
expr に一致する行数(expr is None なら全件)。expr が Python の bool のときは CatalogExprTypeError。式のコンパイルは Collection Search を参照。
stats
stats() -> dict[str, Any]
Returns dict with keys e.g. total, sources, earliest, latest.
clear
clear() -> None
Deletes all rows.
close
close() -> None
Closes the DuckDB connection.