Skip to main content

Catalog

DuckDB-backed index for filings. Stores core columns plus indexed fields as physical columns; extra fields in a JSON data column. Uses FilingResolver to restore Filing subclass from _filing_class (FQCN).

Constructor

Catalog(db_file_path: str, resolver: FilingResolver | None = None) -> Catalog
  • db_file_path: Path to DuckDB file.
  • resolver: Used to resolve _filing_class to a class on get/search. Defaults to default_resolver if None.

Methods

index

index(filing: Filing) -> None

Inserts or replaces one filing. Adds physical columns for any new indexed fields of the filing’s class. Raises CatalogRequiredValueError if a core required value is missing/empty.

index_batch

index_batch(filings: list[Filing]) -> None

Bulk insert/replace. Ensures all indexed columns exist, then inserts in one batch.

get

get(id: str) -> Filing | None

Returns Filing instance (subclass resolved via resolver) or None.

get_raw

get_raw(id: str) -> dict[str, Any] | None

Returns merged dict (physical columns + data JSON) or None. No Filing instantiation.

search(
expr: Expr | None = None,
limit: int = 100,
offset: int = 0,
order_by: str = "created_at",
desc: bool = True,
) -> list[Filing]

WHERE from expr, ORDER BY order_by(列または json_extract), LIMIT/OFFSET。expr が Python の bool のときは CatalogExprTypeError。式のコンパイルは Collection Search を参照。

search_raw

search_raw(sql: str, params: list[Any] | None = None) -> list[Any]

Executes raw SQL and returns fetched rows. Advanced use.

count

count(expr: Expr | None = None) -> int

expr に一致する行数(expr is None なら全件)。expr が Python の bool のときは CatalogExprTypeError。式のコンパイルは Collection Search を参照。

stats

stats() -> dict[str, Any]

Returns dict with keys e.g. total, sources, earliest, latest.

clear

clear() -> None

Deletes all rows.

close

close() -> None

Closes the DuckDB connection.