Implementation overview
Summary of the current implementation by boundary. See Codebase for file locations.
Collection boundary (Facade)
Collection(collection/collection.py) — Single entry for add/get/search. Delegates to Storage, Catalog, and Locator.- add(filing, content): Checks checksum, resolves path via Locator, indexes via Catalog (skip if same id), saves via Storage. Returns
(filing, path). - get / get_filing / get_content / get_path: Catalog for metadata, Locator for path, Storage for content.
- search(expr, limit, offset, order_by, desc): Delegates to
Catalog.search. Uses Expr for WHERE.
- add(filing, content): Checks checksum, resolves path via Locator, indexes via Catalog (skip if same id), saves via Storage. Returns
- Default setup: If
storage/catalogare not provided, uses.fino/collectionunder CWD with LocalStorage and Catalog(index.db).
Catalog and indexing
Catalog(collection/catalog.py) — DuckDB-backed index. Core columns:id,source,checksum,name,is_zip,format,created_at,_filing_class,data(JSON). Indexed fields from Filing are added as physical columns.FilingResolver(collection/filing_resolver.py) — Maps_filing_class(FQCN) to Filing subclass for restore. Built-in: register; fallback: dynamic import.default_resolveris used by Catalog;EDINETFiling,EdgarFiling, andEdgarCompanyFactsFilingare registered in__init__.py.register_filing_class— Convenience that registers a class withdefault_resolver.
Storage and path
Storage(collection/storage.py) — Protocol:save(content, storage_key?),load_by_path(relative_path),base_dir.LocalStorage(collection/storages/local.py) — Saves underbase_dir;storage_keyis optional (default behavior for compatibility).Locator(collection/locator.py) — Strategy: Filing → relative path. Default:{source}/{id}{suffix}; suffix fromformatoris_zip(e.g..xbrl,.zip).
Filing boundary
Filing(filing/filing.py) — Document model with metaclassFilingMeta. Core fields: id, source, checksum, name, is_zip, format, created_at. Identity (for id generation) = core fields minus id/created_at/checksum plus any user-defined fields. Supportsto_dict/from_dict,get_indexed_fields(), etc.Field(filing/field.py) — Descriptor and query DSL (e.g.Field("x") == 1,in_(),between()). Used in model definitions and in Expr.Expr(filing/expr.py) — WHERE abstraction:sql+params. Supports&,|,~. Catalog compiles to DuckDB.EDINETFiling/EdgarFiling/EdgarCompanyFactsFiling— Built-in subclasses with source-specific fields; registered ondefault_resolver.
Collector boundary
BaseCollector(collector/base.py) — Template Method:iter_collect()yields per item from_fetch_documents()→_parse_response(meta)→_build_filing(parsed, content)→add_to_collection(filing, content);collect()islist(iter_collect(...)). Subclasses implement the three abstract methods.RawDocument—content: bytes,meta: dict. One item per fetch.Parsed—dict[str, Any]; intermediate before building a Filing.- Edgar (
collector/edgar/): 共有 EdgarConfig, EdgarClient, _helpers。documents/・bulk/→ EdgarArchiveCollector / EdgarBulkCollector(EdgarFiling);facts/→ EdgarFactsCollector(EdgarCompanyFactsFiling)。
Errors
- Core:
FinoFilingException(core/error.py). - Filing:
FilingRequiredError,FieldValidationError,FieldImmutableError,FilingValidationError,FilingImmutableError(filing/error.py). - Collection:
CollectionChecksumMismatchError,LocatorPathResolutionError,CatalogRequiredValueError,CatalogAlreadyExistsError(collection/error.py).
Public API and exception behavior are specified in Spec: Exception.