-
-
Notifications
You must be signed in to change notification settings - Fork 303
TTM Calculations & XBRL standardizations #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Complete XBRL standardization system for extracting financial statement data from SEC filings using ML-learned canonical concept mappings. Major Components: - 3 financial statement schemas (income, balance sheet, cash flow) - 3 extraction scripts with robust fallback logic - ML audit tool for semantic validation - 36 ML-learned data files (global + 3 sectors) - Comprehensive documentation and compliance reports Features: - ML-optimized concept ordering (totals before components) - Sector-specific overlays (banking, insurance, utilities) - Industry-aware rule prioritization - Component vs total semantic validation - 200+ XBRL concept mappings Testing: - Validated on AAPL, BAC, MSFT - Income statement: 89.5% extraction rate - Balance sheet: 93.5% extraction rate - Cash flow: 74.1% extraction rate Soft Fork Compliance: - All code in quant/ folder - Edgar repo clean (no modifications) - ML training enhancements as optional patch file - Zero coupling to edgar repo internals Additional Changes: - Clean up obsolete test files and financial templates - Update Claude agent documentation - Rename soft_fork.md agent 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Replace initial debug tests with comprehensive validation infrastructure for XBRL standardization mappings. Changes: - Remove: 5 debug/prototype test files - Add: Validation framework with comparison tools - Add: API wrappers for SEC and Nasdaq data sources - Add: Cache system for 15 test companies (AAPL, BAC, NVDA, etc.) - Add: Individual validation reports per company New Test Infrastructure: - compare_all.py: Batch validation across test companies - compare_financials.py: Compare SEC vs Nasdaq data - validate_mappings.py: XBRL mapping accuracy validation - nasdaq_api.py, sec_api.py: Data source wrappers - fetch_fresh_nasdaq.py: Refresh test data cache Test Coverage: - 15 companies across diverse industries - Cache files for repeatable testing - Individual reports tracking extraction accuracy - Support for both annual and quarterly filings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Implemented all critical and high-priority fixes from analyst review: Critical Fixes: - Remove basic/diluted shares cross-contamination (prevents incorrect EPS) - Add correct IFRS diluted shares concept - Remove EBIT fallback that incorrectly used pretaxIncome Structural Improvements: - Split 6 rules that mixed selectAny + computeAny for clarity - Fix REIT revenue to prevent double-counting rental income New Fields Added (6): - stockBasedCompensation: Major expense line, increasingly material - ebitda: Most common valuation metric - interestIncome: Standalone field for interest income - nonControllingInterest: Required for parent-level accuracy - incomeFromEquityMethodInvestments: JV/associate income - discontinuedOperations: Material impact on net income Schema Changes: - Version: 2026-01-02.1 → 2026-01-02.2 - Fields: 19 → 25 (+6 new fields) - Lines: 964 → 1110 (+146 lines) - Updated schema notes with clarifications Impact: - Prevents incorrect EPS calculations from wrong share counts - Clearer rule evaluation with separated select vs compute logic - Comprehensive coverage of material income statement items 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Critical Fixes (4): - Remove PPE gross fallback (prevented $50M+ overstatements) - Remove inventory gross fallback (prevented reserve overstatements) - Remove commonStock APIC-inclusive concept (prevented double-counting) - Add IFRS treasury stock concept (ifrs-full:TreasuryShares) High Priority Fixes (3): - Split longTermInvestments mixed rule (direct 155, compute 150) - Split shortTermDebt mixed rule (direct 155, compute 150) - Split deposits mixed rule (direct 155, compute 150) New Fields (8): - prepaidExpenses (current assets) - operatingLeaseRightOfUseAsset (ASC 842/IFRS 16) - deferredTaxAssets (non-current assets) - deferredRevenue (current liabilities) - accruedExpenses (current liabilities) - operatingLeaseLiability (ASC 842/IFRS 16) - preferredStock (equity) - nonControllingInterest (equity) Schema Version: 2026-01-02.1 → 2026-01-02.2 Documentation: BALANCE_SHEET_FIXES_2026-01-02.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Critical Fixes (4): - Fixed incomplete OCF computation (added changeInOtherOperatingAssets and changeInOtherOperatingLiabilities) - Added freeCashFlow field (OCF - CapEx, most important valuation metric) - Documented period context requirement for cashBeginning/cashEnd - Documented sign conventions for working capital changes and outflows High Priority Fixes (2): - Added short-term debt concepts to proceedsFromDebt and repaymentsOfDebt - Added paymentsOfLeaseObligations field (ASC 842/IFRS 16 required) New Fields (4): - proceedsFromSaleOfPPE (investing inflow) - impairmentCharges (OCF non-cash adjustment) - gainsLossesOnSalesOfAssets (OCF non-cash adjustment) - dividendsPaidToNonControllingInterests (financing outflow) Schema Version: 2026-01-02.1 → 2026-01-02.2 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Added complete documentation for the quant package including: **Documentation Added:** - README.md: Comprehensive guide with TTM methodology and XBRL standardization - Detailed Q4 derivation methods (FY - YTD9, FY - Q1-Q3) - Stock split detection and adjustment algorithms - EPS calculation with split integration - YTD period handling and quarterization - Non-summation metrics (EPS, shares) explanation - 5 practical usage examples - CLAUDE.md: Development guidelines for quant package - README_TEST_RESULTS.md: Testing results and issue documentation **Code Changes:** - Renamed extractors/is.py → ic.py (avoid Python reserved keyword) - Updated all imports to use ic module **Testing:** - test_readme_examples.py: Comprehensive test suite (10 examples, all passing) - test_xbrl_example.py: Quick XBRL validation script **Key Fixes:** - XBRL examples now use correct us-gaap: prefix for concepts - All Python examples tested and verified working - Removed outdated cache and report files **Impact:** Provides developers, investors, and analysts with complete documentation for TTM calculations and XBRL standardization functionality. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
- Add 536-line README.md with feature overview, API reference, usage examples - Document LLM optimization (70% token reduction), filing metadata extraction - Include 20+ practical code examples and troubleshooting guide - Demonstrate integration with EdgarTools and anthropic SDK fix(quant): Improve code quality - docstrings and import sorting - Add/fix docstrings and type annotations in core.py, utils.py, entity_facts_wrapper.py - Fix missing __init__ docstring in QuantEntityFacts - Complete argument descriptions for all public methods - Sort imports in xbrl_standardize tools and extractors - Fix unused variable in _prepare_quarterly_facts (rename to _) - Improve markdown helpers module documentation These changes address CodeFactor violations and improve API documentation. 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <[email protected]>
|
Thanks for this PR. I have been backed up but now will start to review |
Thank you for efforts enveloping Edgartools, it is amazing. if you need any ticker's financial data to validate your research I would be happy to provide it |
|
Thank you for this comprehensive PR! The TTM calculations, Q4 derivation, and stock split adjustment logic are exactly what EdgarTools needs. However, we have concerns about merging the full PR:
Proposed path forward: We'd like to extract and integrate the TTM, Q4, and split adjustment code into the core Would you be open to:
We're happy to credit you as co-author on the commits. Regarding the standardization approach - we'd love to discuss how your sector-aware mappings could enhance our existing system. Perhaps we could collaborate on that as a separate effort? |
|
Hi, my intention of this PR is just for test and explore the new feature. |
Beta extensions for EdgarTools enabling cross-company analysis from SEC XBRL data—TTM calculations (including Q4 derivation from FY/YTD filings), stock split detection + retrospective adjustment for per-share metrics and share counts, and XBRL standardization that maps 1,000+ concepts into a compact set of standardized fields with sector-aware overrides.
Includes a drop-in
QuantCompanyAPI,TTMCalculator, schema-based extractors, plus utilities and tests (pytest + real-company validation scripts). This release is beta / tested, not production-ready.more details in READ.ME file