Skip to content

Commit ba8680b

Browse files
author
ProofCore Team
committed
Sync project files and updates to new repository
1 parent 4ffa666 commit ba8680b

26 files changed

+798
-46
lines changed

CHANGELOG.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
---
99

10+
## [1.0.3] - 2025-11-06
11+
12+
### Added
13+
- Introduced the Research Benchmark Module (RBM) foundation: loader/parser, cascade validator, metrics helpers, and CLI reporting (`rbm_cli`) with a sample dataset.
14+
- Added Python smoke tests under `backend/tests_rbm/` covering CA proof hooks, cascade evaluation, metrics, and CLI execution.
15+
- Replaced vendored Pyodide assets with a manifest-driven fetch/verify pipeline (`npm run setup:pyodide`, `npm run verify:offline-assets`).
16+
17+
### In Progress
18+
- Extend the manifest workflow with Subresource Integrity hashes and automated dependency scanning (npm audit / pip-audit / OSV).
19+
- Implement Service Worker caching and true lazy loading so Pyodide assets download only when offline features are invoked.
20+
21+
---
22+
1023
## [1.0.2] - 2025-10-24
1124

1225
### [*] Complete Optimization & Offline-First Certification Release
@@ -29,6 +42,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2942
- **Documentation & Versioning**
3043
- `.env.example`, backend configuration, README, and packaging metadata now consistently reflect v1.0.2.
3144
- CHANGELOG and release notes point to the offline-first maintenance scope.
45+
- **Research Benchmark Module (RBM)**
46+
- Introduced `proofcore/research_benchmark` with computer-assisted proof hooks, cascade validator, and metrics helpers.
47+
- Added `rbm_cli` reporting utility and sample dataset `data_examples/sample_set.json` for smoke tests.
48+
- Python smoke tests under `backend/tests_rbm` cover hooks, metrics, and CLI integration.
3249

3350
#### Added
3451

README.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,15 +95,23 @@ npm run dev
9595
### Offline Mode (Default)
9696

9797
- ProofCore ships with `VITE_OFFLINE_MODE=true`, so verification works entirely in-browser without starting the FastAPI backend.
98-
- Run `npm run verify:offline-assets` to confirm Pyodide bundles exist in `public/pyodide` before going air-gapped.
98+
- Run `npm run setup:pyodide` when you need offline Pyodide support; it downloads assets declared in `pyodide-manifest.json`.
99+
- Run `npm run verify:offline-assets` to confirm the downloaded files match the manifest before going air-gapped.
99100
- Remote LLM providers stay disabled unless you set `ENABLE_LLM_PROVIDERS=true` (backend) and provide API keys. Leave them unset for the OSS/offline profile.
100101
- If you do enable networking, also set `VITE_ALLOW_NETWORK=true` (frontend) so guarded fetches can reach the backend.
101102

102103
#### Vendoring Pyodide for Offline Use
103104

104-
1. Grab the latest Pyodide release tarball from https://github.com/pyodide/pyodide/releases (once, while online).
105-
2. Extract the `pyodide` directory into `public/pyodide/` (keep `pyodide.js`, `packages.json`, 그리고 릴리스 버전에 따라 `pyodide_py.tar` 또는 `python_stdlib.zip`).
106-
3. Run `npm run verify:offline-assets` to double-check the files are in place.
105+
1. Review `pyodide-manifest.json` and list the assets you plan to ship (include hashes or SRI strings when possible).
106+
2. Run `npm run setup:pyodide` to download the files locally under `public/pyodide/`.
107+
3. Run `npm run verify:offline-assets` to confirm the downloaded files match the manifest.
108+
109+
### Research Benchmark Module (RBM)
110+
111+
- `proofcore/research_benchmark/` hosts ProofCore's step-level benchmarking utilities (loader, parser, cascade validator, and metrics).
112+
- Evaluate a dataset via `python -m proofcore.research_benchmark.rbm_cli --dataset proofcore/research_benchmark/data_examples/sample_set.json`.
113+
- Hooks and metrics mirror the CA proof pipeline and will expand to cover IMO-Bench style evaluations (AnswerBench / ProofBench / GradingBench).
114+
- Python smoke tests under `backend/tests_rbm/` keep the RBM stack regression-safe.
107115

108116
### Run Tests
109117

RELEASE_NOTES_v1.0.2.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,12 @@ This maintenance refresh for ProofCore v1.0.2 reinforces the offline-first contr
2626
- Added `pyodide` dependency declaration plus `npm run verify:offline-assets` to confirm required WASM artifacts (`pyodide.js`, `packages.json`, 그리고 `pyodide_py.tar` 또는 `python_stdlib.zip`) are vendored under `public/pyodide/`.
2727
- `public/pyodide/README.md` documents the manual download workflow for air-gapped deployments.
2828

29+
### Research Benchmark Module (RBM)
30+
31+
- Introduced `proofcore/research_benchmark` skeleton with loader/parser, cascade validator, and metrics helpers.
32+
- Added `rbm_cli` for report generation plus `data_examples/sample_set.json` for quick smoke tests.
33+
- Python regression tests (`backend/tests_rbm`) cover hooks, metrics, and CLI execution.
34+
2935
### Dependency Refresh
3036

3137
- Upgraded to `[email protected]`, `[email protected]`, `@mswjs/[email protected]`, and `[email protected]`, clearing previously reported low/moderate CVEs in the stack.

RELEASE_NOTES_v1.0.3.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# ProofCore v1.0.3 – Research Benchmark Module Preview
2+
3+
**Release Date**: 2025-11-06
4+
**Status**: ✅ Production Ready
5+
**License**: MIT
6+
7+
---
8+
9+
## Overview
10+
11+
This release introduces the Research Benchmark Module (RBM) skeleton to ProofCore. RBM standardises step-level dataset ingestion, computer-assisted proof checks, evaluation metrics, and reporting so we can integrate internal corpora and public benchmarks (e.g., IMO-Bench) in a repeatable way. It also kicks off a broader plan to externalise Pyodide assets for security and performance.
12+
13+
---
14+
15+
## Highlights
16+
17+
### Research Benchmark Module (RBM)
18+
19+
- Added `proofcore/research_benchmark/` with loader/parser helpers, a cascade validator that leverages CA proof hooks, and the first metrics helpers (`balanced_scores`, `omega_rbm`).
20+
- Delivered `rbm_cli` to run end-to-end evaluations and emit JSON reports; bundled `data_examples/sample_set.json` for smoke testing.
21+
- Added Python regression suites in `backend/tests_rbm/` covering the CA proof hooks, cascade pipeline, metrics, and CLI execution.
22+
23+
### Pyodide Asset Pipeline
24+
25+
- Repository now ships with an empty `public/pyodide/` plus `pyodide-manifest.json`; assets are fetched on demand.
26+
- Added `npm run setup:pyodide` (fetch) and `npm run verify:offline-assets` (manifest verification) to manage downloads safely.
27+
- Hash verification is supported via manifest entries and should be enabled for production deployments.
28+
29+
### Next Steps
30+
31+
- Expand manifest generation to include Subresource Integrity hashes automatically.
32+
- Integrate dependency scanning (npm audit, pip-audit, OSV) into CI for Pyodide bundles.
33+
- Introduce Service Worker background caching to improve first-use latency.
34+
35+
### Documentation & Versioning
36+
37+
- CHANGELOG, README, and release notes updated to describe RBM usage and Pyodide asset strategy.
38+
- Version bumped to **1.0.3** in `pyproject.toml`, `package.json`, and `setup.py`.
39+
40+
---
41+
42+
## Installation Notes
43+
44+
1. `npm install` (updates lockfile to 1.0.3).
45+
2. `python -m pytest backend/tests_rbm -q --no-cov` (verifies RBM stack).
46+
3. `npm run test -- tests/offline/offline_guarantee.test.ts` (ensures offline hardening remains intact).
47+
48+
---
49+
50+
## Known Issues / Follow Ups
51+
52+
- Manifest entries currently ship without hashes; add SRI/hash values before production deployment.
53+
- Remaining npm audit warnings require larger upgrades (Vite/Vitest majors).
54+
55+
---
56+
57+
## Acknowledgements
58+
59+
Thanks to the ProofCore maintainers for laying the groundwork for research dataset integration and tightening our asset security posture.***

backend/tests_rbm/test_ca_proof.py

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
import pathlib
2+
import sys
3+
4+
ROOT = pathlib.Path(__file__).resolve().parents[2]
5+
if str(ROOT) not in sys.path:
6+
sys.path.insert(0, str(ROOT))
7+
8+
from proofcore.research_benchmark.ca_proof import (
9+
formal_check,
10+
numeric_sanity,
11+
passes_assumption_guard,
12+
)
13+
14+
15+
def test_formal_check_valid_expression():
16+
assert formal_check("x^2 + 2*x + 1") is True
17+
18+
19+
def test_formal_check_rejects_invalid_expression():
20+
assert formal_check("x^2 + ") is False
21+
22+
23+
def test_numeric_sanity_with_safe_expression():
24+
assert numeric_sanity("4 / (2 + 2)") is True
25+
26+
27+
def test_numeric_sanity_detects_invalid_operation():
28+
assert numeric_sanity("1 / 0") is False
29+
30+
31+
def test_assumption_guard_blocks_banned_phrase():
32+
assert passes_assumption_guard("WLOG, assume x > 0") is False
33+
34+
35+
def test_assumption_guard_allows_regular_text():
36+
assert passes_assumption_guard("Assume x > 0 for contradiction.") is True

backend/tests_rbm/test_cli.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import json
2+
import pathlib
3+
import subprocess
4+
import sys
5+
6+
ROOT = pathlib.Path(__file__).resolve().parents[2]
7+
if str(ROOT) not in sys.path:
8+
sys.path.insert(0, str(ROOT))
9+
10+
11+
def test_cli_generates_report():
12+
dataset = ROOT / "proofcore" / "research_benchmark" / "data_examples" / "sample_set.json"
13+
result = subprocess.run(
14+
[sys.executable, "-m", "proofcore.research_benchmark.rbm_cli", "--dataset", str(dataset)],
15+
capture_output=True,
16+
text=True,
17+
check=True,
18+
)
19+
payload = json.loads(result.stdout)
20+
assert "metrics" in payload
21+
assert payload["metrics"].get("omega", 0.0) >= 0.0

package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@proofcore/engine",
3-
"version": "1.0.2",
3+
"version": "1.0.3",
44
"description": "ProofCore - Hybrid Mathematical Proof Verification Engine",
55
"type": "module",
66
"scripts": {
@@ -20,6 +20,7 @@
2020
"api:generate:local": "openapi-typescript backend/openapi.json -o src/api/schema.d.ts && echo 'OpenAPI types generated from local schema'",
2121
"storybook": "storybook dev -p 6006",
2222
"build-storybook": "storybook build",
23+
"setup:pyodide": "node scripts/fetch-pyodide-assets.mjs",
2324
"verify:offline-assets": "node scripts/verify-offline-assets.mjs"
2425
},
2526
"dependencies": {

proofcore/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""ProofCore Python helpers."""
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
"""Research Benchmark Module (RBM) bootstrap."""
2+
3+
from .rbm_loader import load_rbm_set
4+
from .rbm_parser import to_steps
5+
from .rbm_validator import verify_steps_cascade
6+
from .rbm_metrics import balanced_scores, omega_rbm
7+
8+
__all__ = ["load_rbm_set", "to_steps", "verify_steps_cascade", "balanced_scores", "omega_rbm"]

0 commit comments

Comments
 (0)