Documentation revamp #232

luizhsuperti · 2025-03-23T11:42:30Z

Proposal for Documentation Reorganization

linked issue
I was thinking about reorganizing the documentation to improve clarity and usability. Here’s the proposed structure:

Home (README)

Quickstart (Installation + Package Overview: Splitter, Perturbator, and Analysis classes)
Documentation (APIs, details on classes and methods)
Usage Examples (Notebooks with practical examples)
Contribution (Guidelines for contributing)
In my fork,
👉 [I revamped the README to be more appealing to a broader audience
I also modified the Quickstart section to align with this idea

I’d love your feedback on whether the Quickstart section information is accurate—I’m still new to the package, so there might be some errors in definitions or concepts. (The Python notebooks currently in the docs are safe and should still be included in some form.)

For inspiration, I looked at the documentation structures of:

Ambrosia
pysurvival

Why This Change?

Makes it easier for new users to navigate and understand the package.
Provides a clearer structure for future contributors.
The examples section can be refined over time for better clarity and accuracy.
In the future, we could also add a "Stats 101" section for foundational concepts.
Let me know what you think!

luizhsuperti · 2025-03-23T11:43:43Z

Probably is worth to add Experiment analysis scorecards

david26694

Hey @luizhsuperti , thanks for the PR! I've had a quick read, let me know what you think

README.md

david26694 · 2025-03-24T07:29:28Z

README.md

+  - 🏢 **Cluster randomization**  
+  - 🔄 **Switchback experiments**  
+
+### 🛠 **Data Preprocessing**  


I'd remove this section, I don't think the pandas integration is very relevant nor there are tools for data preparation in the lib

see 1st comment

david26694 · 2025-03-24T07:29:49Z

README.md

+- Seamlessly integrates with **Pandas** for streamlined workflows  

+### 📊 **Comprehensive Experiment Analysis**  
+##### **✅ Metrics**  


I'd drop the metrics one for now since it looks like we have a bug (see last issue)

david26694 · 2025-03-24T07:30:41Z

README.md

+- 📌 **Generalized Estimating Equations (GEE)**  
+- 📌 **Mixed Linear Models** for robust inference  
+- 📌 **Ordinary Least Squares (OLS)** and **Clustered OLS** with covariates  
+- 📌 **T-tests** with variance reduction techniques (**CUPED, CUPAC**)  


I'd merge this and the one above, and not mention t-tests since mostly its OLS with covariates, cuped, cupac

david26694 · 2025-03-24T07:31:43Z

README.md

+pip install cluster-experiments
+=======
 # MDE calculation
 mde = npw.mde(df, power=0.8)


for the MDE example, I have to asks: needs to be reproducible (so dataframe needs to be created), and show the methods power_analysis, mde, power_line and mde_line. wdyt?

Definitely what we should do. I'm thinking that, for a 1st time user, we should show the MDE calculation process and scorecard. wdyt?
The other question is where this example should be, in Readme (Home) or quickstart.

Definitely what we should do. I'm thinking that, for a 1st time user, we should show the MDE calculation process and scorecard. wdyt?

yes! in the simplest set-up but yes.

The other question is where this example should be, in Readme (Home) or quickstart.

I really think there's value in a reproducible hello-world example in what all the users see, which is the readme

I think the variance reduction example can go to quickstart instead of readme

david26694 · 2025-03-24T07:32:16Z

docs/quickstart.md

+```
+
+!!! info "Python Version Support"
+    **Cluster Experiments** requires **Python 3.9 or higher**. Make sure your environment meets this requirement before proceeding with the installation.


it's 3.8 I think

david26694 · 2025-03-24T07:33:24Z

README.md

-## Quick Start
-
-### Power Analysis Example
+**`cluster experiments`** is a comprehensive Python library for end-to-end A/B testing workflows, designed for seamless integration with Pandas in production environments.  


designed for seamless integration with Pandas in production environments.
I'd remove any production mention, I don't think it's fair to call this production. "seamless integration" sounds generated by an LLM, do you have a more natural equivalent?

Hey! Finally had time to return. Thank you for the feedback! I'm adding the answers of the other suggestions you made in the Readme file. I modified the following as suggested:

i.I added at the beginning the simulation based / normal approximation support.
ii.For support of complex experimental designs, I kept as it is; I'd keep Variance Reduction Techniques under statistical methods of analysis, to clarify the diff between experimental design and how we evaluate effects.
iii. Deleted t-tests
iv. Deleted metrics
v. Added Scorecards feature, it's a neat feature that it's definitely a plus
vi. Removed pre-processing

vii. Droped the why used it? section (How did you know it was LLM adjusted? :D )

the readme section should be already clear with the examples

It's been a while so the pkg got updated let me know of additional changes

david26694 · 2025-03-24T07:34:49Z

docs/quickstart.md

+Designing and analyzing experiments can feel overwhelming at times. After formulating a testable hypothesis,
+you're faced with a series of routine tasks. From collecting and transforming raw data to measuring the statistical significance of your experiment results and constructing confidence intervals,
+it can quickly become a repetitive and error-prone process.
+*Cluster Experiments* is here to change that. Built on top of well-known packages like `pandas`, `numpy`, `scipy` and `statsmodels`,  it automates the core steps of an experiment, streamlining your workflow, saving you time and effort, while maintaining statistical rigor.


I'd make the paragraph shorter and stress what it automates: being MDE/power calculation and inference scorecards

given the next examples, I think it's worth mentioning that you're describing the simulaiton-based power analysis, and there are other pipelines like power analysis based on normal approximation and scorecard generation

I like the explanation style, maybe you could write a similar thing for NormalPowerAnalysis and AnalysisPlan

david26694 · 2025-03-24T07:42:49Z

docs/quickstart.md

+```python
+from cluster_experiments import TTestClusteredAnalysis
+
+analysis = TTestClusteredAnalysis(


let's use ClusteredOLS, I think this analysis method is a bit weird

david26694 · 2025-03-26T14:12:06Z

Hey @luizhsuperti, I was playing with this and found an issue, all examples are under switchback, whenever you can have a look please :)

luizhsuperti

Hey! Added comments on the suggestions in the Review doc. After that I'm jumping in the QuickStart and finally in the Examples we should have in the pkg

luizhsuperti · 2025-06-16T10:29:42Z

README.md

-## Quick Start
-
-### Power Analysis Example
+**`cluster experiments`** is a comprehensive Python library for end-to-end A/B testing workflows, designed for seamless integration with Pandas in production environments.  


Hey! Finally had time to return. Thank you for the feedback! I'm adding the answers of the other suggestions you made in the Readme file. I modified the following as suggested:

i.I added at the beginning the simulation based / normal approximation support.
ii.For support of complex experimental designs, I kept as it is; I'd keep Variance Reduction Techniques under statistical methods of analysis, to clarify the diff between experimental design and how we evaluate effects.
iii. Deleted t-tests
iv. Deleted metrics
v. Added Scorecards feature, it's a neat feature that it's definitely a plus
vi. Removed pre-processing

vii. Droped the why used it? section (How did you know it was LLM adjusted? :D )

the readme section should be already clear with the examples

It's been a while so the pkg got updated let me know of additional changes

README.md

luizhsuperti · 2025-06-16T10:30:13Z

README.md

+  - 🏢 **Cluster randomization**  
+  - 🔄 **Switchback experiments**  
+
+### 🛠 **Data Preprocessing**  


see 1st comment

luizhsuperti · 2025-06-16T10:37:19Z

README.md

+pip install cluster-experiments
+=======
 # MDE calculation
 mde = npw.mde(df, power=0.8)


Definitely what we should do. I'm thinking that, for a 1st time user, we should show the MDE calculation process and scorecard. wdyt?
The other question is where this example should be, in Readme (Home) or quickstart.

david26694 · 2025-06-20T10:57:20Z

thanks @luizhsuperti , I'll have a look later! be careful with the conflict of mkdocs.yml, you'll need to resolve them

david26694

asked for a general iteration. Keep in mind about the conflicts with main branch!

david26694 · 2025-07-18T11:55:28Z

README.md

-power_line_normal = npw.power_line(df, average_effects=[0.1, 0.2, 0.3])
+### 📌 Experiment Design & Planning
+- **Power analysis** and **Minimal Detectable Effect (MDE)** estimation
+  - **Normal Approximation (CLT-based)**: Fast, analytical formulas assuming approximate normality


I think we need to add more spaces to render inside of the above, or asterisks

david26694 · 2025-07-18T11:56:30Z

README.md

+  - **Normal Approximation (CLT-based)**: Fast, analytical formulas assuming approximate normality
+    - Best for large sample sizes and standard A/B tests
+  - **Monte Carlo Simulation**: Empirically estimate power or MDE by simulating many experiments
+    - Ideal for complex or non-standard designs (e.g., clustering, non-normal outcomes)


Suggested change

- Ideal for complex or non-standard designs (e.g., clustering, non-normal outcomes)

- Ideal for complex or non-standard designs (e.g., small samples, complex effect distribution)

david26694 · 2025-07-18T11:57:09Z

README.md

+
+`cluster experiments` empowers analysts and data scientists with **scalable, reproducible, and statistically robust** A/B testing workflows.
+
+🔗 **Get Started:** [Documentation Link]


missing a link?

david26694 · 2025-07-18T11:58:19Z

README.md

+pip install cluster-experiments
+=======
 # MDE calculation
 mde = npw.mde(df, power=0.8)


Definitely what we should do. I'm thinking that, for a 1st time user, we should show the MDE calculation process and scorecard. wdyt?

yes! in the simplest set-up but yes.

The other question is where this example should be, in Readme (Home) or quickstart.

I really think there's value in a reproducible hello-world example in what all the users see, which is the readme

david26694 · 2025-07-18T11:59:34Z

README.md

+pip install cluster-experiments
+=======
 # MDE calculation
 mde = npw.mde(df, power=0.8)


I think the variance reduction example can go to quickstart instead of readme

david26694 · 2025-07-18T12:00:04Z

mkdocs.yml

-      - Dimension: api/dimension.md
-      - Hypothesis Test: api/hypothesis_test.md
-      - Analysis Plan: api/analysis_plan.md
+  - Home: ../README.md


not found, let's revert to index.md

david26694 · 2025-07-18T12:04:01Z

docs/quickstart.md

+
+You can install **Cluster Experiments** via pip:
+
+```bash


I recommend adding simpler examples, like dictionary-based inputs in the quickstart. Not saying that we should remove what we have, but I'd also add the simplest use of the library

david26694 · 2025-07-18T12:04:16Z

mkdocs.yml

+        - Pre experiment outcome model: api/cupac_model.md
+        - Power config: api/power_config.md
+        - Power analysis: api/power_analysis.md
+        - Washover: api/washover.md


can this move into switchback?

david26694 · 2025-07-18T12:04:30Z

mkdocs.yml

+        - Power config: api/power_config.md
+        - Power analysis: api/power_analysis.md
+        - Washover: api/washover.md
+        - Metric: api/metric.md


can you group all of these into experiment analysis?

david26694 · 2025-07-18T12:05:06Z

mkdocs.yml

+      - Synthetic control: synthetic_control.ipynb
+      - Delta Method Analysis: delta_method.ipynb
+      - Experiment analysis workflow: experiment_analysis.ipynb
+  - Contribute:


either we create this docs or we remove this, right?

luizhsuperti · 2025-11-01T15:52:23Z

@david26694 changed readme and quickstart (it's in .md, but maybe a notebook is better?) notebooks are in the docs but maybe should move to docs/examples and then we rework them there.

Reformulate the home and quickstart, as well as the examples and API reference (Examples will probably need to be evaluated case by case)

codecov-commenter · 2025-12-04T12:24:30Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.51%. Comparing base (ca58612) to head (601ec18).
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #232   +/-   ##
=======================================
  Coverage   96.51%   96.51%           
=======================================
  Files          17       17           
  Lines        1809     1809           
=======================================
  Hits         1746     1746           
  Misses         63       63

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

david26694

it looks good, requested some last changes

david26694 · 2025-12-04T12:23:40Z

mkdocs.yml

-      - Hypothesis Test: api/hypothesis_test.md
-      - Analysis Plan: api/analysis_plan.md
+
+  - Contributing: CONTRIBUTING.md


when executing in local, found a broken link in here

david26694 · 2025-12-04T12:25:44Z

README.md

+- **Power Analysis & Sample Size Calculation**
+  - Simulation-based (Monte Carlo) for any design complexity
+  - Analytical, (CLT-based) for standard designs
+  - Minimal Detectable Effect (MDE) estimation


Suggested change

- Minimal Detectable Effect (MDE) estimation

- Minimum Detectable Effect (MDE) estimation

david26694 · 2025-12-04T12:26:13Z

README.md


-print(power, power_line_normal, power_normal, mde, mde_timeline)
+### **Experiment Design**
+- **Power Analysis & Sample Size Calculation**


this is not rendering well, it's just highlighting, I think you need more line breaks

david26694 · 2025-12-04T12:26:57Z

README.md

+
+- **Variance Reduction Techniques**
+  - CUPED (Controlled-experiment Using Pre-Experiment Data)
+  - CUPAC (CUPED with Pre-experiment Aggregations)


CUPAC is Control Using Predictions As Covariates

david26694 · 2025-12-04T12:28:13Z

README.md

 )
+print(power_curve)
+# Tip: You can plot this using matplotlib:
+# plt.plot(power_curve['average_effect'], power_curve['power'])


perhaps you can do it without matplotlib, like power_curve.plot(), would that work and have less dependencies?

david26694 · 2025-12-04T12:28:52Z

README.md

+    historical_data,
+    average_effects=[2.0, 4.0, 6.0, 8.0, 10.0]
 )
+print(power_curve)


could you print with less decimals?

david26694 · 2025-12-04T12:31:06Z

docs/quickstart.md

+
+```python
+analysis_plan = AnalysisPlan.from_metrics_dict({
+    'metrics': [...],


what are your thoughts on having code that will run in here? perhaps more verbose but better if user needs to copy/paste

david26694 · 2025-12-04T12:31:35Z

docs/quickstart.md

+
+### 2.1. MDE
+
+Calculate the Minimum Detectable Effect (MDE) for a given sample size ($),  $/alpha$ and $\beta$. parameters.


alpha and beta are not rendering well

david26694 · 2025-12-04T12:31:52Z

docs/quickstart.md

+})
+
+mde = power_analysis.mde(historical_data, power=0.8)
+print(f"Minimum Detectable Effect: {mde}")


wdyt about less decimals?

david26694 · 2025-12-04T12:31:56Z

docs/quickstart.md

+
+```python
+power = power_analysis.power_analysis(historical_data, average_effect=3.5)
+print(f"Power: {power}")


wdyt about less decimals?

david26694 · 2025-12-10T09:05:42Z

another thing: let's add all your .md files in test_docs.py, this way we have tests for more docs

luizhsuperti and others added 5 commits February 8, 2025 17:03

Update .gitignore

5c6558c

Revamp documentation

b6babcc

Add API subfolder

6363d80

Merge branch 'david26694:main' into pilot-branch

aa04ca8

Merge branch 'main' into documentation-revamp

a9673c0

david26694 requested changes Mar 24, 2025

View reviewed changes

david26694 linked an issue Mar 26, 2025 that may be closed by this pull request

Reorg docs #225

Open

Fix typos, white spaces and clarified README as per reviewer feedback

ba44999

luizhsuperti commented Jun 16, 2025

View reviewed changes

david26694 requested changes Jul 18, 2025

View reviewed changes

luizhsuperti added 4 commits November 1, 2025 15:09

Quickstart and readme revamp

3496049

revamp v2. readme

a28517d

update simple examples

55ac747

rollback gitignore

0a4af8b

luizhsuperti requested a review from david26694 November 1, 2025 15:49

luizhsuperti added 2 commits November 1, 2025 16:53

Merge branch 'main' into documentation-revamp

9ef3f6e

quickstart-readme-structure revamped

601ec18

Reformulate the home and quickstart, as well as the examples and API reference (Examples will probably need to be evaluated case by case)

david26694 requested changes Dec 4, 2025

View reviewed changes

	- Ideal for complex or non-standard designs (e.g., clustering, non-normal outcomes)
	- Ideal for complex or non-standard designs (e.g., small samples, complex effect distribution)


		`cluster experiments` empowers analysts and data scientists with scalable, reproducible, and statistically robust A/B testing workflows.

		🔗 Get Started: [Documentation Link]

	- Minimal Detectable Effect (MDE) estimation
	- Minimum Detectable Effect (MDE) estimation


		### 2.1. MDE

		Calculate the Minimum Detectable Effect (MDE) for a given sample size ($), $/alpha$ and $\beta$. parameters.

Documentation revamp #232

Are you sure you want to change the base?

Documentation revamp #232

Uh oh!

Conversation

luizhsuperti commented Mar 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposal for Documentation Reorganization

Home (README)

Why This Change?

Uh oh!

luizhsuperti commented Mar 23, 2025

Uh oh!

david26694 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david26694 commented Mar 26, 2025

Uh oh!

luizhsuperti left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david26694 commented Jun 20, 2025

Uh oh!

david26694 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

luizhsuperti commented Mar 23, 2025 •

edited

Loading

luizhsuperti left a comment •

edited

Loading