Redesign Project Structure #22

zkhotanlou · 2025-01-29T03:52:29Z

After asserting that Model.data == data.obj, all instances of data access through the model have been eliminated. This necessitated structural changes to the Model class and a redesign of workflows, particularly in evaluation and benchmarking. The revised approach now explicitly passes data where needed, improving modularity and clarity.

Several deprecated methods, including scatterDataset, scatterDecisionBoundry, and visualizeDatasetAndFixedModel, have been moved to a dedicated deprecated/load_model.py file. These methods were previously used in loadModel.py, which is now commented out. Similarly, the pipelining folder has been deprecated and relocated, as it was only utilized in api/model.py for returning features. Loadmodelfordataset now accepts dataset objects rather than dataset names.
A key highlight of this pull request is the introduction of a more intuitive project workflow. The new flow encapsulates the dataset, model, and recourse method setup, ensuring clean interactions between components:

dataset = DataCatalog(data_name, model_type)
model = ModelCatalog(dataset, model_type, backend)
method = RecourseMethod(dataset, model)
benchmark = Benchmark(method, factuals)
This approach aligns with the decoupling of data and model, simplifying how evaluation and benchmarking pipelines access these components. The method object now serves as the single source for both the model and data, ensuring encapsulation and cleaner dependencies.

amirhk · 2025-01-29T12:36:50Z

Great new workflow! A few questions below:

These methods were previously used in loadModel.py, which is now commented out.

the deprecated code should be removed, not commented.

dataset = DataCatalog(data_name, model_type)

why does the dataset (I suppose data_obj) need information on the model_type?

method = RecourseMethod(dataset, model)

should the method_type not be included as an argument?

benchmark = Benchmark(method, factuals)

question: other than runtime, are there any other metrics that rely on the method itself? it seems cleaner if the method footprint was: performance = Benchmark(method, factuals)

another question, is Benchmark a class (hence starting with upper case B), or is it a method (should start with lower case b)?

zkhotanlou added 9 commits January 12, 2025 19:35

redesign model structure

19b570b

redesign: data folder structure

3d81dc9

redesign: methods structure

d09ff8e

refactor: data flow structure

c39f7c0

refactor: evaluation structure

5331be1

refactor: methods flow structure

423a1ca

refactor: models structure

3048f5c

fix: tools impacted by structure changes

1abcc89

fix: experiments impacted by structure changes

693cc76

zkhotanlou requested review from a4bello and amirhk January 29, 2025 03:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Redesign Project Structure #22

Redesign Project Structure #22

Uh oh!

zkhotanlou commented Jan 29, 2025

Uh oh!

amirhk commented Jan 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Redesign Project Structure #22

Are you sure you want to change the base?

Redesign Project Structure #22

Uh oh!

Conversation

zkhotanlou commented Jan 29, 2025

Uh oh!

amirhk commented Jan 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants