-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Related to half merged PR: (see #173).
Although, sciunit has great existing methods for scoring collections of models. When neuron network simulators are applied to the task of single-neuron model optimization, 3 notable simulators (PyNN, NEST, SNN.jl [and Brian2]) make it more convenient to model a population of cells (N>1), than a single cell (N=1). Indeed the noted simulators appear to be byte code optimized for evaluating a cell population.
Note there are pre-existing alternative, implementations that solve this problem for example the Numba JIT adaptive exponential and Izhikevich models, mean that you can avoid using network level simulators to run single cell models.
In the context of Genetic Algorithm Optimization a NEST neuron population can conveniently map straight onto a GA chromosome population, so in theory NeuronUnit+GA should be well suited to exploit the efficiency of network simulators to evaluate whole populations.
Unfortunately in practice NEST and PyNN sim platforms are a bad match for the current design pattern of Sciunit/NeuronUnit/BluePyOpt. To clarify collecting models by putting them inside a sciunit model collection container is essentially a separate and different task to collecting models in a NEST/PyNN/SNN.jl population.
To confound matters, very many so-called Python neuron simulators are not Python native. When trying to store collections of external code sim models inside sciunit containers, memory problems ensure, byte code efficiency is poor, and parallel distributed methods are very unlikely to work.
The problem is PyNN and NEST have their own preferred and safe ways to parallelize population model evaluations, but when you try to contain many single cell models (in a sciunit collection) and distribute this code (using dask, which is not what NEST expected). Relative to the external simulator, sciunit container class seems foreign and unsuited, distributing the sciunit container using dask in python is not safe because of nested parallelization. PyNN and NEST don't really support modelling a single cell so much as creating cell populations of size N=1.
I propose a second design of sciunit model collections called SimPopulationModel. In the second design a simulator population is simply contained by a new sciunit class, and it inherits sciunit attributes (RunnableModel) etc.
In the SimPopulationModel sciunit class, model predictions and observations are just regular getter and setter based attributes that have been decoupled from sciunit generate_prediction/extract_feature methods. This means simulator users are completely free to get generate predictions/features, independently from the sciunit infrastructure. This is again necessary to exploit the efficiency of NEST to act on populations fast, which will mean that features are obtained in parallel in a highly NEST/PyNN/SNN.jl specific way.
By the time SimPopulationModel is ready to be judged all the computationally heavy work of feature extraction has been completed, and sciunit is free to do what it does best as predictions (features) and observations have now been provided by other infrastructure.
This method expects users to have pre-existing code that performs something like judge generate prediction. Prediction is then just treated as a settable object attribute, which can be updated. Doing things this way is more amenable to high throughput feature generation. Ie if EFEL generates 100 new features per model.
@all-contributors please add @russelljjarvis for ideas