-
Notifications
You must be signed in to change notification settings - Fork 29
Description
Hi!
I've been working with your IH/IG implementation lately, and doing some experiments with it in an NLP context. What I have noticed is that when I increase the length of my input this is an adverse effect on the convergence of my IH interactions, with respect to the attributions I'm getting with IG.
IG itself converges nicely with respect to the completeness axiom and the model output, but the interaction completeness axiom of section 2.2.1 of your paper does not seem to hold at all in these cases
In this plot you can see that as the input length is increased, the Mean Squared Error between the interactions (summed over the last dimension) and the attributions no longer converges to a reasonable margin of error, with the number of interpolation points for IH on the x-axis (note the log scale on the y):

I tested this on a 1-layer LSTM (very tiny, only 16 hidden units), using the Tensorflow implementation of IH+IG, with a fixed zero-valued baseline (so not using the expectation).
What I was wondering is whether you encountered similar issues when testing your approach on larger models. I see that in Theorem 1 of the paper you touch upon related issues, but that only seems to concern the simply feedforward layer case, and not more complex models like LSTMs.