The sample level prediction function could be incorrect? (correct me if Im wrong)

Hi there,

Thank you for your work! It's lot's of help. 

But I think this code has some discrepancy with the original paper and original theano implementation and may lead to error. In original paper and code, in Sample Level prediction, sample input is partitioned into overlapping frames with length frame_size. For example, if the seq_input is (batch, seq_len), sample level input would consist of seq_input[:, 0:frame_size], seq_input[:, 1:frame_size+1], seq_input[:, 2:frame_size+2]... As a result sample level input would have shape [total_number_of_overlapping_frames(batch*seq_len), frame_size].  In the original theano implemention, function images2neibs did the work, you can find it here: https://github.com/soroushmehr/sampleRNN_ICLR2017/blob/2a3dbdf9eb00f03e64adf58e6780e2a48b9ff6dc/models/two_tier/two_tier.py#L394

I am confused whether this has been implemented in the sample_level_prediction function? I found this issue because I cannot generate useful audio when frame_size is other than 2. 

Also please dont hesitate to correct me if I am wrong somewhere. 

Best regards,

Nic

 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The sample level prediction function could be incorrect? (correct me if Im wrong) #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The sample level prediction function could be incorrect? (correct me if Im wrong) #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions