outbuf.write is too slow

Hi!

Thanks a lot for this exciting Cython extension for avro serialization! It makes my code approximately x2 faster.

But the next iteration of profiling shows the bottleneck in io.py (BytesIO.write) that is used in process of serialization. Probably, I use it in wrong way (please, correct me if so, maybe need to use something else BytestIO):
```
buff = BytestIO()
writer = FastDatumWriter(schema_object)
encder = FastBinaryEncoder(buff)

for dict_record in dict_records:
    buff.seek(0)
    writer.write(dict_record, encoder)
```

If this is correct use case, maybe you could suggest how to use native Cython data structures in fast_binary.pyx (because I'm just newbie in Cython for a while). Then I could create my own fork and try to implement it to avoid using of BytesIO.

Just my own point of view on this "problem" is that the .write() method is invoked for each field of schema. If schema is quiet complex it leads to multiple invoking of .write() method (in my profiling report it takes 50% of execution time). Probably, it's possible to fill some internal Cython data structure (maybe char*) and convert it once to BytesIO in the end.

I will be happy to see any answer from you!
Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

outbuf.write is too slow #17

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

outbuf.write is too slow #17

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions