Skip to content

Add ability to specify text encoding or disable transcoding #39

@ofajardo

Description

@ofajardo

hi

It was reported here in pyreadr that trying to open this file raises the following error:

Unable to convert string to the requested encoding (invalid byte sequence)

i.e RDATA_ERROR_CONVERT_BAD_STRING

Looking at the first 30 bytes of the files I got the impression the file is in CP1252 (maybe I am looking at a completely wrong pace, I actually don't know how this file is structured):

RDX3\nX\n\x00\x00\x00\x03\x00\x03\x06\x01\x00\x03\x05\x00\x00\x00\x00\x06CP1252\x00

Looking at the source code I was expecting to get RDATA_ERROR_UNSUPPORTED_CHARSET instead. Maybe librdata is not extracting the encoding correctly for this file?

And actually, would it be possible to support non UTF-8 files?

thanks!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions