Skip to content

[feature] source encoding #198

@Nikokrock

Description

@Nikokrock

Summary

The idea would be to support only UTF-8 for source encoding, as some other modern languages do, and also restrict the set of allowed characters for identifiers.

Optionally, making the language case-sensitive would be great.

Motivation

The current situation, which relies on switches, can make complete libraries unusable because users use a distinct encoding for their own sources. Both Rust and Go support only UTF-8 for source files.

Restricting the list of possible identifiers would ensure readability and avoid attacks based on confusion. I think the Go approach is probably the sanest: only Unicode letters, digits, and _.

Regarding identifier normalization, the compiler is supposed to use NFKC, but I think it's not fully implemented yet. Switching to case-sensitive identifiers could make the language closer to Go and Rust, and use just NFC (which is simpler, especially if the GoLang set is used for identifiers)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions