Skip to content

Conversation

@alejandro-colomar
Copy link
Collaborator

@alejandro-colomar alejandro-colomar commented Jan 2, 2026

Reported-by: @stoeckmann
Cc: @eggert


Revisions:

v2
  • Reimplement isascii_c() in a way that only evaluates the argument once, which is more robust, and allows implementing them as one-liner macros, more compactly.
$ git rd 
 1:  43fa84b9c !  1:  a22f8b4b0 lib/string/ctype/isascii.[ch]: is*_c(): functions
    @@ Metadata
     Author: Alejandro Colomar <[email protected]>
     
      ## Commit message ##
    -    lib/string/ctype/isascii.[ch]: is*_c(): functions
    +    lib/string/ctype/isascii.[ch]: is*_c(): Add APIs
     
         These are like the isascii(3) family of APIs, but use the C locale, as
         the _c suffix hints.
     
    -    These functions behave well with non-casted input, unlike isascii(3).
    +    These macros behave well with non-casted input, unlike isascii(3).
    +
    +    The isascii_c() and iscntrl_c() implementations are different from the
    +    rest because they must return true for '\0'.
     
         Reported-by: Tobias Stoeckmann <[email protected]>
         Cc: Paul Eggert <[email protected]>
    @@ lib/string/ctype/isascii.c (new)
     +#include "config.h"
     +
     +#include "string/ctype/isascii.h"
    -+
    -+#include <stdbool.h>
    -+
    -+
    -+extern inline bool iscntrl_c(unsigned char c);
    -+extern inline bool islower_c(unsigned char c);
    -+extern inline bool isupper_c(unsigned char c);
    -+extern inline bool isdigit_c(unsigned char c);
    -+extern inline bool ispunct_c(unsigned char c);
    -+extern inline bool isspace_c(unsigned char c);
    -+extern inline bool isalpha_c(unsigned char c);
    -+extern inline bool isalnum_c(unsigned char c);
    -+extern inline bool isgraph_c(unsigned char c);
    -+extern inline bool isprint_c(unsigned char c);
    -+extern inline bool isxdigit_c(unsigned char c);
    -+extern inline bool isascii_c(unsigned char c);
     
      ## lib/string/ctype/isascii.h (new) ##
     @@
    @@ lib/string/ctype/isascii.h (new)
     +
     +#include "config.h"
     +
    -+#include <ctype.h>
    -+#include <stdbool.h>
     +#include <string.h>
     +
    ++#include "string/strcmp/streq.h"
    ++
     +
     +#define CTYPE_CNTRL_C                                                      \
     +  "\x1F\x1E\x1D\x1C\x1B\x1A\x19\x18\x17\x16\x15\x14\x13\x12\x11\x10" \
    @@ lib/string/ctype/isascii.h (new)
     +#define CTYPE_ASCII_C   CTYPE_PRINT_C CTYPE_CNTRL_C /*NUL*/
     +
     +
    -+inline bool iscntrl_c(unsigned char c);   // iscntrl_c - is [:cntrl:] C-locale
    -+inline bool islower_c(unsigned char c);   // islower_c - is [:lower:] C-locale
    -+inline bool isupper_c(unsigned char c);   // isupper_c - is [:upper:] C-locale
    -+inline bool isdigit_c(unsigned char c);   // isdigit_c - is [:digit:] C-locale
    -+inline bool ispunct_c(unsigned char c);   // ispunct_c - is [:punct:] C-locale
    -+inline bool isspace_c(unsigned char c);   // isspace_c - is [:space:] C-locale
    -+inline bool isalpha_c(unsigned char c);   // isalpha_c - is [:alpha:] C-locale
    -+inline bool isalnum_c(unsigned char c);   // isalnum_c - is [:alnum:] C-locale
    -+inline bool isgraph_c(unsigned char c);   // isgraph_c - is [:graph:] C-locale
    -+inline bool isprint_c(unsigned char c);   // isprint_c - is [:print:] C-locale
    -+inline bool isxdigit_c(unsigned char c);  // isxdigit_c - is [:xdigit:] C-locale
    -+inline bool isascii_c(unsigned char c);   // isascii_c - is [:ascii:] C-locale
    -+
    -+
    -+inline bool iscntrl_c(unsigned char c)
    -+{
    -+  return strchr(CTYPE_CNTRL_C, c);
    -+}
    -+inline bool islower_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_LOWER_C, c);
    -+}
    -+inline bool isupper_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_UPPER_C, c);
    -+}
    -+inline bool isdigit_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_DIGIT_C, c);
    -+}
    -+inline bool ispunct_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_PUNCT_C, c);
    -+}
    -+inline bool isspace_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_SPACE_C, c);
    -+}
    -+inline bool isalpha_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_ALPHA_C, c);
    -+}
    -+inline bool isalnum_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_ALNUM_C, c);
    -+}
    -+inline bool isgraph_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_GRAPH_C, c);
    -+}
    -+inline bool isprint_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_PRINT_C, c);
    -+}
    -+inline bool isxdigit_c(unsigned char c)
    -+{
    -+  return c && strchr(CTYPE_XDIGIT_C, c);
    -+}
    -+inline bool isascii_c(unsigned char c)
    -+{
    -+  return strchr(CTYPE_ASCII_C, c);
    -+}
    ++// isascii_c - is [:ascii:] C-locale
    ++#define isascii_c(c)   strchr(CTYPE_ASCII_C, c)
    ++#define iscntrl_c(c)   strchr(CTYPE_CNTRL_C, c)
    ++#define islower_c(c)   (!streq(strchrnul(CTYPE_LOWER_C, c), ""))
    ++#define isupper_c(c)   (!streq(strchrnul(CTYPE_UPPER_C, c), ""))
    ++#define isdigit_c(c)   (!streq(strchrnul(CTYPE_DIGIT_C, c), ""))
    ++#define ispunct_c(c)   (!streq(strchrnul(CTYPE_PUNCT_C, c), ""))
    ++#define isspace_c(c)   (!streq(strchrnul(CTYPE_SPACE_C, c), ""))
    ++#define isalpha_c(c)   (!streq(strchrnul(CTYPE_ALPHA_C, c), ""))
    ++#define isalnum_c(c)   (!streq(strchrnul(CTYPE_ALNUM_C, c), ""))
    ++#define isgraph_c(c)   (!streq(strchrnul(CTYPE_GRAPH_C, c), ""))
    ++#define isprint_c(c)   (!streq(strchrnul(CTYPE_PRINT_C, c), ""))
    ++#define isxdigit_c(c)  (!streq(strchrnul(CTYPE_XDIGIT_C, c), ""))
     +
     +
     +#endif  // include guard
 2:  843a79718 =  2:  9609c9283 lib/, src/: Use isascii_c() functions instead of isascii(3)
 3:  2e1cdda17 =  3:  3af861a40 lib/: Merge directories "lib/string/ctype/*" into unified files
 4:  5b808a921 =  4:  3f60bf818 lib/string/ctype/strisascii.h: strisprint(): Simplify implementation
 5:  386253773 =  5:  c510529d7 lib/string/ctype/strisascii.h: Compact definitions
 6:  2fc54ace4 =  6:  9c1ff4d47 lib/: stris*(), strchris*(): Rename C-locale APIs with a _c suffix
 7:  55ba3f24b =  7:  1e98c8787 lib/string/strspn/: stpcspn(): Add API
 8:  21e95bc62 =  8:  1334b5024 lib/string/ctype/strchrisascii.h: Use stpcspn() to simplify
 9:  35a5d391c =  9:  68cbbe330 lib/, src/, lib/string/strspn/: Compact files
v3
  • Convert return value to boolean.
$ git rd 
 1:  a22f8b4b0 !  1:  0c78ce137 lib/string/ctype/isascii.[ch]: is*_c(): Add APIs
    @@ lib/string/ctype/isascii.h (new)
     +
     +
     +// isascii_c - is [:ascii:] C-locale
    -+#define isascii_c(c)   strchr(CTYPE_ASCII_C, c)
    -+#define iscntrl_c(c)   strchr(CTYPE_CNTRL_C, c)
    ++#define isascii_c(c)   (!!strchr(CTYPE_ASCII_C, c))
    ++#define iscntrl_c(c)   (!!strchr(CTYPE_CNTRL_C, c))
     +#define islower_c(c)   (!streq(strchrnul(CTYPE_LOWER_C, c), ""))
     +#define isupper_c(c)   (!streq(strchrnul(CTYPE_UPPER_C, c), ""))
     +#define isdigit_c(c)   (!streq(strchrnul(CTYPE_DIGIT_C, c), ""))
 2:  9609c9283 =  2:  0ae2f2f18 lib/, src/: Use isascii_c() functions instead of isascii(3)
 3:  3af861a40 =  3:  662e21875 lib/: Merge directories "lib/string/ctype/*" into unified files
 4:  3f60bf818 =  4:  fbbf189e5 lib/string/ctype/strisascii.h: strisprint(): Simplify implementation
 5:  c510529d7 =  5:  81513e66f lib/string/ctype/strisascii.h: Compact definitions
 6:  9c1ff4d47 =  6:  6d8bbc30c lib/: stris*(), strchris*(): Rename C-locale APIs with a _c suffix
 7:  1e98c8787 =  7:  eb1bea820 lib/string/strspn/: stpcspn(): Add API
 8:  1334b5024 =  8:  83c735597 lib/string/ctype/strchrisascii.h: Use stpcspn() to simplify
 9:  68cbbe330 =  9:  0d2063a4a lib/, src/, lib/string/strspn/: Compact files
v4
  • Use strpbrk(3) instead of its pattern.
$ git rd 
 1:  0c78ce137 =  1:  0c78ce137 lib/string/ctype/isascii.[ch]: is*_c(): Add APIs
 2:  0ae2f2f18 =  2:  0ae2f2f18 lib/, src/: Use isascii_c() functions instead of isascii(3)
 3:  662e21875 =  3:  662e21875 lib/: Merge directories "lib/string/ctype/*" into unified files
 4:  fbbf189e5 =  4:  fbbf189e5 lib/string/ctype/strisascii.h: strisprint(): Simplify implementation
 5:  81513e66f =  5:  81513e66f lib/string/ctype/strisascii.h: Compact definitions
 6:  6d8bbc30c =  6:  6d8bbc30c lib/: stris*(), strchris*(): Rename C-locale APIs with a _c suffix
 7:  eb1bea820 <  -:  --------- lib/string/strspn/: stpcspn(): Add API
 8:  83c735597 !  7:  347f93174 lib/string/ctype/strchrisascii.h: Use stpcspn() to simplify
    @@ Metadata
     Author: Alejandro Colomar <[email protected]>
     
      ## Commit message ##
    -    lib/string/ctype/strchrisascii.h: Use stpcspn() to simplify
    +    lib/string/ctype/strchrisascii.h: Use strpbrk(3) to simplify
     
         This compacts it into a one-liner, more similar to the strisascii_c()
         functions.
    @@ lib/string/ctype/strchrisascii.h
      #include "config.h"
      
     -#include <stdbool.h>
    --
    ++#include <string.h>
    + 
      #include "string/ctype/isascii.h"
    - #include "string/strcmp/streq.h"
    +-#include "string/strcmp/streq.h"
     -
     -
     -inline bool strchriscntrl_c(const char *s);
    -+#include "string/strspn/stpcspn.h"
      
      
      // strchriscntrl_c - string character is [:cntrl:] C-locale
    @@ lib/string/ctype/strchrisascii.h
     -
     -  return false;
     -}
    -+#define strchriscntrl_c(s)  (!streq(stpcspn(s, CTYPE_CNTRL_C), ""))
    ++#define strchriscntrl_c(s)  (!!strpbrk(s, CTYPE_CNTRL_C))
      
      
      #endif  // include guard
 9:  0d2063a4a <  -:  --------- lib/, src/, lib/string/strspn/: Compact files
v5
  • Simplify strisascii.h, by not special-casing an empty string.
$ git rd 
 1:  0c78ce137 =  1:  0c78ce137 lib/string/ctype/isascii.[ch]: is*_c(): Add APIs
 2:  0ae2f2f18 =  2:  0ae2f2f18 lib/, src/: Use isascii_c() functions instead of isascii(3)
 3:  662e21875 =  3:  662e21875 lib/: Merge directories "lib/string/ctype/*" into unified files
 4:  fbbf189e5 =  4:  fbbf189e5 lib/string/ctype/strisascii.h: strisprint(): Simplify implementation
 5:  81513e66f =  5:  81513e66f lib/string/ctype/strisascii.h: Compact definitions
 6:  6d8bbc30c =  6:  6d8bbc30c lib/: stris*(), strchris*(): Rename C-locale APIs with a _c suffix
 7:  347f93174 =  7:  347f93174 lib/string/ctype/strchrisascii.h: Use strpbrk(3) to simplify
 -:  --------- >  8:  7c37e884f lib/fields.c: valid_field(): Check empty string before strisprint_c()
 -:  --------- >  9:  67d289d35 lib/string/ctype/strisascii.*: Don't special-case ""
v5b
  • Remove unused include.
$ git rd 
 1:  0c78ce137 =  1:  0c78ce137 lib/string/ctype/isascii.[ch]: is*_c(): Add APIs
 2:  0ae2f2f18 =  2:  0ae2f2f18 lib/, src/: Use isascii_c() functions instead of isascii(3)
 3:  662e21875 =  3:  662e21875 lib/: Merge directories "lib/string/ctype/*" into unified files
 4:  fbbf189e5 =  4:  fbbf189e5 lib/string/ctype/strisascii.h: strisprint(): Simplify implementation
 5:  81513e66f =  5:  81513e66f lib/string/ctype/strisascii.h: Compact definitions
 6:  6d8bbc30c =  6:  6d8bbc30c lib/: stris*(), strchris*(): Rename C-locale APIs with a _c suffix
 7:  347f93174 =  7:  347f93174 lib/string/ctype/strchrisascii.h: Use strpbrk(3) to simplify
 8:  7c37e884f =  8:  7c37e884f lib/fields.c: valid_field(): Check empty string before strisprint_c()
 9:  67d289d35 !  9:  554e412e7 lib/string/ctype/strisascii.*: Don't special-case ""
    @@ lib/string/ctype/strisascii.c
     
      ## lib/string/ctype/strisascii.h ##
     @@
    + 
    + #include "config.h"
    + 
    +-#include <stdbool.h>
    +-
    + #include "string/ctype/isascii.h"
    + #include "string/strcmp/streq.h"
      #include "string/strspn/stpspn.h"
      
      
v5c
  • Add white space
$ git rd 
 1:  0c78ce137 =  1:  0c78ce137 lib/string/ctype/isascii.[ch]: is*_c(): Add APIs
 2:  0ae2f2f18 =  2:  0ae2f2f18 lib/, src/: Use isascii_c() functions instead of isascii(3)
 3:  662e21875 =  3:  662e21875 lib/: Merge directories "lib/string/ctype/*" into unified files
 4:  fbbf189e5 =  4:  fbbf189e5 lib/string/ctype/strisascii.h: strisprint(): Simplify implementation
 5:  81513e66f =  5:  81513e66f lib/string/ctype/strisascii.h: Compact definitions
 6:  6d8bbc30c =  6:  6d8bbc30c lib/: stris*(), strchris*(): Rename C-locale APIs with a _c suffix
 7:  347f93174 =  7:  347f93174 lib/string/ctype/strchrisascii.h: Use strpbrk(3) to simplify
 8:  7c37e884f =  8:  7c37e884f lib/fields.c: valid_field(): Check empty string before strisprint_c()
 9:  554e412e7 !  9:  d311b0d09 lib/string/ctype/strisascii.*: Don't special-case ""
    @@ lib/string/ctype/strisascii.h
     -  return !streq(s, "") && streq(stpspn(s, CTYPE_PRINT_C), "");
     -}
     +// strisascii_c - string is [:ascii:] C-locale
    -+#define strisdigit_c(s)  streq(stpspn(s, CTYPE_DIGIT_C), "")
    -+#define strisprint_c(s)  streq(stpspn(s, CTYPE_PRINT_C), "")
    ++#define strisdigit_c(s)   streq(stpspn(s, CTYPE_DIGIT_C), "")
    ++#define strisprint_c(s)   streq(stpspn(s, CTYPE_PRINT_C), "")
      
      
      #endif  // include guard

@alejandro-colomar alejandro-colomar self-assigned this Jan 2, 2026
@alejandro-colomar alejandro-colomar marked this pull request as ready for review January 2, 2026 21:08
@alejandro-colomar alejandro-colomar changed the title Add C-locale variants of <ctype.h> APIs, and use them Add C-locale variants of isascii(3) APIs, and use them Jan 2, 2026
@alejandro-colomar alejandro-colomar force-pushed the isascii_c branch 2 times, most recently from 35a5d39 to 68cbbe3 Compare January 3, 2026 00:36
These are like the isascii(3) family of APIs, but use the C locale, as
the _c suffix hints.

These macros behave well with non-casted input, unlike isascii(3).

The isascii_c() and iscntrl_c() implementations are different from the
rest because they must return true for '\0'.

Reported-by: Tobias Stoeckmann <[email protected]>
Cc: Paul Eggert <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
We want to use the C locale.

Reported-by: Tobias Stoeckmann <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
The APIs defined under each of those subdirs are too similar and related
that it makes more sense to define them in the same files.  (BTW, we
only had one API per subdir, except in one subdir that had two APIs, so
in the end, we have almost the same separation.)

Signed-off-by: Alejandro Colomar <[email protected]>
This also makes it consistent with strisdigit().

Signed-off-by: Alejandro Colomar <[email protected]>
By being closer together, I find them more readable.  The pattern and
the differences are easier to spot.

Signed-off-by: Alejandro Colomar <[email protected]>
This compacts it into a one-liner, more similar to the strisascii_c()
functions.

Since we only use the argument once, we can even turn this into a macro.

Signed-off-by: Alejandro Colomar <[email protected]>
This allows us to not depend on whether strisprint_c("") returns true or
false.

Signed-off-by: Alejandro Colomar <[email protected]>
It is not intuitive or clear what the right behavior should be for an
empty string.  If we define these APIs as "return true if all characters
in the string belong to the specified character set", then an empty
string should return true.  On the other hand, if you ask me if an empty
string is a numeric string, I might naively say no.

It is irrelevant whether we return true or false for an empty string.
All of the callers already handle correctly the case of an empty string.

This makes the implementation simpler, using the argument only once.
This allows implementing these as macros.

Signed-off-by: Alejandro Colomar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants