Skip to content

String Function Summary

Yoav Nir edited this page Sep 16, 2021 · 9 revisions

This page lists the two kinds of string functions: those that process the current record, and those that process an input string.

One of the goals of version 0.9 is to reduce duplication by making string functions that accept an input string, treat an elided input string as though it gets the current record, so if length(s) returns the length of the string s, then length() with the first and only argument elided will return the length of the current record, just as if it defaults to length(record())).

So, without further ado, here's the list of functions that work on the current record:

function what it does equivalent with string arg
record() returns the entire record n/a
wordcount() returns count of words words()
wordstart(i) returns the starting position of the i-th word wordindex()
wordend(i) returns the ending position of the i-th word -
wordlen(i) returns the length of the i-th word wordlength()
word(i) returns the i-th word -
wordrange(i,j) returns a substring of the record starting with the i-th word and ending with the j-th -
fieldcount() returns count of fields -
fieldstart(i) returns the starting position of the i-th field -
fieldend(i) returns the ending position of the i-th field -
fieldlen(i) returns the length of the i-th field -
field(i) returns the i-th field -
fieldrange(i,j) returns a substring of the record starting with the i-th field and ending with the j-th -
range(i,j) returns a substring of the record starting with the i-th character and ending with the j-th -
wordwith(substr) returns a word that contains the substring substr -
wordwithidx(substr) returns the index of a word that contains the substring substr -
fieldwith(substr) returns a field that contains the substring substr -
fieldwithidx(substr) returns the index of a field that contains the substring substr -

Functions that work on an input string:

Function What it does Default for str Equivalent
substr(str,start,length) returns a substring of str current record -
pos(needle,str) returns position of needle within str current record -
lastpos(needle,str) returns the last position of needle within str current record -
includes(str, needle...) returns TRUE if str contains any needle current record -
includesall(str, needle...) returns TRUE if str contains all needles current record -
rmatch / rsearch / rreplace regular expression thingie current record -
left / right / center a more specialized substr current record -
x2d / d2x / x2ch / c2x conversion may not be elided -
ucase / lcase / bswap conversion may not be elided -
substitute(str, needle,subst,[max]) in-string substitution may not be elided -
sfield(str,n,[sep]) extracts the n-th field of str may not be elided field
sword(str,n,[sep]) extracts the n-th word of str may not be elided word
lvalue / rvalue(str,[sep]) left/right part of str separated by sep may not be elided -
abbrev(str,s,len) returns TRUE if s is an abbreviation of str may not be elided -
compare(s1,s2,pad) returns index of first mismatched character may not be elided -
copies(str,n) returns n copies of str may not be elided -
delstr/delword(str,start,[length]) deletes a part of the middle of the string may not be elided -
find(str,phrase) returns word number of first occurence of phrase may not be elided -
index(str,needle) returns index of first occurence of needle in str may not be elided -
insert(str,target,pos,len,pad) returns string str with target inserted in the middle may not be elided -
justify/overlay/reverse what it says may not be elided -
length(s) returns the length of the string s may not be elided -
words(s) returns the number of words in the string s may not be elided -

Other functions that may require their string argument to default to the current record:

  • space
  • strip
  • subword (similar to wordrange?)
  • translate
  • verify
  • wordindex (like wordstart)
  • wordlength (like wordlen)
  • wordpos
  • words (like wordcount)
  • countocc (already defaults to current record)

NOTE: sword(s,i) seems to do what word(s,i) does in CMS Pipelines. Perhaps we wish to eliminate sword (and sfield) and harmonize with the way it's done there?

Clone this wiki locally