CHRPAK
Characters and Strings
CHRPAK
is a FORTRAN90 library which
handles characters and strings.
CHRPAK began when I simply wanted to be able to capitalize
a string. Now it has expanded to a number of interesting uses.
Many unusual situations are provided for, including
-
string '31.2' <=> numeric value 31.2;
-
uppercase <=> lowercase;
-
removal of control characters or blanks;
-
sorting, merging, searching.
Many of the routine names begin with the name of the data type they
operate on:
-
B4 - a 4 byte word;
-
CH - a character;
-
CHVEC - a vector of characters;
-
DEC - a decimal fraction;
-
DIGIT - a character representing a numeric digit;
-
I4 - an integer ( kind = 4 );
-
R4 - a real ( kind = 4 );
-
R8 - a real ( kind = 8 );
-
RAT - a ratio I/J;
-
S - a string;
-
SVEC - a vector of strings;
-
SVECI - a vector of strings, implicitly capitalized;
Licensing:
The computer code and data files made available on this web page
are distributed under
the GNU LGPL license.
Languages:
CHRPAK is available in
a C version and
a C++ version and
a FORTRAN90 version and
a MATLAB version and
a Python version.
Related Software and Data:
CAESAR,
a FORTRAN90 library which
can apply a Caesar Shift Cipher to a string of text.
ROT13,
a FORTRAN90 library which
can encipher a string using the ROT13 cipher for letters, and the
ROT5 cipher for digits.
Source Code:
Examples and Tests:
List of Routines:
-
A_TO_I4 returns the index of an alphabetic character.
-
B4_IEEE_TO_R4 converts a 4 byte IEEE word into an R4.
-
B4_IEEE_TO_SEF converts an IEEE real word to S * 2^E * F format.
-
BASE_TO_I4 returns the value of an I4 represented in some base.
-
BINARY_TO_I4 converts a binary representation into an I4.
-
BINARY_TO_R4 converts a binary representation into an R4.
-
BINARY_TO_R8 converts a binary representation into an R8.
-
CH_CAP capitalizes a single character.
-
CH_COUNT_CHVEC_ADD adds a character vector to a character count.
-
CH_COUNT_FILE_ADD adds characters in a file to a character count.
-
CH_COUNT_HISTOGRAM_PRINT prints a histogram of a set of character counts.
-
CH_COUNT_INIT initializes a character count.
-
CH_COUNT_PRINT prints a set of character counts.
-
CH_COUNT_S_ADD adds a character string to a character histogram.
-
CH_EQI is a case insensitive comparison of two characters for equality.
-
CH_EXTRACT extracts the next nonblank character from a string.
-
CH_INDEX_FIRST is the first occurrence of a character in a string.
-
CH_INDEX_LAST is the last occurrence of a character in a string.
-
CH_INDEXI: (case insensitive) first occurrence of a character in a string.
-
CH_IS_ALPHA is TRUE if CH is an alphabetic character.
-
CH_IS_ALPHANUMERIC is TRUE if CH is alphanumeric.
-
CH_IS_CONTROL is TRUE if a character is a control character.
-
CH_IS_DIGIT is TRUE if a character is a decimal digit.
-
CH_IS_FORMAT_CODE is TRUE if a character is a FORTRAN format code.
-
CH_IS_ISBN_DIGIT is TRUE if a character is an ISBN digit.
-
CH_IS_LOWER is TRUE if a character is a lower case letter.
-
CH_IS_PRINTABLE is TRUE if C is printable.
-
CH_IS_SPACE is TRUE if a character is a whitespace character.
-
CH_IS_UPPER is TRUE if CH is an upper case letter.
-
CH_LOW lowercases a single character.
-
CH_NEXT reads the next character from a string, ignoring blanks and commas.
-
CH_NOT_CONTROL = CH is NOT a control character.
-
CH_ROMAN_TO_I4 converts a single Roman digit to an I4.
-
CH_SCRABBLE returns the character on a given Scrabble tile.
-
CH_SCRABBLE_FREQUENCY returns the Scrabble frequency of a character.
-
CH_SCRABBLE_POINTS returns the Scrabble point value of a character.
-
CH_SCRABBLE_SELECT selects a character with the Scrabble probability.
-
CH_SWAP swaps two characters.
-
CH_TO_AMINO_NAME converts a character to an amino acid name.
-
CH_TO_BRAILLE converts an ASCII character to a Braille character string.
-
CH_TO_CH3_AMINO converts a 1 character to a 3 character code for amino acids.
-
CH_TO_DIGIT returns the value of a base 10 digit.
-
CH_TO_DIGIT_BIN returns the value of a binary digit.
-
CH_TO_DIGIT_OCT returns the value of an octal digit.
-
CH_TO_EBCDIC converts a character to EBCDIC.
-
CH_TO_MILITARY converts an ASCII character to a Military code word.
-
CH_TO_MORSE converts an ASCII character to a Morse character string.
-
CH_TO_ROT13 converts a character to its ROT13 equivalent.
-
CH_TO_SCRABBLE returns the Scrabble index of a character.
-
CH_TO_SOUNDEX converts an ASCII character to a Soundex character.
-
CH_TO_SYM returns a printable symbol for any ASCII character.
-
CH_UNIFORM returns a random character in a given range.
-
CH3_TO_CH_AMINO converts a 3 character to a 1 character code for amino acids.
-
CH4_TO_I4 converts a four character string to an I4.
-
CH4_TO_R4 converts a 4 character string to an R4.
-
CH4VEC_TO_I4VEC converts an string of characters into an array of integers.
-
CHR4_TO_8 replaces pairs of hexadecimal digits by a character.
-
CHR8_TO_4 replaces characters by a pair of hexadecimal digits.
-
CHRA_TO_S replaces control characters by printable symbols.
-
CHRASC converts a vector of ASCII codes into character strings.
-
CHRASS "understands" an assignment statement of the form LHS = RHS.
-
CHRCTF reads an integer or rational fraction from a string.
-
CHRCTG reads an integer, decimal fraction or a ratio from a string.
-
CHRCTI2 finds and reads an integer from a string.
-
CHRCTP reads a parenthesized complex number from a string.
-
CHRS_TO_A replaces all control symbols by control characters.
-
CHVEC_PERMUTE permutes a character vector in place.
-
CHVEC_PRINT prints a character vector.
-
CHVEC_REVERSE reverses the elements of a character vector.
-
CHVEC_TO_S converts a character vector to a string.
-
CHVEC2_PRINT prints two vectors of characters.
-
COMMA moves commas left through blanks in a string.
-
DEC_TO_S_LEFT returns a left-justified representation of IVAL * 10^JVAL.
-
DEC_TO_S_RIGHT returns a right justified representation of IVAL * 10**JVAL.
-
DIGIT_BIN_TO_CH returns the character representation of a binary digit.
-
DIGIT_INC increments a decimal digit.
-
DIGIT_OCT_TO_CH returns the character representation of an octal digit.
-
DIGIT_TO_CH returns the character representation of a decimal digit.
-
EBCDIC_TO_CH converts an EBCDIC character to ASCII.
-
EBCDIC_TO_S converts a string of EBCDIC characters to ASCII.
-
FILLCH writes a string into a subfield of a string.
-
FILLIN writes an integer into a subfield of a string.
-
FILLRL writes a real into a subfield of a string.
-
FLT_TO_S returns a representation of MANT * 10**IEXP.
-
FORCOM splits a FORTRAN line into "fortran" and "comment".
-
GET_UNIT returns a free FORTRAN unit number.
-
HEX_DIGIT_TO_I4 converts a hexadecimal digit to an I4.
-
HEX_TO_BINARY_DIGITS converts a hexadecimal digit to 4 binary digits.
-
HEX_TO_I4 converts a hexadecimal string to an I4.
-
HEX_TO_S converts a hexadecimal string into characters.
-
I2_BYTE_SWAP swaps bytes in an 8-byte word.
-
I4_BYTE_SWAP swaps bytes in a 4-byte word.
-
I4_EXTRACT "extracts" an I4 from the beginning of a string.
-
I4_GCD finds the greatest common divisor of I and J.
-
I4_HUGE returns a "huge" I4.
-
I4_INPUT prints a prompt string and reads an I4 from the user.
-
I4_LENGTH computes the number of characters needed to print an I4.
-
I4_MODP returns the nonnegative remainder of I4 division.
-
I4_NEXT "reads" I4's from a string, one at a time.
-
I4_NEXT_READ finds and reads the next I4 in a string.
-
I4_RANGE_INPUT reads a pair of I4's from the user, representing a range.
-
I4_SWAP swaps two I4's.
-
I4_TO_A returns the I-th alphabetic character.
-
I4_TO_AMINO_CODE converts an integer to an amino code.
-
I4_TO_BASE represents an integer in any base up to 16.
-
I4_TO_BINARY produces the binary representation of an I4.
-
I4_TO_BINHEX returns the I-th character in the BINHEX encoding.
-
I4_TO_CH4 converts an I4 to a 4 character string.
-
I4_TO_HEX produces the hexadecimal representation of an I4.
-
I4_TO_HEX_DIGIT converts a (small) I4 to a hexadecimal digit.
-
I4_TO_ISBN_DIGIT converts an I4 to an ISBN digit.
-
I4_TO_MONTH_ABB returns the 3 character abbreviation of a given month.
-
I4_TO_MONTH_NAME returns the name of a given month.
-
I4_TO_NUNARY produces the "base -1" representation of an I4.
-
I4_TO_OCT produces the octal representation of an integer.
-
I4_TO_S_LEFT converts an I4 to a left-justified string.
-
I4_TO_S_RIGHT converts an I4 to a right justified string.
-
I4_TO_S_RIGHT_COMMA converts an I4 to a right justified string with commas.
-
I4_TO_S_ROMAN converts an I4 to a string of Roman numerals.
-
I4_TO_S_ZERO converts an I4 to a string, with zero padding.
-
I4_TO_S32 converts an I4 to an S32.
-
I4_TO_UNARY produces the "base 1" representation of an I4.
-
I4_TO_UUDECODE returns the I-th character in the UUDECODE encoding.
-
I4_TO_XXDECODE returns the I-th character in the XXDECODE encoding.
-
I4_UNIFORM returns a scaled pseudorandom I4.
-
I4VEC_INDICATOR sets an I4VEC to the indicator vector.
-
I4VEC_PRINT prints an I4VEC.
-
I4VEC_TO_CH4VEC converts an I4VEC into a string.
-
IC_TO_IBRAILLE converts an ASCII integer code to a Braille code.
-
IC_TO_IEBCDIC converts an ASCII character code to an EBCDIC code.
-
IC_TO_IMORSE converts an ASCII integer code to a Morse integer code.
-
IC_TO_ISOUNDEX converts an ASCII integer code to a Soundex integer code.
-
IEBCDIC_TO_IC converts an EBCDIC character code to ASCII.
-
ISBN_DIGIT_TO_I4 converts an ISBN digit to an I4.
-
ISTRCMP compares two strings, returning +1, 0, or -1.
-
ISTRNCMP compares the start of two strings, returning +1, 0, or -1.
-
LEN_NONNULL returns the length of a string up to the last non-null character.
-
MALPHNUM2 is TRUE if a string contains only alphanumerics and underscores.
-
MILITARY_TO_CH converts a Military code word to an ASCII character.
-
MONTH_NAME_TO_I4 returns the month number of a given month
-
NAMEFL replaces "lastname, firstname" by "firstname lastname".
-
NAMELF replaces "firstname lastname" by "lastname, firstname".
-
NAMELS reads a NAMELIST line, returning the variable name and value.
-
NEXCHR returns the next nonblank character from a string.
-
NEXSTR returns the next nonblank characters from a string.
-
NUMBER_INC increments the integer represented by a string.
-
OCT_TO_I4 converts an octal string to an I4.
-
PERM_CHECK checks that a vector represents a permutation.
-
PERM_INVERSE3 produces the inverse of a given permutation.
-
PERM_UNIFORM selects a random permutation of N objects.
-
R4_TO_B4_IEEE converts an R4 to a 4 byte IEEE word.
-
R4_TO_BINARY represents an R4 as a string of binary digits.
-
R4_TO_CH4 converts an R4 to a 4 character string.
-
R4_TO_FLT computes the scientific representation of an R4.
-
R4_TO_S_LEFT writes an R4 into a left justified character string.
-
R4_TO_S_RIGHT writes an R4 into a right justified character string.
-
R4_TO_S32 encodes an R4 as 32 characters.
-
R4_TO_SEF represents an R4 as R = S * 2**E * F.
-
R4_UNIFORM_01 returns a unit pseudorandom R4.
-
R8_EXTRACT "extracts" an R8 from the beginning of a string.
-
R8_INPUT prints a prompt string and reads an R8 from the user.
-
R8_NEXT "reads" R8's from a string, one at a time.
-
R8_TO_BINARY represents an R8 as a string of binary digits.
-
R8_TO_S_LEFT writes an R8 into a left justified string.
-
R8_TO_S_LEFT writes an R8 into a right justified string.
-
R8_UNIFORM_01 returns a unit pseudorandom R8.
-
R8VEC_TO_S "writes" an R8VEC into a string.
-
RANGER "understands" a range defined by a string like '4:8'.
-
RAT_TO_S_LEFT returns a left-justified representation of IVAL/JVAL.
-
RAT_TO_S_RIGHT returns a right-justified representation of IVAL/JVAL.
-
S_ADJUSTL flushes a string left.
-
S_ADJUSTR flushes a string right.
-
S_AFTER_SS_COPY copies a string after a given substring.
-
S_ALPHA_LAST returns the location of the last alphabetic character.
-
S_ANY_ALPHA is TRUE if a string contains any alphabetic character.
-
S_ANY_CONTROL is TRUE if a string contains any control characters.
-
S_B2U replaces interword blanks by underscores.
-
S_BEFORE_SS_COPY copies a string up to a given substring.
-
S_BEGIN is TRUE if one string matches the beginning of the other.
-
S_BEHEAD_SUBSTRING "beheads" a string, removing a given substring.
-
S_BLANK_DELETE removes blanks from a string, left justifying the remainder.
-
S_BLANKS_DELETE replaces consecutive blanks by one blank.
-
S_BLANKS_INSERT inserts blanks into a string, sliding old characters over.
-
S_CAP replaces any lowercase letters by uppercase ones in a string.
-
S_CAT concatenates two strings to make a third string.
-
S_CAT1 concatenates two strings, with a single blank separator.
-
S_CENTER centers the non-blank portion of a string.
-
S_CENTER_INSERT inserts one string into the center of another.
-
S_CH_BLANK replaces each occurrence of a particular character by a blank.
-
S_CH_COUNT counts occurrences of a particular character in a string.
-
S_CH_DELETE removes all occurrences of a character from a string.
-
S_CH_LAST returns the last nonblank character in a string.
-
S_CHOP "chops out" a portion of a string, and closes up the hole.
-
S_COMPARE compares two strings.
-
S_CONTROL_BLANK replaces control characters with blanks.
-
S_CONTROL_COUNT returns the number of control characters in a string.
-
S_CONTROL_DELETE removes all control characters from a string.
-
S_COPY copies one string into another.
-
S_DETAG removes from a string all substrings marked by angle brackets.
-
S_DETROFF removes obnoxious "character" + backspace pairs from a string.
-
S_DIGITS_COUNT counts the digits in a string.
-
S_EQI is a case insensitive comparison of two strings for equality.
-
S_EQIDB compares two strings, ignoring case and blanks.
-
S_ESCAPE_TEX de-escapes TeX escape sequences.
-
S_FILL overwrites every character of a string by a given character.
-
S_FIRST_NONBLANK returns the location of the first nonblank.
-
S_GEI = ( S1 is lexically greater than or equal to S2 ).
-
S_GTI = S1 is lexically greater than S2.
-
S_INDEX seeks the first occurrence of a substring.
-
S_INDEX_SET searches a string for any of a set of characters.
-
S_INDEXI is a case-insensitive INDEX function.
-
S_INDEX_LAST finds the LAST occurrence of a given substring.
-
S_INDEX_LAST_C finds the LAST occurrence of a given character.
-
S_I_APPEND appends an integer to a string.
-
S_INC_C "increments" the characters in a string.
-
S_INC_N increments the digits in a string.
-
S_INPUT prints a prompt string and reads a string from the user.
-
S_IS_ALPHA returns TRUE if the string contains only alphabetic characters.
-
S_IS_ALPHANUMERIC = string contains only alphanumeric characters.
-
S_IS_DIGIT returns TRUE if a string contains only decimal digits.
-
S_IS_F77_NAME = input string represent a legal FORTRAN77 identifier.
-
S_IS_F90_NAME = input string represent a legal FORTRAN90 identifier.
-
S_IS_I is TRUE if a string represents an integer.
-
S_IS_R is TRUE if a string represents a real number.
-
S_LEFT_INSERT inserts one string flush left into another.
-
S_LEI = ( S1 is lexically less than or equal to S2 ).
-
S_LEN_TRIM returns the length of a string to the last nonblank.
-
S_LOW replaces all uppercase letters by lowercase ones.
-
S_LTI = ( S1 is lexically less than S2 ).
-
S_NEQI compares two strings for non-equality, ignoring case.
-
S_NO_CONTROL = string contains no control characters.
-
S_NONALPHA_DELETE removes nonalphabetic characters from a string.
-
S_OF_I4 converts an integer to a left-justified string.
-
S_ONLY_ALPHAB checks if a string is only alphabetic and blanks.
-
S_ONLY_DIGITB returns TRUE if the string contains only digits or blanks.
-
S_OVERLAP determines the overlap between two strings.
-
S_PAREN_CHECK checks the parentheses in a string.
-
S_R_APPEND appends a real number to a string.
-
S_REPLACE_CH replaces all occurrences of one character by another.
-
S_REPLACE_ONE replaces the first occurrence of SUB1 with SUB2.
-
S_REPLACE_REC is a recursive replacement of one string by another.
-
S_REPLACE replaces all occurrences of SUB1 by SUB2 in a string.
-
S_REPLACE_I replaces all occurrences of SUB1 by SUB2 in a string.
-
S_REVERSE reverses the characters in a string.
-
S_RIGHT_INSERT inserts a string flush right into another.
-
S_ROMAN_TO_I4 converts a Roman numeral to an integer.
-
S_S_DELETE removes all occurrences of a substring from a string.
-
S_S_DELETE2 recursively removes a substring from a string.
-
S_S_INSERT inserts a substring into a string.
-
S_S_SUBANAGRAM determines if S2 is a "subanagram" of S1.
-
S_S_SUBANAGRAM_SORTED determines if S2 is a "subanagram" of S1.
-
S_SCRABBLE_POINTS returns the Scrabble point value of a string.
-
S_SET_DELETE removes any characters in one string from another string.
-
S_SHIFT_CIRCULAR circular shifts the characters in a string to the right.
-
S_SHIFT_LEFT shifts the characters in a string to the left and blank pads.
-
S_SHIFT_RIGHT shifts the characters in a string to the right and blank pads.
-
S_SKIP_SET finds the first entry of a string that is NOT in a set.
-
S_SORT_A sorts a string into ascending order.
-
S_SPLIT divides a string into three parts, given the middle.
-
S_SWAP swaps two strings.
-
S_TAB_BLANK replaces each TAB character by one space.
-
S_TAB_BLANKS replaces TAB characters by 6 spaces.
-
S_TO_C4 reads a complex number from a string.
-
S_TO_CAESAR applies a Caesar shift cipher to a string.
-
S_TO_CHVEC converts a string to a character vector.
-
S_TO_DATE converts the F90 date string to a more usual format.
-
S_TO_DEC reads a number from a string, returning a decimal result.
-
S_TO_DIGITS extracts N digits from a string.
-
S_TO_EBCDIC converts a character string from ASCII to EBCDIC.
-
S_TO_FORMAT reads a FORTRAN format from a string.
-
S_TO_HEX replaces a character string by a hexadecimal representation.
-
S_TO_I4 reads an I4 from a string.
-
S_TO_I4VEC reads an integer vector from a string.
-
S_TO_ISBN_DIGITS extracts N ISBN digits from a string.
-
S_TO_L4 reads a logical value from a string.
-
S_TO_R4 reads an R4 value from a string.
-
S_TO_R4VEC reads an R4VEC from a string.
-
S_TO_R8 reads an R8 value from a string.
-
S_TO_R8_OLD reads an R8 value from a string.
-
S_TO_R8VEC reads an R8VEC from a string.
-
S_TO_ROT13 "rotates" the alphabetical characters in a string by 13 positions.
-
S_TO_SOUNDEX computes the Soundex code of a string.
-
S_TO_W reads the next blank-delimited word from a string.
-
S_TOKEN_EQUAL checks whether a string is equal to any of a set of strings.
-
S_TOKEN_MATCH matches the beginning of a string and a set of tokens.
-
S_TRIM_ZEROS removes trailing zeros from a string.
-
S_U2B replaces underscores by blanks.
-
S_WORD_APPEND appends a word to a string.
-
S_WORD_CAP capitalizes the first character of each word in a string.
-
S_WORD_COUNT counts the number of "words" in a string.
-
S_WORD_EXTRACT_FIRST extracts the first word from a string.
-
S_WORD_FIND finds the word of a given index in a string.
-
S_WORD_INDEX finds the word of a given index in a string.
-
S_WORD_NEXT "reads" words from a string, one at a time.
-
S_WORD_PERMUTE permutes the words in a string.
-
S32_TO_I4 returns an I4 equivalent to a 32 character string.
-
S32_TO_R4 converts a 32-character variable into an R4.
-
SEF_TO_B4_IEEE converts SEF information to a 4 byte IEEE real word.
-
SEF_TO_R4 converts SEF information to an R4 = S * 2.0**E * F.
-
SORT_HEAP_EXTERNAL externally sorts a list of items into ascending order.
-
STATE_ID returns the 2 letter Postal Code for one of the 50 states.
-
STATE_NAME returns the name of one of the 50 states.
-
SVEC_LAB makes an index array for an array of (repeated) strings.
-
SVEC_MERGE_A merges two ascending sorted string arrays.
-
SVEC_PERMUTE permutes a string vector in place.
-
SVEC_REVERSE reverses the elements of a string vector.
-
SVEC_SEARCH_BINARY_A searches an ascending sorted string vector.
-
SVEC_SORT_HEAP_A ascending sorts an SVEC using heap sort.
-
SVEC_SORT_HEAP_A_INDEX: case-sensitive indexed heap sort of an SVEC.
-
SVEC_SORTED_UNIQUE: number of unique entries in a sorted SVEC.
-
SVECI_SEARCH_BINARY_A: search ascending sorted implicitly capitalized SVEC
-
SVECI_SORT_HEAP_A heap sorts an SVEC of implicitly capitalized strings.
-
SVECI_SORT_HEAP_A_INDEX index heap sorts an SVECI.
-
SYM_TO_CH returns the character represented by a symbol.
-
TIMESTAMP prints the current YMDHMS date as a time stamp.
-
TOKEN_EXPAND makes sure certain tokens have spaces surrounding them.
-
TOKEN_EXTRACT "extracts" a token from the beginning of a string.
-
TOKEN_INDEX finds the N-th FORTRAN variable name in a string.
-
TOKEN_NEXT finds the next FORTRAN variable name in a string.
-
WORD_BOUNDS returns the start and end of each word in a string.
-
WORD_LAST_READ returns the last word from a string.
-
WORD_NEXT finds the next (blank separated) word in a string.
-
WORD_NEXT_READ "reads" words from a string, one at a time.
-
WORD_NEXT2 returns the first word in a string.
-
WORD_SWAP swaps two words in a given string.
You can go up one level to
the FORTRAN90 source codes.
Last revised on 30 January 2016.