CHRPAK
Characters and Strings

CHRPAK is a FORTRAN90 library which handles characters and strings.

CHRPAK began when I simply wanted to be able to capitalize a string. Now it has expanded to a number of interesting uses. Many unusual situations are provided for, including

string '31.2' <=> numeric value 31.2;
uppercase <=> lowercase;
removal of control characters or blanks;
sorting, merging, searching.

Many of the routine names begin with the name of the data type they operate on:

B4 - a 4 byte word;
CH - a character;
CHVEC - a vector of characters;
DEC - a decimal fraction;
DIGIT - a character representing a numeric digit;
I4 - an integer ( kind = 4 );
R4 - a real ( kind = 4 );
R8 - a real ( kind = 8 );
RAT - a ratio I/J;
S - a string;
SVEC - a vector of strings;
SVECI - a vector of strings, implicitly capitalized;

Licensing:

The computer code and data files made available on this web page are distributed under the GNU LGPL license.

Languages:

CHRPAK is available in a C version and a C++ version and a FORTRAN90 version and a MATLAB version and a Python version.

Related Software and Data:

CAESAR, a FORTRAN90 library which can apply a Caesar Shift Cipher to a string of text.

ROT13, a FORTRAN90 library which can encipher a string using the ROT13 cipher for letters, and the ROT5 cipher for digits.

Source Code:

chrpak.f90, the source code;

Examples and Tests:

chrpak_test.f90, the calling program;
chrpak_test.txt, the output file.

List of Routines:

A_TO_I4 returns the index of an alphabetic character.
B4_IEEE_TO_R4 converts a 4 byte IEEE word into an R4.
B4_IEEE_TO_SEF converts an IEEE real word to S * 2^E * F format.
BASE_TO_I4 returns the value of an I4 represented in some base.
BINARY_TO_I4 converts a binary representation into an I4.
BINARY_TO_R4 converts a binary representation into an R4.
BINARY_TO_R8 converts a binary representation into an R8.
CH_CAP capitalizes a single character.
CH_COUNT_CHVEC_ADD adds a character vector to a character count.
CH_COUNT_FILE_ADD adds characters in a file to a character count.
CH_COUNT_HISTOGRAM_PRINT prints a histogram of a set of character counts.
CH_COUNT_INIT initializes a character count.
CH_COUNT_PRINT prints a set of character counts.
CH_COUNT_S_ADD adds a character string to a character histogram.
CH_EQI is a case insensitive comparison of two characters for equality.
CH_EXTRACT extracts the next nonblank character from a string.
CH_INDEX_FIRST is the first occurrence of a character in a string.
CH_INDEX_LAST is the last occurrence of a character in a string.
CH_INDEXI: (case insensitive) first occurrence of a character in a string.
CH_IS_ALPHA is TRUE if CH is an alphabetic character.
CH_IS_ALPHANUMERIC is TRUE if CH is alphanumeric.
CH_IS_CONTROL is TRUE if a character is a control character.
CH_IS_DIGIT is TRUE if a character is a decimal digit.
CH_IS_FORMAT_CODE is TRUE if a character is a FORTRAN format code.
CH_IS_ISBN_DIGIT is TRUE if a character is an ISBN digit.
CH_IS_LOWER is TRUE if a character is a lower case letter.
CH_IS_PRINTABLE is TRUE if C is printable.
CH_IS_SPACE is TRUE if a character is a whitespace character.
CH_IS_UPPER is TRUE if CH is an upper case letter.
CH_LOW lowercases a single character.
CH_NEXT reads the next character from a string, ignoring blanks and commas.
CH_NOT_CONTROL = CH is NOT a control character.
CH_ROMAN_TO_I4 converts a single Roman digit to an I4.
CH_SCRABBLE returns the character on a given Scrabble tile.
CH_SCRABBLE_FREQUENCY returns the Scrabble frequency of a character.
CH_SCRABBLE_POINTS returns the Scrabble point value of a character.
CH_SCRABBLE_SELECT selects a character with the Scrabble probability.
CH_SWAP swaps two characters.
CH_TO_AMINO_NAME converts a character to an amino acid name.
CH_TO_BRAILLE converts an ASCII character to a Braille character string.
CH_TO_CH3_AMINO converts a 1 character to a 3 character code for amino acids.
CH_TO_DIGIT returns the value of a base 10 digit.
CH_TO_DIGIT_BIN returns the value of a binary digit.
CH_TO_DIGIT_OCT returns the value of an octal digit.
CH_TO_EBCDIC converts a character to EBCDIC.
CH_TO_MILITARY converts an ASCII character to a Military code word.
CH_TO_MORSE converts an ASCII character to a Morse character string.
CH_TO_ROT13 converts a character to its ROT13 equivalent.
CH_TO_SCRABBLE returns the Scrabble index of a character.
CH_TO_SOUNDEX converts an ASCII character to a Soundex character.
CH_TO_SYM returns a printable symbol for any ASCII character.
CH_UNIFORM returns a random character in a given range.
CH3_TO_CH_AMINO converts a 3 character to a 1 character code for amino acids.
CH4_TO_I4 converts a four character string to an I4.
CH4_TO_R4 converts a 4 character string to an R4.
CH4VEC_TO_I4VEC converts an string of characters into an array of integers.
CHR4_TO_8 replaces pairs of hexadecimal digits by a character.
CHR8_TO_4 replaces characters by a pair of hexadecimal digits.
CHRA_TO_S replaces control characters by printable symbols.
CHRASC converts a vector of ASCII codes into character strings.
CHRASS "understands" an assignment statement of the form LHS = RHS.
CHRCTF reads an integer or rational fraction from a string.
CHRCTG reads an integer, decimal fraction or a ratio from a string.
CHRCTI2 finds and reads an integer from a string.
CHRCTP reads a parenthesized complex number from a string.
CHRS_TO_A replaces all control symbols by control characters.
CHVEC_PERMUTE permutes a character vector in place.
CHVEC_PRINT prints a character vector.
CHVEC_REVERSE reverses the elements of a character vector.
CHVEC_TO_S converts a character vector to a string.
CHVEC2_PRINT prints two vectors of characters.
COMMA moves commas left through blanks in a string.
DEC_TO_S_LEFT returns a left-justified representation of IVAL * 10^JVAL.
DEC_TO_S_RIGHT returns a right justified representation of IVAL * 10**JVAL.
DIGIT_BIN_TO_CH returns the character representation of a binary digit.
DIGIT_INC increments a decimal digit.
DIGIT_OCT_TO_CH returns the character representation of an octal digit.
DIGIT_TO_CH returns the character representation of a decimal digit.
EBCDIC_TO_CH converts an EBCDIC character to ASCII.
EBCDIC_TO_S converts a string of EBCDIC characters to ASCII.
FILLCH writes a string into a subfield of a string.
FILLIN writes an integer into a subfield of a string.
FILLRL writes a real into a subfield of a string.
FLT_TO_S returns a representation of MANT * 10**IEXP.
FORCOM splits a FORTRAN line into "fortran" and "comment".
GET_UNIT returns a free FORTRAN unit number.
HEX_DIGIT_TO_I4 converts a hexadecimal digit to an I4.
HEX_TO_BINARY_DIGITS converts a hexadecimal digit to 4 binary digits.
HEX_TO_I4 converts a hexadecimal string to an I4.
HEX_TO_S converts a hexadecimal string into characters.
I2_BYTE_SWAP swaps bytes in an 8-byte word.
I4_BYTE_SWAP swaps bytes in a 4-byte word.
I4_EXTRACT "extracts" an I4 from the beginning of a string.
I4_GCD finds the greatest common divisor of I and J.
I4_HUGE returns a "huge" I4.
I4_INPUT prints a prompt string and reads an I4 from the user.
I4_LENGTH computes the number of characters needed to print an I4.
I4_MODP returns the nonnegative remainder of I4 division.
I4_NEXT "reads" I4's from a string, one at a time.
I4_NEXT_READ finds and reads the next I4 in a string.
I4_RANGE_INPUT reads a pair of I4's from the user, representing a range.
I4_SWAP swaps two I4's.
I4_TO_A returns the I-th alphabetic character.
I4_TO_AMINO_CODE converts an integer to an amino code.
I4_TO_BASE represents an integer in any base up to 16.
I4_TO_BINARY produces the binary representation of an I4.
I4_TO_BINHEX returns the I-th character in the BINHEX encoding.
I4_TO_CH4 converts an I4 to a 4 character string.
I4_TO_HEX produces the hexadecimal representation of an I4.
I4_TO_HEX_DIGIT converts a (small) I4 to a hexadecimal digit.
I4_TO_ISBN_DIGIT converts an I4 to an ISBN digit.
I4_TO_MONTH_ABB returns the 3 character abbreviation of a given month.
I4_TO_MONTH_NAME returns the name of a given month.
I4_TO_NUNARY produces the "base -1" representation of an I4.
I4_TO_OCT produces the octal representation of an integer.
I4_TO_S_LEFT converts an I4 to a left-justified string.
I4_TO_S_RIGHT converts an I4 to a right justified string.
I4_TO_S_RIGHT_COMMA converts an I4 to a right justified string with commas.
I4_TO_S_ROMAN converts an I4 to a string of Roman numerals.
I4_TO_S_ZERO converts an I4 to a string, with zero padding.
I4_TO_S32 converts an I4 to an S32.
I4_TO_UNARY produces the "base 1" representation of an I4.
I4_TO_UUDECODE returns the I-th character in the UUDECODE encoding.
I4_TO_XXDECODE returns the I-th character in the XXDECODE encoding.
I4_UNIFORM returns a scaled pseudorandom I4.
I4VEC_INDICATOR sets an I4VEC to the indicator vector.
I4VEC_PRINT prints an I4VEC.
I4VEC_TO_CH4VEC converts an I4VEC into a string.
IC_TO_IBRAILLE converts an ASCII integer code to a Braille code.
IC_TO_IEBCDIC converts an ASCII character code to an EBCDIC code.
IC_TO_IMORSE converts an ASCII integer code to a Morse integer code.
IC_TO_ISOUNDEX converts an ASCII integer code to a Soundex integer code.
IEBCDIC_TO_IC converts an EBCDIC character code to ASCII.
ISBN_DIGIT_TO_I4 converts an ISBN digit to an I4.
ISTRCMP compares two strings, returning +1, 0, or -1.
ISTRNCMP compares the start of two strings, returning +1, 0, or -1.
LEN_NONNULL returns the length of a string up to the last non-null character.
MALPHNUM2 is TRUE if a string contains only alphanumerics and underscores.
MILITARY_TO_CH converts a Military code word to an ASCII character.
MONTH_NAME_TO_I4 returns the month number of a given month
NAMEFL replaces "lastname, firstname" by "firstname lastname".
NAMELF replaces "firstname lastname" by "lastname, firstname".
NAMELS reads a NAMELIST line, returning the variable name and value.
NEXCHR returns the next nonblank character from a string.
NEXSTR returns the next nonblank characters from a string.
NUMBER_INC increments the integer represented by a string.
OCT_TO_I4 converts an octal string to an I4.
PERM_CHECK checks that a vector represents a permutation.
PERM_INVERSE3 produces the inverse of a given permutation.
PERM_UNIFORM selects a random permutation of N objects.
R4_TO_B4_IEEE converts an R4 to a 4 byte IEEE word.
R4_TO_BINARY represents an R4 as a string of binary digits.
R4_TO_CH4 converts an R4 to a 4 character string.
R4_TO_FLT computes the scientific representation of an R4.
R4_TO_S_LEFT writes an R4 into a left justified character string.
R4_TO_S_RIGHT writes an R4 into a right justified character string.
R4_TO_S32 encodes an R4 as 32 characters.
R4_TO_SEF represents an R4 as R = S * 2**E * F.
R4_UNIFORM_01 returns a unit pseudorandom R4.
R8_EXTRACT "extracts" an R8 from the beginning of a string.
R8_INPUT prints a prompt string and reads an R8 from the user.
R8_NEXT "reads" R8's from a string, one at a time.
R8_TO_BINARY represents an R8 as a string of binary digits.
R8_TO_S_LEFT writes an R8 into a left justified string.
R8_TO_S_LEFT writes an R8 into a right justified string.
R8_UNIFORM_01 returns a unit pseudorandom R8.
R8VEC_TO_S "writes" an R8VEC into a string.
RANGER "understands" a range defined by a string like '4:8'.
RAT_TO_S_LEFT returns a left-justified representation of IVAL/JVAL.
RAT_TO_S_RIGHT returns a right-justified representation of IVAL/JVAL.
S_ADJUSTL flushes a string left.
S_ADJUSTR flushes a string right.
S_AFTER_SS_COPY copies a string after a given substring.
S_ALPHA_LAST returns the location of the last alphabetic character.
S_ANY_ALPHA is TRUE if a string contains any alphabetic character.
S_ANY_CONTROL is TRUE if a string contains any control characters.
S_B2U replaces interword blanks by underscores.
S_BEFORE_SS_COPY copies a string up to a given substring.
S_BEGIN is TRUE if one string matches the beginning of the other.
S_BEHEAD_SUBSTRING "beheads" a string, removing a given substring.
S_BLANK_DELETE removes blanks from a string, left justifying the remainder.
S_BLANKS_DELETE replaces consecutive blanks by one blank.
S_BLANKS_INSERT inserts blanks into a string, sliding old characters over.
S_CAP replaces any lowercase letters by uppercase ones in a string.
S_CAT concatenates two strings to make a third string.
S_CAT1 concatenates two strings, with a single blank separator.
S_CENTER centers the non-blank portion of a string.
S_CENTER_INSERT inserts one string into the center of another.
S_CH_BLANK replaces each occurrence of a particular character by a blank.
S_CH_COUNT counts occurrences of a particular character in a string.
S_CH_DELETE removes all occurrences of a character from a string.
S_CH_LAST returns the last nonblank character in a string.
S_CHOP "chops out" a portion of a string, and closes up the hole.
S_COMPARE compares two strings.
S_CONTROL_BLANK replaces control characters with blanks.
S_CONTROL_COUNT returns the number of control characters in a string.
S_CONTROL_DELETE removes all control characters from a string.
S_COPY copies one string into another.
S_DETAG removes from a string all substrings marked by angle brackets.
S_DETROFF removes obnoxious "character" + backspace pairs from a string.
S_DIGITS_COUNT counts the digits in a string.
S_EQI is a case insensitive comparison of two strings for equality.
S_EQIDB compares two strings, ignoring case and blanks.
S_ESCAPE_TEX de-escapes TeX escape sequences.
S_FILL overwrites every character of a string by a given character.
S_FIRST_NONBLANK returns the location of the first nonblank.
S_GEI = ( S1 is lexically greater than or equal to S2 ).
S_GTI = S1 is lexically greater than S2.
S_INDEX seeks the first occurrence of a substring.
S_INDEX_SET searches a string for any of a set of characters.
S_INDEXI is a case-insensitive INDEX function.
S_INDEX_LAST finds the LAST occurrence of a given substring.
S_INDEX_LAST_C finds the LAST occurrence of a given character.
S_I_APPEND appends an integer to a string.
S_INC_C "increments" the characters in a string.
S_INC_N increments the digits in a string.
S_INPUT prints a prompt string and reads a string from the user.
S_IS_ALPHA returns TRUE if the string contains only alphabetic characters.
S_IS_ALPHANUMERIC = string contains only alphanumeric characters.
S_IS_DIGIT returns TRUE if a string contains only decimal digits.
S_IS_F77_NAME = input string represent a legal FORTRAN77 identifier.
S_IS_F90_NAME = input string represent a legal FORTRAN90 identifier.
S_IS_I is TRUE if a string represents an integer.
S_IS_R is TRUE if a string represents a real number.
S_LEFT_INSERT inserts one string flush left into another.
S_LEI = ( S1 is lexically less than or equal to S2 ).
S_LEN_TRIM returns the length of a string to the last nonblank.
S_LOW replaces all uppercase letters by lowercase ones.
S_LTI = ( S1 is lexically less than S2 ).
S_NEQI compares two strings for non-equality, ignoring case.
S_NO_CONTROL = string contains no control characters.
S_NONALPHA_DELETE removes nonalphabetic characters from a string.
S_OF_I4 converts an integer to a left-justified string.
S_ONLY_ALPHAB checks if a string is only alphabetic and blanks.
S_ONLY_DIGITB returns TRUE if the string contains only digits or blanks.
S_OVERLAP determines the overlap between two strings.
S_PAREN_CHECK checks the parentheses in a string.
S_R_APPEND appends a real number to a string.
S_REPLACE_CH replaces all occurrences of one character by another.
S_REPLACE_ONE replaces the first occurrence of SUB1 with SUB2.
S_REPLACE_REC is a recursive replacement of one string by another.
S_REPLACE replaces all occurrences of SUB1 by SUB2 in a string.
S_REPLACE_I replaces all occurrences of SUB1 by SUB2 in a string.
S_REVERSE reverses the characters in a string.
S_RIGHT_INSERT inserts a string flush right into another.
S_ROMAN_TO_I4 converts a Roman numeral to an integer.
S_S_DELETE removes all occurrences of a substring from a string.
S_S_DELETE2 recursively removes a substring from a string.
S_S_INSERT inserts a substring into a string.
S_S_SUBANAGRAM determines if S2 is a "subanagram" of S1.
S_S_SUBANAGRAM_SORTED determines if S2 is a "subanagram" of S1.
S_SCRABBLE_POINTS returns the Scrabble point value of a string.
S_SET_DELETE removes any characters in one string from another string.
S_SHIFT_CIRCULAR circular shifts the characters in a string to the right.
S_SHIFT_LEFT shifts the characters in a string to the left and blank pads.
S_SHIFT_RIGHT shifts the characters in a string to the right and blank pads.
S_SKIP_SET finds the first entry of a string that is NOT in a set.
S_SORT_A sorts a string into ascending order.
S_SPLIT divides a string into three parts, given the middle.
S_SWAP swaps two strings.
S_TAB_BLANK replaces each TAB character by one space.
S_TAB_BLANKS replaces TAB characters by 6 spaces.
S_TO_C4 reads a complex number from a string.
S_TO_CAESAR applies a Caesar shift cipher to a string.
S_TO_CHVEC converts a string to a character vector.
S_TO_DATE converts the F90 date string to a more usual format.
S_TO_DEC reads a number from a string, returning a decimal result.
S_TO_DIGITS extracts N digits from a string.
S_TO_EBCDIC converts a character string from ASCII to EBCDIC.
S_TO_FORMAT reads a FORTRAN format from a string.
S_TO_HEX replaces a character string by a hexadecimal representation.
S_TO_I4 reads an I4 from a string.
S_TO_I4VEC reads an integer vector from a string.
S_TO_ISBN_DIGITS extracts N ISBN digits from a string.
S_TO_L4 reads a logical value from a string.
S_TO_R4 reads an R4 value from a string.
S_TO_R4VEC reads an R4VEC from a string.
S_TO_R8 reads an R8 value from a string.
S_TO_R8_OLD reads an R8 value from a string.
S_TO_R8VEC reads an R8VEC from a string.
S_TO_ROT13 "rotates" the alphabetical characters in a string by 13 positions.
S_TO_SOUNDEX computes the Soundex code of a string.
S_TO_W reads the next blank-delimited word from a string.
S_TOKEN_EQUAL checks whether a string is equal to any of a set of strings.
S_TOKEN_MATCH matches the beginning of a string and a set of tokens.
S_TRIM_ZEROS removes trailing zeros from a string.
S_U2B replaces underscores by blanks.
S_WORD_APPEND appends a word to a string.
S_WORD_CAP capitalizes the first character of each word in a string.
S_WORD_COUNT counts the number of "words" in a string.
S_WORD_EXTRACT_FIRST extracts the first word from a string.
S_WORD_FIND finds the word of a given index in a string.
S_WORD_INDEX finds the word of a given index in a string.
S_WORD_NEXT "reads" words from a string, one at a time.
S_WORD_PERMUTE permutes the words in a string.
S32_TO_I4 returns an I4 equivalent to a 32 character string.
S32_TO_R4 converts a 32-character variable into an R4.
SEF_TO_B4_IEEE converts SEF information to a 4 byte IEEE real word.
SEF_TO_R4 converts SEF information to an R4 = S * 2.0**E * F.
SORT_HEAP_EXTERNAL externally sorts a list of items into ascending order.
STATE_ID returns the 2 letter Postal Code for one of the 50 states.
STATE_NAME returns the name of one of the 50 states.
SVEC_LAB makes an index array for an array of (repeated) strings.
SVEC_MERGE_A merges two ascending sorted string arrays.
SVEC_PERMUTE permutes a string vector in place.
SVEC_REVERSE reverses the elements of a string vector.
SVEC_SEARCH_BINARY_A searches an ascending sorted string vector.
SVEC_SORT_HEAP_A ascending sorts an SVEC using heap sort.
SVEC_SORT_HEAP_A_INDEX: case-sensitive indexed heap sort of an SVEC.
SVEC_SORTED_UNIQUE: number of unique entries in a sorted SVEC.
SVECI_SEARCH_BINARY_A: search ascending sorted implicitly capitalized SVEC
SVECI_SORT_HEAP_A heap sorts an SVEC of implicitly capitalized strings.
SVECI_SORT_HEAP_A_INDEX index heap sorts an SVECI.
SYM_TO_CH returns the character represented by a symbol.
TIMESTAMP prints the current YMDHMS date as a time stamp.
TOKEN_EXPAND makes sure certain tokens have spaces surrounding them.
TOKEN_EXTRACT "extracts" a token from the beginning of a string.
TOKEN_INDEX finds the N-th FORTRAN variable name in a string.
TOKEN_NEXT finds the next FORTRAN variable name in a string.
WORD_BOUNDS returns the start and end of each word in a string.
WORD_LAST_READ returns the last word from a string.
WORD_NEXT finds the next (blank separated) word in a string.
WORD_NEXT_READ "reads" words from a string, one at a time.
WORD_NEXT2 returns the first word in a string.
WORD_SWAP swaps two words in a given string.

You can go up one level to the FORTRAN90 source codes.

Last revised on 30 January 2016.