P R E L I M I N A R Y S P E C I F I C A T I O N Due 2:00 AM, Friday, 23 October 2020 CPSC 323 Homework #3 The Shell Game: The Macroprocessor REMINDER: Do not under any circumstances copy someone else's code for this assignment, give your code to someone else, or make it publicly available. After discussing any aspect of the assignment with anyone other than a member of the teaching staff (such discussions must be noted in your log file), do not keep any written or electronic record and engage in some mind-numbing activity before you work on the assignment again. Sharing ANY related document (e.g., code or test cases), is a violation of this policy. Since code reuse is an important part of programming, you may study published code (e.g., from textbooks or the Net) and/or incorporate it in a program, provided that you give proper attribution in your log file and in your source files (see the syllabus for details) and that the bulk of the code submitted is your own. Note: Removing/rewriting comments, renaming functions/variables, or reformatting statements does not convey ownership. (40 points) /bin/bash (the Bourne Again SHell) performs variable expansion on the command line. For example, if the value of the environment variable HWK is the string "/c/cs323/Hwk3", then the command % $HWK/test.mcBash is expanded to % /c/cs323/Hwk3/test.mcBash More generally, bash performs a large set of variable expansions, including Syntax Action ~~~~~~ ~~~~~~ $NAME Replace by the value of the environment variable NAME, or by the empty string if NAME is not defined. ${NAME} Replace by the value of the environment variable NAME, or by the empty string if NAME is not defined. ${NAME-WORD} Replace by the value of NAME, or by the expansion of WORD if NAME is not defined. ${NAME=WORD} Replace by the value of the environment variable NAME, or by the expansion of WORD if NAME is not defined (in which case NAME is immediately assigned the expansion of WORD). $0, ..., $9 Replace $D (where D is a decimal digit) by the D-th command line argument to mcBash, or by the empty string if there is no D-th argument. ${N} Replace by the N-th argument to mcBash, or by the empty string if there is no N-th argument. $* Replace by a list of all arguments to mcBash (not including $0), separated by single space characters. where * NAME is a maximal sequence of one or more alphanumeric or _ characters that begins with an alphabetic or _ character; * WORD is any sequence of characters that ends with the first } not escaped by a backslash and not part of one of the expansions above; and * N is a nonempty sequence of decimal digits. The expansion of WORD takes place before the substitution so that the search for substrings to expand proceeds from left to right and continues at the end of the replacement string after each substitution. The escape character \ removes any special meaning that is associated with the following non-null, non-newline character. This can be used to include $, {, }, \, and whitespace (but not newlines) in a command. The \ is not removed. Collectively these expansions turn one stage in the front-end of bash into a macroprocessor with a somewhat unusual syntax (e.g., when compared with that of the C preprocessor /bin/cpp or /bin/m4). Your task is to implement this stage in Perl or Python or Ruby; i.e., to write a bash-like macroprocessor "mcBash" that * prompts for and reads lines from the standard input, * performs the variable expansions described above, and * writes the expanded command to the standard output. Examples: % /c/cs323/Hwk3/mcBash (1)$ OSTYPE = $OSTYPE >> OSTYPE = linux (2)$ ${OSTYPE-Linux} >> linux (3)$ ${NONEXISTENT} >> (4)$ ${NONEXISTENT=VALUE} >> VALUE (5)$ ${NONEXISTENT} >> VALUE (6)$ $0 >> /c/cs323/Hwk3/mcBash (7)$ \$OSTYPE >> \$OSTYPE (8)$ ${ANOTHER=$OSTYPE} >> linux (9)$ $ANOTHER >> linux Use the submit command to turn in your source and log files for mcBash as assignment 3. YOU MUST SUBMIT YOUR FILES (INCLUDING THE LOG FILE) AT THE END OF ANY SESSION WHERE YOU WRITE OR DEBUG CODE, AND AT LEAST ONCE EVERY HOUR DURING SESSIONS LASTING LONGER THAN ONE HOUR. (All submissions are retained.) Notes ~~~~~ 1. The environment is a list of name-value pairs that is passed to an executed program in the same way as a normal argument list. The names must be NAMEs and the values may be any null-terminated character strings. The command /usr/bin/env prints all environment variables and their values; the command /usr/bin/printenv prints the value of its argument. See [Matthew and Stones, pp. 144-148] for details. To access the environment variables you can use the hash %ENV in Perl, the mapping object os.environ in Python, and the hash-like accessor ENV in Ruby. 2. A $ begins an expansion ONLY when not escaped and immediately followed by * an alphabetic or _ as in $NAME * a digit as in $0, ..., $9 * a { as in ${NAME}, ${NAME-WORD}, ${NAME=WORD}, or ${N} * a * as in $* Otherwise it is just another character. 3. If WORD itself contains macros, mcBash must recursively expand them before performing the substitution (e.g., see Example (8)$ above). However, if NAME exists, then during the expansion mcBash must suppress any assignments to environment variables and must ignore errors other than a missing } (e.g., ${NAME=${:}} if NAME is defined). This behavior will be worth at most 8 points (i.e., WORD will contain macros in at most 8 tests). 4. When mcBash detects any of the following errors: * ${ not followed by an alphanumeric or _ * ${NAME not followed by a } or - or = * WORD is not followed by a } * ${N not followed by a } it prints a one-line message to stderr and gives a new prompt (unless the error must be ignored; see Note (3)). It does not print the (partially) expanded line or increment the line number. If an expansion earlier in the command defined an environment variable (e.g., ${NAME=a}${:} where NAME is not initially defined), that definition is retained. 5. [Matthew and Stones, pp. 27-31, 70-72] and the man page for bash ("man bash" or "info bash") have more complete descriptions of these expansions (and others that mcBash does not implement) as well as examples of their use. If you use bash to improve your understanding of how the expansion process works, bear in mind the following differences between mcBash and bash: * bash prompts for additional lines when a matching } is missing. * bash allows escaped newlines and WORD fields that span multiple lines, prompting for additional input lines. * bash allows the use of single and double quotes to remove the special meaning of the characters within quotes. * bash also does brace expansion, tilde expansion, command substitution, arithmetic expansion, word splitting, and pathname expansion. * bash expands shell variables as well as environment variables. * bash defines other variable expansions and special macros such as $@, $#, $?, $-, $$, or $!. * bash tokenizes the command BEFORE expanding variables, which gives a somewhat different effect than tokenizing after expanding macros. Note: This list will expand as I learn of other discrepancies. 6. Like bash, when a line contains only whitespace, mcBash does not print the line or increment the line number used in the prompt. 7. Although all test scripts redirect stdin to a file, you should think of mcBash as an interactive program that handles one line at a time and takes the appropriate action. Thus if there are multiple lines and only one contains an error, then the expansions of the correct lines should be printed (to stdout) as should the error (to stderr). 8. mcBash writes a newline on reaching the end of the input. 9. Hwk3/mcBash runs a Perl solution that contains 39 lines of code (ignoring comments and blank/brace-only lines). A Python solution contains 56 lines (ignoring comments and blank/brace-only/continue-only lines). A C version would be _much_ longer. A. This assignment is designed to acquaint you with a high-level scripting language (Perl/Python/Ruby) and with regular expressions. There are links to tutorials for each of these languages on the class web site for those who do not know any of them. There are also links to sites that help you understand regular expressions. CS-323-09/30/20