P R E L I M I N A R Y S P E C I F I C A T I O N Due 2:00 AM, Friday, 18 September 2020 CPSC 323 Homework #1 Directory "Tree" Search REMINDER: Do not under any circumstances copy someone else's code for this assignment, give your code to someone else, or make it publicly available. After discussing any aspect of the assignment with anyone other than a member of the teaching staff (such discussions must be noted in your log file), do not keep any written or electronic record and engage in some mind-numbing activity before you work on the assignment again. Sharing ANY related document (e.g., code or test cases), is a violation of this policy. Since code reuse is an important part of programming, you may study published code (e.g., from textbooks or the Net) and/or incorporate it in a program, provided that you give proper attribution in your log file and in your source files (see the syllabus for details) and that the bulk of the code submitted is your own. Note: Removing/rewriting comments, renaming functions/variables, or reformatting statements does not convey ownership. (60 points) Write a utility % ./fiend [-P | -L]* FILENAME* EXPRESSION that traverses the directory tree(s) rooted at the FILENAME(s) specified in the left-to-right order that they appear in the command line. For each file in each tree, it evaluates the EXPRESSION specified from left to right, according to the rules of precedence, until the outcome is known (the left-hand side of an AND operation is false or the left-hand side of an OR operation is true). fiend (pronounced fie-end) is a stripped down version of the Linux utility find that implements only the subset of command-line arguments containing: * the "real" options -P and -L; * the expression components -depth, -exec, -maxdepth, -name, -newer, -print; * the operators -o and -a. Except as noted below, fiend matches the behavior of /bin/find (see "man find" or Matthew and Stones, pp. 181-183) for the subset of arguments above. Thus * If present, the options -P (never follow symbolic links) and -L (always follow symbolic links) must appear before the first FILENAME. The default is -P. If more than one of -P and -L is specified, then the rightmost takes effect. * The list of FILENAMEs begins with the first argument that is neither -P nor -L and ends just before the next argument that begins with a -. If no FILENAMEs are specified, the search is rooted at the current working directory (i.e., .). * The EXPRESSION begins after the last filename (if any) and consists of + options (-depth and -maxdepth), which affect the overall traversal (rather than the evaluation of EXPRESSION) and always return true; + tests (-name and -newer), which return true or false; + actions (-exec and -print), which may have side effects and return true or false (-print always returns true); and + operators (-o and -a). If more than one -maxdepth is specified, then the rightmost takes effect. Use the submit command (see below) to turn in the C source file(s) for fiend, a Makefile, and your log file (see below) as assignment 1; e.g., % /c/cs323/bin/submit 1 Makefile fiend.c log.fiend YOU MUST SUBMIT YOUR FILES (INCLUDING A LOG FILE) AT THE END OF ANY SESSION WHERE YOU WRITE OR DEBUG CODE, AND AT LEAST ONCE EVERY HOUR DURING SESSIONS THAT LAST LONGER THAN ONE HOUR. (All submissions are retained.) Notes ~~~~~ A. When fiend encounters an error, it writes a one-line message to stderr. If the error is in a command-line argument (e.g., when stat() or lstat() fails on the argument to -newer), it exits immediately; otherwise it continues execution. B. fiend must: * Free all storage before exiting, unless there is a command-line error. * Check for errors from system calls. (But if stat() or lstat() of a file is successful, fiend may assume that a subsequent stat() or lstat() of that file will also be successful.) * Detect a loop involving symbolic links (with the -L flag) and report an error. (fiend may ignore the device number and use only the inode number when testing for identical files.) * Issue a warning (a one-line message to stderr) but not exit when -depth or -maxdepth appears after the first test, action, or operator. * Use strtoul() CORRECTLY to read maxdepth; i.e., fiend must deal with errno when calling strtoul() (see Matthew and Stones, pp. 127-128, or the man page for details). * Implement -newer using the st_mtim entry of the struct stat returned by stat() and lstat() rather than the library function difftime(). This entry contains two fields: tv_sec and tv_nsec, which are (respectively) the number of seconds since the start of Unix time, and the number of nanoseconds since the start of the current second. The command % ls -l --full-time shows times to this resolution. * Replace every {} that is, or is embedded in, an argument to -exec by the full name of the current file. (The time for the replacement operation must grow no worse than quadraticly in the length of the command passed to system().) * Perform the -print action whenever EXPRESSION does not contain any actions and the value of EXPRESSION is true. * Preserve trailing /'s in filenames that appear on the command line. E.g., % fiend .//// -maxdepth 0 .//// C. fiend may * Assume that malloc() and realloc() never fail. * Fail gracefully (i.e., exit with no valgrind errors or segmentation faults) if the depth of the depth-first traversal exceeds the number of file pointers available. * Assume that -exec does not create new files or modify/delete old ones. Thus for "-newer file", the time can be fixed when the command-line arguments are first processed. * Ignore all security considerations (see "man find"). D. Keep track of how you spend your time in completing this assignment. Your log file should be of the general form (the log file below is *FICTIONAL*): ESTIMATE of time to complete assignment: 10 hours Start Time Date Time Spent Work completed ---- ----- ---- -------------- 9/03 10:15 1:20 Read assignment and documentation for find 9/05 20:15 1:00 Played with Hwk1/fiend to discover how it works 9/08 12:45 3:30 Sketched solution using recursion 9/08 14:00 0:40 Recursive solution cannot handle all cases 9/10 21:20 4:00 Sketched new solution using two stacks and a queue 9/12 09:00 4:45 Wrote/typed-in program; eliminated compile-time errors 9/13 20:00 4:30 Debugged program; it passes every public test ----- 19:45 TOTAL time spent I discussed depth-first search with Peter Salovey on 9/1; finite-state machines with Scott Strobel on 9/2; queues with Tamar Gendler on 9/3; and stacks with Marvin Chun on 9/4. (the log file above is *FICTIONAL*), but MUST contain * your estimate of the time required (made prior to writing any code); * the total time you actually spent on the assignment, including reading background material; * the names of all others (except members of the teaching staff) with whom you discussed the assignment for more than 10 minutes total and what topic(s) you talked about; and * a brief discussion (100 words = 8 lines of 80 characters MINIMUM) of the major difficulties that you encountered in developing/debugging (and there will always be some). This log will generally be worth 5-10% of the total grade. N.B. To facilitate analysis, your log file MUST be the only file submitted whose name contains the string "log"; and the estimate / total MUST be on the only line in that file that contains the string "ESTIMATE" / "TOTAL". E. The submit program can be invoked in nine different ways: /c/cs323/bin/submit 1 Makefile fiend.c time.log submits the named source files as your solution to Homework #1; /c/cs323/bin/check 2 lists the files that you submitted for Homework #2; /c/cs323/bin/unsubmit 3 error.submit bogus.solution deletes the named files that you submitted previously for Homework #3 (which is useful if you accidentally submit the wrong file); /c/cs323/bin/makeit 4 fiend runs "make" on the files that you submitted previously for Homework #4; /c/cs323/bin/testit 5 fiend runs the public test script for fiend using the files that you submitted previously for Homework #5; /c/cs323/bin/protect 6 fiend.c time.log protects the named files that you submitted previously for Homework #6 (so they cannot be deleted accidentally); /c/cs323/bin/unprotect 7 fiend.c time.log unprotects the named files that you submitted previously for Homework #7 (so they can be deleted); and /c/cs323/bin/retrieve 8 fiend.c time.log and /c/cs323/bin/retrieve 8 -d"2019/09/11 20:00" fiend.c retrieve copies of the named files that you submitted previously for Homework #8 (in case you accidentally delete your own copies). The day and hour are optional and request the latest submission prior to that time (see the -d flag under "man co" for how to specify times). G. The public grading script is /c/cs323/Hwk1/test.fiend. To run it type % /c/cs323/Hwk2/test.fiend The script runs the command % make fiend to create the executable fiend. It then runs each test, capturing the output in a temporary file and comparing it with the file containing the expected output. If the two outputs are identical, the test is passed. The command % /c/cs323/Hwk1/test.fiend 02 03 05 runs only Tests #02, #03, and #05 (you may specify any nonempty set of tests here). The command % /c/cs323/Hwk1/Tests/t01 | cmp - /c/cs323/Hwk1/Tests/t01.t and the command % /c/cs223/Hwk1/Tests/t01 | diff - /c/cs323/Hwk1/Tests/t01.t run Test #01 using your fiend and compare the output with what is expected. Here Hwk1/Tests/t01 is a shell script (i.e., a sequence of shell commands) that executes your fiend one or more times; Hwk1/Tests/t01.t is a text file that contains the expected output; cmp prints the first character where the files differ; and diff prints the lines where they differ but uses a looser definition for "identical". If your output looks the same as what is expected, but your program still fails the test, there may be some invisible characters in your output. To make all characters visible (except blanks), use the command % /c/cs323/Hwk1/Tests/t01 | cat -vet or the command % /c/cs323/Hwk1/Tests/t01 | od -bc See the man pages for cat and od for details. You may use the script /c/cs323/bin/tester to run your own tests. F. Prudence (and a 5-point penalty for code that does not make) suggests that you run % makeit 1 fiend after submitting your source files. Better yet, run % testit 1 fiend The first creates the executable fiend from the sources that you submitted; the second also runs the test script on those sources. H. Differences between fiend and find: * find and fiend may print different one-line error messages. * find prints an extra blank line when warning that -maxdepth or -depth follows a non-option argument. * find treats anything larger than INT_MAX as out of range. * find prints a warning when the NAME in -name NAME contains a /. * find leaks storage. Hwk1/Compare runs fiend and find and reports the differences in their outputs, if any. I. Useful library functions: * stat(2) and lstat(2) return the status of a file. * opendir(3), readdir(3), and closedir(3) deal with directories (see /c/cs323/Hwk1/Dir.c for one example, and Matthew and Stones pp. 124-126 for another). * strdup(3) returns a a block of storage allocated by malloc() that contains a copy of a string. * strncat(3) appends a prefix of one string to another. * strrchr(3) and strstr(3) search one string for a character or another string. * strtoul(3) converts a string to an unsigned long. * system(3) spawns a shell to execute a command. (You may use it ONLY to implement -exec.) * fflush(3) flushes the buffer associated with a FILE pointer; to intermix output from -print and -exec this must be done before calling system(). * fnmatch(3) with the FLAGS argument set to 0 does the regular expression matching needed for -name. See their man pages (e.g., "man 2 stat" or "man 3 opendir") or Matthew and Stones for details. Note: You must add the line #define _GNU_SOURCE before any #include-s to define prototypes for some of these functions. J. The reference solution /c/cs323/Hwk1/fiend exists to answer your questions about how fiend should work. But ask if its behavior is not consistent with your interpretation of this specification since it may have bugs. K. Hwk1/fiend.h contains the #include files and some macros that are used by from the reference solution. L. Excluding code that provides more informative error messages and compares device numbers and and not just inode numbers when checking for loops, the reference solution is ~160 lines as measured by the command % cat fiend.c fiendlib.c | /c/cs323/bin/xLines which counts the number of lines remaining after deleting all braces, comments, and the keyword else, and then deleting all whitespace-only lines. Fine Points ~~~~~~~~~~~ 1. find uses execvp() rather than system() to execute shell commands. To achieve the same effect fiend would have to escape embedded blanks and shell metacharacters in the string passed to system(). For example, % ./fiend . -exec echo foo '>' bar \; writes the output from each invocation of echo to the file bar, while % find . -exec echo foo '>' bar \; writes "foo > bar" to stdout for each invocation. Explanation: bash removes the single quotes from '>'. fiend passes the string "echo foo > bar" to system, and the bash treats > as a redirection operator. find passes the array of strings ("echo", "foo", ">", "bar") to exec(), and exec() passes ">" to the echo command. You may ignore this point. 2. Using "cd .." in bash and using chdir("..") in your C program have different effects if you reached the current working directory via a symbolic link. In particular, "cd .." moves to the symbolic link's parent directory, while chdir("..") moves to the parent of the current directory. Since the latter is generally not what you want for this assignment, you may want to ignore that part of the directory scanning example in Matthew and Stones and avoid using chdir(). 3. If you use basename(), be aware that there are two different versions, the POSIX version and the GNU version (see man 3 basename). 4. You must submit a Makefile along with your code and log. (See the 8/31 reading in Matthew and Stones on make.) From the Style guide (Guidelines for grading programming style: 23) Programs must compile under "gcc -std=c99 -pedantic -Wall" without generating ANY warning messages. CS-323-08/28/20