Profs. Dawn Lawrie and
Dave Binkley
are working on a project with the aim of splitting and expanding identifers
to aid program comprehension and search. For example, if a programmer wants
to find the parts of a program where the radius of a sphere is computed
or used, they might not know that the code code uses the
identifier sphrad
for that value and so would not know what to
search for. Profs. Lawrie and Binkley use nearby comments in order
to determine how to split
and expand identifiers, so that if sphrad
appears near the
comment "compute the radius of the sphere" in one place, sphrad
would match the search "sphere radius" everywhere.
You will develop a tool to extract comments from C source code as the first part of this analysis (we will leave extraction of identifiers and all subsequent steps to someone else).
Write a program called Comments
that reads C source code from
standard input and writes the text of the comments to standard output.
The text of a comment does not include the delimiters
/*
and */
at the beginning or end of a C-style
comment, or the marker //
that starts a C++-style
comment. In both cases, leading whitespace and
asterisks (*
) on each line should not be output; internal
and trailing whitespace should be output as is and each non-empty comment
should be output with a newline character at the end ("non-empty" means
contains something other than whitespace, asterisks, and tags).
Furthermore, doxygen/javadoc style tags
– any word within a comment that starts with an at symbol
(@
) – should not be output regardless of what type of
comment it is in and whether it is actually a valid tag (so don't output
something like @DrGlennNFA
even though that is not a tag
used by either tool).
Additionally, note that
\
) at the end of a line immediately
before a newline character is a line-continuation character
and should be ignored along with the following newline; for all
purposes consider those two characters to not appear in the input
Other things to consider:
/*
and //
do not mark the beginning of a comment
Your program must use only a constant amount of space (in other words, the amount
of memory used must not vary with the amount of input read). So, for example,
you may not use arrays or call malloc
.
Your submission must include a makefile that produces an executable file
named Comments
when make
is run with no arguments.
input_0.txt
contains
/* This is a single-line C comment */ #include <stdio.h> /****** * This is a nicely formatted * multi-line comment. ******/ int main(int argc, char **argv) { // This is a C++ comment. }Then the execution of the program would be as follows.
$ ./Comments < input_0.txt This is a single-line C comment This is a nicely formatted multi-line comment. This is a C++ comment.