CS520
Fall 2013
Program 5
Due Sunday November 3

Posix Threads

Program Specifications

Write a program that reads in a file full of words, and builds a hash table of all of the words. Next, the program should read in a file containing text that should be compared against the provided dictionary.

For every string that is not a word in the file, your program should output a message saying that the string is not a word as follows: "Not a word: %s\n", where %s denotes the string, and the \n is a new line. For example:

lorelai:spellcheck chris$ cat wordfile
chris hello orange
lorelai:spellcheck chris$ cat dictionary
chris
lorelai:spellcheck chris$ ./spellcheck_debug dictionary wordfile 
Not a word: hello
Not a word: orange

Command Line Arguments

The program is to accept two command line arguments. The first command line argument is a file to be used as a dictionary. The second command line argument is a file to be searched for words/non-words. You should verify that the program has the correct number of command line arguments, and that the program's command line arguments are valid files.

Input Format Specification

The definition of a "word" is a sequence of non-space characters one after the other. A space character is defined by the isspace function. Valid input files will only have ONE space character separating words, which should help you parse the files.

You may assume a "word" will contain no more than 50 characters; any file with a "word" longer than 50 characters constitutes invalid input.

Source Files

In order for me to test your program effectively, I will compile it using a variety of header files, which will control the number of threads used to do various tasks.

  1. hashtable_constants.h: a header file containing constants governing the number of threads to use
  2. makefile: a starter makefile I used to build my program, which you may find helpful

Other Files

To assist you, I also provide two additional files. I make no guarantees about how correct or useful either file is.

  1. wordlist.txt: a file containing many english words, but not all words.
  2. converter.py: A python script that I believe is capable of opening up a file and reading the entire file into memory, and then replaces all instances of 1 or more consecutive whitespace characters with a single space character.

Building your program

I expect you to submit a makefile along with your code that will build your program. I have included a starter makefile that links to the pthread library for you.

Speed Contest

A portion of your grade will be based upon how well your program performs relative to that of your peers. The student whose program runs the fastest will receive 5 additional bonus points. Any student whose program's speed is within 5% of the fastest program will receive 4 points. Any student whose program is in the top 50% will receive an additional 1 point. Programs which produce incorrect output are ineligible to win the speed contest. For the speed contest, your program will be run with the constants supplied in your header file, which will allow you to experiment with optimizing the number of threads for each task.

Any student whose program is faster than my serial solution will receive 1 additional point. Any student whose program is faster than my parallel solution will receive 1 additional point.

These extra points will be applied independent of one another.

Timing stuff

If you want a quick and dirty way to time stuff, there is a program called "time" that can do that:
lorelai:cs520 chris$ time ls
classes		old_assts	p2		p4		scripts
midterm		p1		p3		p5		website

real	0m0.009s
user	0m0.002s
sys	0m0.003s

Grading

In addition, remember, you may lose points if your program is not properly structured or adequately documented. Coding guidelines are given on the course overview webpage.

Your programs will be graded using agate.cs.unh.edu so be sure to test in that environment.

Remember: as always you are expected to do your own work on this assignment. Copying code from another student or from sites on the internet is explicitly forbidden!

Submission

Your programs should be submitted for grading from agate.cs.unh.edu. In order to turn in the program, first make sure you are SSH'd in to agate. To turn in this program, type:
% ~cs520/bin/DoSubmission.py prog5 file1 file2 file3 file4

This submission script is new. It passed what testing I have done on it, but it may still have issues. If there are any problems, please contact me via email and I will do my best to assist you. If I cannot be reached, please send me a copy of your assignment via email, and we will deal with the submission script later.

Due Date

This assignment is due Sunday November 3. The standard late policy concerning late submissions will be in effect.


Last modified: Sun Oct 20 03:23:24 EDT 2013