BU CAS CS 113
Introduction to Computer Science II with Intensive C
Fall 1997


Assignment 5: General Sorting and Searching


Last Modified: Tue Oct 28 13:29:11 1997

Deadline

October 30, 1996

What to Submit

You should submit five files. Be sure the files you submit have exactly the following names: hw5-types1.h, hw5-types1.c, hw5-types2.h, hw5-types2.c, hw5-sort.c . As you will see below, most of your task will be to make small changes in code which appears on this page or code from the textbook, which you can get online.

These programs are to be electronically submitted by using the submit program on csa. The code you submit should conform with the program assignment guidelines.

Assignment

A general sorting algorithm is one which makes all of its sorting decsisions based soley on comparing two elements in the array, and not on any additional properties of the data elements. The algorithms presented in class and in the text are of this type, although we only implemented them for arrays of int.

There are at least two ways to implement general sorting algorithms so that a single package can be used to sort arrays of different types. One method is to write general sorting methods for arrays of void* and have the user of the package keeps track of the pointer work and pass a function variable which tells how to compare (we will learn how to do something like this later). In fact, C has a built in quicksort function which works this way.

A second way is to have the user of the package provide the package with the type of thing stored in the array and a compare function for that type. This method is a bit easier for us at the moment, although it has two drawbacks: you can only sort one kind of array in a given program, and the client becomes responsible for part of the implementation in a way that is not convenient to do C. Nevertheless, for simplicity this will be our method.

Our method for this will be to have TWO interfaces. hw5-types will contain the information about the type of thing stored in the array and how to compare them. hw5-sort will do the sorting based on this. Note that to write a program using this method, there are essentially two roles (think of two programmers) each one being a client of the other's implementation. Two write a program which can do sorting, each one is responsible for the files listed below:

	Programmer 1			Programmer 2
	(program using sorts)		(sorting algorithms)
	-----------			------------
 	main.c				sort.c
	types.h				sort.h
	types.c

  1. hw5-types1.h and hw5-types2.h

    These should look like the following example for the type double. Create two files (1 for int, 2 for string). You should only need to edit one line in the header file below for each of them.

    
    #ifndef _hw5_types_h
    #define _hw5_types_h
    #include <stdio.h>
    
    /* edit the line below to determine the type of array elements */
    typedef double elementT;
    
    /*
     * Function: Compare
     * Usage: result = Compare(e1, e2);
     * -----------------------------------
     * This function determines the ordering of things of type elementT.
     *
     * Returns a value of 
     * 	0 		if e1 and e2 are equivalent  
     *   a negative int 	if e1 preceds e2
     *   a positive int 	if e1 follows e2
     */
    int Compare(elementT e1, elementT e2);
    
    /*
     * Functions: FPrintElement, FScanElement
     * Usage: FPrintElement(file, element);  
     * result = FScanElement(file, &element); 
     * -----------------------------------------------------------------
     * These mimic fprintf() and fscanf() so that client code can be written to
     * read and write elements independently of what type they are.
     *
     */
    int FPrintElement(FILE* file, elementT e);
    int FScanElement(FILE* file, elementT *ePtr);
    
    #endif
    
  2. hw5-types1.c and hw5-types2.c

    Implement each of the interfaces above.

    In each of hw5-types1.c (the one for integers) and hw5-types2.h (the one for strings) the body of Compare() can be written in just one line. (Hint: 1) subtraction, 2) libraries.)

    Use fprintf() and fscanf() for FPrintElement() and FScanElement(). For strings, be sure to allocate memory for the string read. You may assume a string will not be longer than 100 characters. Each of these functions returns a value (although we often ignore it), so call these using something like

    return ( fprintf( .... ) );
    

    To make use of these, we will copy one of them to hw5-types.h and hw5-types.c, so it can be used with hw5-sort.h below.

    Here is an example for the type double:

    #include "hw5-types.h"	/* notice this is hw5-types for all versions */
    
    int Compare(elementT e1, elementT e2)
    {
      return e1 - e2;
    }
    
    
    int FPrintElement(FILE* file, elementT e)
    {
      return( fprintf( file, "%f", e) ) ;
    }
    
    int FScanElement(FILE* file, elementT *ePtr)
    {
      return( fscanf( file, "%lf", ePtr) );
    }
    
    

  3. hw5-sort.c

    Implement the following interface, hw5-sort.h:

    #include "hw5-types.h"  /* for types in arrays and compare function */
    
    /*
     * Function: NumOfSorts
     * Usage: n = NumOfSorts();
     * ---------------------------
     * Returns the number of sorts implemented in SortName and Sort
     */
    
    int NumOfSorts(void);
    
    /*
     * Function: SortName
     * Usage: name = SortName(sortNum);
     * -----------------------------------
     * if 0 <= sortNum < NumOfSorts(), then SortName returns the name of 
     * sort number sortNum.  Else it returns the value "No Sort Performed".
     * Note: counting starts at 0 as per usual C conventions.
     */
    
    char* SortName(int sort);
    
    /*
     * Function: Sort
     * Usage: Sort(sortNumber, array, size);
     * ----------------------------------------
     * If 0 <= sortNumber < NumOfSorts(), then sorts the array which has size
     * integer elements.  Else it returns without doing anything.
     */
    
    void Sort(int sort, elementT array[], int size);
    
    /*
     * Function: SearchSortedArray
     * Usage: location = SearchSortedArray(array, size, key);
     * ----------------------------------------
     * This is a O(log n) search.  array must be sorted prior to calling.
     * The return value is the index of some array element which matches key
     * or -1 if key is not in the array.
     */
    
    int SearchSortedArray(elmentT array[], int size, elementT key);
    

    For full credit you must implement at least three sorts:

    There is information about all of these in the exercises of the book on pages 319 - 322. See exercises 2, 8 and 9 especially. You may implement more sorts if you wish. Doing so may get you a few extra points.

    Much of the code you need is available. You will need to make some changes to the types (from int to elementT) and use your Compare() functions to do the comparing instead of <, <=, etc. Insertion sort is the only one you need to do from scratch.

    I have written a program which will use your sorting package and test the sorts. It will display information about correctness and efficiency. You should run this with your submissions both so that you can see how your sorts compare and because that is how i will grade it. If you use the Makefile for this assignment, the executables for this will be made and called hw5-itest and hw5-stest. To see it run on some large data files, type

    make test
    (Don't copy the data files to your own directory. We need to conserve space on csa.)

    You should see output something like:

    
     Loading data from file: data.random . . . done. 
    
     Loading data from file: data.increasing . . . done. 
    
     Loading data from file: data.decreasing . . . done. 
    
    EFFICIENCY TESTING
    
    SelectionSort
    =============
      Sort: SelectionSort  File: data.random
       Sorted?       Size  Time(msec)    n^2/Time     n log n/Time
             S        100          10     1000.00           46.052
             S        200          50      800.00           21.193
             S        400         150     1066.67           15.977
             S        800         540     1185.19            9.903
             S       1600        2110     1213.27            5.595
    
    
      Sort: SelectionSort  File: data.increasing
       Sorted?       Size  Time(msec)    n^2/Time     n log n/Time
             S        100          10     1000.00           46.052
             S        200          50      800.00           21.193
             S        400         150     1066.67           15.977
             S        800         560     1142.86            9.549
             S       1600        2180     1174.31            5.415
    
    
      Sort: SelectionSort  File: data.decreasing
       Sorted?       Size  Time(msec)    n^2/Time     n log n/Time
             S        100          10     1000.00           46.052
             S        200          50      800.00           21.193
             S        400         160     1000.00           14.979
             S        800         550     1163.64            9.723
             S       1600        2170     1179.72            5.440
    
    
    BubbleSort
    ==========
      Sort: BubbleSort  File: data.random
       Sorted?       Size  Time(msec)    n^2/Time     n log n/Time
             S        100          20      500.00           23.026
             S        200         100      400.00           10.597
             S        400         360      444.44            6.657
             S        800        1390      460.43            3.847
    
    
      Sort: BubbleSort  File: data.increasing
       Sorted?       Size  Time(msec)    n^2/Time     n log n/Time
             S        800          10    64000.00          534.769
             S       3200          10  1024000.00         2582.690
             S       6400          10  4096000.00         5608.994
             S      12800          30  5461333.33         4035.072
             S      25600          40  16384000.00         6496.222
             S      51200          90  -18594747.73         6168.744
    
    
      Sort: BubbleSort  File: data.decreasing
       Sorted?       Size  Time(msec)    n^2/Time     n log n/Time
             S        100          20      500.00           23.026
             S        200         130      307.69            8.151
             S        400         480      333.33            4.993
             S        800        1840      347.83            2.906
    
    
    
    ACCURACY TESTING
    
    SelectionSort:
       data.random: 	2749	4086	5627	5758	7419	10113
    		12767	16212	16838	17515	23010	31051
    
       data.increasing: 	0	0	2	2	4	4
    		6	6	8	9	10	11
    
       data.decreasing: 	9	10	11	12	13	14
    		15	15	16	16	17	18
    
    BubbleSort:
       data.random: 	2749	4086	5627	5758	7419	10113
    		12767	16212	16838	17515	23010	31051
    
       data.increasing: 	0	0	2	2	4	4
    		6	6	8	9	10	11
    
       data.decreasing: 	9	10	11	12	13	14
    		15	15	16	16	17	18
    

    You can run it on your own (smaller) data files by typing

    		hw5-itest datafiles
    
    or
    		hw5-stest datafiles
    
    where datafiles is a list of data files separated by spaces. Be sure the files contain the correct kind of data.

    There is also hw5-dtest which works on data of type double. You can use that to see if your sorts are correct independent of you types.h and types.c stuff.

    Makefile

    To check that your assignment compiles correctly, use the Makefile for this assignment:
    1. Copy this Makefile into the directory where your programs are.
    2. Name it Makefile.
      • Note:If you named it Makefile.a5, you will have to use the mv (short for move) command to rename it by typing
        mv Makefile.a5 Makefile
        or the cp command to copy it by typing
        cp Makefile.a5 Makefile
    3. type: make -k. (The -k means "keep going", which instructs make to build as much as it can, rather than quit after the first error.)

    Academic Honesty and Collaboration

    It is reasonable to discuss with others possible general approaches to problems. It is unreasonable to work together on a detailed solution, to copy a solution, or to give away a solution. If your common discussion can be detected by looking at the solutions, then there is too much collaboration. Such instances of academic dishonesty may result in a course grade of F or expulsion from Boston University.

    Do not allow your work to be used by others:

    Warning: If someone cheats by using your work, you will also be penalized.