hw8-ttt.c,
hw8-symtab.c
hw8-inv.c
hw8-Makefile
.
These programs are to be electronically submitted by using the submit program on csa. The code you submit should conform with the program assignment guidelines.
Implement hw8-ttt.h, which contains the following interface for a 2-3 tree.
/*
* File: ttt.h
* -----------
* This file provides an interface for a general 2-3 search
* tree facility that allows the client to maintain control of
* the structure of the node.
*/
#ifndef _ttt_h
#define _ttt_h
#include "genlib.h"
/* Types: clientDataT, keyT, clientBlockT
* --------------------------------------
* These are all void*, but to emphasize the reasons for the void*,
* keyT is used for keys.
* clientBlockT for the blocks of memory
* (allocated by the client) stored in the tree. These blocks must
* include both a key and the data assoiciated with the key.
* clientDataT is used for any additional piece of data which client
* might pass along to callback functions.
*/
typedef void *clientDataT, *keyT, *clientBlockT;
/*
* Type: tttADT
* ------------
* This is the abstract type for a binary search tree.
*/
typedef struct tttCDT *tttADT;
/*
* Type: compFnT
* ------------
* This type defines the type space of comparison functions.
* Both arguments to compare functions are void*, but the following
* convention will be used: the first argument is always an "isolated" key,
* the second takes the address a client allocated block of memory
* which contains a key. The two keys are compared and an integer
* is returned which is negative, 0, or positive depending on
* whether the first key is less than, equal to, or
* greater than the second.
*
*/
typedef int (*compFnT)(const keyT p1, const clientBlockT p2);
/*
* Type: nodeFnT
* -------------
* This type defines the class of callback functions for nodes.
*/
typedef void (*nodeFnT)(clientBlockT blockPtr, clientDataT clientData);
/*
* Function: NewTTT
* Usage: ttt = NewTTT(compFn);
* -------------------------------------------------------
* This function allocates and returns a new empty binary search
* tree. The argument is a comparison function.
*/
tttADT NewTTT(compFnT compFn);
/*
* Function: FindTTTNode
* Usage: np = FindTTTNode(ttt, key);
* -----------------------------------
* This function applies the 2-3 search algorithm to find a
* particular key in the tree represented by ttt. The second
* argument represents the address of the key in the client
* space rather than the key itself, which makes it possible to
* use this package for keys that are not pointer types. If a
* node matching the key appears in the tree, FindTTTNode
* returns a pointer to it; if not, FindTTTNode returns NULL.
*/
clientBlockT FindTTTNode(tttADT ttt, keyT kp);
/*
* Function: InsertTTTNode
* Usage: np = InsertTTTNode(ttt, key, clientBlock);
* -------------------------------------------------
* This function is used to insert a new client block into a binary search
* tree. The ttt and key arguments are interpreted as they are
* in FindTTTNode. If the key already exists, the associated client
* block is overwritten. If the key is not found, then the clientBlock
* is added to the tree.
*/
void InsertTTTNode(tttADT ttt, keyT kp, clientBlockT clientBlock);
/*
* Function: MapTTT
* Usage: MapTTT(fn, ttt, order, clientData);
* ------------------------------------------
* This function calls fn on every node in the binary search tree,
* passing it a pointer to a node and the clientData pointer. The
* type of traversal is given by the order argument, which must
* be one of the constants InOrder, PreOrder, or PostOrder.
*/
typedef enum { InOrder, PreOrder, PostOrder } traversalOrderT;
void MapTTT(nodeFnT fn, tttADT ttt, traversalOrderT order,
clientDataT clientData);
/*
* Note: it is possible to implement functions for deleting and freeing,
* but you are not required to do so. Also, note that this implementation
* may leak memory when an item is added which has a key which is already
* in the tree.
*/
#endif
The algorithm for insertig into a 2-3 tree is not particularly hard, once you understand the process and see how to view it recursively. First, consider the process holistically, as if you could see the entire tree. Suppose it looked like
(h|n)
/ | \
(c) (j) (q|v)
Inserting f would be easy, it can join
c and form a node with 2 data values.
The result is:
(h|n)
/ | \
(c|f) (j) (q|v)
But if we try to insert z, things are more interesting.
First, we try to insert into a leaf, but (q|v) is "full".
So we split the node into two with one data value each, promoting
the "middle" data value. In this case, (q) and (z) are the new leaves,
and (v) gets promoted to (h|n) which is also "full", so again we split and
promote the middle (this time n).
Now the result is:
(n)
/ \
(h) (v)
/ \ / \
(c|f) (j) (q) (z)
The key to writing code for this is understand a more local perspective,
namely the view from a single node of the tree. Suppose in the preceeding
example you take the view of the node (h|n) while we insert z. The process
goes someting like this:
(v)
/ \
(q) (z)
(n)
/ \
(h) (v)
/ \ / \
Notice I have left off the bottom of the tree. We will be using pointers,
and as far as the current node is concerned, it is irrelevant if (h) and (n)
are leaves or not.
So the process really boils down to two phases: down and up. On the way down, all we do is continue the search in the appropriate subtree; on the way up, we deal with promotion.
Well, almost. Unfortunately, we have ignored two things: two extreme cases. Fortunately, they are not too bad to deal with. The first case should be obvious--we have no base case to the recursion! Clearly the base case has something to do with the leaves. It turns out to be easiest to consider the base case to be an empty tree (represented in C by a NULL). That is, we will not check to see if a node is a leaf, rather we will deal with the situation when a NULL is being handlesd by the code.
What do we do with an empty tree? If the entire tree were empty, we would build a new root node. Even if the tree is not entirely empty, the idea is the same,
The other extreme case is at the root. What if the root node returns something to promote? This is the case in the example above. The root node (h|n) inserts into its right subtree and receives a promotion in return. The root is already full, so this produces another promotion which results in a new root for the tree. This will cause the CDT to need a new value for its root, namely the promoted tree.
Putting this all together, we see that the recursive insertion function will look something like:
treeT RecInsert( ..., treeT tree, ...)
if (tree == NULL) { return NewPromotedTree (...); }
/* figure out which tree to search using key comparisons */
promotedNodePtr = RecInsert(...)
if (promotedNodePtr == NULL) {return NULL;}
/* deal with promotion (set links/data in this node and promoted node) */
return promotedNodePtr; /* will be NULL if previously had 1 data */
This will be called from a wrapper
which will have something like the following in it:
promotedNodePtr = RecInsert(..., ttt->root, ...)
if (promotedNodePtr != NULL) { ttt-> root = promotedNodePtr; }
The psuedo-code above is, of course, very sketchy and leaves out a lot of
details, but it does demonstrate the algorithm and give a good outline from
which to start.
Be sure to look at the information on the lab web pages as well.
Rob has put together some
code for 2-3 trees which should be very helpful in completing this
assignment.
Your program hw7-inv.c should still work with this new implementation of a symbol table, except that there is no delete now, since I am not requiring you to delete from 2-3 trees. In fact, it will have the nice property that the lists will now be in alphabetical order instead of "random". Make a copy of hw7-inv.c and call it hw8-inv.c Modify it to work with hw8-symtab.h. (If it didn't work quite right last time, fix it.)
Do not allow your work to be used by others:
Warning: If someone cheats by using your work, you will also be penalized.