COS 226 Programming Assignment

Bin packing

princetonshipping.com has just hired you as a consultant to help them optimally pack items of various weights into shipping crates. You are permitted to pack multiple items in the same crate, but union regulations prohibit you from packing more than 100 pounds of total weight to any one crate. Since your marginal cost is proportional to the total number of crates used, your task is to assign items to crates to use as few crates as possible. Admirably, you recall from COS 126 that this problem is NP-hard; thus it seems hopeless to find an efficient algorithm for finding the optimal packing. Instead, your goal is to design heuristics that run fast and produce high quality solutions.

We formulate the bin packing problem as follows:  given a set of N weights between 0.0 and 1.0, find a way to assign them to a minimal number of bins, each of capacity 1.0. Two intuitive heuristics immediately spring to mind. The worst-fit heuristic is to consider the weights in the order they are presented:  if the weight won't fit in any bin, create a new bin; otherwise place the weight in a bin that has the most remaining space. For example, this algorithm would put the weights 0.7 0.8 0.2 0.15 0.15 into three bins: {0.7, 0.2}, {0.8, 0.15}, {0.15}. The best-fit heuristic is identical, except that the weight is placed in a bin that has the least space among all bins with enough room to store the weight. This algorithm would put the same sequence of weights into two bins: {0.8, 0.2}, {0.7, 0.15, 0.15}.

Your task is to implement the two heuristics and compare their performance.

Perspective.   The bin packing problem is a fundamental problem for minimizing the consumption of a scarce resource, usually space. Applications include: packing the data for Internet phone calls into ATM packets, optimizing file storage on removable media, assigning commercials to station breaks, allocating computer memory, and minimizing the number of identical processors needed to process a set of tasks by a given deadline. In the latter example, the scarce resource is time instead of space. Cloth, paper, and sheet metal manufacturers use a two-dimensional version of the problem to optimally cut pieces of material (according to customer demand) from standard sized rectangular sheets. Shipping companies use a three-dimensional version to optimally pack containers or trucks.

Item data type.   To use the priority queue and symbol table ADTs, you need to specify and implement an appropriate Item data type to represent the bins. In addition to a list of the weights, this data type should contain a key field for use with a priority queue or symbol table ADT. The files item.h and item.c provide a sample interface and implementation of the Item data type for doubles. One of your challenges is to rewrite these two files to be useful in your bin packing applications. This will require some thought (and perhaps a few iterations on the problem). You must use the same Item data type in both of your heuristics.

Priority queue and symbol table ADTs.   For each of the two basic heuristics (worst-fit and best-fit), you will need to develop an efficient data structure that supports all of the basic operations needed to implement that heuristic. You are required to implement each of your heuristics using the client-implementation-interface paradigm.

For the worst-fit heuristic, create an interface pq.h that specifies which operations your data structure will support. You will certainly need insert and delete the maximum, so a priority queue seems like a judicious starting point. Create a file pq.c that implements these operations and a client program worstfit.c that uses these operations to perform the worst-fit heuristic. As a starting point, you may use this priority queue implementation and interface. They are adaptions of the heap code in Sedgewick Program 9.2. You are free to add functionality to the interface and implementation to support the client.

For the best-fit heuristic, create an interface st.h that specifies which operations you will need to support the best-fit heuristic. You will certainly need an insert operations. You will also need an efficient way to find the most full bin among those that still have enough room to store a given weight. Implementing this will require some careful thought and design. You might start by implementing a brute force solution, but you will quickly see that this approach is too slow to be useful in practice. Binary search trees provide more flexibility than priority queues, so as a starting point, consider using the following symbol table implementation and interface. They are adaptions of the randomized binary search tree code in Sedgewick Programs 13.2-13.4.

Input and output. Your client program will read in the set of weights (guaranteed to be between 0 and 1) from standard input. Your program should output the number of bins used by the heuristic and the total weight of all items (a lower bound on the number of bins required). If the number of items is less than 100, you should also print out the bins in decreasing order of remaining space. For each bin, print out the amount of remaining space and the items that it includes (in the order that they were inserted). For debugging, the output below also includes a unique id number for each bin that represents the order in which the bins were created.

% worstfit < input20.txt
total weight = 6.580996
total bins   = 8
     5 0.325755: 0.347661 0.326585 
     0 0.227744: 0.420713 0.351543 
     7 0.224009: 0.383972 0.392019 
     4 0.190811: 0.324387 0.484802 
     6 0.142051: 0.340190 0.263485 0.254274 
     3 0.116562: 0.347560 0.204065 0.331812 
     2 0.109806: 0.396295 0.230973 0.262926 
     1 0.082266: 0.311011 0.286309 0.320414 

Worst-fit decreasing and best-fit decreasing.   Experienced travelers know that if small items are packed last, they can fit snugly in the odd gaps in nearly filled luggage. This motivates a smarter strategy:  process the items from biggest to smallest. The worst-fit decreasing heuristic is to do worst-fit, but first preprocess the weights so that they are in descending order. The best-fit decreasing heuristic is define analogously.

Don't write any extra code to implement the worst- and best-fit decreasing heuristics. Instead, sort the input file in descending order using an independent sorting routine, then pipe the results through your best-fit heuristic. You can accomplish this in Unix with:

% gcc bestfit.c st.c item.c -o bestfit
% sort -r < input20.txt | bestfit
or from the Windows command prompt with:
C:\> lcc bestfit.c st.c item.c
C:\> sort /r < input20.txt | bestfit

Analysis.   Run your heuristics on a variety of inputs, with N = 100, 1,000, 10,000, 100,000, and 1,000,000. Do a few trials for each value of N using random weights between 0.0 and 1.0. In your writeup, comment on the relative effectiveness (number of bins used) of the four heuristics (best-fit, worst-fit, and the decreasing variants).


Extra Credit. Design and implement a heuristic that consistently beats best-fit decreasing for N =100,000.