The purpose of this assignment is to help you learn about ARMv8 computer architecture, ARMv8 assembly language programming, and testing strategies. It also will give you the opportunity to learn more about the GNU/Linux programming tools, especially bash
, emacs
, gcc
, gdb
, and gprof
.
The assignment consists of two parts, each of which has subparts. We encourage you to complete Part 1 during the first half of the assginment period.
A similar assignment was given in past semesters. Since the switch to the ARMv8 architecture and assembly language, students reported taking, on average, 21.5 hours to complete the assignment — certainly a serious undertaking! (But in the recent terms for which this was a partnered assignment, the average has been marginally lower: approximately 19 hours, counting both partnered and individual submissions.)
You may work with one partner on this assignment. You need not work with a partner, but we strongly prefer that you do. Your preceptor will help you with your partner search, if you so desire.
Part 2f, as defined below, is the "challenge" part of this assignment. While doing the challenge part of the assignment, you are bound to observe the course policies regarding assignment conduct as given in the course Policies web page, plus one additional policy: you may not use any "human" sources of information. That is, you may not consult with the course's staff members, the Intro Lab TAs, other current students via Ed, or any other people while working on the challenge part of an assignment, except for clarification of requirements.
The challenge part is worth 10 percent of this assignment. So if you don't do any of the challenge part and all other parts of your assignment solution are perfect and submitted on time, then your grade for the assignment will be 90 percent.
As you have been doing thus far in the course, you may write code in whatever development environment and on whatever computer you see fit. For this assignment it is necessary to do all testing, debugging, and timing on armlab, however.
As with the other assignments, you will create your own repository, separate from ours, that has the same contents as ours. As a reminder, you can complete Setup Step 5 from the Git and GitHub Primer or follow these procedures to import your own version of our contents into your GitHub account.
The repository that you will import into your own
is:
https://github.com/COS217/Assembly.
This repository contains files that you will find useful. Subsequent parts of this document describe them. Once you have your working copy, complete the parts of the assignment given below, in the order of their appearance.
Do not use a C compiler to produce any of your assembly language code. Do not reference assembly code generated by a C compiler for the programs that you must produce. Doing so would be considered academically dishonest. Instead produce your assembly language code manually.
We encourage you to develop "flattened" C code (as described in lectures and precepts) to bridge the gap between the given "normal" C code and your assembly language code. Using flattened C code as a bridge can eliminate logic errors from your assembly language code, leaving only the possibility of translation errors.
We also encourage you to use your flattened C code as comments in your assembly language code. Such comments can clarify your assembly language code substantially.
The GNU toolset contains a program named wc
(word count). In its simplest form, wc
reads characters from stdin
until end-of-file, and writes to stdout
a count of how many lines, words, and characters it has read. A word is a sequence of characters that is delimited by one or more white space characters.
Consider some examples. In the following, a space is shown as s
and a newline character as n
.
If the file named proverb
contains these characters:
Learningsissan treasureswhichn accompaniessitsn ownerseverywhere.n --sChinesesproverbn
then the command:
$ wc < proverb
writes this line to stdout
:
5 12 82
If the file proverb2
contains these characters:
Learningsissan treasureswhichn accompaniessitsn ownerseverywhere.n --sssChinesesproverb
(note that the last "line" does not end with a newline character) then the command:
$ wc < proverb2
writes this line to stdout
:
4 12 83
The given mywc.c
file contains a C program that implements the subset of the wc
command described above. Translate that program into ARMv8 assembly language, thus creating a file named mywc.s
. Your mywc.s
program must be an accurate translation of mywc.c
: it must behave exactly the same as the given C program does, i.e., it must write exactly the same characters to stdout
, interact with memory in the way that we have described that an unoptimized C program would do, etc.
One complication with this is that the given mywc.c
program uses global variables, and so your mywc.s
must use global variables too. As described in precept, where a global variable explicitly initialized to a value of all zero bits is placed in memory is compiler-dependent behavior -- it might be placed with any other initialized variables in the DATA section or with any uninitialized variables that will assume the value zero at runtime in the BSS section. Although the compiler in our gcc217
environment would put these variables in the BSS section, for this assignment the mywc.s
program you write will instead use the DATA section for any global variables that were explicitly initialized to zero.
Compose data files that, when read by your mywc.s
program, perform boundary tests, statement tests, and stress tests of that program. Name each test file such that its prefix is mywc
and its suffix is .txt
. Thus, the command ls mywc*.txt
must display the names of all test files, and only those files.
Because the logic implemented by your mywc.s
program must be identical to the logic implemented by the given mywc.c
program, let's assume that any test file that boundary/statement/stress tests mywc.c
also boundary/statement/stress tests mywc.s
. (This means you could do Part 1b before Part 1a, if you'd like. In fact, it might even be a good idea!)
Describe your test files in your readme
file. Your descriptions must have this structure:
Boundary tests:
mywcXXX.txt
: Description of the characteristics of that file, and how it tests boundary conditions of the mywc.c
and mywc.s
programs.
mywcYYY.txt
: Description of the characteristics of that file, and how it tests boundary conditions of the mywc.c
and mywc.s
programs.
...
Statement tests:
mywcXXX.txt
: Description of the characteristics of that file, and which lines of the mywc.c
and mywc.s
programs it causes the computer to execute.
mywcYYY.txt
: Description of the characteristics of that file, and which lines of the mywc.c
and mywc.s
programs it causes the computer to execute.
...
Your description of each test file must be of the form "This file contains such-and-such characteristics, and so causes the computer to execute lines x, y, ..., and z of the given mywc.c
" (where x, y, ..., and z are line numbers). Collectively your statement test files should cause the computer to execute all lines of the given mywc.c
file (and so, by assumption, all lines of your mywc.s
file).
Stress tests:
mywcXXX.txt
: Description of the characteristics of that file, and how it stress tests the mywc.c
and mywc.s
programs.
mywcYYY.txt
: Description of the characteristics of that file, and how it stress tests the mywc.c
and mywc.s
programs.
...
In a more realistic context, some of your test files should contain character codes that are not valid in ASCII. However, in the context of this assignment, submit test files that contain character codes for only ASCII printable characters. Specifically, make sure your computer-generated test files contain only the character codes (in hexadecimal) 09, 0A, and 20 through 7E (inclusive). It would be difficult for your grader to examine files that contain other character codes.
You may submit as many test files as you want. However at most three of your test files may be large, and a large test file must contain no more than 50000 characters and no more than 1000 lines. It would be difficult for your grader to scroll through a test file that exceeds those limits.
To test your mywc.s
program, make sure that it writes the same output as mywc.c
does when given each your test files as input. The given testmywc
and testmywcdiff
are Bash shell scripts that automate your testing. Comments at the beginning of those files describe how to use them.
Many programming environments contain modules to handle high-precision integer arithmetic. For example, the Java Development Kit (JDK) contains the BigDecimal
and BigInteger
classes.
The Fibonacci numbers are used often in computer science. See http://en.wikipedia.org/wiki/Fibonacci_number for some background information. Note in particular that Fibonacci numbers can be very large integers.
This part of the assignment asks you to use assembly language to compose a minimal but fast high-precision integer arithmetic module, and use it to compute large Fibonacci numbers.
BigInt
Objects Using C CodeSuppose you must compute Fibonacci number 250000, that is, fib(250000)
...
You are given a C program that computes Fibonacci numbers. It consists of two modules: a client and a BigInt
ADT.
The client consists of the file fib.c
. The client accepts an integer x
as a command-line argument, where x
must be a non-negative integer. The client computes and writes fib(x)
to stdout
as a hexadecimal number. Then it writes to stderr
the amount of CPU time consumed while performing the computation. Finally the client performs some boundary condition and stress tests, writing the results to stdout
. The client module delegates most of the work to BigInt
objects.
The BigInt
ADT performs high precision integer arithmetic. It is a minimal ADT; essentially it implements only an "add" operation. The BigInt
ADT consists of four files:
bigint.h
is the interface. Note that the ADT makes eight functions available to clients: BigInt_new
, BigInt_free
, BigInt_assignFromHexString
, BigInt_largest
, BigInt_random
, BigInt_writeHex
, BigInt_writeHexAbbrev
, and BigInt_add
.
bigint.c
contains implementations of the BigInt_new
, BigInt_free
, BigInt_assignFromHexString
, BigInt_largest
, BigInt_random
, BigInt_writeHex
, and BigInt_writeHexAbbrev
functions.
bigintadd.c
contains an implementation of the BigInt_add
function.
bigintprivate.h
is a private header file — private in the sense that clients never use it. It allows code to be shared between the two implementation files, bigint.c
and bigintadd.c
.
Study the given code. Then build a fib
program consisting of the files fib.c
, bigint.c
, and bigintadd.c
, with no optimizations (that is, without the -D NDEBUG
option and without the -O
option, as described below). Run the program to compute fib(250000)
. In your readme
file note the amount of CPU time consumed.
BigInt
Objects Using C Code Built with Compiler OptimizationSuppose you decide that the amount of CPU time consumed is unacceptably large. You decide to instruct the compiler to optimize the code that it produces...
Build the fib
program using optimization. Specifically, build with the -D NDEBUG
option so the preprocessor disables the assert
macro, and with the -O
(that's an upper case letter, not a digit) option so the compiler generates optimized code. Run the resulting program to compute fib(250000)
. In your readme
file note the amount of CPU time consumed.
Suppose you decide that the amount of CPU time consumed still is too large. You decide to investigate by doing a gprof
analysis to determine which functions are consuming the most time...
Perform a gprof
analysis of the executable binary file from Part 2b. Save the textual report in a file named performance
. Don't delete the file; as described later in this document, you must submit that file.
BigInt
Objects Using Assembly Language CodeSuppose, not surprisingly, your gprof
analysis shows that most CPU time is spent executing the BigInt_add
function. In an attempt to gain speed, you decide to code the BigInt_add
function manually in assembly language...
Manually translate the C code in the bigintadd.c
file into ARMv8 assembly language, thus creating the file bigintadd.s
. Do not translate the code in other files into assembly language.
Your assembly language code must store all parameters and local variables defined in the BigInt_larger
and BigInt_add
functions in memory, on the stack.
Note that assert
is a parameterized macro, not a function. (See Section 14.3 of the King book for a description of parameterized macros.) So assembly language code cannot call assert
. When translating bigintadd.c
to assembly language, simply pretend that the calls of assert
are not in the C code.
Use .equ
directives to avoid "magic numbers." In particular, in bigintadd.s
use .equ
directives to give meaningful names to:
Enumerated constants, for example: .equ TRUE, 1
.
Parameter stack offsets, for example: .equ OADDEND1, 48
.
Local variable stack offsets, for example: .equ ULCARRY, 24
.
Structure field offsets, for example: .equ LLENGTH, 0
.
Build a fib
program consisting of the files fib.c
, bigint.c
, and bigintadd.s
using the -D NDEBUG
and -O
options. Run the program to compute fib(x)
for various values of x
, and make sure it writes the same output to stdout
as the program built from C code does.
It's error prone to compare the output of the two programs manually. Instead use the
diff
command. Commands such as these are appropriate:gcc217 fib.c bigint.c bigintadd.c -o fibc gcc217 -D NDEBUG -O fib.c bigint.c bigintadd.s -o fibs ./fibc somepositiveinteger > file1 ./fibs somepositiveinteger > file2 diff file1 file2 rm file1 rm file2If and only if the
diff
command generates no output, then the contents offile1
andfile2
are the same. Similarly, using thediff
command is appropriate in subsequent steps of this assignment.
The output generated by the
diff
command can be difficult to understand. If you have trouble understanding it, then one course of action is to study theman
page fordiff
(which isn't particularly helpful) or the GNU Comparing and Merging Files website.Another course of action is to widen your terminal window and then issue the
diff
command with the-y
option. When given the-y
optiondiff
writes complete side-by-side listings of the two given files, and marks lines that differ. Section 2.3 of the GNU Comparing and Merging Files website will help you to interpret the marks.
The given
simple.c
file defines a client of theBigInt
module that is much simpler than the client defined byfib.c
. If yourbigintadd.s
implementation is failing tests performed by thefib.c
client, then you might find it helpful to debug using thesimple.c
client.
Finally, run the program to compute fib(250000)
. In your readme
file note the amount of CPU time consumed.
BigInt
Objects Using Optimized Assembly Language CodeSuppose, to your horror, you discover that you have taken a step backward: the CPU time consumed by your assembly language code is approximately the same as that of the non-optimized compiler-generated code! So you decide to optimize your assembly language code...
Manually optimize your assembly language code in bigintadd.s
, thus creating the file bigintaddopt.s
. Specifically, perform this optimization:
Store all parameters and local variables defined in the BigInt_larger
and BigInt_add
functions in callee-saved registers instead of in memory.
Optimized assembly code can be (even more) difficult to read. Thus, you should supplement the .equ
directives in your code from the previous part with .req
directives to give meaningful names to:
Registers that now store parameters, for example: OADDEND1 .req x19
.
Registers that now store local variables, for example: ULCARRY .req x22
.
Build a fib
program consisting of the files fib.c
, bigint.c
, and bigintaddopt.s
using the -D NDEBUG
and -O
options. Run the program to compute fib(x)
for various values of x
, and make sure it writes the same output to stdout
as the program built from C code does. Finally, run the program to compute fib(250000)
. In your readme
file note the amount of CPU time consumed.
If your
bigintaddopt.s
implementation is failing tests performed by thefib.c
client, then you might find it helpful to debug using thesimple.c
client.
Can you write assembly language code that is approximately as fast as the optimized code that the compiler generates? That is, can you approximately tie the compiler?
BigInt
Objects Using Highly Optimized Assembly Language CodeFinally, suppose you decide to optimize your assembly language code even further, moving away from a statement-by-statement translation of C code into assembly language...
Further optimize your assembly language code in bigintaddopt.s
, thus creating the file bigintaddoptopt.s
. Specifically, perform these optimizations:
Use the assembly language guarded loop pattern described in Section 3.2 of Chapter 5 of the Pyeatt with Ughetta book instead of the simpler but less efficient loop patterns described in lecture and precept.
"Inline" the call of the BigInt_larger
function. That is, eliminate the BigInt_larger
function, placing its code within the BigInt_add
function.
Use the adcs
("add with carry and set condition flags") instruction effectively. The adcs
instruction computes the sum of its source operand, its destination operand, and the C condition flag, places the sum in the destination operand, and assigns 1 (or 0) to the C condition flag if a carry occurred (or did not occur) during the addition. Effective use of the adcs
instruction will use the C condition flag instead of a ulCarry
variable to keep track of carries during addition.
Feel free to implement any additional optimizations you want. You are not limited to the subset of ARMv8 assembly language covered in COS 217 lectures and precepts. However, your BigInt_add
function must be a general-purpose function for adding two given BigInt
objects; the function cannot be specific to the task of adding two Fibonacci numbers to generate a third Fibonacci number. In other words, your function must work with the given fib.c
and simple.c
clients.
This part is challenging, particularly the open-ended portion. We will not think unkindly of you if you stop after the specified operations above, or if you only complete some of them, or even if you decide not to do part f at all.
Build a fib
program consisting of the files fib.c
, bigint.c
, and bigintaddoptopt.s
using the -D NDEBUG
and -O
options. Run the program to compute fib(x)
for various values of x
, and make sure it writes the same output to stdout
as the program built from C code does. Finally, run the program to compute fib(250000)
. In your readme
file note the amount of CPU time consumed.
If your
bigintaddoptopt.s
implementation is failing tests performed by thefib.c
client, then you might find it helpful to debug using thesimple.c
client.
So it comes down to this: can you beat the compiler?
Edit your copy of the given readme
file by answering each question that is expressed therein.
One of the sections of the readme
file requires you to list the authorized sources of information that you used to complete the assignment. Another section requires you to list the unauthorized sources of information that you used to complete the assignment. Your grader will not grade your submission unless you have completed those sections. To complete the authorized sources section of your readme
file, copy the list of authorized sources given in the Policies web page to that section, and edit it as appropriate.
Provide the instructors with your feedback on the assignment. To do that, issue this command:
FeedbackCOS217.py 5
and answer the questions that it asks. (If you worked with a partner, then when answering the numeric questions, please enter the average of the responses that you and your partner would provide individually.) That command stores its questions and your answers in a file named feedback
in your working directory.
Submit your work electronically on armlab. If you worked with a partner, then one of the partners must submit all of your team's files, and the other partner must submit none of your team's files. Your readme
file and your source code must contain both your name and your partner's name. Use these commands to submit:
submit 5 mywc.s submit 5 mywc*.txt submit 5 performance submit 5 bigintadd.s submit 5 bigintaddopt.s submit 5 bigintaddoptopt.s submit 5 readme feedback
Finally, if you have worked with a partner, the submitting partner must create a partner file indicating the non-submitting partner's netid and submit it (the non-submitting partner should still submit no files). You can use the following commands, substituting in your partner's netid into each:
touch netid.partner submit 5 netid.partner
WARNING 1: please make sure the file has the.partner
extension, since that's how we can recognize it; please don't submit a file called, literally,netid.partner
-- changenetid
to be your actual partner's actual netid; yes, the actual netid used to log in to armlab, not their email alias. Non-submitting partners should make sure the submitting partner did this, because otherwise we have no way to know to give you credit!
WARNING 2: if you named your files (e.g. your test files from Part 1b) something different than you were supposed to and copied+pasted the lines above (in particular, the one with the wildcard), some of your files will not be submitted! Thesubmit
command prints an error message, but invariably multiple students per term ignore this. Don't be one of them: name your files correctly and, at the very least, submit the filenames that you actually have.
Comments in your assembly language code are especially important. Each assembly language function — especially the main
function — must have a comment with the same information previous assignments have expected for a C function comment. Local comments within your assembly language functions are equally important. Comments copied from corresponding flattened C code are particularly helpful.
Minimal requirement to receive credit for the "Translate to Assembly Language" (Part 1a) implementation:
The mywc.s
implementation must build.
Minimal requirement to receive credit for the "Add BigInt
Objects Using Assembly Language Code" (Part 2d) implementation:
The bigintadd.s
implementation must build with fib.c
and bigint.c
using the -D NDEBUG
and -O
options.
Minimal requirement to receive credit for the "Add BigInt Objects Using Optimized Assembly Language Code" (Part 2e) implementation:
The bigintaddopt.s
implementation must build with fib.c
and bigint.c
using the -D NDEBUG
and -O
options.
Minimal requirement to receive credit for the "Add BigInt
Objects Using Highly Optimized Assembly Language Code" (Part 2f) implementation:
The bigintaddoptopt.s
implementation must build with fib.c
and bigint.c
using the -D NDEBUG
and -O
options.
We will grade your work on two kinds of quality:
Quality from the user's point of view. From the user's point of view, your code has quality if it behaves as it must. The correct behavior of your code is defined by the previous sections of this assignment specification.
Quality from the programmer's point of view. From the programmer's point of view, your code has quality if it is well styled and thereby easy to maintain. Good program style is defined by the previous section of this assignment specification.
To encourage good coding practices, we will deduct points if gcc217
generates warning messages.
Testing is a substantial aspect of the assignment. At least 10 percent of the grade will be based upon your mywc
test plan as described in your readme
file and as implemented by your data files.
Your grade for Part 2f will be based upon:
Approach and style (6 percent). That is, does your code correctly apply the specified optimizations that logically should improve performance?
Raw performance (4 percent). That is, how quickly does your code compute fib(250000)
? The record is 1.08 seconds by partners Michelle Thurber '26 and Godwin Duan '26 in Fall 2023, but you can be assured of earning full raw performance points if you can get under twice their time: 2.15 seconds. Do note that we cannnot award any raw performance points unless your code passes all tests in the fib.c
client. (After all, one can get a wrong answer remarkably quickly ...)
This assignment was written by Robert M. Dondero, Jr.,
based in part upon suggestions from Andrew Appel.