Implementing quicksort Part 1
Implementing alphabetical sorting in C using Test Driven Development
Introduction
Let's examine a classic computer science question - how to alphabetically sort words.
In the first part, we'll examine the problem and set up a test-driven-development template.
In the second part we'll examine the Quick Sort algorithm, and implement it to get our test to pass.
Heres our list of words (or 'strings' in C speak). They're the names of some of my friend's cats. We'll be using this list throughout the tutorial.
- Bryaxis
- Batman
- Sadie
- Milo
- Lou
Part 1
In Ruby, my weapon of choice for dealing with strings, it goes like this.
Open irb in your terminal window and type
@names = ['Bryaxis', 'Batman', 'Sadie', 'Milo', 'Lou']
@names.sort
To skip the interpretor look for file
part-1/01-alphabetical-sort-ruby.rb
ruby 01-alphabetical-sort-ruby.rb
Python is also good with strings
Open the python interpreter and type
python
names = ['Bryaxis', 'Batman', 'Sadie', 'Milo', 'Lou']
sorted(names)
To skip the the interpretor
part-1/03-alphabetical-sort-python.py
python 03-alphabetical-sort-python.py
In these languages, string sorting has already be implemented, but we want to examine how an algorithm like this is structured and implemented. So, let's get low-level and work it out in C.
If we're gonna reinvent the wheel, let's do it right and use test driven development. A software development technique designed to reduce frustration and make software development FUN.
The idea is simple, you write a failing test and then you write the code required to pass it. Its a built in reward system. Instead of wondering "did my program do what I wanted it to do?" you can have the software tell you it works.
We'll start with a simple C file
part-1/05-simple-assert.c
#include <stdio
#include <assert.h>
int main() {
assert(0);
return 0;
}
To Compile & run
bash
gcc 05-simple-assert.c && ./a.out
you should see:
bash
Assertion failed: (0), function main, file main.c, line 5.
Abort trap
If you are unfamiliar with TDD, you might be wondering what assert() is all about.
'assert()' is like a sanity test. It passes if true (the integer 1 in C) and fails if false (0). Our first bit of code here tests the test. We can make sure it passes on true by inserting a positive assert. This way, our code compiles without errors, but stops on execution if our algorithm isn't doing what we want it to.
part-1/06-simple-meta-assert.c
#include <stdio
#include <assert.h>
int main() {
assert(1);
assert(0);
return 0;
}
To compile and run
gcc 06-simple-meta-assert.c && ./a.out
you should see:
bash
Assertion failed: (0), function main, file main.c, line 6.
Abort trap
Notice that it failed on line 6, but not on line 5. this should make you comfortable enough with the assert() function to continue on.
Since we will be dealing with words, lets add the string.h header, and some string comparison assertions to our code.
part-1/07-simple-assert-two-strings.c
#include <stdio
#include <assert.h>
#include <string.h>
int main() {
char unsortedWords[] = "false";
char alphabetizedWords[] = "true";
assert(!strcmp(unsortedWords,unsortedWords));
assert(!strcmp(unsortedWords,alphabetizedWords));
return 0;
}
To compile and run
bash
gcc 07-simple-assert-two-strings.c && ./a.out
you should see
bash
Assertion failed: (!strcmp(unsortedWords,sortedWords)), function main, file main.c, line 8.
Abort trap
We are using the strcmp() function to compare two strings, notice the ! before strcmp(), because strcmp() is weird and returns 0 if the strings don't match and 1 if they do. (welcome to C, where up is down and down is up, I miss ruby already)
This gets us closer to being able to test if we correctly implemented a way to alphabetize words. We're not all the way there yet, because we can only test if one word is the same as another, so lets modify our code further to make a robust failing test to see if two arrays of strings are the same.
We'll need a way to iterate through an array of strings and compare each individual string in the array.
part-1/08-compare-first-element-of-string-array.c
#include <stdio
#include <assert.h>
#include <string.h>
int main() {
char *unsortedWords[5] = {"Bryaxis", "Batman" , "Sadie", "Milo", "Lou" };
char *alphabetizedWords[5] = {"Batman" , "Bryaxis", "Lou" , "Milo", "Sadie"};
assert(!strcmp(unsortedWords[0], unsortedWords[0]));
assert(!strcmp(unsortedWords[0],alphabetizedWords[0]));
return 0;
}
To compile and run
bash
gcc 08-compare-first-element-of-string-array.c && ./a.out
If we compile and run this, then we should get this message
bash
Assertion failed: (!strcmp(unsortedWords[0],alphabetizedWords[0])),
function main, file main.c, line 8.
Abort trap
Which tells us that the first word (aka string) in the array is not the same.
So lets modify our code furthur to check every single word in the array.
We'll have to quit using assert() because a failed assert stops execution of the program.
09-compare-array-of-strings.c
#include <stdio
#include <assert.h>
#include <string.h>
int main() {
char *unsortedWords[5] = {"Bryaxis", "Batman" , "Sadie", "Milo", "Lou" };
char *alphabetizedWords[5] = {"Batman" , "Bryaxis", "Lou" , "Milo", "Sadie"};
int metaTest[5];
int realTest[5];
int i;
for (i=0; i < 5; i++){
metaTest[i] = (!strcmp(unsortedWords[i], unsortedWords[i]));
realTest[i] = (!strcmp(unsortedWords[i],alphabetizedWords[i]));
}
printf("MetaTest:\n");
for (i=0; i < 5; i++){
printf("Success! unsortedWords[%d] '%s' matches
unsortedWords[%d] '%s'.\n",
i, unsortedWords[i], i, unsortedWords[i] );
}
printf("RealTest:\n");
for (i=0; i < 5; i++){
if (realTest[i]) {
printf("Success! unsortedWords[%d] '%s' matches
unsortedWords[%d] '%s'.\n",
i, unsortedWords[i], i, unsortedWords[i] );
}
else {
printf("Failure! unsortedWords[%d] '%s' doesn't match
alphabetizedWords[%d] '%s'.\n",
i, unsortedWords[i],i, alphabetizedWords[i] );
}
}
return 0;
}
To compile and run
gcc 09-compare-array-of-strings.c && ./a.out
And you should see:
MetaTest:
Success! unsortedWords[0] 'Bryaxis' matches unsortedWords[0] 'Bryaxis'.
Success! unsortedWords[1] 'Batman' matches unsortedWords[1] 'Batman'.
Success! unsortedWords[2] 'Sadie' matches unsortedWords[2] 'Sadie'.
Success! unsortedWords[3] 'Milo' matches unsortedWords[3] 'Milo'.
Success! unsortedWords[4] 'Lou' matches unsortedWords[4] 'Lou'.
RealTest:
Failure! unsortedWords[0] 'Bryaxis' doesn't match alphabetizedWords[0] 'Batman'.
Failure! unsortedWords[1] 'Batman' doesn't match alphabetizedWords[1] 'Bryaxis'.
Failure! unsortedWords[2] 'Sadie' doesn't match alphabetizedWords[2] 'Lou'.
Success! unsortedWords[3] 'Milo' matches unsortedWords[3] 'Milo'.
Failure! unsortedWords[4] 'Lou' doesn't match alphabetizedWords[4] 'Sadie'.
We have a automatic way to test if two arrays of words are the same. But its a little hard to infer that so, at the risk of going on a tangent here, I'm going to add some terminal colors to this output. Here I should mention I'm on OSX 10.6 so this color codes might not work for you. Along the way I'll enforce some better coding style-like not having lines longer than 79 columns.
10-compare-array-of-strings-colors.c
#include <stdio
#include <assert.h>
#include <string.h>
#define RESET "\033[0m"
#define BLACK "\033[30m" /* Black */
#define RED "\033[31m" /* Red */
#define GREEN "\033[32m" /* Green */
int main() {
char *unsortedWords[5] = {"Bryaxis",
"Batman",
"Sadie",
"Milo",
"Lou"};
char *alphabetizedWords[5] = {"Batman",
"Bryaxis",
"Lou",
"Milo",
"Sadie"};
int metaTest[5];
int realTest[5];
int i;
for (i=0; i < 5; i++){
metaTest[i] = (!strcmp(unsortedWords[i], unsortedWords[i]));
realTest[i] = (!strcmp(unsortedWords[i],alphabetizedWords[i]));
}
printf("MetaTest:\n");
for (i=0; i < 5; i++){
printf(GREEN);
printf("Success! unsortedWords[%d] '%s'",i, unsortedWords[i]);
printf("matches");
printf("unsortedWords[%d] '%s'.\n", i, unsortedWords[i]);
printf(RESET);
}
printf("RealTest:\n");
for (i=0; i < 5; i++){
if (realTest[i]) {
printf(GREEN);
printf("Success! unsortedWords[%d] '%s'",i, unsortedWords[i]);
printf("matches");
printf("alphabetizedWords[%d] '%s'.\n", i, unsortedWords[i]);
printf(RESET);
}
else {
printf(RED);
printf("Failure! unsortedWords[%d] '%s'",i, unsortedWords[i]);
printf("doesn't match");
printf("alphabetizedWords[%d] '%s'.\n", i, alphabetizedWords[i]);
printf(RESET);
}
}
return 0;
}
Our code is getting a little crowded, so let's put our test into a function. And comment out our metatest
#include <stdio
#include <assert.h>
#include <string.h>
#define RESET "\033[0m"
#define BLACK "\033[30m" /* Black */
#define RED "\033[31m" /* Red */
#define GREEN "\033[32m" /* Green */
void doesAlphabetizeWork(char** unsortedWords,
char** alphabetizedWords,
int howManyWords);
int main() {
char *unsortedWords[5] = {"Bryaxis",
"Batman",
"Sadie",
"Milo",
"Lou"};
char *alphabetizedWords[5] = {"Batman",
"Bryaxis",
"Lou",
"Milo",
"Sadie"};
doesAlphabetizeWork(unsortedWords,alphabetizedWords, 5);
return 0;
}
void doesAlphabetizeWork(char** unsortedWords,
char** alphabetizedWords,
int howManyWords) {
int i;
int metaTest[howManyWords];
int realTest[howManyWords];
for (i=0; i < howManyWords; i++){
metaTest[i] = (!strcmp(unsortedWords[i], unsortedWords[i]));
realTest[i] = (!strcmp(unsortedWords[i],alphabetizedWords[i]));
}
// printf("MetaTest:\n");
// for (i=0; i < howManyWords; i++){
// printf(GREEN);
// printf("Success! unsortedWords[%d] '%s'",i, unsortedWords[i]);
// printf("matches");
// printf("unsortedWords[%d] '%s'.\n", i, unsortedWords[i]);
// printf(RESET);
// }
// printf("RealTest:\n");
for (i=0; i < howManyWords; i++){
if (realTest[i]) {
printf(GREEN);
printf("Success! unsortedWords[%d] '%s'",i, unsortedWords[i]);
printf("matches");
printf("alphabetizedWords[%d] '%s'.\n", i, unsortedWords[i]);
printf(RESET);
}
else {
printf(RED);
printf("Failure! unsortedWords[%d] '%s'",i, unsortedWords[i]);
printf("doesn't match");
printf("alphabetizedWords[%d] '%s'.\n", i, alphabetizedWords[i]);
printf(RESET);
}
}
}
To compile and run
gcc 11-does-alphabetize-work-function.c && ./a.out
Now when we compile and run this code we should see (with colors of course)
Failure! unsortedWords[0] "Bryaxis'doesn't matchalphabetizedWords[0] 'Batman'".
Failure! unsortedWords[1] "Batman'doesn't matchalphabetizedWords[1] 'Bryaxis'".
Failure! unsortedWords[2] "Sadie'doesn't matchalphabetizedWords[2] 'Lou'".
Success! unsortedWords[3] "Milo'matchesalphabetizedWords[3] 'Milo'".
Failure! unsortedWords[4] "Lou'doesn't matchalphabetizedWords[4] 'Sadie'".