This is a little experiment in literate programming - binary search is a small enough testcase to get a taste of it.
This program will try to find the target number inside the sorted array of numbers.
We allow only ascending order of the array, and do a couple of checks that the order is indeed ascending.
(Originally indended to do sort-independent search)
First, let's make an outline:
<<*>>=
<>
<>
<>
@
So, first the boilerplate.
<>=
#include
#include
@
Search. Our function will take four arguments:
<>=
<>,
<>,
<>,
<>
@
Note that the right interval end is noninclusive, so if right<=left, means that there
are no elements of interest at all. We assume the caller takes care that
right <= length of the array, so we do not step over the boundaries.
Since we do not know the array length, we will use pointer. Therefore, we have the following definitions:
<>=
int left
<>=
int right
<>=
int *arr
<>=
int target
@
We will return the index of the found value, or, in the good old tradition, -1 if nothing has been found.
There goes the function declaration:
<>=
int mybsearch(<>)
@
Now, let's think about the possible scenarios. First, the simplest scenario of an empty interval.
This means we can not possibly find here, so we can return the -1 right away.
<>=
if (right <= left) {
/* empty interval */
return -1;
}
@
The next scenario to consider is the interval of the length one. Here, we can do a simple comparison of
the target with the value at the index "left" and return -1 if they are different, else we return "left".
<>=
else if (right == left + 1) {
/* interval of length 1 */
if(arr[left] == target) {
return left;
} else {
return -1;
}
}
@
Now, we've got the interval of length two. The simplest to do here, is, again - compare both values,
and return the corresponding index if one of them matches. Else, we return -1.
<>=
else if (right == left + 2) {
/* interval of length 2 */
if(arr[left] == target) {
return left;
} else if (arr[left+1] == target) {
return left+1;
} else {
return -1;
}
}
@
From here we have the intervals of increasing length (three and longer),
and this is where we need to remember we did not specify the sort order - ascending or descending.
So, we need to decide upon that. To do that we need to compare [left] element with [right-1].
<>=
else if (arr[right-1] >= arr[left]) {
/* ascending longer interval */
<>
} else {
/* descending longer interval */
<>
}
@
The behaviour of the code with the ascending and the descending intervals is mirrored, however,
we will consider both cases without optimizations.
We will handle each of these cases by splitting it into two subsets - odd number of elements (so the middle test
number will be exact middle), and even number of elements (so the middle number will be biased by one).
It is trivial to see that we have the odd number of elements in case (right-left mod 2 == 1) and even number
of elements otherwise.
<>=
if((right-left)%2) {
/* odd number of elements in the interval */
<>
} else {
/* even number of elements in the interval */
<>
}
@
For the ascending sequence with odd number of elements, we compare the middle element value with the target value.
Three cases are possible. It can be smaller, larger, or equal when compared to target.
We will not hunt down for the "first-most" element equal to the target and shoot for the first one we find, it is
a bit easier.
<>=
<>
<>
<>
<>
@
Calculating the middle index is merely a matter of averaging the leftmost and rightmost indices.
Assuming there is no integer overflow.
<>=
int mid = ((right-1)%2 + left%2)/2 + (right-1)/2 + left/2;
@
When we recurse into the right side, the middle index will need to be incremented
by one, because the left edge is inclusive, and we already checked it.
<>=
if (arr[mid] < target) {
return mybsearch(mid+1, right, arr, target);
}
@
When we recurse into the left side, we again would pass the mid index untouched, but note
that this time it will not be inclusive.
<>=
else if(arr[mid] > target) {
return mybsearch(left, mid, arr, target);
}
@
If the middle element is the same as target, unless we are satisfied with the random
element, we'd need to know if this element is the only one or there are some other to the left/right.
But, as I said, it is a separate exercise - so for now be happy with whatever little we do.
<>=
else {
return mid;
}
@
The sequence with even number of elements is a bit more fun. We can not really pick the "middle" element - the middle is
inbetween the two elements. So, while not terribly efficient, let's consider both of them. Their indices will be (right+left)/2
for the more one closer to the right, and (right+left)/2-1 for the one closer to the left.
In case we are happy with the random element found out of sequence, we have merely three cases here - one where we recurse to the right,
one where we recurse to the left, and one where we need to check both elements.
Useful to note that we have at a minimum four elements in the interval - so the scenarios into which we will recurse are
already defined in the beginning.
<>=
<>
<>
<>
<>
@
Let's calculate indices as we discussed before, with the naive trick for the integer overflow.
<>=
int mid_l = (right%2 + left%2)/2 + right/2 + left/2 - 1;
int mid_r = mid_l + 1;
@
And do the recurse parts:
<>=
if(arr[mid_l] > target) {
return mybsearch(left, mid_l, arr, target);
}
@
<>=
if(arr[mid_r] < target) {
return mybsearch(mid_r, right, arr, target);
}
@
Now, the middle part. It's effectively a recurse into 2-element interval to check both values.
<>=
else {
return mybsearch(mid_l, mid_r+1, arr, target);
}
@
Now, let's turn to the descending order. For now, disallow it, and return -1.
<>=
fprintf(stderr, "Descending intervals not supported, sorry!\n");
return -1;
@
Now we have defined all the steps, and we can assemble
the full definition of the search function:
<>=
<> {
<>
<>
<>
<>
}
@
Finally, we can define the main program. The first argument will be the number to find, the rest of the arguments will be the array.
We do not verify the possible failure of calloc, also the array that we make within the test is a bit unusual - we make it symmetric
to argv[] - so the indices 0 and 1 can not ever occur.
<>=
int main(int argc, char *argv[]) {
if(argc >= 1) {
int i;
/* let's make the arr index-compatible with argv - simpler if we want error reporting */
int *arr = calloc(argc, sizeof(int));
for(i=1;i= 0 ? arr[index] : -1 );
} else {
printf("Usage: %s \n", argv[0]);
}
return 0;
}
@