C++: Sorting the Contents of an Array Based on the Contents of Another

I often find while programming that I want to sort the contents of one array based on that of another.  For instance, say I had the arrays

int rankings[] = {1,4,2,5,3}
string names[] = {"George", "Fred", "Gary", "John", "Bob"}

and I wanted to resort both arrays based on ranking.  That is, I would want the following result:

int rankings[] = {1,2,3,4,5}
string names[] = {"George",  "Gary", "Bob", "Fred", "John"}

Now in Python, this would be easy to do.

rankings = [1, 4, 2, 5, 3]
names = ["George", "Fred", "Gary", "John", "Bob"]
toSort = zip(rankings, names)
toSort.sort()
rankings, names = zip(*toSort)

Unfortunately you can’t zip C arrays or C++ STL containers up like that. There is probably some mechanism for doing so using Boost, but I figured it would be better to just use the standard C++ algorithms and iterators. I thus present the following code (dualsort.h):

#pragma once
 
#include <algorithm>
#include <stdexcept>
#include <vector>
#include <utility>
 
namespace xenoscope
{
	namespace utility
	{
		template <class Compare, class RandomAccessIterator>
		struct dual_sort_compare_wrapper
		{
			dual_sort_compare_wrapper(Compare comparer) :
			_comparer(comparer)
			{
 
			}
 
			bool operator() (std::pair<size_t, RandomAccessIterator> x, 
							std::pair<size_t, RandomAccessIterator> y)
			{
				return _comparer(*x.second, *y.second);
			}
 
			Compare _comparer;
		};
 
		template <class T>
		struct dual_sort_basic_compare
		{
			bool operator() (T x, T y)
			{
				return x < y;
			}
		};
 
		template <class IndexIterator, class RandomAccessIterator, class OutputRandomAccessIterator>
		void index_shuffle_copy(IndexIterator begin, 
								IndexIterator end, 
								RandomAccessIterator toSort,
								OutputRandomAccessIterator output)
		{
			while(begin != end)
			{
				*output = *(toSort + begin->first);
				begin++;
				output++;
			}
		}
 
		//Unfortunately we have to make a copy buffer.  Moving
		//data in-place would be O(n log n) at minimum, if not more.
		template <class RandomAccessIteratorType1, class RandomAccessIteratorType2, class Compare>
		void dual_sort(RandomAccessIteratorType1 it1_begin, 
						RandomAccessIteratorType1 it1_end, 
						RandomAccessIteratorType2 it2_begin, 
						RandomAccessIteratorType2 it2_end,
						Compare comp)
		{
			//Count how many elements we have
			unsigned n1 = it1_end - it1_begin;
			unsigned n2 = it2_end - it2_begin;
 
			//Do checking
			if(n1 == 0)
			{
				return;
			}
			if(n1 < 0)
			{
				//We got some messed up values
				throw std::invalid_argument("First set of iterator arguments are invalid.");
			}
			if(n1 != n2)
			{
				//The arrays were different sizes
				throw std::invalid_argument("Containers have different sizes.");
			}
 
			//Create a tracker with indices 0, n1-1 that we can use to see how the
			//order of the data changed.
			std::vector<std::pair<size_t, RandomAccessIteratorType1> > order(n1);
			size_t n = 0;
			for(RandomAccessIteratorType1 it = it1_begin; it != it1_end; it++, n++)
			{
				order[n] = std::make_pair(n, it);
			}
 
			//Do the sorting
			std::sort(order.begin(), 
						order.end(), 
						dual_sort_compare_wrapper<Compare, 
													RandomAccessIteratorType1 >(comp));
 
			//Order vectors appropriately
			std::vector<typename std::iterator_traits<RandomAccessIteratorType1>::value_type > copyBuffer1(n1);
			std::vector<typename std::iterator_traits<RandomAccessIteratorType2>::value_type > copyBuffer2(n2);
 
			index_shuffle_copy(order.begin(), order.end(), it1_begin, copyBuffer1.begin());	
			index_shuffle_copy(order.begin(), order.end(), it2_begin, copyBuffer2.begin());
 
			std::copy(copyBuffer1.begin(), copyBuffer1.end(), it1_begin);
			std::copy(copyBuffer2.begin(), copyBuffer2.end(), it2_begin);
		}
 
		template <class RandomAccessIteratorType1, class RandomAccessIteratorType2>
		void dual_sort(RandomAccessIteratorType1 it1_begin, 
						RandomAccessIteratorType1 it1_end, 
						RandomAccessIteratorType2 it2_begin, 
						RandomAccessIteratorType2 it2_end)
		{
			dual_sort(it1_begin, 
						it1_end, 
						it2_begin, 
						it2_end, 
						dual_sort_basic_compare<typename std::iterator_traits<RandomAccessIteratorType1>::value_type>());
		}
	}
}

This code implements two templated algorithms which extend the standard sort algorithm. They function identically to std::sort, except that they take an extra pair of iterators to a second container which you wanted to be reordered corresponding to the new order in the first container. Here’s an example of how to use it:

#include <vector>
#include <string>
#include <cstdio>
 
#include "dualsort.h"
 
using namespace std;
using namespace xenoscope::utility;
 
int origRankings[] = {1, 4, 2, 5, 3};
string origNames[] = {"George", "Fred", "Gary", "John", "Bob"};
vector<int> rankings;
vector<string> names;
 
void printData()
{
	printf("Rankings ordering is:\n");
	vector<int>::iterator rit;
	for(rit = rankings.begin(); rit != rankings.end(); rit++)
	{
		printf("\t%i\n", *rit);
	}
	printf("Names ordering is:\n");
	vector<string>::iterator nit;
	for(nit = names.begin(); nit != names.end(); nit++)
	{
		printf("\t%s\n", nit->c_str());
	}
	printf("\n\n");
}
 
struct ReverseCompare
{
	bool operator() (int i, int j) { return (i > j); }
} revComparer;
 
int main()
{	
	//Create a dataset
	rankings = vector<int>(origRankings, origRankings + 5);
	names = vector<string>(origNames, origNames + 5);
	printData();
 
	//Sort into ascending order
	dual_sort(rankings.begin(), rankings.end(), names.begin(), names.end());
	printData();
 
	//Sort into descending order using a custom comparer
	dual_sort(rankings.begin(), rankings.end(), names.begin(), names.end(), revComparer);
	printData();
 
	return 0;
}

It produces the following output:

Rankings ordering is:
        1
        4
        2
        5
        3
Names ordering is:
        George
        Fred
        Gary
        John
        Bob

Rankings ordering is:
        1
        2
        3
        4
        5
Names ordering is:
        George
        Gary
        Bob
        Fred
        John

Rankings ordering is:
        5
        4
        3
        2
        1
Names ordering is:
        John
        Fred
        Bob
        Gary
        George

The code is currently just limited to two containers, but you can easily extend it to more. You can also just make copies of the index vector and call dual_sort for as many vectors as you have. Let me know if you have any trouble with it, or any recommendations. I’m making this code public-domain, so feel free to use it in any project, commercial, open source, whatever. I hate licenses. Credit is of course appreciated, but by no means required.

Downloads:
dualsort.h

Comments

Leave a Reply

You must be logged in to post a comment.