[Gllug] [OT] sort algorithms

Tue Nov 29 22:51:53 UTC 2011

On 29 November 2011 22:36, David L Neil <GLLUG at getaroundtoit.co.uk> wrote:
> James,
>
>>>> Target pattern is a list of numbers in a certain predefined sequence.
>>>> The source data is a set of numbers that is always a subset of the
>>>> predefined sequence.
>>>> I need a way to sort the source data so that the numbers appear in the
>>>> same order as the predefined sequence.
>>>
>>> What is the order of magnitude of the numbers in the source data set/in
>>> the
>>> predefined sequence?
>>
>> Magnitude is the input/predefined sequence is between 0 and 2^32.  I.e
>> 0 to MAXINT32
>
> =ouch!
>
>
>>> How large/small a proportion of the predefined sequence is/are the source
>>> data set(s) likely to be?
>>
>> The source sequence will always be shorter than the predefined
>> sequence. Source is a subset of the predefined sequence.
>
> =understood from spec.
>
> =a selection process is only likely to be more efficient than a sort as the
> size of the sample (source) approaches the size of the population
> (predefined sequence, or MAXINT32). Hence the question. Any idea of the
> ratio?
>
>
>>> Is each record in the source unique with respect to the number in the set
>>> or
>>> may there be several with the same 'number',
>>> eg result may be 1 1 8 8 4 4 3 3 9 9
>>
>> There should not be any duplicates in the source or predefined sequence.
>
> =then efficient selection is possible...
>

I have just thought of an edge case. If values in the source do not
appear in the predefined sequence, I wish them to be dropped.
I was planning to do some filtering on the source data to pre-drop the
values outside the predefined sequence, but if the sort algorithm can
handle that at the same time, all the better.
--
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug