Unique Values Without Rearrangement
In MATLAB, the simplest form of the function unique returns the unique values contained in a numeric vector, with the results sorted. This is often acceptable, but sometimes a user prefers the results in the order originally found in the data.
Contents
Algorithm for unique
The reason the results are sorted is because of the algorithm used by unique. Conceptually, the input data is sorted, and then adjacent elements are compared. If there are equal elements, all elements except the first or the last are removed (depending on how you call the function). Hence, the output is sorted.
Avoid the Sorted Output
To avoid the sorted output, you can simply sort the data first, retaining the indices from the sorting operation. Study the examples for sort to see how to use the second output of indices.
There were a couple of solutions posted with similar ideas, but different implementations. I'll walk you through the one posted by Jan Simon. The idea Jan uses is to take the difference of the sorted results and find where the differences are not zero (i.e., they are different values). Create the correct indices for these now unique values in the logical vector UV. Finally use this set of logical indices to extract the required values from the original data. Notice that this solution doesn't call the function unique and only calls the function sort one time.
Code in Action
Let's create X and see what happens in the code.
myString = 'now is the time for cheering, tgif!';
X = double(myString)
X = Columns 1 through 13 110 111 119 32 105 115 32 116 104 101 32 116 105 Columns 14 through 26 109 101 32 102 111 114 32 99 104 101 101 114 105 Columns 27 through 35 110 103 44 32 116 103 105 102 33
You can see the data X is now sorted in Xs and SortVec tracks the original locations of the values.
[Xs, SortVec] = sort(X(:))
Xs = 32 32 32 32 32 32 33 44 99 101 101 101 101 102 102 103 103 104 104 105 105 105 105 109 110 110 111 111 114 114 115 116 116 116 119 SortVec = 4 7 11 16 20 30 35 29 21 10 15 23 24 17 34 28 32 9 22 5 13 26 33 14 1 27 2 18 19 25 6 8 12 31 3
Now place the unique values (when diff isn't 0) into a logical vector according to the sorting.
UV(SortVec) = ([1; diff(Xs)] ~= 0)
UV = Columns 1 through 13 1 1 1 1 1 1 0 1 1 1 0 0 0 Columns 14 through 26 1 0 0 1 0 1 0 1 0 0 0 0 0 Columns 27 through 35 0 1 1 0 0 0 0 0 1
Use the logical vector to re-scramble the sorting that occurred with the original data.
Y = X(UV)
Y = Columns 1 through 13 110 111 119 32 105 115 116 104 101 109 102 114 99 Columns 14 through 16 103 44 33
finalString = char(Y)
finalString = now isthemfrcg,!
Do You Unique Data Values Unsorted?
Do you need unsorted unique values as part of your data processing? I'd love to hear more here. In the meantime, perhaps you could create a cryptic signature of the day by running your thoughts through this algorithm!