# Unique Values Without Rearrangement4

Posted by Loren Shure,

In MATLAB, the simplest form of the function unique returns the unique values contained in a numeric vector, with the results sorted. This is often acceptable, but sometimes a user prefers the results in the order originally found in the data.

### Algorithm for unique

The reason the results are sorted is because of the algorithm used by unique. Conceptually, the input data is sorted, and then adjacent elements are compared. If there are equal elements, all elements except the first or the last are removed (depending on how you call the function). Hence, the output is sorted.

### Avoid the Sorted Output

To avoid the sorted output, you can simply sort the data first, retaining the indices from the sorting operation. Study the examples for sort to see how to use the second output of indices.

There were a couple of solutions posted with similar ideas, but different implementations. I'll walk you through the one posted by Jan Simon. The idea Jan uses is to take the difference of the sorted results and find where the differences are not zero (i.e., they are different values). Create the correct indices for these now unique values in the logical vector UV. Finally use this set of logical indices to extract the required values from the original data. Notice that this solution doesn't call the function unique and only calls the function sort one time.

### Code in Action

Let's create X and see what happens in the code.

myString = 'now is the time for cheering, tgif!';
X = double(myString)
X =
Columns 1 through 13
110   111   119    32   105   115    32   116   104   101    32   116   105
Columns 14 through 26
109   101    32   102   111   114    32    99   104   101   101   114   105
Columns 27 through 35
110   103    44    32   116   103   105   102    33


You can see the data X is now sorted in Xs and SortVec tracks the original locations of the values.

[Xs, SortVec] = sort(X(:))
Xs =
32
32
32
32
32
32
33
44
99
101
101
101
101
102
102
103
103
104
104
105
105
105
105
109
110
110
111
111
114
114
115
116
116
116
119
SortVec =
4
7
11
16
20
30
35
29
21
10
15
23
24
17
34
28
32
9
22
5
13
26
33
14
1
27
2
18
19
25
6
8
12
31
3


Now place the unique values (when diff isn't 0) into a logical vector according to the sorting.

UV(SortVec) = ([1; diff(Xs)] ~= 0)
UV =
Columns 1 through 13
1     1     1     1     1     1     0     1     1     1     0     0     0
Columns 14 through 26
1     0     0     1     0     1     0     1     0     0     0     0     0
Columns 27 through 35
0     1     1     0     0     0     0     0     1


Use the logical vector to re-scramble the sorting that occurred with the original data.

Y = X(UV)
Y =
Columns 1 through 13
110   111   119    32   105   115   116   104   101   109   102   114    99
Columns 14 through 16
103    44    33

finalString = char(Y)
finalString =
now isthemfrcg,!


### Do You Unique Data Values Unsorted?

Do you need unsorted unique values as part of your data processing? I'd love to hear more here. In the meantime, perhaps you could create a cryptic signature of the day by running your thoughts through this algorithm!

Get the MATLAB code

Published with MATLAB® 7.9

Jerker Wågberg replied on : 1 of 4

My two cents and lines:

[~,ix]=unique(myString, 'first');
finalString=myString(sort(ix))

Jerker Wågberg replied on : 2 of 4

Sorry, the two cents just got devaluated. Should have read the discussion first…

OysterEngineer replied on : 3 of 4

Your technique is clever, but it is difficult to follow at a glance. You’ve kind of left to the reader several important details. Like:

2. The details of how the chained use of SortVec & UV assures the order is related to the original order that we want.

Also, won’t the technique work properly with the string left in the character class? e.g., why was it necessary to convert the characters to numbers?

Loren replied on : 4 of 4

OysterEngineer-

The first value is unique since there are no other values to compare to, that’s why the leading 1.

I didn’t explain the details of the SortVec and UV because I think it helps to study the help and examples for sort/sortrows to understand and I didn’t feel like reproducing that here.

I converted to double since I have the Symbolic Toolbox on my path and there is an overloaded diff for the char datatype there that produces a derivative, which I didn’t want.

–Loren

These postings are the author's and don't necessarily represent the opinions of MathWorks.