Finding the Closest Value Less than a Threshold
I just got asked a question about a good way to find the closest value in a vector that was less than a threshold. My solution is fairly short, and demonstrates some of my favorite MATLAB techniques. I will compare also show you an "obvious" solution.
Contents
Set up
First let's set up the data for our problem.
thresh = 75; nvals = 10^6; data = 100*rand(1,nvals);
Solution via looping
We could solve this by brute force, just looping over the values. Let's try that. I'm going to set the index value to empty ([]). That way, if we end up with an array that doesn't meet the criterion, we can tell. Also, I am setting the current minimum value to -Inf so any finite value that we find as a valid candidate will be closer to thresh, assuming we can find one.
loopindex = []; candidate = -inf; for ind = 1:numel(data) dval = data(ind); if dval < thresh && dval > candidate candidate = dval; loopindex = ind; end end
Solution using logical indexing
Next I show you my non-loop solution.
First collect the list of possible data entries - the ones that are less than the threshold value thresh. This list is a logical variable, essentially true and false values for each entry in data, selecting the candidate values that are less than the thresh value.
possibles = data < thresh;
Let's find the actual best value, plus its index into the reduced set from possibles. The index we find will not be the index into data but rather into a smaller array which is the subset meeting the threshold criteria.
[posmax, posind] = max(data(possibles));
Convert the answer into the correct index in the original array, data.
inddatapos = find(possibles); % possible indices inddata = inddatapos(posind); % find the index we care about
If inddata is empty, then there were no possible values meeting the criterion. So we set the final values accordingly.
if isempty(inddata) posmax = -Inf; end
Let's make sure both solutions match
sameSolution = isequal([inddata posmax],[loopindex candidate])
sameSolution = 1
Which solution do you prefer?
You might guess correctly which solution is more natural to me at this point :-). I am wondering which solution you prefer, and why? Let me know here.
- Category:
- Indexing,
- Vectorization