It's reached that time for me. I will be retiring from MathWorks at the end of March 2022. It's been 35 years of tremendous growth for MathWorks, and for me. When I started this blog, the original... read more >>

]]>I still love math and linear algebra. I've had the great good fortune to work with MathWorks founders, including my colleague and friend, Cleve Moler. I am grateful to Jack Little, who gave me (or I took!) opportunities that I never would have dreamt of - helping create world-class software, learning new domains in-depth, such as some areas in signal and image processing.

We have added great new functionality and technology over the years, enabling you to write code in many styles, ranging from quick one-off calculations to large systems that can be shared and deployed for many to use.

MATLAB started as a language for matrix computation and evolved to be replete with tools for a wide array of technical problems in technical computing and model-based design.

During that time, we have continued to focus on what you, our users, need. Sometimes it was completely new features, sometimes it was making certain features easier to use, and sometimes it was to make your code run faster (a focus from our teams all the time, even in the presence of other goals).

I have been lucky to help MathWorks by occasionally relocating for months at a time, several times, to Europe. In addition to travels far and wide, talking to you: users, researchers, engineers, scientists, professors, deans, rectors, and heads of large groups at small and large educational and research institutions as well as at a broad range of companies, has been the most rewarding part of my job.

From you, I have learned a lot, including things about MATLAB that I didn't know! And certainly things we could do to improve MATLAB. Please keep these suggestions and hassles coming our way. I've also learned plenty about science and engineering. And lots about the world we live in. What a beautiful place with an incredible collection of talented people.

These days I am very glad to see that the issue of inclusion of women and minorities is gaining the traction and focus that it deserves in STEM. In the broader world, and on a daily basis at MathWorks, there is evidence of progress. I didn't think early in my career that we'd have to work so hard to get there, but it's been totally worth every bit of effort.

The first major activity I am planning after I retire is taking a weaving class to create a Mobius scarf! So I'm not straying so far from all the technical stuff I love, but venturing into a more visually artistic rendition.

In the meantime, I leave you in the capable hands of my colleagues here at MathWorks. I hope you'll keep reading as Mike Croucher will very soon continue the tradition of blogging about MATLAB. May you all have prosperous futures and satisfaction as you learn new things and help the world become a better place.

Any comments? Leave them here.

A few nostalgic pictures for you.

Cleve and Loren at MathWorks 25th anniversary

MathWorks Team in the early days - our first real office, lines show me in the front middle, Cleve on far left.

Worldwide gathering of MathWorks for 35th anniversary, lines showing locations for Cleve (right) and me (left)

Copyright 2022 The MathWorks, Inc.

]]>Today's guest blogger is Matt Tearle, who works on the team that creates our online training content, such as our various Onramp courses to get you started on MATLAB, Simulink, and applications.... read more >>

]]>Today's guest blogger is Matt Tearle, who works on the team that creates our online training content, such as our various Onramp courses to get you started on MATLAB, Simulink, and applications. Matt has written several blog posts here in the past, usually prompted by a puzzle - and today is no different.

Wordle. It has captured us all. When Adam wrote his post about using MATLAB to solve Wordle puzzles, I had been thinking about doing exactly the same thing. (In the past, I have written code to cheat at Wordscapes and the NY Times Spelling Bee puzzle.) I've seen other friends post about letter distributions. I guess this is what nerds do.

When I read Adam's post, I knew I had to see if I could do better. My first thought was what reader Peter Wittenberg suggested: weighting letter probabilities by where they occurred in the word. I then tried something close to what another reader, TallBrian, suggested, by scoring words according to how much they cut down the possibilities in the next turn. I also experimented with how to choose the word with the most new letters.

But nothing worked. I couldn't make any significant improvement on Adam's 94% success rate. He had noted that there were some words in the official Wordle canon that were not in the set used to develop the algorithm. I was getting suspicious. My data-senses were tingling.

According to the letter probabilities, TARES is a great opening guess. But I felt like I hadn't seen too many words ending in -S, when I was playing Wordle as a human. Maybe it was time to compare the two word sets. Here I'm just repurposing Adam's code to get the two word lists. The dictionary words (called word5 in Adam's code) are in the string array trainwords. The actual Wordle list (called mystery_words in Adam's code) is in testwords.

[trainwords,testwords] = getdictionaries;

whos

The variable names show my bias: I was now thinking about this like a machine learning problem. One set of words had been used to train an algorithm - in this case, not a standard machine learning method, but a bespoke algorithm based on statistics. The other was the test set. Anyone who does machine learning knows that the quality of your model depends critically on the quality of your data. Specifically, you need your training data to accurately represent the actual data your model will be used on.

How do the letter distributions for the "training" and "test" data sets compare? First, I need to count the number of appearances of each letter in each location (for both sets of words):

% Make a list of all capital letters (A-Z)

AZ = string(char((65:90)'));

% Calculate distribution of letters

lprobTrain = letterdistribution(trainwords,AZ);

lprobTest = letterdistribution(testwords,AZ);

whos

I now have two 26-by-5 matrices of the frequency of each letter of the alphabet in each position. First, let's see the overall distribution:

% Average across the 5 letter positions

distTrain = mean(lprobTrain,2);

distTest = mean(lprobTest,2);

% Plot

bar([distTrain,distTest])

xticks(1:26)

xticklabels(AZ)

xtickangle(0)

legend("Training","Test")

ylabel("Proportion of uses")

title("Total letter distributions")

Wow, something's going on with S. Sure enough, there's a big difference between the usage in the two word lists. There are also more Ds in the training set than in the actual Wordle set. Meanwhile, there are several letters that are used more in the Wordle set than the general dictionary - most notably, R, T, and Y.

Now I'm even more suspicious of words ending in -S. Let's look at the full picture of usage by letter and word location. If you do any data analysis, you will likely have encountered this situation of wanting to visualize values as a function of two discrete variables. In the past, you may have used imagesc for this, and that's certainly a reliable old workhorse. But if you don't religiously read the release notes, you may not be aware of some of the new data analysis charting functions, such as heatmap (introduced in R2017a).

heatmap(1:5,AZ,lprobTrain)

title("Letter distribution for TRAINING data")

heatmap(1:5,AZ,lprobTest)

title("Letter distribution for TEST data")

Ah, sweet vindication! Sure enough, the words we built our strategy on have a different distribution to where the common letters are used in the actual Wordle words. In particular, note that Wordle is much more likely to start a word with S than end with one. The distribution of where Es show up is different, too. To make these kinds of comparisons, it might be easier to visualize the difference between the two heatmaps:

% Take the difference in distributions

delta = lprobTrain-lprobTest;

% Find the biggest, to set the color limits symmetrically

dl = max(abs(delta(:)));

% Visualize

hm = heatmap(1:5,AZ,delta,"ColorLimits",dl*[-1 1],"Colormap",turbo);

title(["Difference in letter distribution","Red = higher prevalance in training set","Blue = higher prevalance in test set"])

Notice that I've set the color limits so that 0 is a "neutral" green, while blue or red shows a difference in either direction. The lack of -S words shows up clearly, as does the shift from -E- to -E at the end of words. Together, this suggests fewer plurals (like WORDS) and fewer -ES verb forms (eg "Adam LOVES playing Wordle") in the Wordle list.

There are a few other details in there, but they're harder to see because everything is so dominated by the difference in -S words. Let's manually narrow the color range to see more details. This will show -S as less significant than it is, but that's OK - we know about that already.

hm.ColorLimits = [-0.1 0.1];

Now we can see some interesting trends: more -Y words (presumably adjectives like "this is a SILLY topic for a blog post"), as well as more -R and -T, fewer -D words (perhaps -ED past tenses like "until Adam's post, Matt LOVED playing Wordle"), fewer instances of vowels as the second letter, and R and S switching at positions 1 & 2.

Using this heatmap, you can also see that TARES is much more aligned with the training (dictionary) set than the test (Wordle) set: every letter has a positive value in the heatmap. So while the letters are all high probability overall, this specific arrangement is particularly good for the training set and bad for the test set. A simple rearrangement to STARE reverses the situation.

I noticed that my code jumps freely between strings, chars, and categoricals. You can see some of that in the code here, but my Wordle solver was even more liberal in its use of the different types. That might seem like evidence of bad programming - "pick a data type already!", you cry - but I'm claiming that this is actually a good practice: MATLAB gives you lots of great data types; use them! With the introduction of strings (R2016b), we get questions like "so should we just use strings now?" and "is there any point in char instead of string?". If you're confused about this, here's a simple principle: the unit of a char is a single character, the unit of a string is text of any length. Wordle is all about words... but also all about the letters! That's why it's useful to use both string (for studying words) and char (for letters).

Also, our dedicated team of developers gave us a whole pile of handy text functions along with strings. But those functions aren't just for strings - like many MATLAB functions, they accept different kinds of inputs. These ones accept text in any form and allow you to do basic text processing without regular expressions. For example, I've hypothesized that Wordle doesn't use -ES and -ED verb forms. Let's see the words that have those endings in each list:

ESDendings = @(words) words(endsWith(words,["ES","ED"]));

ESDtrain = ESDendings(trainwords)'

ESDtest = ESDendings(testwords)'

100*numel(ESDtrain)/numel(trainwords)

100*numel(ESDtest)/numel(testwords)

The handy endsWith function does what it suggests and finds the words with the given endings. Sure enough, the training data from the dictionary has many pairs of -ES and -ED verbs, like ACHED and ACHES (13.5% of the whole list). But Wordle has almost no words that are made by simply appending -ED or -ES to a 4-letter verb. Consequently, prevalence of -ED and -ES words is much lower (only 1%).

Having confirmed that the letter distributions were indeed different, I was able to salvage my pride by building my various solution algorithms with the Wordle list and then testing them. Now I was able to successfully solve the puzzle 99% of the time. Great. But also a little unsatisfying. As any data scientist knows, training and testing with the same data set is cheating and not a good measure of how well your algorithm will perform on new data.

But... well, there is no new data. The Wordle word list is set. So it remains a valid question: given the official Wordle list, what is the best way to solve it?

Unfortunately, readers pointed out some details with what Adam had done (which I had followed). That casts doubt on my "solutions". So for now, I'll need to keep tinkering. If the New York Times haven't hidden Wordle away behind a paywall by the time I figure it out, I'll be back. I might even be brave enough to enter the internet-argument-of-the-day: what is the best starting word?

Adam's readers had some clever ideas on how they would go about beating this addictive game. Do any of you have a guaranteed opening word? A secret strategy that you will reveal to subscribers for only $19.95? How did you find yours and what makes it so great? Let us know in the comments.

function [word5,mystery_words] = getdictionaries

% Copied from Adam F

% read the list of words into a string array

r = readlines("https://gist.githubusercontent.com/wchargin/8927565/raw/d9783627c731268fb2935a731a618aa8e95cf465/words");

% replace diacritics using a custom function from the Appendix

rs = removediacritics(r);

% keep only the entries that start with a lower case letter

rs = rs(startsWith(rs,characterListPattern("a","z")));

% get rid of entries with apostrophes, like contractions

rs = rs(~contains(rs,"'"));

% Wordle uses all upper case letters

rs = upper(rs);

% get the list of unique five letter words

word5 = unique(rs(strlength(rs)==5));

mystery_id = "1-M0RIVVZqbeh0mZacdAsJyBrLuEmhKUhNaVAI-7pr2Y"; % taken from the sheet's URL linked above

mystery_url = sprintf("https://docs.google.com/spreadsheets/d/%s/gviz/tq?tqx=out:csv",mystery_id);

mystery_words = readlines(mystery_url);

% there's an extra set of double quotes included, so let's strip them out

mystery_words = erase(mystery_words,"""");

% also we're using upper case

mystery_words = upper(mystery_words);

end

function lprob = letterdistribution(words,AZ)

% split our words into their individual letters

letters = split(words,"");

% this also creates leading and trailing blank strings, drop them

letters = letters(:,2:end-1);

% Calculate the distribution of letters in each word position

for k = 1:5

lcount(:,k) = histcounts(categorical(letters(:,k),AZ));

end

lprob = lcount./sum(lcount); % Normalize

end

% Also from Adam

% citation: Jim Goodall, 2020. Stack Overflow, available at: https://stackoverflow.com/a/60181033

function [clean_s] = removediacritics(s)

%REMOVEDIACRITICS Removes diacritics from text.

% This function removes many common diacritics from strings, such as

% á - the acute accent

% à - the grave accent

% â - the circumflex accent

% ü - the diaeresis, or trema, or umlaut

% ñ - the tilde

% ç - the cedilla

% å - the ring, or bolle

% ø - the slash, or solidus, or virgule

% uppercase

s = regexprep(s,'(?:Á|À|Â|Ã|Ä|Å)','A');

s = regexprep(s,'(?:Æ)','AE');

s = regexprep(s,'(?:ß)','ss');

s = regexprep(s,'(?:Ç)','C');

s = regexprep(s,'(?:Ð)','D');

s = regexprep(s,'(?:É|È|Ê|Ë)','E');

s = regexprep(s,'(?:Í|Ì|Î|Ï)','I');

s = regexprep(s,'(?:Ñ)','N');

s = regexprep(s,'(?:Ó|Ò|Ô|Ö|Õ|Ø)','O');

s = regexprep(s,'(?:Œ)','OE');

s = regexprep(s,'(?:Ú|Ù|Û|Ü)','U');

s = regexprep(s,'(?:Ý|Ÿ)','Y');

% lowercase

s = regexprep(s,'(?:á|à|â|ä|ã|å)','a');

s = regexprep(s,'(?:æ)','ae');

s = regexprep(s,'(?:ç)','c');

s = regexprep(s,'(?:ð)','d');

s = regexprep(s,'(?:é|è|ê|ë)','e');

s = regexprep(s,'(?:í|ì|î|ï)','i');

s = regexprep(s,'(?:ñ)','n');

s = regexprep(s,'(?:ó|ò|ô|ö|õ|ø)','o');

s = regexprep(s,'(?:œ)','oe');

s = regexprep(s,'(?:ú|ù|ü|û)','u');

s = regexprep(s,'(?:ý|ÿ)','y');

% return cleaned string

clean_s = s;

end

Today's guest blogger is Adam Filion, a Senior Data Scientist at MathWorks. Adam has worked on many areas of data science at MathWorks, including helping customers understand and implement data... read more >>

]]>Today's guest blogger is Adam Filion, a Senior Data Scientist at MathWorks. Adam has worked on many areas of data science at MathWorks, including helping customers understand and implement data science techniques, managing and prioritizing our development efforts, building Coursera classes, and leading internal data science projects.

My wife recently introduced me to the addictive puzzle game Wordle. In the game, you make a series of guesses to figure out the day's secret answer word. The answer is always a five letter English word, and you have six attempts to guess the right answer. After each guess, the game gives you some information about how close you are to the answer.

Figure 1: Examples of how Wordle gives feedback.

I've always been more of a numbers guy, so after getting stuck on a daily Wordle I decided to see if I could make better guesses using MATLAB. In this post, I'll walk through a simple method of generating suggestions for the Wordle game that can get the right answer within six guesses 94% of the time without knowing Wordle's official word list. The example puzzle is from Jan 12, 2022.

Figure 2: A blank Wordle puzzle. Six guesses remaining!

Table of Contents

Generate our vocabulary
Find the most commonly used letters
Create a score for each word
Choose a word and make our first guess
Account for Wordle's feedback
Make our second guess
Make our third guess
Make our fourth guess
Make our fifth guess
Make our sixth and final guess
Play a random game of Wordle
Play all possible games of Wordle
Areas for improvement
Appendix

If we're going to play games of Wordle, we need a vocabulary list of five letter English words. Fans of the game have already scraped the Wordle source code and shared the list of 2,315 mystery words and 12,972 guessable words (thanks FiveThirtyEight!). We'll come back to the mystery words later to check our accuracy but using that in our solver feels a bit like cheating, so let's pretend we don't know what list Wordle uses. There isn't a single comprehensive list of English words, so let's pick a common source for coders looking for a list of English words, a list that unix systems provide under /usr/share/dict/words. If you're on Windows, you can find the same list on places like github. We can easily read text files like this directly off the web for text processing using readlines. This list includes acronyms and proper nouns, which we can remove by ignoring entries that start with a capital letter. While it doesn't contain the full English language, it gives us a list of 4,581 five letter words to play with. We'll probably be missing some of the words in Wordle's mystery list but it should still be close enough to make helpful suggestions.

% read the list of words into a string array

r = readlines("https://gist.githubusercontent.com/wchargin/8927565/raw/d9783627c731268fb2935a731a618aa8e95cf465/words");

% replace diacritics using a custom function from the Appendix

rs = removediacritics(r);

% keep only the entries that start with a lower case letter

rs = rs(startsWith(rs,characterListPattern("a","z")));

% get rid of entries with apostrophes, like contractions

rs = rs(~contains(rs,"'"));

% Wordle uses all upper case letters

rs = upper(rs);

% get the list of unique five letter words

word5 = unique(rs(strlength(rs)==5))

Now we have our list of five letter words, but how to pick which word to guess first? Our first guess is made blind, with no clues to the final answer. Since Wordle gives feedback by letter, an easy method is to pick the word that has the most commonly used letters.

Let's start by splitting each word into its letters and looking at the overall histogram of letters. We can see that some letters are used vastly more often than others.

% split our words into their individual letters

letters = split(word5,"");

% this also creates leading and trailing blank strings, drop them

letters = letters(:,2:end-1);

% view the counts of letter use

h = histogram(categorical(letters(:)));

ylabel("Number of uses in five letter words")

Let's put this in a table for use in creating word scores.

lt = table(h.Categories',h.Values','VariableNames',["letters","score"])

We can now create a word score based on the popularity of the letters it uses. Start by replacing each letter with its individual score, then adding up the letter scores to create word scores.

% for each letter, replace it with its corresponding letter score

letters_score = arrayfun(@(x) lt.score(lt.letters==x),letters);

% sum the letter scores to create word scores

word_score = sum(letters_score,2);

% find the top scores and their corresponding words

[top_scores,top_idx] = sort(word_score,1,"descend");

word_scores = table(word5(top_idx),top_scores,'VariableNames',["words","score"]);

While I'm no game theorist, it seems obvious our opening move should be one that uses five different and popular letters to maximize the chance we'll get useful feedback to narrow down our search. After removing words with repeated letters, we see AROSE is the top choice for first word so let's try that.

% find how many unique letters are in each word

word_scores.num_letters = arrayfun(@(x) numel(unique(char(x))),word_scores.words);

% keep only the words with no repeated letters

top_words_norep = word_scores(word_scores.num_letters==5,:);

head(top_words_norep)

Figure 3: Our Wordle puzzle after making our first guess.

After submitting our first guess, we can see that three of the letters, A, R, and O, are in the final answer but in different positions. The letters S and E are not in the word at all. This feedback eliminates a huge number of possible words.

Now that we have this feedback, how can we incorporate it? It's a fairly simple matter of representing the feedback received then looping through those results and eliminating words that are no longer possible solutions. We do so in the filter_words helper function found in the Appendix. With it we pass in our table of words and their scores, the words we've guessed so far, and the encoded results of those guesses. The results are encoded as a matrix with one row per guess and one column per letter. If the letter is incorrect it is encoded as 0, if the letter is in the answer but not in that position it is encoded as 1, and if it is in the correct position it is encoded as 2.

We're off to a good start! Passing this information to filter_words, we've narrowed our candidates down from 4,581 words to just 35.

% our previous guesses

guesses = "AROSE";

% encode the feedback

results = [1,1,1,0,0];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

We can see the top score for the next word is TAROT, but at this point we're probably better off still using words with five unique letters, so let's try RATIO.

FIgure 4: Our Wordle puzzle after making our second guess.

Now the "A" is in the right location, and we've eliminated two more popular letters. After adding in this information, there are only 10 candidates left and CAROL is the next top choice.

% our previous guesses

guesses = ["AROSE";"RATIO"];

% encode the feedback

results = [1,1,1,0,0;

1,2,0,0,1];

% filter down to the remaining candidates, no requirement on unique letters

top_words_filtered = filter_words(word_scores,guesses,results)

Figure 5: Our Wordle puzzle after making our third guess.

Now we've got two letters in the right spot, and by process of elimination we know "R" must come last. Adding this info, we see there's only five choices left and three of them start with M so let's go with MANOR.

% our previous guesses

guesses = ["AROSE";"RATIO";"CAROL"];

% encode the feedback

results = [1,1,1,0,0;

1,2,0,0,1;

0,2,1,2,0];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

Figure 6: Our Wordle puzzle after making our fourth guess.

And now we're left with two choices for two guesses.

% our previous guesses

guesses = ["AROSE";"RATIO";"CAROL";"MANOR"];

% encode the feedback

results = [1,1,1,0,0;

1,2,0,0,1;

0,2,1,2,0;

0,2,0,2,2];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

Figure 7: Our Wordle puzzle after our fifth guess.

One option left for our final guess. Fingers crossed!

% our previous guesses

guesses = ["AROSE";"RATIO";"CAROL";"MANOR";"VAPOR"];

% encode the feedback

results = [1,1,1,0,0;

1,2,0,0,1;

0,2,1,2,0;

0,2,0,2,2;

1,2,0,2,2];

% filter down to the remaining candidates

top_words_filtered = filter_words(word_scores,guesses,results)

Figure 8: Our Wordle puzzle after our sixth guess. Success!

So, it worked out with this Wordle puzzle, but it took all six guesses so we cut it close. How well will this work in general?

If MATLAB knows what the answer is, we can automate the process of playing a game of Wordle and see if our algorithm will correctly guess it. We'll start by creating another helper function wordle_feedback in the Appendix to encode the feedback we receive for each guess based on the correct answer.

Now we can automatically play a game using our play_wordle helper function. This accepts our table of five letter words and their scores, along with a word to serve as the answer. It will return the answer we were trying to guess, whether or not we won while playing, and the guesses made along the way. As we play, we'll require that our first three guesses use no repeating letters (assuming such words are still possible), but from the fourth guess on letters can repeat.

Since we know where there's a list of the mystery words, we can read it from the Google sheet directly into MATLAB.

mystery_id = "1-M0RIVVZqbeh0mZacdAsJyBrLuEmhKUhNaVAI-7pr2Y"; % taken from the sheet's URL linked above

mystery_url = sprintf("https://docs.google.com/spreadsheets/d/%s/gviz/tq?tqx=out:csv",mystery_id);

mystery_words = readlines(mystery_url);

% there's an extra set of double quotes included, so let's strip them out

mystery_words = erase(mystery_words,"""");

% also we're using upper case

mystery_words = upper(mystery_words);

Our algorithm can only guess words from the vocabulary we gave it. About 4% of mystery words are missing from our vocabulary, so even if we play perfectly using the words we know, the best win rate we can expect is 96%.

num_missing = sum(~ismember(mystery_words,word_scores.words))

perc_missing = num_missing / numel(mystery_words) * 100

Now that we have the mystery list, we can play a game with a random answer to guess.

answer_idx = randi(numel(mystery_words));

[answer,win,played_words] = play_wordle(word_scores,mystery_words(answer_idx))

We can test our algorithm across the entire 2,315 mystery word vocabulary by running in a loop. We can see that this simple approach will get us the right answer within six guesses about 94% of the time, which is pretty close to the maximum possible of 96%! When we do win, we'll most commonly win in four guesses.

num_games = numel(mystery_words);

wins = nan(num_games,1);

guesses = strings(num_games,6);

answers = strings(num_games,1);

for ii = 1:num_games % for each word in our vocabulary

% play a game of Wordle where that word is the answer we're guessing

[answers(ii),wins(ii),guesses(ii,:)] = play_wordle(word_scores,mystery_words(ii));

end

fprintf("This strategy results in winning ~%0.1f%% of the time.\n",sum(wins)/numel(wins)*100)

num_guesses = sum(guesses(wins==1,:)~="",2);

histogram(num_guesses,"Normalization","probability")

xlabel("Number of guesses when winning Wordle")

ylabel("Fraction of victories")

Here's how the game went for an answer we didn't get correct.

missed_answers = answers(wins==0);

[answer,win,played_words] = play_wordle(word_scores,missed_answers(1))

There seems to be two patterns to missed answers.

- As mentioned above, about 4% of answers aren't in our vocabulary, such as with RAMEN and ZESTY. You can tell when this happens because we lose the game without using all our guesses due to running out of allowable words.
- Some answers combine a common letter pattern with a rarely used letter, and we didn't have enough guesses to narrow it down. For example, when the answer is FIXER, there are 39 words in our vocabulary that use "I" in the second position and "ER" at the end. Out of all of them FIXER has the lowest word score due to F and X both being in the bottom seven least used letters. Our six guesses go AROSE, LITER, DINER, RIPER, HIKER, FIBER and we run out of guesses before getting to FIXER.

What are some other things we could try to get our win rate to 100%? Here's a few ideas:

- We identified the two main patterns to missed answers above. Clearly the first pattern could be resolved just by adding Wordle's mystery words to our vocabulary.
- A solution to the second pattern is less clear. One drawback of our current word scoring approach is that the scores are static, so if a word like FIXER starts with a lower score, that will never change. We could potentially get a few more correct guesses by updating our score as we play by removing the ineligible words and/or solved letter positions from the score computation.
- We could also try improving our scoring method by looking for common patterns, called n-grams. Most commonly n-grams are used to find common word combinations, but it can also be used to find common letter combinations. We could extract the top letter n-grams and incorporate that into our score, since guessing a word with a common n-gram will get us feedback on many similar words.
- We're already requiring that our first three guesses use non-repeating letters, which is a strategy I picked through trial-and-error and may not be optimal. We could also use non-overlapping words on the first few guesses, even if we already got some letters correct. This would require us to always use 10 unique letters across our first two guesses, even if we have to make guesses we know can't be correct in order to do so. I experimented with using this universally and it actually decreases the overall win rate very slightly, but there may be a smarter way to use it situationally.

function word_scores_filtered = filter_words(word_scores,words_guessed,results)

% remove words_guessed since those can't be the answer

word_scores_filtered = word_scores;

word_scores_filtered(matches(word_scores_filtered.words,words_guessed),:) = [];

% filter to words that have correct letters in correct positions (green letters)

[rlp,clp] = find(results==2);

if ~isempty(rlp)

for ii = 1:numel(rlp)

letter = extract(words_guessed(rlp(ii)),clp(ii));

% keep only words that have the correct letters in the correct locations

word_scores_filtered = word_scores_filtered(extract(word_scores_filtered.words,clp(ii))==letter,:);

end

end

% filter to words that also contain correct letters in other positions (yellow letters)

[rl,cl] = find(results==1);

if ~isempty(rl)

for jj = 1:numel(rl)

letter = extract(words_guessed(rl(jj)),cl(jj));

% remove words with letter in same location

word_scores_filtered(extract(word_scores_filtered.words,cl(jj))==letter,:) = [];

% remove words that don't contain letter

word_scores_filtered(~contains(word_scores_filtered.words,letter),:) = [];

end

end

% filter to words that also contain no incorrect letters (grey letters)

[ri,ci] = find(results==0);

if ~isempty(ri)

for kk = 1:numel(ri)

letter = extract(words_guessed(ri(kk)),ci(kk));

% remove words that contain incorrect letter

word_scores_filtered(contains(word_scores_filtered.words,letter),:) = [];

end

end

end % filter_words

function results = wordle_feedback(answer, guess)

results = nan(1,5);

for ii = 1:5 % for each letter in our guess

letter = extract(guess,ii); % extract that letter

if extract(answer,ii) == letter

% if answer has the letter in the same position

results(ii) = 2;

elseif contains(answer,letter)

% if answer has that letter in another position

results(ii) = 1;

else

% if answer does not contain that letter

results(ii) = 0;

end

end

end % wordle_feedback

function [word_to_guess,win,guesses] = play_wordle(word_scores, word_to_guess)

top_words = sortrows(word_scores,2,"descend"); % ensure scores are sorted

guesses = strings(1,6);

results = nan(6,5);

max_guesses = 6;

for ii = 1:max_guesses % for each of our guesses

% filter our total vocabulary to candidate guesses using progressively different strategies

if ii == 1 % for our first guess, filter down to words with five unique letters and take top score

top_words_filtered = top_words(top_words.num_letters==5,:);

elseif ii <= 3 % if we're generating our second or third guess

% filter out ineligible words and require five unique letters if possible

min_uniq = 5;

top_words_filtered = filter_words(top_words(top_words.num_letters==min_uniq,:),guesses(1:ii-1),results(1:ii-1,:));

% if filtering to five unique letters removes all words, allow more repeated letters

while height(top_words_filtered) == 0 && min_uniq > min(word_scores.num_letters)

min_uniq = min_uniq - 1;

top_words_filtered = filter_words(top_words(top_words.num_letters==min_uniq,:),guesses(1:ii-1),results(1:ii-1,:));

end

else % after third guess, set no restrictions on repeated letters

top_words_filtered = filter_words(top_words,guesses(1:ii-1),results(1:ii-1,:));

end

% generate our guess (if we have any)

if height(top_words_filtered) == 0 % if there are no eligible words in our vocabulary

win = 0; % we don't know the word and we've lost

return % make no more guesses

else % otherwise generate a new guess and get the results

guesses(1,ii) = top_words_filtered.words(1);

results(ii,:) = wordle_feedback(word_to_guess,guesses(1,ii));

end

% evaluate if we've won, lost, or should keep playing

if guesses(1,ii) == word_to_guess % if our guess is correct

win = 1; % set the win flag

return % make no more guesses

elseif ii == max_guesses % if we've already used all our guesses and they're all wrong

win = 0; % we've lost and the loop will end

else % otherwise we're still playing

end

end

end % play_wordle

% citation: Jim Goodall, 2020. Stack Overflow, available at: https://stackoverflow.com/a/60181033

function [clean_s] = removediacritics(s)

%REMOVEDIACRITICS Removes diacritics from text.

% This function removes many common diacritics from strings, such as

% á - the acute accent

% à - the grave accent

% â - the circumflex accent

% ü - the diaeresis, or trema, or umlaut

% ñ - the tilde

% ç - the cedilla

% å - the ring, or bolle

% ø - the slash, or solidus, or virgule

% uppercase

s = regexprep(s,'(?:Á|À|Â|Ã|Ä|Å)','A');

s = regexprep(s,'(?:Æ)','AE');

s = regexprep(s,'(?:ß)','ss');

s = regexprep(s,'(?:Ç)','C');

s = regexprep(s,'(?:Ð)','D');

s = regexprep(s,'(?:É|È|Ê|Ë)','E');

s = regexprep(s,'(?:Í|Ì|Î|Ï)','I');

s = regexprep(s,'(?:Ñ)','N');

s = regexprep(s,'(?:Ó|Ò|Ô|Ö|Õ|Ø)','O');

s = regexprep(s,'(?:Œ)','OE');

s = regexprep(s,'(?:Ú|Ù|Û|Ü)','U');

s = regexprep(s,'(?:Ý|Ÿ)','Y');

% lowercase

s = regexprep(s,'(?:á|à|â|ä|ã|å)','a');

s = regexprep(s,'(?:æ)','ae');

s = regexprep(s,'(?:ç)','c');

s = regexprep(s,'(?:ð)','d');

s = regexprep(s,'(?:é|è|ê|ë)','e');

s = regexprep(s,'(?:í|ì|î|ï)','i');

s = regexprep(s,'(?:ñ)','n');

s = regexprep(s,'(?:ó|ò|ô|ö|õ|ø)','o');

s = regexprep(s,'(?:œ)','oe');

s = regexprep(s,'(?:ú|ù|ü|û)','u');

s = regexprep(s,'(?:ý|ÿ)','y');

% return cleaned string

clean_s = s;

end

In the early 1990s, to avoid eval and all of its quirks (if you don't know about this, DON'T look it up - it's totally discouraged), we recommended using feval for evaluating functions that might... read more >>

]]>In the early 1990s, to avoid eval and all of its quirks (if you don't know about this, DON'T look it up - it's totally discouraged), we recommended using feval for evaluating functions that might not be known until supplied by the user running the code. We used this, for example, for evaluating numeric integrals. We wanted to leave the integrand completely flexible and up to the user. Yet the integrator had to be able to evaluate the user function, an unknown at the time of creating the integrator.

function I=integ(fcn,fmin,fmax,tol)

if ~ischar(fcn)

error(...)

end

% figure out some initial points to evaluate

pts = linspace(fmin, fmax, 20);

fv = feval(fcn,pts);

I = ...

:

end

This had the advantage of not asking MATLAB to "poof" any variables into the workspace. It helped also avoid situations where there was a possibility of a function and variable having the same name, thereby possibly not giving you the version of the name you expected. The way you used feval at that time was generally via a character array identifying the function to be called.

I am only considering the use of feval in the context of characters or strings in MATLAB, and not for some of the more specialized versions such as working with GPUs.

You would call the integration function like this.

area = integ('myfun', 0, pi);

Today, with function handles, we can bypass using feval and use the function handle directly.

function I=integ(fcn,fmin,fmax,tol)

if ~isa(fcn, 'function_handle')

% might still be nice to allow chars for backward compatiblity- but not be permissive about allowing new "strings".

% if ~isa(fcn, 'function_handle') || ~ischar(fcn)

error(...)

end

% figure out some initial points to evaluate

pts = linspace(fmin, fmax, 20);

fv = fcn(pts);

I = ...

:

end

Call it like this.

area = integ(@myfun, 0, pi);

This is useful for at least a couple of reasons:

- It is generally faster, if only by a bit, because there is one less indirection of function calls.
- It evaluates the function for the handle you supply - and can't get confused about other possible name conflicts as a result. Inside the integrator, we have complete control about the name of the function (called fcn in the code) and since it's a function handle, it can't conflict with anything else we may have around in the function or environment.

I know we use feval for some cases of working with GPUs, but I can't think of any typical MATLAB case where I still need to use feval instead of directly applying the function. Do you still use feval, perhaps where it's no longer needed? Let us know here.

Copyright 2022 The MathWorks, Inc.

]]>I have seen a lot of code in my life, including code from many different people written for many different purposes, and in many different "styles". These styles range from quick-and-dirty (I only need to do this once), to fully optimized, documented, and tested (I want this to last a long time while other people use it). For me, I have found, a bit more than I expected, that the quick-and-dirty quickly morphs into something being useful and used a lot, but without the thought and care of making sure the code is really up to the task.

Today I want to argue why, as soon as you take your quick-and-dirty code, to use again, it is time to refactor and use some good engineering techniques to whip it into shape.

First break the code into logical units that are small. Each unit is more concise, it's more focused, does less, and it's clear what it does and does not do. Decide on edge cases and error conditions and deal with these in a way that is straight-forward and is likely to cause users of this module the least amount of trouble.

As I already said, each part does less and so it's easier to understand what each piece does. In fact, if you can get the piece of code to do one thing well, that often pays off. One technique for doing this is to reduce the branching (if-elseif-else) and instead have various branches relegated to separate functions. Another technique is to use an arguments block to check the input arguments. This generally takes up fewer lines of code and you are able to be concise and precise using it.

When you have smaller components, they are frequently easier to understand, and much easier to test - especially if there are a very limited number of code paths. You can check out the complexity of your code using either of 2 options to checkcode.

checkcode(filename, "-cyc")

checkcode(filename, "-modcyc")

Complexity is reduced when you refactor the code so there are not so many nested statements like if and switch statements.

With smaller pieces, each part is easier to test, debug, check out edge conditions. You can be certain you are covering all the bases more readily. Of course, using the one of the testing frameworks.

Since we're working with smaller units, it's generally easier to reuse the pieces because, I hope, we've split things up in such a way that the interfaces are simpler to use. With each piece small, there's usually only a limited number of inputs required.

With input argument lists smaller, I hope the calling syntax is shorter. If you are calling functions that take optional arguments from within your function, the more the code complexity, the more likely code gets indented further and further to the right, either having code not in view in your window, or the argument list goes on for several lines, especially if you are using the new name=value syntax. In that case, each "input" is potentially long, possible causing extra line continuations to fit inside the editor window parameters you have set.

I have an example from the file exchange, X Steam. If you look at the code, there are lots of switch/case statements with if-else statements inside. How can you easily debug something near a particular line. It's hard! And that's despite the code not being poorly written or structured apart from this nesting.

checkcode Xsteam.m -cyc

With this many paths through the code, 322, what are the chances that there are no issues, despite the refactored functions? If I were using this for some work I wanted to publish, I would need to make sure that all the paths I used were correctly computing what I need. Since that's a hassle, I'd likely refactor the code.

Apart from going in the code and copy/pasting to elsewhere (could be in the same file), you can use the tools in the editor toolstrip or from the right-click context menu once you've made your selection.

Do you let the code rule you or do you rule your code? Please post any additional techniques or benefits I have not mentioned right here.

Copyright 2021 The MathWorks, Inc.

]]>
Seventeen? Why 17? Well, as a high school student, I attended HCSSIM, a summer program for students interested in math. There we learned all kinds of math you don't typically learn about until... read more >>

]]>As I went to college, the number 17 was a part of my life. Looking through the course catalogue before my first semester, I saw an offering something like "the seventeen regular tilings of the plane", and I signed up. And isn't cool that all of these patterns are displayed in tiles within the Alhambra! I leave you to search the many sites with pictures and drawings of these.

I enjoy the artwork of Rafael Araujo. If you have watched any webinars I have delivered during 2020-2021, you may notice a piece of Araujo's hanging in the background. The basis for much of his work is the golden mean (or golden ratio). Here's a place where you can explore the influence of math on art.

It's defined as the solution to

And the value, typically denoted by the Greek letter

$$\varphi \text{\hspace{0.17em}}=\frac{1+\sqrt{5}}{2}\text{\hspace{0.17em}}$$

or approximately

phi = (1+sqrt(5))/2

And there are claims that this ratio is universally(?) pleasing. You can see approximations to it show up in everyday life. In the US, we use note cards that are 5x3".

ratio5to3 = 5/3

So, close.

plot(0:(5/3):5,0:3,'.')

title("Not quite the golden ratio: " + ratio5to3)

axis equal

axis tight

I have written several blogs that show ways to compute Fibonacci numbers, also related to the golden mean. Why? Because ratios of successive Fibonacci numbers converge to the golden mean

$$\underset{\mathit{n}\to \infty}{\mathrm{lim}}\frac{{\mathit{F}}_{\mathit{n}+1}}{{\mathit{F}}_{\mathit{n}}}=\varphi \text{\hspace{0.17em}}$$

In the 1990s, we held several MATLAB User Conferences. In 1997, I gave a talk on Programming Patterns in MATLAB. I had 17 of them available, but time to only discuss 6 of them. The regular tilings of the plane seemed like a cool way to categorize and clump together some of programming patterns I wanted to talk about. I thought it would be interesting to revisit many of these and see how well they held up over time. So that's my plan for some of the upcoming posts though I feel no compulsion to do all of them or in the order they showed up in my original talk.

Steve Eddins wrote two posts on this topic in 2016: one and two. And I wrote one as well, on performance implications.

When I first started at MathWorks (1987), MATLAB had only double matrices and no other data types or dimensions. If I wanted to remove the mean of each column of data in a matrix, I would do something like this.

A(4,4) = 0;

A(:) = randperm(16)

Here I'll calculate the mean of each column.

meanAc = mean(A)

and then I needed to create an array from meanAc that was the same size as A in order to subtract the means. Originally, we did this by matrix multiplication.

Ameans1 = ones(4,1)*meanAc

And now I can do the subtraction.

Ameanless1 = A-Ameans1

I then met a customer at my first ICASSP conference (in Phoenix, AZ), Tony, and he asked why I was not using indexing instead - because I never thought about it! This is cool because I didn't need to do arithmetic to get my expanded mean matrix.

Ameans2 = meanAc(ones(1,4),:)

isequal(Ameans1, Ameans2)

That was all well and good - but potentially not so easy to remember each time you might need it.

In 1996, we had heard plenty from customers that we were making something simple a little too difficult. And, we were very close to introducing ND arrays, where we wanted to be able to do similar operations in any chosen dimension(s). So we introduced a new function, repmat.

Now I can find the matrix mean with easier to read code, in my opinion.

Ameanlessr = A - repmat(mean(A),[4,1])

isequal(Ameanless1, Ameanlessr)

By 2006, we had a lot of evidence that handling really large data was important for many of our customers, and likely to be an increasing demand. Up until then, we always created an intermediate matrix the same size as our original one, A, in order to calculate the result. But this wasn't strictly necessary -- we just need some syntax -- a way to express that all the rows (or columns) would be the same. Now, of course we need a matrix the same size as A for the answer. But how many more arrays of that size did we need along the way? Along came the function, gloriously named bsxfun (standing for binary singleton expansion), and we could perform the computation without fully forming the mxn matrix to subtract from the original.

Ameanlessb = bsxfun(@minus, A, mean(A))

isequal(Ameanless1, Ameanlessb)

Finally, in 2016, we decided that the meaning was clear even if it wasn't strictly linear algebra, and we now allow many operations to take advantage of implicit expansion of singleton dimensions. What this means for us with this problem is now we can simply say

Ameanless2016 = A - mean(A)

isequal(Ameanless1, Ameanless2016)

I do not expect a Part 5 to come along in 2026, though of course I could be wrong!

Copyright 2021 The MathWorks, Inc.

]]>Today I want to introduce you to Jake Mitchell, a MATLAB user that I knew of and someone recently reminded me of again. Jake is a mechanical engineering major who is interested in data science. He uses MATLAB to explore strategies and positions in various games, and then writes about it. As he does, he shows the core code for the way pieces move and the game unfolds.

Jake has really nice commentary about possible strategies, based on simulating many, many plays of each games. In some cases, he also applies machine learning techniques to enable a machine to learn to play, such as tic-tac-toe. He's got an algorithm for playing Connect-4, a fun post on Chutes and Ladders. And he explores games that appear simple, or at least have simple governing rules, to ones that have much more nuance.

I learned to play Settlers of Catan probably 15 years ago. And I still play occasionally. Now I will be armed with more strategic knowledge after reading Jake on How I Built the Best Catan Board.

Perhaps my favorite is Jake's analysis on the value of Monopoly properties. He goes into all the different property types, adding houses and hotels, plus utilities and railroads. And don't forget about going to jail! I like the way Jake presents the results as well, sometimes in tables and sometimes in plots. Here's a plot Jake allowed me to copy, showing the effects of houses and hotels on reaching break-even on the investment. Plus I really like that he uses the same colors as the Monopoly game so you can easily tell which group of properties are which.

I also like that he delves into the ins and outs of Boardwalk and Park Place!

Have you made or analyzed games using MATLAB? Clearly some people have, when I check out the File Exchange, or with this search. If you have, please share with us here!

Copyright 2021 The MathWorks, Inc.

]]>

Today's guest blogger is Christine Tobler, who's a developer at MathWorks working on core numeric functions.Hi everyone! I'd like to tell you a story about round-off error, the algorithm used in sum,... read more >>

]]>Today's guest blogger is Christine Tobler, who's a developer at MathWorks working on core numeric functions.

Hi everyone! I'd like to tell you a story about round-off error, the algorithm used in sum, and compability issues. Over the last few years, my colleague Bobby Cheng has made changes to sum to make it both more accurate and faster, and I thought it would be an interesting story to tell here.

Table of Contents

Even for a simple function like sum, with floating-point numbers the order in which we sum them up matters:

x = (1 + 1e-16) + 1e-16

y = 1 + (1e-16 + 1e-16)

x - y

So the result of the sum command depends on the order in which the inputs are added up. The sum command in MATLAB started out with the most straight-forward implementation: Start at the first number and add them up one at a time. I have implemented that in sumOriginal at the bottom of this post.

But we ran into a problem: For single precision, the results were sometimes quite wrong:

sumOfOnes = sumOriginal(ones(1e8, 1, 'single'))

What's going on here? The issue is that for this number, the round-off error is larger than 1, as we can see by calling eps

eps(sumOfOnes)

Therefore, adding another 1 and then rounding to single precision returns the same number again, as the exact result is rounded down to fit into single precision:

sumOfOnes + 1

This was particularly noticeable when sum was used to compute the mean of a large array, since the mean computed could be smaller than all the elements of the input array.

Now we already had a threaded version of sum in MATLAB, where we would compute the sum of several chunks of an array in parallel, and then add those up in the end. It turned out that this version didn't have the same issues:

numBlocks = 8;

sumInBlocks(ones(1e8, 1, 'single'), numBlocks)

We have made this change in MATLAB's sum even for the non-threaded case, and it addresses this and other similar cases. But keep in mind that while this is an improvement for many cases, it's not a perfect algorithm and still won't always give the correct answer. We could modify the number of blocks that we split the input into, and not all work out great:

sumInBlocks(ones(1e8, 1, 'single'), 4)

sumInBlocks(ones(1e8, 1, 'single'), 128)

Let's look at another case where we're not just summing up the number one (still summing up the same number everytime, because it makes it easier to compare with the exact value). We'll be looking at the relative error of the result here, which is more relevant than if we get the exact integer correct:

x = repmat(single(3155), 54194, 1);

exactSum = 3155*54194;

numBlocks = 2.^(0:9)

err = zeros(1, length(numBlocks));

for i=1:length(numBlocks)

err(i) = abs(sumInBlocks(x, numBlocks(i)) - exactSum) / exactSum;

end

loglog(numBlocks, err)

xlabel('Number of blocks')

ylabel('Relative error in sumInBlocks function')

So there's definitely a balancing act for choosing the exact number of blocks, with the plot above just representing one dataset.

There are other possible algorithms for computing the sum, I'm including some links at the bottom. Some more complicated ones would result in a slowdown in computation time, which would put a burden on cases that aren't experiencing the issues with accuracy described above.

We chose this "computing by blocks" algorithm because the change could be made in a such a way that sum became faster. This was done using loop-unrolling: Instead of adding one number at a time, several numbers are added to separate running totals at the same time. I've included an implementation in sumLoopUnrolled below.

By the way: I'm not explicitly giving the choice of numBlocks we made, because we wouldn't want anyone to rely on this - it might change again in the future, after all.

In R2017b, we first shipped this new version of sum, for the case of single-precision numbers only. All practical issues with results being off by magnitudes for large arrays of single data had been addressed, and the function even got faster at the same time. However, we also got quite some feedback from people who were unhappy about the behavior change; while the new behavior gives as good or better results, their code was relying on the old behavior and adjusting to the new behavior was painful.

While we don't want to get stuck in never being able to improve our functions for fear of breaking a dependency, we definitely want to make the process of updating as pain-free as possible. Generally speaking, we aim for run-to-run reproducibility: If a function is called twice with the same inputs, the outputs will be the same. However, if the machine, OS or MATLAB version (among other externals) change, this can also change the exact value that's returned by MATLAB, within round-off error. It's common for the output of matrix multiplication to change for performance' sake, for example. But with sum, no changes had been made for a long time, and so more people had come to rely on its exact behavior than we would have expected.

So after making this change to sum for single values, we waited a few releases to evaluate customer feedback on the change. In R2020b, we added the same behavior for all other datatypes. This time, we added a Release Note that describes both the performance improvement and mentions the change as a "Compatibility Consideration". While we still had some feedback on this being an inconvenient change, it was less than in the first case.

Have you had changes in round-off error with a new MATLAB release break some of your code? Do you find the compatibility considerations in the release notes useful? Please let us know in the comments.

- Nick Higham, "What is stochastic rounding?", Blog post from 7/7/2020.
- Blanchard, Pierre, Nicholas J. Higham, and Theo Mary. "A class of fast and accurate summation algorithms." SIAM Journal on Scientific Computing 42, no. 3 (2020): A1541-A1557.

function s = sumOriginal(x)

s = 0;

for ii=1:length(x)

s = s + x(ii);

end

end

function s = sumInBlocks(x, numBlocks)

len = length(x);

blockLength = ceil(len / numBlocks);

s = sumOriginal(x(1:blockLength));

iter = blockLength;

while iter<len

s = s + sumOriginal(x(iter+1:min(iter+blockLength, len)));

iter = iter + blockLength;

end

end

function s = sumLoopUnrolled(x) %#ok<DEFNU>

% Example of loop unrolling for numBlocks == 4. For simplicity, we assume

% the length of x is divisible by 4.

%

% Note this technique is faster using a built-in code with the right

% compiler flags. It won't necessarily be faster in MATLAB code like this.

s1 = 0;

s2 = 0;

s3 = 0;

s4 = 0;

for ii=1:4:length(x)

s1 = s1 + x(ii);

s2 = s2 + x(ii+1);

s3 = s3 + x(ii+2);

s4 = s4 + x(ii+3);

end

s = s1 + s2 + s3 + s4;

end

Copyright 2021 The MathWorks, Inc.

]]>Today's guest blogger is Alan Weiss, who writes documentation for Optimization Toolbox™ and other mathematical toolboxes.Table of ContentsCone Programming Discrete Dynamics With Cone... read more >>

]]>Today's guest blogger is Alan Weiss, who writes documentation for Optimization Toolbox and other mathematical toolboxes.

Table of Contents

Hi, folks. The subject for today is cone programming, and an application of cone programming to controlling a rocket optimally. Since R2020b the coneprog solver has been available to solve cone programming problems. What is cone programming? I think of it as a generalization of quadratic programming. All quadratic programming problems can be represented as cone programming problems. But there are cone programming problems that cannot be represented as quadratic programs.

So again, what is cone programming? It is a problem with a linear objective function and linear constraints, like a linear program or quadratic program. But it also incorporates cone constraints. In three dimensions [x, y, z], you can represent a cone as, for example, the radius of a circle in the x-y direction is less than or equal to z. In other words, the cone constraint is the inequality constraint

$ x^2+y^2\le z^2 $,

or equivalently

$ \|[x,y]\|\le z $ for nonnegative z.

Here is a picture of the boundary of the cone $ \|[x,y]\|\le z $ for nonnegative z.

[X,Y] = meshgrid(-2:0.1:2);

Z = sqrt(X.^2 + Y.^2);

surf(X,Y,Z)

view(8,2)

xlabel("x")

ylabel("y")

zlabel("z")

Of course, you can scale, translate, and rotate a cone constraint. The formal definition of a general cone constraint uses a matrix Asc, vectors bsc and d, and scalar gamma with the constraint in x represented as

norm(Asc*x - bsc) <= d'*x - gamma;

The coneprog solver in Optimization Toolbox requires you to use the secondordercone function to formulate cone constraints. For example,

Asc = diag([1,1/2,0]);

bsc = zeros(3,1);

d = [0;0;1];

gamma = 0;

socConstraints = secondordercone(Asc,bsc,d,gamma);

f = [-1,-2,0];

Aineq = [];

bineq = [];

Aeq = [];

beq = [];

lb = [-Inf,-Inf,0];

ub = [Inf,Inf,2];

[x,fval] = coneprog(f,socConstraints,Aineq,bineq,Aeq,beq,lb,ub)

It can be simpler to use the problem-based approach to access cone programming. This functionality was added in R2021a. For the previous example using the problem-based approach:

x = optimvar('x',3,"LowerBound",[-Inf,-Inf,0],"UpperBound",[Inf,Inf,2]);

Asc = diag([1,1/2,0]);

prob = optimproblem("Objective",-x(1)-2*x(2));

prob.Constraints = norm(Asc*x) <= x(3);

[sol,fval] = solve(prob)

Notice that, unlike most nonlinear solvers, you do not need to specify an initial point for coneprog. This comes in handy in the following example.

Suppose that you want to control a rocket to land gently at a particular location using minimal fuel. Suppose that the fuel used is proportional to the applied acceleration times time. Do not model the changing weight of the rocket as you burn fuel; we are supposing that this control is for a relatively short time, where the weight does not change appreciably. There is gravitational acceleration g = 9.81 in the negative z direction. There is also linear drag on the rocket that acts in the negative direction of velocity with coefficient 1/10. This means after time t, without any applied acceleration or gravity, the velocity changes from v to $ v\exp(-t/10) $.

In continuous time the equations of motion for position $ p(t) $, velocity $ v(t) $, and applied acceleration $ a(t) $ are

$ \frac{dp}{dt} = v(t) $

$ \frac{dv}{dt} = -v(t)/10 + a(t) + g*[0,0,-1] $.

Here are some approximate equations of motion, using discrete time with N equal steps of length $ t = T/N $:

$ p(i+i) = p(i) + t*(v(i) + v(i+1))/2 $ (trapezoidal rule)

$ v(i+1) = v(i)*\exp(-t/10) + t*(a(i) + g*[0, 0, -1]) $ (Euler integration).

Therefore,

$ p(i+1) = p(i) + t*v(i)*(1 + \exp(-t/10))/2 + t^2*(a(i) + g*[0, 0, -1])/2 $.

Now for the part that leads to cone programming. Suppose that the applied acceleration at each step is bounded by a constant Amax. These constraints are

$ \|a(i)\| \le {\rm Amax} $ for all i.

The cost to minimize should be the sum of the norms of the accelerations times t. Cone programming requires the objective function to be linear in optimization parameters. You can reformulate this cost to be linear by introducing new optimization variables s(i) that are subject to a new set of cone constraints:

$ {\rm cost} = \sum s(i)*t $

$ \|s(i)\| \le a(i) $.

Suppose that the rocket is traveling initially at velocity $ v0 = [100,50,-40] $ at position $ p0 = [-1000,-800,1200] $. Calculate the acceleration required to bring the rocket to position $ [0,0,0] $ with velocity $ [0,0,0] $ at time $ T = 40 $. Break up the calculation into 100 steps ($ t=40/100 $). Suppose that the maximum acceleration $ \rm{Amax} = 2g $.

The makeprob function at the end of this script accepts the time T, initial position p0, and initial velocity v0, and returns a problem that describes the discrete dynamics and cost.

p0 = [-1000,-800,1200];

v0 = [100,50,-40];

prob = makeprob(40,p0,v0)

Set options to solve the cone programming problem using an optimality tolarance 100 times smaller than the default. Use the "schur" linear solver, which can be more accurate for this problem.

opts = optimoptions("coneprog","OptimalityTolerance",1e-8,"LinearSolver","schur");

[sol,cost] = solve(prob,Options=opts)

The plottrajandaccel function at the end of this script plots both the trajectory and the norm of the acceleration as a function of time step.

plottrajandaccel(sol)

The optimal acceleration is nearly "bang-bang." The rocket accelerates at about $ 2g $ at first, then has close to zero acceleration until the near end. Near the end, the rocket accelerates at maximum to slow the descent and land with zero velocity. The total cost of this control is about 313.

Find the optimal time T for the rocket to land, meaning the time that causes the rocket to use the least possible fuel. The findT function at the end of this script calls fminbnd to locate the minimal-cost time. I experimented briefly to find that [20,60] is a reasonable range for times T for the minimum, and I used those bounds in the fminbnd call. If you take a time much less than 20 you get an infeasible problem:

badprob = makeprob(15,p0,v0);

badsol = solve(badprob,Options=opts)

(As an aside, if you try to make T an optimization variable then the problem is no longer a coneprog problem. Instead, it is a problem for fmincon, which takes much longer to solve in this case, and requires you to provide an initial point.)

Topt = findT(opts)

Plot the optimal trajectory and acceleration.

probopt = makeprob(Topt,p0,v0);

[solopt,costopt] = solve(probopt,Options=opts)

plottrajandaccel(solopt)

The optimal cost is about 171, which is roughly half of the cost for the original parameters. This time, the control is more nearly bang-bang. The rocket accelerates at maximum at first, then stops accelerating for some time. Again, during the final times the rocket accelerates at maximum to land with zero velocity.

Cone programming is a surprisingly versatile framework for solving many convex optimization problems. For another nontrivial example, see Minimize Energy of Piecewise Linear Mass-Spring System Using Cone Programming, Problem-Based. For other problems that can be put in the cone programming framework, see Lobo, Miguel Sousa, Lieven Vandenberghe, Stephen Boyd, and Hervé Lebret. “Applications of Second-Order Cone Programming.” Linear Algebra and Its Applications 284, no. 1–3 (November 1998): 193–228. https://doi.org/10.1016/S0024-3795(98)10032-0

Do you find cone programming or discrete dynamics useful? Do you have any examples of your own to share? Let us know here.

This code creates the makeprob function.

function trajectoryproblem = makeprob(T,p0,v0)

N = 100;

g = 9.81;

pF = [0 0 0];

Amax = 2*g;

p = optimvar("p",N,3);

v = optimvar("v",N,3);

a = optimvar("a",N-1,3);

s = optimvar("s",N-1,"LowerBound",0,"UpperBound",Amax);

trajectoryproblem = optimproblem;

t = T/N;

trajectoryproblem.Objective = sum(s)*t;

scons = optimconstr(N-1);

for i = 1:(N-1)

scons(i) = norm(a(i,:)) <= s(i);

end

acons = optimconstr(N-1);

for i = 1:(N-1)

acons(i) = norm(a(i,:)) <= Amax;

end

vcons = optimconstr(N+1,3);

vcons(1,:) = v(1,:) == v0;

vcons(2:N,:) = v(2:N,:) == v(1:(N-1),:)*exp(-t/10) + t*(a + repmat([0 0 -g],N-1,1));

vcons(N+1,:) = v(N,:) == [0 0 0];

pcons = optimconstr(N+1,3);

pcons(1,:) = p(1,:) == p0;

pcons(2:N,:) = p(2:N,:) == p(1:(N-1),:) + (1+exp(-t/10))/2*t*v(1:(N-1),:) + t^2/2*(a + repmat([0 0 -g],N-1,1));

pcons((N+1),:) = p(N,:) == pF;

trajectoryproblem.Constraints.acons = acons;

trajectoryproblem.Constraints.scons = scons;

trajectoryproblem.Constraints.vcons = vcons;

trajectoryproblem.Constraints.pcons = pcons;

end

This code creates the plottrajandaccel function.

function plottrajandaccel(sol)

figure

psol = sol.p;

p0 = psol(1,:);

pF = psol(end,:);

plot3(psol(:,1),psol(:,2),psol(:,3),'rx')

hold on

plot3(p0(1),p0(2),p0(3),'ks')

plot3(pF(1),pF(2),pF(3),'bo')

hold off

view([18 -10])

xlabel("x")

ylabel("y")

zlabel("z")

legend("Steps","Initial Point","Final Point")

figure

asolm = sol.a;

nasolm = sqrt(sum(asolm.^2,2));

plot(nasolm,"rx")

xlabel("Time step")

ylabel("Norm(acceleration)")

end

This code creates the fvalT function, which is used by findT.

function Fval = fvalT(T,opts)

p0 = [-1000,-800,1200];

v0 = [100,50,-40];

tprob = makeprob(T,p0,v0);

opts = optimoptions(opts,"Display","off");

[~,Fval] = solve(tprob,Options=opts);

end

This code creates the findT function.

function Tmin = findT(opts)

disp("Solving...")

Tmin = fminbnd(@(T)fvalT(T,opts),20,60);

disp("Done")

end

Copyright 2021 The MathWorks, Inc.

]]>Have you ever needed to solve an optimization problem where there were local minima? What strategy do you use to solve it, trying to find the "best" answer? Today I'm going to talk about a simple... read more >>

]]>Have you ever needed to solve an optimization problem where there were local minima? What strategy do you use to solve it, trying to find the "best" answer? Today I'm going to talk about a simple strategy, readily available in the Global Optimization Toolbox.

Or at least let's try. I have some data and I want to fit a particular form of a curve to it. First let's look at the pharmacokinetic data. Here's the reference: Parameter estimation in nonlinear algebraic models via global optimization. Computers & Chemical Engineering, Volume 22, Supplement 1, 15 March 1998, Pages S213-S220 William R. Esposito, Christodoulos A. Floudas.

The data are time vs. concentration

t = [ 3.92, 7.93, 11.89, 23.90, 47.87, 71.91, 93.85, 117.84 ]

c = [0.163, 0.679, 0.679, 0.388, 0.183, 0.125, 0.086, 0.0624 ]

I like to see the data, in part to be sure I have no entry mistakes, and in part to get a feel for the overall system. In fact, let's visualize the data.

plot(t,c,'o')

xlabel('Time')

ylabel('Concentration')

As in the reference, we fit a 3 compartment model, sum of 3 decaying exponentials.

$$\mathit{c}={\mathit{b}}_{1}\text{\hspace{0.17em}}{\mathit{e}}^{\left(-{\mathit{b}}_{4}\text{\hspace{0.17em}}\mathit{t}\right)}+{\mathit{b}}_{2}\text{\hspace{0.17em}}{\mathit{e}}^{\left({-\mathit{b}}_{5}\text{\hspace{0.17em}}\mathit{t}\right)}+{\mathit{b}}_{3}\text{\hspace{0.17em}}{\mathit{e}}^{\left({-\mathit{b}}_{6}\text{\hspace{0.17em}}\mathit{t}\right)}$$

and we can express that model as an anonymous function of t (time) and the model parameters [b(1) b(2) ... b(6)].

model = @(b,t) b(1)*exp(-b(4)*t) + b(2)*exp(-b(5)*t) + b(3)*exp(-b(6)*t)

We next define the optimization problem to solve using the problem-based formulation. This allows us to choose the solver we want, supply the data, and naturally express constraints and options.

problem = createOptimProblem('lsqcurvefit', ...

'objective', model, ...

'xdata', t, 'ydata', c, ...

'x0',ones(1,6),...

'lb', [-10 -10 -10 0 0 0 ],...

'ub', [ 10 10 10 0.5 0.5 0.5], ...

'options',optimoptions('lsqcurvefit',...

'OutputFcn', @curvefittingPlotIterates,...

'Display','none'))

First solve the problem directly once.

b = lsqcurvefit(problem)

You'll notice that the model does not do a stellar job fitting the data or even following the shape of the data.

Let's see if we can do better by starting at a bunch of different points.

ms = MultiStart;

ms.Display = 'iter';

rng default

figure

tic

[~,fval,exitflag,output,solutions] = run(ms, problem, 50)

serialTime = toc;

The 50th solution, which is what is plotted above, is not necessarily the best one. Luckily for us, MultiStart orders the solutions from best to worst. So we need only look at the first one.

curvefittingPlotIterates(solutions)

You can see now that the 50th was not the best solution as the mean squared error on this final one displayed is over a factor of 10 better.

I will now see if we can improve the performance using all 4 of my cores as parallel workers locally.

ms.UseParallel = true;

gcp;

tic;

rng default

run(ms, problem, 50);

parallelTime = toc;

Speed up may not be evident until second run due to pool start up time. Since I started mine earlier, I get to see decent speed up.

speedup = serialTime/parallelTime

Tell us about your exploration of the solution space to your problem. If the solution is sensitive to where you start, you might consider using MultiStart and other techniques from the Global Optimization Toolbox.

Copyright 2021 The MathWorks, Inc.

Here's the code for plotting the iterates.

dbtype curvefittingPlotIterates