Comments on: The Missing Link https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/?s_tid=feedtopost Loren Shure is interested in the design of the MATLAB language. She is an application engineer and writes here about MATLAB programming and related topics. Fri, 15 Mar 2019 00:44:20 +0000 hourly 1 https://wordpress.org/?v=6.2.2 By: Loren Shure https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47854 Sun, 24 Feb 2019 05:41:45 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47854 @Rob-

Have you looked into standardizeMissing? If you use timetables, you can use the retime function which may help.

–loren

]]>
By: Loren Shure https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47846 Sat, 23 Feb 2019 13:00:38 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47846 @Brad- apparently the moderated comment stuff is a consequence of us upgrading to the latest WordPress release. And therefore, the “Your comment is awaiting moderation” confirmation message no longer shows after posting a comment.

@Hyori- glad to know this is helpful!

]]>
By: Hyori Pak https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47844 Sat, 23 Feb 2019 11:51:49 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47844 Thank you for introducing the feature. I just realized the feature in MATLAB. It can be helpful to my daily job. thanks for sharing!

]]>
By: Loren Shure https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47840 Sat, 23 Feb 2019 11:32:10 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47840 @Rob-

If you know the value for missing, you can substitute it with NaNs after reading, and then proceed. You can’t simply declare a different value to be the missing one.

As for the gap filling, you could use ismissing, find how long each gap is, and do different things with different gaps sizes. Not a 1-liner!

–loren

]]>
By: Loren Shure https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47838 Sat, 23 Feb 2019 11:27:36 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47838 @Brad– checking on the commenting issue — not sure what goes on behind the scene

]]>
By: Loren Shure https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47836 Sat, 23 Feb 2019 11:26:48 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47836 @Brad-

I don’t know Sean’s opinion, but sometimes it’s not a big deal to grow something if it’s convenient and not your bottleneck. A mixed strategy that generally works ok in MATLAB is to grow whatever you need in chunks – so you don’t do so much reallocation. And use a marker so you know when you are done which final rows, if any, are ready to delete.

–loren

]]>
By: Brad Stiritz https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47834 Sat, 23 Feb 2019 01:31:25 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47834 p.s. Loren, there may be an issue with the commenting software. Your blog used to acknowledge submitted comments with “Your post is awaiting moderation.” Lately, I haven’t been finding any acknowledgement at all. This is a bit disconcerting.

]]>
By: Brad Stiritz https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47832 Sat, 23 Feb 2019 01:28:00 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47832 Hi Sean, yes I have found as well that growing tabulars via indexing can be expensive in time (as of R2018b). I only mentioned this in my first comment b/c it’s occasionally convenient and affordable.

So are you suggesting preallocation in all cases, even when the tabular height is unknown at the outset? I have read discussion elsewhere on the site, suggesting to initially build out larger-scale tabular data as a struct vector or cell array, and then convert via xxx2table(). This performs nicely, at least for my needs (50K – 100K rows).

]]>
By: Rob W https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47830 Fri, 22 Feb 2019 17:51:30 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47830 The NaN is very useful (especially for plotting!), as is being able to omit NaNs from mean/median/stddev/min/max calculations (I think all those can be told to ignore NaNs).
However, when I read in my spacecraft data, the Missing_Constant (or Fill-Value) is often a number, which I often replace with NaNs. Could the above codes be told to treat, for example, -1 (or other user defined value) as if it were a NaN and be excluded from mean/medians/etc.

The second issue that’d be great is while knowing when data is missing is useful, for analysis I often have to put in something so I can still analyze the data (e.g. Fast Fourier Transforms) – so I end up interpolating the data over the gap to get a regularly spaced dataset.

I usually interpolate to nearest value, e.g. using interp1 with the ‘nearest’ option. This is normally great, however if I’ve a data gaps of days that’s bad for nearest. What I’d really want is interpolate to ‘nearest if within a threshold of x units, else leave as NaN’, e.g. use nearest value from the nearest record in time, as long as that time is less than 1 hour.

]]>
By: Sean de Wolski https://blogs.mathworks.com/loren/2019/02/20/the-missing-link/#comment-47820 Thu, 21 Feb 2019 20:12:36 +0000 https://blogs.mathworks.com/loren/?p=3230#comment-47820 I’d recommend against growing the table. In recent releases, you can preallocate it with the various types and standardizeMissing for the default value.

t = table('size',[3 2],'VariableTypes',{'double', 'string'})
standardizeMissing(t,0)

  3×2 table
    Var1      Var2   
    ____    _________
    NaN     
    NaN     
    NaN     
]]>