Comments on: The Missing Link

By: Loren Shure

Loren Shure — Sun, 24 Feb 2019 05:41:45 +0000

@Rob-

Have you looked into standardizeMissing? If you use timetables, you can use the retime function which may help.

–loren

By: Loren Shure

Loren Shure — Sat, 23 Feb 2019 13:00:38 +0000

@Brad- apparently the moderated comment stuff is a consequence of us upgrading to the latest WordPress release. And therefore, the “Your comment is awaiting moderation” confirmation message no longer shows after posting a comment.

@Hyori- glad to know this is helpful!

By: Hyori Pak

Hyori Pak — Sat, 23 Feb 2019 11:51:49 +0000

Thank you for introducing the feature. I just realized the feature in MATLAB. It can be helpful to my daily job. thanks for sharing!

By: Loren Shure

Loren Shure — Sat, 23 Feb 2019 11:32:10 +0000

@Rob-

If you know the value for missing, you can substitute it with NaNs after reading, and then proceed. You can’t simply declare a different value to be the missing one.

As for the gap filling, you could use ismissing, find how long each gap is, and do different things with different gaps sizes. Not a 1-liner!

–loren

By: Loren Shure

Loren Shure — Sat, 23 Feb 2019 11:27:36 +0000

@Brad– checking on the commenting issue — not sure what goes on behind the scene

By: Loren Shure

Loren Shure — Sat, 23 Feb 2019 11:26:48 +0000

@Brad-

I don’t know Sean’s opinion, but sometimes it’s not a big deal to grow something if it’s convenient and not your bottleneck. A mixed strategy that generally works ok in MATLAB is to grow whatever you need in chunks – so you don’t do so much reallocation. And use a marker so you know when you are done which final rows, if any, are ready to delete.

–loren

By: Brad Stiritz

Brad Stiritz — Sat, 23 Feb 2019 01:31:25 +0000

p.s. Loren, there may be an issue with the commenting software. Your blog used to acknowledge submitted comments with “Your post is awaiting moderation.” Lately, I haven’t been finding any acknowledgement at all. This is a bit disconcerting.

By: Brad Stiritz

Brad Stiritz — Sat, 23 Feb 2019 01:28:00 +0000

Hi Sean, yes I have found as well that growing tabulars via indexing can be expensive in time (as of R2018b). I only mentioned this in my first comment b/c it’s occasionally convenient and affordable.

So are you suggesting preallocation in all cases, even when the tabular height is unknown at the outset? I have read discussion elsewhere on the site, suggesting to initially build out larger-scale tabular data as a struct vector or cell array, and then convert via xxx2table(). This performs nicely, at least for my needs (50K – 100K rows).

By: Rob W

Rob W — Fri, 22 Feb 2019 17:51:30 +0000

The NaN is very useful (especially for plotting!), as is being able to omit NaNs from mean/median/stddev/min/max calculations (I think all those can be told to ignore NaNs).
However, when I read in my spacecraft data, the Missing_Constant (or Fill-Value) is often a number, which I often replace with NaNs. Could the above codes be told to treat, for example, -1 (or other user defined value) as if it were a NaN and be excluded from mean/medians/etc.

The second issue that’d be great is while knowing when data is missing is useful, for analysis I often have to put in something so I can still analyze the data (e.g. Fast Fourier Transforms) – so I end up interpolating the data over the gap to get a regularly spaced dataset.

I usually interpolate to nearest value, e.g. using interp1 with the ‘nearest’ option. This is normally great, however if I’ve a data gaps of days that’s bad for nearest. What I’d really want is interpolate to ‘nearest if within a threshold of x units, else leave as NaN’, e.g. use nearest value from the nearest record in time, as long as that time is less than 1 hour.

By: Sean de Wolski

Sean de Wolski — Thu, 21 Feb 2019 20:12:36 +0000

I'd recommend against growing the table. In recent releases, you can preallocate it with the various types and standardizeMissing for the default value.

t = table('size',[3 2],'VariableTypes',{'double', 'string'})
standardizeMissing(t,0)

  3×2 table
    Var1      Var2   
    ____    _________
    NaN     
    NaN     
    NaN