# “Most Active/Interactive” File Exchange Entry

Jiro's pick this week is "Command-line peak fitter for time-series signals" by Tom O'Haver.

Continuing with the celebration of MATLAB Central's 15th birthday and previous week's blog post by Sean, I'd like to focus on all of the great interactions people have had through the File Exchange entries. Although, you may not think of the File Exchange as the next big social network, people have collaborated and exchanged conversations through the comments and rating sections of the entries. When I see a File Exchange entry with a lot of comments, I tend to think that the file is getting a lot of interest from other users. If I also see a lot of responses from the author of the file, that means that the author is actively involved with improving and helping people use the file. If there are a lot of updates to the entry, that also means that the author is actively maintaining the file.

So, I wanted to see which files had the most interactions amongst the author and the users. Disclaimer: Not all of the metrics used here are purely quantitative. I've introduced some qualitative fudge factors.

The Data

I gathered my data the brute-force way, of course using MATLAB. I went through all possible File Exchange IDs and scraped each webpage for comments and updates.

load FEX


Here's what the first few entries look like.

FEX(1:5,:)

ans =
_________________________    _____    ____________________    ___________    ___________
'central_diff.m'             12       'Robert Canfield'       [3x4 table]    [5x3 table]
'interpsinc.m'               13       'Michael Minardi'       [1x4 table]    [1x3 table]
'Polybase'                   15       'Giampiero Campa'       [8x4 table]    [7x3 table]
'Toolbox BOD Version 2.8'    16       'Gert-Helge Geitner'    [1x4 table]    [7x3 table]
'connectnames.m'             17       'Douglas Harriman'      [1x4 table]    [0x3 table]


Here are the comments from the first entry (central_diff.m, which was Picked a couple of weeks ago).

FEX.Comments{1}

ans =
Date              Name                                           Comment                                   Rating
__________    ___________________    _____________________________________________________________________    ______
2004-09-16    'godlove njie teku'    ' '                                                                      4
2006-08-09    'Shyang-Wen Tseng'     '<p>This is a very good and usefull add-on function.  Thank you.</p>'    4
2007-08-06    'Alvaro Valcarce'      '<p>I think that line 98 should be (notice the "=" sign)</p>…'           4


And the updates for that entry.

FEX.Updates{1}

ans =
Date       Version                                   Description
__________    _______    __________________________________________________________________________
NaT           ''         '<p>update description</p>'
NaT           ''         '<p>description</p>'
NaT           ''         '<p>updating description</p>'
2001-08-21    ''         '<p>updating</p>'
2015-10-01    '2.0'      '<p>Second-order accurate forward and backward difference formulae are u…'


The Metric

To help me find the entries with the most "interactions", I first calculated the number of comments and updates from the data.

FEX.NumComments = cellfun(@height, FEX.Comments);


Next, I also wanted to know of all the comments for each entry, how many were by the author of the entry.

FEX.NumAuthorComments = cellfun(@(a,c) nnz(strcmp(a,c.Name)), ...


FEX = sortrows(FEX,'NumComments','descend');

% Truncate the file names to the first 20 characters (for labeling)
fexNames = cellfun(@(x) x(1:min(20,length(x))),FEX.Name(10:-1:1),'UniformOutput',false);

% Axes properties
ax = gca;
ax.YLim = [0 11];
ax.YTickLabel = fexNames;
ax.TickLabelInterpreter = 'none';
ax.YTickLabelRotation = 30;


Not surprisingly, export_fig.

FEX = sortrows(FEX,'NumUpdates','descend');

% Truncate the file names to the first 20 characters (for labeling)
fexNames = cellfun(@(x) x(1:min(20,length(x))),FEX.Name(10:-1:1),'UniformOutput',false);

% Axes properties
ax = gca;
ax.YLim = [0 11];
ax.YTickLabel = fexNames;
ax.TickLabelInterpreter = 'none';
ax.YTickLabelRotation = 30;


"DICOM to NIfTI converter" just beats export_fig.

Highest percentage of comments by the original author

One way to see how much the original author was involved with the user comments is to look at the percentage of author comments. (Yes, an author can be heavily involved without actually responding to comments on the File Exchange. He/she can choose to respond via email or simply update files.) To account for bias towards low number of comments, I have included an arbitrary qualification cutoff of 20 comments.

FEX.AuthorCommentRatio = FEX.NumAuthorComments ./ FEX.NumComments;

% Fix 0/0 (-> NaN) to 0
FEX.AuthorCommentRatio(isnan(FEX.AuthorCommentRatio)) = 0;

% Only look at entries with 20 or more comments

FEX = sortrows(FEX,'AuthorCommentRatio','descend');

ans =
___________________________________    _________________________    ___________    _________________    __________
'ipf(arg1,arg2,arg3,arg4)'             'Tom O'Haver'                23             14                   39
'nth_element'                          'Peter Li'                   26             14                    7
'Tree Controls for User Interfaces'    'Robyn Jackey'               29             15                    6
'Wavelet Based  Image Segmentation'    'Ashutosh Kumar Upadhyay'    23             11                   16
'iPeak'                                'Tom O'Haver'                36             17                   30


Great job folks!

FEX = FEX(FEX.NumUpdates >= 10,:);

ans =
__________________________________________________    _________________________    ___________    _________________    __________
'ipf(arg1,arg2,arg3,arg4)'                            'Tom O'Haver'                 23            14                   39
'Wavelet Based  Image Segmentation'                   'Ashutosh Kumar Upadhyay'     23            11                   16
'iPeak'                                               'Tom O'Haver'                 36            17                   30
'Command-line peak fitter for time-series signals'    'Tom O'Haver'                120            54                   41
'Fast Bilateral Filter'                               'Kunal Chaudhury'             20             9                   14


Wow, Tom is up there 3 times!! I'm a little intrigued by the 4th one, which has 120 comments with 41 updates. Let's take a closer look at the timings of those comments and updates.

% Process the 4th entry

% Create plot
'MarkerFaceColor','b','MarkerEdgeColor','none','MarkerFaceAlpha',0.25);
hold on
'MarkerFaceColor','r','MarkerEdgeColor','none','MarkerFaceAlpha',0.25);
'DatetimeTickFormat','uuuu');
hold off

% Axis properties
ax = gca;
ax.YLim = [0 2];
ax.YTick = [1 1.5];
ax.YTickLabel = {'Users','Author'};
ax.YTickLabelRotation = 60;
ax.YGrid = 'on';
title({FEX.Name{plotID},FEX.Author{plotID}})
xlabel('Date')


We can see that there is a nice balance of comments from users and Tom. The updates seem to be coming in at a nice regular interval, with updates happening recently. This is a sign that Tom has been heavily involved with interacting with users and keeping the file up-to-date.

Thank you, Tom, for being a great citizen of MATLAB Central and the File Exchange! You are what makes this community thrive.