Top Files and Authors
Sean's going to take this week to celebrate the top files and authors of the File Exchange.
As you may know by now, MATLAB Central is celebrating its 15th birthday. Let's start by making it a File Exchange based birthday cake!
HappyBirthday({'MATLAB' 'Central'}, 15)
Contents
Top Files
I figured an interesting thing to look at would be the top files of all time and the distribution downloads based on the total number of downloads for each file.
T = readtable('fx_downloads.xlsx'); T = sortrows(T,'total_downloads','descend');
And the 15 most downloaded files are:
barh(T.total_downloads(1:15)); ax = gca; ax.YTickLabel = T.title(1:15); ax.YDir = 'reverse'; ax.XAxis.Exponent = 0; ax.YAxis.TickLabelInterpreter = 'none'; xlabel('Total Downloads') title('Top 15 Files')
It's not a surprise to me at all to see export_fig at the top. We'll dig into it a bit more later. There are also three Arduino support packages up there. This isn't too surprising either given the popularity of Arduinos in recent years.
What about the distribution in number of downloads of all of the files? Let's look at a histogram of the number of files binned by number of downloads. Note, the log scale.
histogram(T.total_downloads, [logspace(0,5,30) inf]) set(gca, 'XScale', 'log') xlabel('Total Downloads') ylabel('Number of Files') title('Download Distribution')
Top Authors
So which authors have the most files and downloads?
Sum the total number of downloads grouping by author.
Author = varfun(@sum,T,'GroupingVariables','Creators_name','InputVariables','total_downloads'); summary(Author)
Variables: Creators_name: 10468x1 cell string GroupCount: 10468x1 double Values: min 1 median 1 max 189 sum_total_downloads: 10468x1 double Values: min 1 median 1154 max 6.9059e+05
So it looks like there are 10468 unique authors. Most people submit only one file and one person has submitted 189 files. Who's that?
disp(Author(Author.GroupCount == 189,:))
Creators_name GroupCount sum_total_downloads ________________________ __________ ___________________ 'Antonio Trujillo-Ortiz' 189 3.7709e+05
What does the distribution of submitted files per author look like?
histogram(Author.GroupCount) set(gca,'XScale','log') axis tight xlabel('Number of Files') ylabel('Number of Authors') title('Number of Files per Author')
What about the most downloaded author?
Author = sortrows(Author,'sum_total_downloads','descend'); barh(Author.sum_total_downloads(1:15)); ax = gca; ax.YTickLabel = Author.Creators_name(1:15); ax.YDir = 'reverse'; ax.XAxis.Exponent = 0; ax.YAxis.TickLabelInterpreter = 'none'; xlabel('Total Downloads') title('Top 15 Authors')
So what about export_fig? It used to belong to Oliver Woodford, the original author. In August 2015, Yair Altman took over maintenance and ownership of it. It's only fair that we give Oliver credit for the years he owned it.
I have another file that has export_fig's history. Read it in convert the date to datetime for logical indexing and plotting. The original format was 'yyyyMmm', e.g. 2016M07 for July, 2016.
HistoryExportFig = readtable('monthly-export_fig_Downloads.xlsx'); HistoryExportFig.MonthName_Download = datetime(HistoryExportFig.MonthName_Download,'InputFormat','yyyy''M''MM'); summary(HistoryExportFig)
Variables: MonthName_Download: 88x1 datetime Description: Original column heading: 'Month Name - Download' Values: min 01-Apr-2009 median 16-Nov-2012 max 01-Jul-2016 SourceFileId: 88x1 double Description: Original column heading: 'Source File Id' Values: min 23629 median 23629 max 23629 FileDownloadCount: 88x1 double Description: Original column heading: 'File Download Count' Values: min 555 median 2163.5 max 4082
How has export_fig been used with time?
plot(HistoryExportFig.MonthName_Download, HistoryExportFig.FileDownloadCount) Aug15 = datetime(2015,8,0); hold on h = plot([Aug15 Aug15],ylim); legend(h,'Yair Takes Over','location','northwest') xlabel('Time') ylabel('Monthly Downloads') title('Monthly Export Fig Downloads')
So it looks like export_fig use is in decline. But don't worry, I don't think it's Yair's fault! MATLAB R2014b included a new graphics system in MATLAB. With this printing has become much improved which has removed usecases where export_fig really helped; for example, with antialiasing. As users migrate to newer releases, I'd expect to see this trend continue.
So what happens if we give Oliver credit for the export_fig downloads leading up to August, 2015?
% Sum the file downloads before ownership transferred beforeAug15 = HistoryExportFig.MonthName_Download < datetime(2015,8,18); export_fig_Oliver = sum(HistoryExportFig.FileDownloadCount(beforeAug15)); % Add it to Oliver's count idxOliver = find(strcmp(Author.Creators_name,'Oliver Woodford')); Author.sum_total_downloads(idxOliver) = Author.sum_total_downloads(idxOliver)+export_fig_Oliver; % Re-sort Author = sortrows(Author,'sum_total_downloads','descend');
Is it enough to bring Oliver into the top 15?
idxOliver = find(strcmp(Author.Creators_name,'Oliver Woodford')); disp(['Oliver''s ranking: ' num2str(idxOliver)])
Oliver's ranking: 23
Not quite, but it brings him from 129th down to 23rd!
Comments
What has been your "Top File" ever? Is there any other way you'd like me to slice this data? Let us know here!
- Category:
- Picks
Comments
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.