{"id":1291,"date":"2016-01-20T10:16:53","date_gmt":"2016-01-20T15:16:53","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=1291"},"modified":"2015-12-28T10:17:11","modified_gmt":"2015-12-28T15:17:11","slug":"mapping-uber-pickups-in-new-york-city","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2016\/01\/20\/mapping-uber-pickups-in-new-york-city\/","title":{"rendered":"Mapping Uber Pickups in New York City"},"content":{"rendered":"\r\n<div class=\"content\"><!--introduction--><p>I travel a lot and I use ridesharing services like Uber often when I am away. One of my guest bloggers, <a href=\"https:\/\/twitter.com\/toshi2fly\">Toshi<\/a>, just got his first experience with such a service when he visited New York, and that inspired a new post.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#d7c16365-b91d-4f5e-8213-44c3ce727db2\">FiveThirtyEight<\/a><\/li><li><a href=\"#9e27f6fa-cf1f-4162-9a65-89fc05d65484\">Raw data<\/a><\/li><li><a href=\"#cb14ad08-3a38-4904-82e3-147c6f323a2a\">Load data with datastore<\/a><\/li><li><a href=\"#5af7e6ad-301b-421a-b29f-2ce04d3fb1d5\">Get New York area map<\/a><\/li><li><a href=\"#8323bdae-7689-4bc2-9575-68b68ab5e365\">Visualize Uber pickup locations<\/a><\/li><li><a href=\"#97223420-fa3a-4caf-b23f-26dba210ad33\">Visualize pickup frequency with a heat map<\/a><\/li><li><a href=\"#2affbc23-6129-456a-83ce-1e1e51968584\">Pickups by month<\/a><\/li><li><a href=\"#4d7503d6-8204-4224-a4e4-d75803aed3f7\">Create GIF Animation<\/a><\/li><li><a href=\"#513d2512-1efa-4bc0-84ea-1963041238e2\">Pickups by day of week<\/a><\/li><li><a href=\"#3b9db5ab-f819-422c-a1fa-52c1084b4d9c\">Pickups by hour<\/a><\/li><li><a href=\"#760efbc0-3037-474a-a0c5-a4f59b4d5da6\">Fast forward to 2015<\/a><\/li><li><a href=\"#16b6f6de-0c2c-407b-9408-ae4ac7e94a3a\">Growth from 2014 to 2015 by month<\/a><\/li><li><a href=\"#dd5ef3f9-1a76-478d-86c8-445fcd4c0bf9\">Growth by day of week<\/a><\/li><li><a href=\"#61186ea7-4308-4991-9ca5-647f3976e407\">Growth by hour<\/a><\/li><li><a href=\"#76fdce35-f72e-46d7-bd25-9cd10af6acf7\">Mapping Hourly Pickups in 2015<\/a><\/li><li><a href=\"#f806b69b-d186-4a82-b4d4-fdf94123d434\">Summary<\/a><\/li><\/ul><\/div><h4>FiveThirtyEight<a name=\"d7c16365-b91d-4f5e-8213-44c3ce727db2\"><\/a><\/h4><p>I visited New York for Thanksgiving and I used Uber for the first time (Yes, I am a technology laggard when it comes to transportation). Now I undersand why ridesharing got so popular.<\/p><p>I noticed FiveThirtyEight has several <a href=\"http:\/\/fivethirtyeight.com\/features\/is-uber-making-nyc-rush-hour-traffic-worse\/\">articles<\/a> about Uber and they make their data available on <a href=\"https:\/\/github.com\/fivethirtyeight\/uber-tlc-foil-response\">GitHub<\/a> for the public. In my <a href=\"https:\/\/blogs.mathworks.com\/loren\/2014\/09\/06\/analyzing-uber-ride-sharing-gps-data\/\">earlier post<\/a> we looked at Uber data from San Francisco. It would be curious to compare New York and San Francisco Uber usage. I will quickly summarize San Franciso Uber usage pattern in that dataset (which is no longer available, unfortunately):<\/p><div><ul><li>More rides in the weekends than during the weekdays<\/li><li>More rides in early morning hours than during the daytime<\/li><\/ul><\/div><h4>Raw data<a name=\"9e27f6fa-cf1f-4162-9a65-89fc05d65484\"><\/a><\/h4><p>I placed the downloaded CSV files into \"uber-trip-data\" folder in the current folder. CSV files contain Uber pickup data from April through September 2014. Here is a snippet from a CSV file. You can see that it is a tabular data with four columns - Date\/Time, Latitude, Longitude, and Base, which is a company code, all affiliated with Uber in this case.<\/p><pre class=\"codeinput\">dbtype(<span class=\"string\">'uber-trip-data\/uber-raw-data-apr14.csv'<\/span>,<span class=\"string\">'1:8'<\/span>)\r\n<\/pre><pre class=\"codeoutput\">\r\n1     \"Date\/Time\",\"Lat\",\"Lon\",\"Base\"\r\n2     \"4\/1\/2014 0:11:00\",40.769,-73.9549,\"B02512\"\r\n3     \"4\/1\/2014 0:17:00\",40.7267,-74.0345,\"B02512\"\r\n4     \"4\/1\/2014 0:21:00\",40.7316,-73.9873,\"B02512\"\r\n5     \"4\/1\/2014 0:28:00\",40.7588,-73.9776,\"B02512\"\r\n6     \"4\/1\/2014 0:33:00\",40.7594,-73.9722,\"B02512\"\r\n7     \"4\/1\/2014 0:33:00\",40.7383,-74.0403,\"B02512\"\r\n8     \"4\/1\/2014 0:39:00\",40.7223,-73.9887,\"B02512\"\r\n<\/pre><h4>Load data with datastore<a name=\"cb14ad08-3a38-4904-82e3-147c6f323a2a\"><\/a><\/h4><p>When you have multiple tabular data files with the same format, you can use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/datastore.html\">datastore<\/a> to load everything in one shot using a wild card character to match multiple file names, instead of reading them one by one.<\/p><pre class=\"codeinput\">ds = datastore(<span class=\"keyword\">...<\/span>\r\n    <span class=\"string\">'uber-trip-data\/uber-raw-data-*14.csv'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">         % wild card char *<\/span>\r\n    <span class=\"string\">'ReadVariableNames'<\/span>,false, <span class=\"keyword\">...<\/span><span class=\"comment\">                      % ignore header<\/span>\r\n    <span class=\"string\">'VariableNames'<\/span>,{<span class=\"string\">'DateTime'<\/span>,<span class=\"string\">'Lat'<\/span>,<span class=\"string\">'Lon'<\/span>,<span class=\"string\">'Base'<\/span>});\r\nds.NumHeaderLines = 1;                                  <span class=\"comment\">% has header line<\/span>\r\nds.TextscanFormats = <span class=\"keyword\">...<\/span><span class=\"comment\">                                % set data formats<\/span>\r\n    {<span class=\"string\">'%{M\/d\/yyyy HH:mm:ss}D'<\/span>,<span class=\"string\">'%f'<\/span>,<span class=\"string\">'%f'<\/span>,<span class=\"string\">'%q'<\/span>};\r\npreview(ds)                                             <span class=\"comment\">% preview the data<\/span>\r\n<\/pre><pre class=\"codeoutput\">ans = \r\n        DateTime          Lat        Lon        Base  \r\n    _________________    ______    _______    ________\r\n    4\/1\/2014 00:11:00    40.769    -73.955    'B02512'\r\n    4\/1\/2014 00:17:00    40.727    -74.034    'B02512'\r\n    4\/1\/2014 00:21:00    40.732    -73.987    'B02512'\r\n    4\/1\/2014 00:28:00    40.759    -73.978    'B02512'\r\n    4\/1\/2014 00:33:00    40.759    -73.972    'B02512'\r\n    4\/1\/2014 00:33:00    40.738     -74.04    'B02512'\r\n    4\/1\/2014 00:39:00    40.722    -73.989    'B02512'\r\n    4\/1\/2014 00:45:00    40.762    -73.979    'B02512'\r\n<\/pre><p>When you use <tt>datastore<\/tt>, you don't actually load data. You are simply creating a reference to a data repository. You need to specify variables of interest and explicitly load the actual data in memory. This allows you to selectively read data too large to fit into memory. In our case, you can load everything and save the resulting table to disk. I commented out the following code because I have done this step.<\/p><pre class=\"codeinput\"><span class=\"comment\">% ds.SelectedVariableNames = {'DateTime', 'Lat', 'Lon'};  % select variables<\/span>\r\n<span class=\"comment\">% T = readall(ds);                                        % read all<\/span>\r\n<span class=\"comment\">% save('uber.mat', 'T');                                  % save to disk<\/span>\r\n<\/pre><p>I am going to reload the <a href=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uber.mat\">existing mat file<\/a> instead. Let's also load additional settings like latitude\/longitude ranges, image size and landmark coordinates with <a href=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/load_settings.m\">load_settings.m<\/a>.<\/p><pre class=\"codeinput\">load <span class=\"string\">uber<\/span>                                               <span class=\"comment\">% reload data<\/span>\r\nload_settings                                           <span class=\"comment\">% get settings<\/span>\r\n<\/pre><h4>Get New York area map<a name=\"5af7e6ad-301b-421a-b29f-2ce04d3fb1d5\"><\/a><\/h4><p>If you have <a href=\"https:\/\/www.mathworks.com\/products\/mapping\/index.html\">Mapping Toolbox<\/a>, you can download raster maps from a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Web_Map_Service\">Web Map Service<\/a> server. I used a raster map service but you can also use an <a href=\"https:\/\/en.wikipedia.org\/wiki\/OpenStreetMap\">OpenStreetMap<\/a> service. Get the <a href=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/wms.mat\">raster map data<\/a> if you don't have Mapping Toolbox.<\/p><pre class=\"codeinput\"><span class=\"comment\">% wms = wmsinfo(url1);                                    % url1 is for raster<\/span>\r\n<span class=\"comment\">%                                                         % url2 is for OSM<\/span>\r\n<span class=\"comment\">% layer = wms.Layer;                                      % get layer object<\/span>\r\n<span class=\"comment\">% [A,R] = wmsread(layer, 'ImageFormat', 'image\/png', ...  % read raster image<\/span>\r\n<span class=\"comment\">%     'Lonlim', lim.lon, 'Latlim', lim.lat, ...<\/span>\r\n<span class=\"comment\">%     'ImageHeight', img.h, 'ImageWidth', img.w);<\/span>\r\n\r\nload <span class=\"string\">wms<\/span>\r\n<\/pre><h4>Visualize Uber pickup locations<a name=\"8323bdae-7689-4bc2-9575-68b68ab5e365\"><\/a><\/h4><p>Now we are ready to show the Uber data over the map.<\/p><pre class=\"codeinput\">figure                                                  <span class=\"comment\">% create a new figure<\/span>\r\nusamap(lim.lat, lim.lon);                               <span class=\"comment\">% limit to New York area<\/span>\r\ngeoshow(A, R)                                           <span class=\"comment\">% display raster map<\/span>\r\ngeoshow(T.Lat, T.Lon, <span class=\"keyword\">...<\/span><span class=\"comment\">                               % overlay data points<\/span>\r\n    <span class=\"string\">'DisplayType'<\/span>, <span class=\"string\">'point'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">                         % display as a point<\/span>\r\n    <span class=\"string\">'Marker'<\/span>, <span class=\"string\">'.'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">                                  % use dot<\/span>\r\n    <span class=\"string\">'MarkerSize'<\/span>, 1, <span class=\"keyword\">...<\/span><span class=\"comment\">                                % keep the size small<\/span>\r\n    <span class=\"string\">'MarkerEdgeColor'<\/span>, <span class=\"string\">'c'<\/span>)                             <span class=\"comment\">% set color to cyan<\/span>\r\ntitle({<span class=\"string\">'NYC Uber Pickup Locations'<\/span>; <span class=\"string\">'Apr - Sep 2014'<\/span>})  <span class=\"comment\">% add title<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_01.png\" alt=\"\"> <h4>Visualize pickup frequency with a heat map<a name=\"97223420-fa3a-4caf-b23f-26dba210ad33\"><\/a><\/h4><p>Manhattan is almost completely blanketed by dense dots and it's hard to see any details. <a href=\"https:\/\/blogs.mathworks.com\/graphics\/\">Mike Garrity<\/a> showed me how to use <a title=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/histogram2.html (link no longer works)\">histogram2<\/a> instead. This function is in base MATLAB and not in Mapping Toolbox. Therefore geospatial coordinates like latitudes and longitudes are treated like ordinary points on a 2D surface. Since longitudes get closer as we move away from the equator, we need to adjust for that with data aspect ratio, which was loaded as <tt>dar<\/tt> earlier.<\/p><p>We also have to load the raster map as an image, and x-y coordinates are different between the plot and image. We need to flip the image and fix the orientation of the plot.<\/p><pre class=\"codeinput\">nbins = 150;                                            <span class=\"comment\">% number of bins<\/span>\r\nxbinedges = linspace(lim.lon(1),lim.lon(2),nbins);      <span class=\"comment\">% x-axis bin edges<\/span>\r\nybinedges = linspace(lim.lat(1),lim.lat(2),nbins);      <span class=\"comment\">% y-axis bin edges<\/span>\r\nmap = flipud(A);                                        <span class=\"comment\">% flip image<\/span>\r\n\r\nfigure\r\nimagesc(lim.lon, lim.lat, map)                          <span class=\"comment\">% show raster map<\/span>\r\nhold <span class=\"string\">on<\/span>                                                 <span class=\"comment\">% don't overwrite<\/span>\r\ncolormap <span class=\"string\">cool<\/span>                                           <span class=\"comment\">% set colormap<\/span>\r\nhistogram2(T.Lon, T.Lat, xbinedges, ybinedges, <span class=\"keyword\">...<\/span><span class=\"comment\">      % overlay histogram<\/span>\r\n    <span class=\"string\">'DisplayStyle'<\/span>, <span class=\"string\">'tile'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">                         % in 2D style<\/span>\r\n    <span class=\"string\">'FaceAlpha'<\/span>, 0.5)\r\nhold <span class=\"string\">off<\/span>                                                <span class=\"comment\">% restore default<\/span>\r\ndaspect(dar)                                            <span class=\"comment\">% adjust ratio<\/span>\r\nset(gca,<span class=\"string\">'ydir'<\/span>,<span class=\"string\">'normal'<\/span>);                               <span class=\"comment\">% fix y orientation<\/span>\r\ncaxis([0 5000])                                         <span class=\"comment\">% color axis scaling<\/span>\r\ntitle({<span class=\"string\">'NYC Uber Pickup Frequency'<\/span>; <span class=\"string\">'Apr - Sep 2014'<\/span>})  <span class=\"comment\">% add title<\/span>\r\ntext(lmk1.lon, lmk1.lat, lmk1.str, <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>);       <span class=\"comment\">% add landmarks<\/span>\r\ntext(lmk2.lon, lmk2.lat, lmk2.str, <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">    % add landmarks<\/span>\r\n    <span class=\"string\">'HorizontalAlignment'<\/span>, <span class=\"string\">'right'<\/span>);\r\ncolorbar                                                <span class=\"comment\">% add colorbar<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_02.png\" alt=\"\"> <p>The plot shows that Uber is particularly popular along Fifth Avenue, around Grand Central Station, Penn Station, Chelsea, around the Empire State Building, and Soho. It seems New York Uber users are primarily interested in getting around from transportation hubs and shopping areas?<\/p><h4>Pickups by month<a name=\"2affbc23-6129-456a-83ce-1e1e51968584\"><\/a><\/h4><p>Did the number of pickups change over time? You can reload the whole dataset and plot a histogram. We see that the volume is increasing month by month.<\/p><pre class=\"codeinput\">months = {<span class=\"string\">'Apr'<\/span>,<span class=\"string\">'May'<\/span>,<span class=\"string\">'Jun'<\/span>,<span class=\"string\">'Jul'<\/span>,<span class=\"string\">'Aug'<\/span>,<span class=\"string\">'Sep'<\/span>};         <span class=\"comment\">% month names<\/span>\r\n\r\nfigure\r\nhistogram(T.DateTime.Month)                             <span class=\"comment\">% plot histogram<\/span>\r\nax = gca;                                               <span class=\"comment\">% get current axes<\/span>\r\nax.XTick = 4:9;                                         <span class=\"comment\">% change ticks<\/span>\r\nax.XTickLabel = months;                                 <span class=\"comment\">% change tick labels<\/span>\r\ntitle(<span class=\"string\">'Number of Uber Pickups by Month'<\/span>)                <span class=\"comment\">% add title<\/span>\r\nxlabel(<span class=\"string\">'Month'<\/span>)                                         <span class=\"comment\">% x axis label<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_03.png\" alt=\"\"> <p>Let's plot this over the map and see if we see any variation by location. To speed it up, we will reduce the data size by drawing samples in equal proportion from each month.<\/p><pre class=\"codeinput\">c = cvpartition(T.DateTime.Month, <span class=\"string\">'Holdout'<\/span>, 1\/10);     <span class=\"comment\">% partition data<\/span>\r\nTs = T(test(c),:);                                      <span class=\"comment\">% get 1\/10<\/span>\r\n\r\nfigure\r\nimagesc(lim.lon, lim.lat, map)                          <span class=\"comment\">% show raster map<\/span>\r\nhold <span class=\"string\">on<\/span>                                                 <span class=\"comment\">% don't overwrite<\/span>\r\ncolormap <span class=\"string\">winter<\/span>                                         <span class=\"comment\">% set colormap<\/span>\r\ncols = Ts.DateTime.Month;                               <span class=\"comment\">% color by month<\/span>\r\nscatter(Ts.Lon, Ts.Lat, 1, cols, <span class=\"string\">'MarkerEdgeAlpha'<\/span>, .3) <span class=\"comment\">% plot data points<\/span>\r\nhold <span class=\"string\">off<\/span>                                                <span class=\"comment\">% restore default<\/span>\r\nxlim(lim.lon)                                           <span class=\"comment\">% limit x range<\/span>\r\nylim(lim.lat)                                           <span class=\"comment\">% limit y range<\/span>\r\ndaspect(dar)                                            <span class=\"comment\">% adjust ratio<\/span>\r\nset(gca,<span class=\"string\">'ydir'<\/span>,<span class=\"string\">'normal'<\/span>);                               <span class=\"comment\">% fix y orientation<\/span>\r\ntitle({<span class=\"string\">'NYC Uber Pickup Locations by Month'<\/span>; <span class=\"keyword\">...<\/span><span class=\"comment\">        % add title<\/span>\r\n    <span class=\"string\">'Apr - Sep 2014'<\/span>})\r\ncolorbar(<span class=\"string\">'Ticks'<\/span>, unique(cols), <span class=\"string\">'TickLabels'<\/span>, months)   <span class=\"comment\">% add colorbar<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_04.png\" alt=\"\"> <h4>Create GIF Animation<a name=\"4d7503d6-8204-4224-a4e4-d75803aed3f7\"><\/a><\/h4><p>Unfortunately, it is not easy to detect patterns in this plot. Mike Garrity also showed me how to use animation with <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/imwrite.html\">imwrite<\/a> to see the pattern more clearly.<\/p><pre class=\"codeinput\">first = true;                                           <span class=\"comment\">% flag<\/span>\r\n\r\nfigure(<span class=\"string\">'Visible'<\/span>, <span class=\"string\">'off'<\/span>)                                <span class=\"comment\">% make plot invisible<\/span>\r\n<span class=\"keyword\">for<\/span> i = 4:9                                             <span class=\"comment\">% loop over Apr to Sep<\/span>\r\n    imagesc(lim.lon, lim.lat, map)                      <span class=\"comment\">% show raster map<\/span>\r\n    hold <span class=\"string\">on<\/span>                                             <span class=\"comment\">% don't overwrite<\/span>\r\n    colormap <span class=\"string\">cool<\/span>                                       <span class=\"comment\">% set colormap<\/span>\r\n    idx = T.DateTime.Month == i;                        <span class=\"comment\">% pick data by month<\/span>\r\n    histogram2(T.Lon(idx), T.Lat(idx), xbinedges, <span class=\"keyword\">...<\/span><span class=\"comment\">   % overlay histogram<\/span>\r\n        ybinedges, <span class=\"string\">'DisplayStyle'<\/span>, <span class=\"string\">'tile'<\/span>)              <span class=\"comment\">% in 2D style<\/span>\r\n    hold <span class=\"string\">off<\/span>                                            <span class=\"comment\">% restore default<\/span>\r\n    xlim(lim.lon)                                       <span class=\"comment\">% limit x range<\/span>\r\n    ylim(lim.lat)                                       <span class=\"comment\">% limit y range<\/span>\r\n    daspect(dar)                                        <span class=\"comment\">% adjust ratio<\/span>\r\n    set(gca,<span class=\"string\">'ydir'<\/span>,<span class=\"string\">'normal'<\/span>);                           <span class=\"comment\">% fix y orientation<\/span>\r\n    title(<span class=\"string\">'NYC Uber Pickup Locations by Month'<\/span>)         <span class=\"comment\">% add title<\/span>\r\n    text(cir.ctr(2), cir.ctr(1), <span class=\"keyword\">...<\/span><span class=\"comment\">                    % add month<\/span>\r\n        {months{i-3};<span class=\"string\">'2014'<\/span>}, <span class=\"keyword\">...<\/span><span class=\"comment\">                       % at an upper left<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'FontSize'<\/span>, 20, <span class=\"keyword\">...<\/span><span class=\"comment\">               % corner of the map<\/span>\r\n        <span class=\"string\">'FontWeight'<\/span>, <span class=\"string\">'bold'<\/span>, <span class=\"keyword\">...<\/span>\r\n        <span class=\"string\">'HorizontalAlignment'<\/span>, <span class=\"string\">'center'<\/span>)\r\n    caxis([0 1500])                                     <span class=\"comment\">% color axis scaling<\/span>\r\n    colorbar                                            <span class=\"comment\">% add colorbar<\/span>\r\n    fname = getframe(gcf);                                  <span class=\"comment\">% get the frame<\/span>\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);                   <span class=\"comment\">% get indexed image<\/span>\r\n    <span class=\"keyword\">if<\/span> first                                            <span class=\"comment\">% if first frame<\/span>\r\n        first = false;                                  <span class=\"comment\">% update flag<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/monthly.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">         % save as GIF<\/span>\r\n            <span class=\"string\">'Loopcount'<\/span>, Inf, <span class=\"keyword\">...<\/span><span class=\"comment\">                       % loop animation<\/span>\r\n            <span class=\"string\">'DelayTime'<\/span>, 1);                            <span class=\"comment\">% 1 frame per second<\/span>\r\n    <span class=\"keyword\">else<\/span>                                                <span class=\"comment\">% if image exists<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/monthly.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">         % append frame<\/span>\r\n            <span class=\"string\">'WriteMode'<\/span>, <span class=\"string\">'append'<\/span>, <span class=\"string\">'DelayTime'<\/span>, 1);     <span class=\"comment\">% to the image<\/span>\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>Now it is easier to see how Uber usage was spreading within Manhattan as well as in the surrounding areas.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/monthly.gif\" alt=\"\"> <\/p><h4>Pickups by day of week<a name=\"513d2512-1efa-4bc0-84ea-1963041238e2\"><\/a><\/h4><p>Let's now check the changes by day of week. San Franciscans used Uber more in the weekend but New Yorkers used it more during the weekdays.<\/p><pre class=\"codeinput\">week = {<span class=\"string\">'Sun'<\/span>,<span class=\"string\">'Mon'<\/span>,<span class=\"string\">'Tue'<\/span>,<span class=\"string\">'Wed'<\/span>,<span class=\"string\">'Thu'<\/span>,<span class=\"string\">'Fri'<\/span>,<span class=\"string\">'Sat'<\/span>};     <span class=\"comment\">% days of week<\/span>\r\n\r\nfigure\r\nhistogram(weekday(T.DateTime))                          <span class=\"comment\">% plot histogram<\/span>\r\nax = gca;                                               <span class=\"comment\">% get current axes<\/span>\r\nax.XTick = 1:7;                                         <span class=\"comment\">% change ticks<\/span>\r\nax.XTickLabel = week;                                   <span class=\"comment\">% change tick labels<\/span>\r\ntitle(<span class=\"string\">'Number of Uber Pickups by Day of Week'<\/span>)          <span class=\"comment\">% add title<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_05.png\" alt=\"\"> <p>Let's animate this over the map again.<\/p><pre class=\"codeinput\">first = true;                                           <span class=\"comment\">% flag<\/span>\r\n\r\nfigure(<span class=\"string\">'Visible'<\/span>, <span class=\"string\">'off'<\/span>)                                <span class=\"comment\">% make plot invisible<\/span>\r\n<span class=\"keyword\">for<\/span> i = 1:7                                             <span class=\"comment\">% loop over Sun to Sat<\/span>\r\n    imagesc(lim.lon, lim.lat, map)                      <span class=\"comment\">% show raster map<\/span>\r\n    hold <span class=\"string\">on<\/span>                                             <span class=\"comment\">% don't overwrite<\/span>\r\n    colormap <span class=\"string\">cool<\/span>                                       <span class=\"comment\">% set colormap<\/span>\r\n    idx = weekday(T.DateTime) == i;                     <span class=\"comment\">% pick data by day<\/span>\r\n    histogram2(T.Lon(idx), T.Lat(idx), xbinedges, <span class=\"keyword\">...<\/span><span class=\"comment\">   % overlay histogram<\/span>\r\n        ybinedges, <span class=\"string\">'DisplayStyle'<\/span>, <span class=\"string\">'tile'<\/span>)              <span class=\"comment\">% in 2D style<\/span>\r\n    hold <span class=\"string\">off<\/span>                                            <span class=\"comment\">% restore default<\/span>\r\n    xlim(lim.lon)                                       <span class=\"comment\">% limit x range<\/span>\r\n    ylim(lim.lat)                                       <span class=\"comment\">% limit y range<\/span>\r\n    daspect(dar)                                        <span class=\"comment\">% adjust ratio<\/span>\r\n    set(gca,<span class=\"string\">'ydir'<\/span>,<span class=\"string\">'normal'<\/span>);                           <span class=\"comment\">% fix y orientation<\/span>\r\n    title({<span class=\"string\">'NYC Uber Pickup Locations by Day of Week'<\/span>;  <span class=\"comment\">% add title<\/span>\r\n        <span class=\"string\">'Apr - Sep 2014'<\/span>})\r\n    text(cir.ctr(2), cir.ctr(1), week{i},<span class=\"keyword\">...<\/span><span class=\"comment\">            % add day of week<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'FontSize'<\/span>, 20, <span class=\"keyword\">...<\/span><span class=\"comment\">               % at an upper left<\/span>\r\n        <span class=\"string\">'FontWeight'<\/span>, <span class=\"string\">'bold'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">                       % corner of the map<\/span>\r\n        <span class=\"string\">'HorizontalAlignment'<\/span>, <span class=\"string\">'center'<\/span>)\r\n    caxis([0 1500])                                     <span class=\"comment\">% color axis scaling<\/span>\r\n    colorbar                                            <span class=\"comment\">% add colorbar<\/span>\r\n    fname = getframe(gcf);                                  <span class=\"comment\">% get the frame<\/span>\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);                   <span class=\"comment\">% get indexed image<\/span>\r\n    <span class=\"keyword\">if<\/span> first                                            <span class=\"comment\">% if first frame<\/span>\r\n        first = false;                                  <span class=\"comment\">% update flag<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/daily.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">           % save as GIF<\/span>\r\n            <span class=\"string\">'Loopcount'<\/span>, Inf, <span class=\"keyword\">...<\/span><span class=\"comment\">                       % loop animation<\/span>\r\n            <span class=\"string\">'DelayTime'<\/span>, 1);                            <span class=\"comment\">% 1 frame per second<\/span>\r\n    <span class=\"keyword\">else<\/span>                                                <span class=\"comment\">% if image exists<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/daily.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">           % append frame<\/span>\r\n            <span class=\"string\">'WriteMode'<\/span>, <span class=\"string\">'append'<\/span>, <span class=\"string\">'DelayTime'<\/span>, 1);     <span class=\"comment\">% to the image<\/span>\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>The frequency clearly drops off during the weekend across Manhattan.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/daily.gif\" alt=\"\"> <\/p><h4>Pickups by hour<a name=\"3b9db5ab-f819-422c-a1fa-52c1084b4d9c\"><\/a><\/h4><p>Uber users in San Francisco were more active during earlier morning hours. The histogram shows that New Yorkers actually don't stay out as late, and volume peaks during the evening rush hour.<\/p><pre class=\"codeinput\">figure\r\nhistogram(T.DateTime.Hour)                              <span class=\"comment\">% plot histogram<\/span>\r\nxlim([-1 24])                                           <span class=\"comment\">% set x-axis limits<\/span>\r\nax = gca;                                               <span class=\"comment\">% get current axes<\/span>\r\nax.XTick = 0:23;                                        <span class=\"comment\">% change ticks<\/span>\r\ntitle(<span class=\"string\">'Number of Uber Pickups by Hour'<\/span>)                 <span class=\"comment\">% add title<\/span>\r\nxlabel(<span class=\"string\">'Hour'<\/span>)                                          <span class=\"comment\">% x axis label<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_06.png\" alt=\"\"> <p>Let's animate this as well.<\/p><pre class=\"codeinput\">first = true;                                           <span class=\"comment\">% flag<\/span>\r\nampm = <span class=\"string\">'AM'<\/span>;                                            <span class=\"comment\">% flag<\/span>\r\n\r\nfigure(<span class=\"string\">'Visible'<\/span>, <span class=\"string\">'off'<\/span>)                                <span class=\"comment\">% make plot invisible<\/span>\r\n<span class=\"keyword\">for<\/span> i = 1:24                                            <span class=\"comment\">% loop over 24 hours<\/span>\r\n    j = i - 1;                                          <span class=\"comment\">% hour starts with zero<\/span>\r\n    imagesc(lim.lon, lim.lat, map)                      <span class=\"comment\">% show raster map<\/span>\r\n    hold <span class=\"string\">on<\/span>                                             <span class=\"comment\">% don't overwrite<\/span>\r\n    colormap <span class=\"string\">cool<\/span>                                       <span class=\"comment\">% set colormap<\/span>\r\n    idx = T.DateTime.Hour == j;                         <span class=\"comment\">% pick data by hour<\/span>\r\n    histogram2(T.Lon(idx), T.Lat(idx), xbinedges, <span class=\"keyword\">...<\/span><span class=\"comment\">   % overlay histogram<\/span>\r\n        ybinedges, <span class=\"string\">'DisplayStyle'<\/span>, <span class=\"string\">'tile'<\/span>)              <span class=\"comment\">% in 2D style<\/span>\r\n    line(cir.lon, cir.lat, <span class=\"keyword\">...<\/span><span class=\"comment\">                          % draw clock face<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'LineWidth'<\/span>, 3)\r\n    line(hour.x(i,:), hour.y(i,:), <span class=\"keyword\">...<\/span><span class=\"comment\">                  % draw hour handle<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'LineWidth'<\/span>, 3)\r\n    line(min.x, min.y, <span class=\"keyword\">...<\/span><span class=\"comment\">                              % draw min handle<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'LineWidth'<\/span>, 3)\r\n    <span class=\"keyword\">if<\/span> j &gt;= 12                                          <span class=\"comment\">% afternoon<\/span>\r\n        ampm = <span class=\"string\">'PM'<\/span>;\r\n    <span class=\"keyword\">end<\/span>\r\n    text(cir.ctr(2), cir.ctr(1) - .02, ampm, <span class=\"keyword\">...<\/span><span class=\"comment\">        % add AM\/PM<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'FontSize'<\/span>, 14, <span class=\"keyword\">...<\/span>\r\n        <span class=\"string\">'FontWeight'<\/span>, <span class=\"string\">'bold'<\/span>, <span class=\"keyword\">...<\/span>\r\n        <span class=\"string\">'HorizontalAlignment'<\/span>, <span class=\"string\">'center'<\/span>)\r\n    hold <span class=\"string\">off<\/span>                                            <span class=\"comment\">% restore default<\/span>\r\n    xlim(lim.lon)                                       <span class=\"comment\">% limit x range<\/span>\r\n    ylim(lim.lat)                                       <span class=\"comment\">% limit y range<\/span>\r\n    daspect(dar)                                        <span class=\"comment\">% adjust ratio<\/span>\r\n    set(gca,<span class=\"string\">'ydir'<\/span>,<span class=\"string\">'normal'<\/span>);                           <span class=\"comment\">% fix y orientation<\/span>\r\n    title({<span class=\"string\">'NYC Uber Pickup Locations by Hour'<\/span>; <span class=\"keyword\">...<\/span><span class=\"comment\">     % add title<\/span>\r\n       <span class=\"string\">'Apr - Sep 2014'<\/span>})\r\n    caxis([0 700])                                      <span class=\"comment\">% color axis scaling<\/span>\r\n    colorbar                                            <span class=\"comment\">% add colorbar<\/span>\r\n    fname = getframe(gcf);                              <span class=\"comment\">% get the frame<\/span>\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);               <span class=\"comment\">% get indexed image<\/span>\r\n    <span class=\"keyword\">if<\/span> first                                            <span class=\"comment\">% if first frame<\/span>\r\n        first = false;                                  <span class=\"comment\">% update flag<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/hourly.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">          % save as GIF<\/span>\r\n            <span class=\"string\">'Loopcount'<\/span>, Inf, <span class=\"keyword\">...<\/span><span class=\"comment\">                       % loop animation<\/span>\r\n            <span class=\"string\">'DelayTime'<\/span>, 1);                            <span class=\"comment\">% 1 frame per second<\/span>\r\n    <span class=\"keyword\">else<\/span>                                                <span class=\"comment\">% if image exists<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/hourly.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">          % append frame<\/span>\r\n            <span class=\"string\">'WriteMode'<\/span>, <span class=\"string\">'append'<\/span>, <span class=\"string\">'DelayTime'<\/span>, 1);     <span class=\"comment\">% to the image<\/span>\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>You can see Midtown gets really busy during the evening rush hour and Soho and Chelsea get more active during the evening.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/hourly.gif\" alt=\"\"> <\/p><h4>Fast forward to 2015<a name=\"760efbc0-3037-474a-a0c5-a4f59b4d5da6\"><\/a><\/h4><p>We also have data from Jan through June 2015, but it is in a different format, and the file size is also much bigger. We can use <tt>datastore<\/tt> again.<\/p><pre class=\"codeinput\">csv2015 = <span class=\"string\">'uber-trip-data\/uber-raw-data-janjune-15.csv'<\/span>;<span class=\"comment\">% filename<\/span>\r\ndbtype(csv2015,<span class=\"string\">'1:8'<\/span>)                                   <span class=\"comment\">% show content<\/span>\r\n\r\nds = datastore(csv2015, <span class=\"string\">'ReadVariableNames'<\/span>,false, <span class=\"keyword\">...<\/span><span class=\"comment\">  % setup  datastore<\/span>\r\n    <span class=\"string\">'VariableNames'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">                                % set variable names<\/span>\r\n    {<span class=\"string\">'Dispatching'<\/span>,<span class=\"string\">'Date'<\/span>,<span class=\"string\">'Affiliated'<\/span>,<span class=\"string\">'LocID'<\/span>});\r\nds.NumHeaderLines = 1;                                  <span class=\"comment\">% has header line<\/span>\r\nds.TextscanFormats = <span class=\"keyword\">...<\/span><span class=\"comment\">                                % set data formats<\/span>\r\n    {<span class=\"string\">'%C'<\/span>,<span class=\"string\">'%{yyyy-M-d HH:mm:ss}D'<\/span>,<span class=\"string\">'%C'<\/span>,<span class=\"string\">'%f'<\/span>};\r\n<\/pre><pre class=\"codeoutput\">\r\n1     Dispatching_base_num,Pickup_date,Affiliated_base_num,locationID\r\n2     B02617,2015-05-17 09:47:00,B02617,141\r\n3     B02617,2015-05-17 09:47:00,B02617,65\r\n4     B02617,2015-05-17 09:47:00,B02617,100\r\n5     B02617,2015-05-17 09:47:00,B02774,80\r\n6     B02617,2015-05-17 09:47:00,B02617,90\r\n7     B02617,2015-05-17 09:47:00,B02617,228\r\n8     B02617,2015-05-17 09:47:00,B02617,7\r\n<\/pre><p>This time, we will load data sequentially, take what we need, and discard the rest in order to avoid filling up our computer memory. This process takes time and I instead reload <a href=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/nyc2015.mat\">data I saved earlier<\/a>.<\/p><pre class=\"codeinput\"><span class=\"comment\">% ds.SelectedVariableNames = {'Date','LocID'};            % select variables<\/span>\r\n<span class=\"comment\">% months = [];                                            % accumulator<\/span>\r\n<span class=\"comment\">% days = [];                                              % accumulator<\/span>\r\n<span class=\"comment\">% hours = [];                                             % accumulator<\/span>\r\n<span class=\"comment\">% locations = [];                                         % accumulator<\/span>\r\n<span class=\"comment\">% reset(ds)                                               % reset read point<\/span>\r\n<span class=\"comment\">% while hasdata(ds)                                       % loop until end<\/span>\r\n<span class=\"comment\">%     T = read(ds);                                       % read partial<\/span>\r\n<span class=\"comment\">%     months = vertcat(months, T.Date.Month);             % append months<\/span>\r\n<span class=\"comment\">%     days = vertcat(days, weekday(T.Date));              % append days<\/span>\r\n<span class=\"comment\">%     hours = vertcat(hours, T.Date.Hour);                % append hours<\/span>\r\n<span class=\"comment\">%     locations = vertcat(locations, T.LocID);            % append locations<\/span>\r\n<span class=\"comment\">% end<\/span>\r\n\r\nload <span class=\"string\">nyc2015.mat<\/span>\r\n<\/pre><h4>Growth from 2014 to 2015 by month<a name=\"16b6f6de-0c2c-407b-9408-ae4ac7e94a3a\"><\/a><\/h4><p>We can compare data from 2014 and 2015 to see how Uber is growing in New York. You can see a dramatic increase in the volume of pickups.<\/p><pre class=\"codeinput\">monthStr = {<span class=\"string\">'Jan'<\/span>,<span class=\"string\">'Feb'<\/span>,<span class=\"string\">'Mar'<\/span>,<span class=\"string\">'Apr'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">                % month names<\/span>\r\n    <span class=\"string\">'May'<\/span>,<span class=\"string\">'Jun'<\/span>,<span class=\"string\">'Jul'<\/span>,<span class=\"string\">'Aug'<\/span>,<span class=\"string\">'Sep'<\/span>};\r\n\r\nfigure\r\nhistogram(months)                                       <span class=\"comment\">% plot histogram<\/span>\r\nhold <span class=\"string\">on<\/span>\r\nhistogram(T.DateTime.Month)                             <span class=\"comment\">% plot histogram<\/span>\r\nhold <span class=\"string\">off<\/span>\r\nax = gca;                                               <span class=\"comment\">% get current axes<\/span>\r\nax.XTick = 1:9;                                         <span class=\"comment\">% change ticks<\/span>\r\nax.XTickLabel = monthStr;                               <span class=\"comment\">% change tick labels<\/span>\r\ntitle(<span class=\"string\">'Number of Uber Pickups by Month'<\/span>)                <span class=\"comment\">% add title<\/span>\r\nlegend(<span class=\"string\">'2015'<\/span>, <span class=\"string\">'2014'<\/span>)                                  <span class=\"comment\">% add legend<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_07.png\" alt=\"\"> <h4>Growth by day of week<a name=\"dd5ef3f9-1a76-478d-86c8-445fcd4c0bf9\"><\/a><\/h4><p>When you look at the data by day of week, you see a usage shift - New Yorkers started to use Uber over weekends as well as week days.<\/p><pre class=\"codeinput\">figure\r\nhistogram(days)                                         <span class=\"comment\">% plot histogram<\/span>\r\nhold <span class=\"string\">on<\/span>\r\nhistogram(weekday(T.DateTime))                          <span class=\"comment\">% plot histogram<\/span>\r\nhold <span class=\"string\">off<\/span>\r\nax = gca;                                               <span class=\"comment\">% get current axes<\/span>\r\nax.XTick = 1:7;                                         <span class=\"comment\">% change ticks<\/span>\r\nax.XTickLabel = week;                                   <span class=\"comment\">% change tick labels<\/span>\r\ntitle(<span class=\"string\">'Number of Uber Pickups by Day of Week'<\/span>)          <span class=\"comment\">% add title<\/span>\r\nlegend(<span class=\"string\">'2015'<\/span>, <span class=\"string\">'2014'<\/span>, <span class=\"string\">'Location'<\/span>, <span class=\"string\">'NorthWest'<\/span>)         <span class=\"comment\">% add legend<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_08.png\" alt=\"\"> <h4>Growth by hour<a name=\"61186ea7-4308-4991-9ca5-647f3976e407\"><\/a><\/h4><p>However, New Yorkers still don't use Uber a lot in early morning hours, and still use it heavily during the evening rush hour.<\/p><pre class=\"codeinput\">figure\r\nhistogram(hours)                                        <span class=\"comment\">% plot histogram<\/span>\r\nhold <span class=\"string\">on<\/span>\r\nhistogram(T.DateTime.Hour)                              <span class=\"comment\">% plot histogram<\/span>\r\nxlim([-1 24])                                           <span class=\"comment\">% set x-axis limits<\/span>\r\nax = gca;                                               <span class=\"comment\">% get current axes<\/span>\r\nax.XTick = 0:23;                                        <span class=\"comment\">% change ticks<\/span>\r\ntitle(<span class=\"string\">'Number of Uber Pickups by Hour'<\/span>)                 <span class=\"comment\">% add title<\/span>\r\nxlabel(<span class=\"string\">'Hour'<\/span>)                                          <span class=\"comment\">% x axis label<\/span>\r\nlegend(<span class=\"string\">'2015'<\/span>, <span class=\"string\">'2014'<\/span>, <span class=\"string\">'Location'<\/span>, <span class=\"string\">'NorthWest'<\/span>)         <span class=\"comment\">% add legend<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uberNYC_09.png\" alt=\"\"> <h4>Mapping Hourly Pickups in 2015<a name=\"76fdce35-f72e-46d7-bd25-9cd10af6acf7\"><\/a><\/h4><p>Do we see any change in geographic pattern along with the volume increase? Instead of latitudes and longitudes, we just have location Ids for pickups in \"taxi-zone-lookup.csv\". For mapping I added latitudes and longitudes in a <a href=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/latlon.xlsx\">separate file<\/a>.<\/p><pre class=\"codeinput\">latlon = readtable(<span class=\"string\">'uber-trip-data\/latlon.xlsx'<\/span>);       <span class=\"comment\">% load lat lon data<\/span>\r\nonMap = ismember(locations, latlon.LocationID);         <span class=\"comment\">% find points on map<\/span>\r\nlocations = locations(onMap);                           <span class=\"comment\">% points on map only<\/span>\r\nhours = hours(onMap);                                   <span class=\"comment\">% hours on map only<\/span>\r\n\r\nfirst = true;                                           <span class=\"comment\">% flag<\/span>\r\nampm = <span class=\"string\">'AM'<\/span>;                                            <span class=\"comment\">% flag<\/span>\r\n\r\nfigure(<span class=\"string\">'Visible'<\/span>, <span class=\"string\">'off'<\/span>)                                <span class=\"comment\">% make plot invisible<\/span>\r\n<span class=\"keyword\">for<\/span> i = 1:24                                            <span class=\"comment\">% loop over 24 hours<\/span>\r\n    j = i - 1;                                          <span class=\"comment\">% hour starts with zero<\/span>\r\n    curHour = hours == j;                               <span class=\"comment\">% current hour<\/span>\r\n    [locId, ~, idx] = unique(locations(curHour));       <span class=\"comment\">% get unique loc ids<\/span>\r\n    count = accumarray(idx,1);                          <span class=\"comment\">% pickups by locatoin<\/span>\r\n    rows = ismember(latlon.LocationID,locId);           <span class=\"comment\">% get matching rows<\/span>\r\n    imagesc(lim.lon, lim.lat, map)                      <span class=\"comment\">% show raster map<\/span>\r\n    hold <span class=\"string\">on<\/span>                                             <span class=\"comment\">% don't overwrite<\/span>\r\n    colormap <span class=\"string\">cool<\/span>                                       <span class=\"comment\">% set colormap<\/span>\r\n    scatter(latlon.Lon(rows), latlon.Lat(rows), 100, <span class=\"keyword\">...<\/span><span class=\"comment\">% plot data points<\/span>\r\n        count, <span class=\"string\">'filled'<\/span>, <span class=\"string\">'MarkerFaceAlpha'<\/span>, 0.7)        <span class=\"comment\">% color by count<\/span>\r\n    line(cir.lon, cir.lat, <span class=\"keyword\">...<\/span><span class=\"comment\">                          % draw clock face<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'LineWidth'<\/span>, 3)\r\n    line(hour.x(i,:), hour.y(i,:), <span class=\"keyword\">...<\/span><span class=\"comment\">                  % draw hour handle<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'LineWidth'<\/span>, 3)\r\n    line(min.x, min.y, <span class=\"keyword\">...<\/span><span class=\"comment\">                              % draw min handle<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'LineWidth'<\/span>, 3)\r\n    <span class=\"keyword\">if<\/span> j &gt;= 12                                          <span class=\"comment\">% afternoon<\/span>\r\n        ampm = <span class=\"string\">'PM'<\/span>;\r\n    <span class=\"keyword\">end<\/span>\r\n    text(cir.ctr(2), cir.ctr(1) - .02, ampm, <span class=\"keyword\">...<\/span><span class=\"comment\">        % add AM\/PM<\/span>\r\n        <span class=\"string\">'Color'<\/span>, <span class=\"string\">'w'<\/span>, <span class=\"string\">'FontSize'<\/span>, 14, <span class=\"keyword\">...<\/span>\r\n        <span class=\"string\">'FontWeight'<\/span>, <span class=\"string\">'bold'<\/span>, <span class=\"keyword\">...<\/span>\r\n        <span class=\"string\">'HorizontalAlignment'<\/span>, <span class=\"string\">'center'<\/span>)\r\n    hold <span class=\"string\">off<\/span>                                            <span class=\"comment\">% restore default<\/span>\r\n    xlim(lim.lon)                                       <span class=\"comment\">% limit x range<\/span>\r\n    ylim(lim.lat)                                       <span class=\"comment\">% limit y range<\/span>\r\n    daspect(dar)                                        <span class=\"comment\">% adjust ratio<\/span>\r\n    set(gca,<span class=\"string\">'ydir'<\/span>,<span class=\"string\">'normal'<\/span>);                           <span class=\"comment\">% fix y orientation<\/span>\r\n    caxis([0 20000])                                    <span class=\"comment\">% color axis scaling<\/span>\r\n    title({<span class=\"string\">'NYC Uber Pickups by Zone'<\/span>; <span class=\"string\">'Jan-Jun 2015'<\/span>}) <span class=\"comment\">% add title<\/span>\r\n    colorbar                                            <span class=\"comment\">% add colorbar<\/span>\r\n    fname = getframe(gcf);                              <span class=\"comment\">% get the frame<\/span>\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);               <span class=\"comment\">% get indexed image<\/span>\r\n    <span class=\"keyword\">if<\/span> first                                            <span class=\"comment\">% if first frame<\/span>\r\n        first = false;                                  <span class=\"comment\">% update flag<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/nyc2015.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">         % save as GIF<\/span>\r\n            <span class=\"string\">'Loopcount'<\/span>, Inf, <span class=\"keyword\">...<\/span><span class=\"comment\">                       % loop animation<\/span>\r\n            <span class=\"string\">'DelayTime'<\/span>, 1);                            <span class=\"comment\">% 1 frame per second<\/span>\r\n    <span class=\"keyword\">else<\/span>                                                <span class=\"comment\">% if image exists<\/span>\r\n        imwrite(x,cmap, <span class=\"string\">'html\/nyc2015.gif'<\/span>, <span class=\"keyword\">...<\/span><span class=\"comment\">         % append frame<\/span>\r\n            <span class=\"string\">'WriteMode'<\/span>, <span class=\"string\">'append'<\/span>, <span class=\"string\">'DelayTime'<\/span>, 1);     <span class=\"comment\">% to the image<\/span>\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>The traffic pattern hasn't changed very much from 2014, but you can now see some hot spots in Brooklyin and Queens in the evening rush hour.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/nyc2015.gif\" alt=\"\"> <\/p><h4>Summary<a name=\"f806b69b-d186-4a82-b4d4-fdf94123d434\"><\/a><\/h4><p>It is very interesting to see such difference in Uber usage between New York and San Francisco. New Yorkers seem to use Uber for commuting and shopping, but it doesn't seem it is a big part of night life, while we saw earlier that San Franciso users got more active in the early morning hours. What accounts for this difference? Share your thought <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=1291#respond\">here<\/a>!<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_2885e67620ec459f8fd9169b204ca5f7() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='2885e67620ec459f8fd9169b204ca5f7 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 2885e67620ec459f8fd9169b204ca5f7';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2016 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_2885e67620ec459f8fd9169b204ca5f7()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2015b<br><\/p><\/div><!--\r\n2885e67620ec459f8fd9169b204ca5f7 ##### SOURCE BEGIN #####\r\n%% Mapping Uber Pickups in New York City\r\n% I travel a lot and I use ridesharing services like Uber often when I am\r\n% away. One of my guest bloggers, <https:\/\/twitter.com\/toshi2fly Toshi>,\r\n% just got his first experience with such a service when he visited New\r\n% York, and that inspired a new post.\r\n%\r\n%% FiveThirtyEight\r\n% I visited New York for Thanksgiving and I used Uber for the first time\r\n% (Yes, I am a technology laggard when it comes to transportation). Now I\r\n% undersand why ridesharing got so popular. \r\n% \r\n% I noticed FiveThirtyEight has several\r\n% <http:\/\/fivethirtyeight.com\/features\/is-uber-making-nyc-rush-hour-traffic-worse\/\r\n% articles> about Uber and they make their data available on\r\n% <https:\/\/github.com\/fivethirtyeight\/uber-tlc-foil-response GitHub> for\r\n% the public. In my\r\n% <https:\/\/blogs.mathworks.com\/loren\/2014\/09\/06\/analyzing-uber-ride-sharing-gps-data\/\r\n% earlier post> we looked at Uber data from San Francisco. It would be curious to compare New York\r\n% and San Francisco Uber usage. I will quickly summarize San Franciso Uber\r\n% usage pattern in that dataset (which is no longer available,\r\n% unfortunately): \r\n% \r\n% * More rides in the weekends than during the weekdays\r\n% * More rides in early morning hours than during the daytime\r\n% \r\n%% Raw data\r\n% I placed the downloaded CSV files into \"uber-trip-data\" folder in the\r\n% current folder. CSV files contain Uber pickup data from April through\r\n% September 2014. Here is a snippet from a CSV file. You can see that it is\r\n% a tabular data with four columns - Date\/Time, Latitude, Longitude, and\r\n% Base, which is a company code, all affiliated with Uber in this case.\r\n\r\ndbtype('uber-trip-data\/uber-raw-data-apr14.csv','1:8')\r\n\r\n%% Load data with datastore\r\n% When you have multiple tabular data files with the same format, you can\r\n% use <https:\/\/www.mathworks.com\/help\/matlab\/datastore.html datastore> to\r\n% load everything in one shot using a wild card character to match multiple\r\n% file names, instead of reading them one by one.\r\n                         \r\nds = datastore(...\r\n    'uber-trip-data\/uber-raw-data-*14.csv', ...         % wild card char *\r\n    'ReadVariableNames',false, ...                      % ignore header\r\n    'VariableNames',{'DateTime','Lat','Lon','Base'});\r\nds.NumHeaderLines = 1;                                  % has header line\r\nds.TextscanFormats = ...                                % set data formats\r\n    {'%{M\/d\/yyyy HH:mm:ss}D','%f','%f','%q'};\r\npreview(ds)                                             % preview the data\r\n\r\n%%\r\n% When you use |datastore|, you don't actually load data. You are simply\r\n% creating a reference to a data repository. You need to specify variables\r\n% of interest and explicitly load the actual data in memory. This allows\r\n% you to selectively read data too large to fit into memory. In our case,\r\n% you can load everything and save the resulting table to disk. I commented\r\n% out the following code because I have done this step.\r\n\r\n% ds.SelectedVariableNames = {'DateTime', 'Lat', 'Lon'};  % select variables\r\n% T = readall(ds);                                        % read all\r\n% save('uber.mat', 'T');                                  % save to disk\r\n\r\n%%\r\n% I am going to reload the\r\n% <https:\/\/blogs.mathworks.com\/images\/loren\/2016\/uber.mat existing\r\n% mat file> instead. Let's also load additional settings like\r\n% latitude\/longitude ranges, image size and landmark coordinates with\r\n% <https:\/\/blogs.mathworks.com\/images\/loren\/2016\/load_settings.m\r\n% load_settings.m>.\r\n\r\nload uber                                               % reload data\r\nload_settings                                           % get settings  \r\n\r\n%% Get New York area map\r\n% If you have <https:\/\/www.mathworks.com\/products\/mapping\/index.html Mapping\r\n% Toolbox>, you can download raster maps from a\r\n% <https:\/\/en.wikipedia.org\/wiki\/Web_Map_Service Web Map Service> server. I\r\n% used a raster map service but you can also use an\r\n% <https:\/\/en.wikipedia.org\/wiki\/OpenStreetMap OpenStreetMap> service. Get\r\n% the <https:\/\/blogs.mathworks.com\/images\/loren\/2016\/wms.mat raster map\r\n% data> if you don't have Mapping Toolbox.\r\n\r\n% wms = wmsinfo(url1);                                    % url1 is for raster\r\n%                                                         % url2 is for OSM\r\n% layer = wms.Layer;                                      % get layer object\r\n% [A,R] = wmsread(layer, 'ImageFormat', 'image\/png', ...  % read raster image\r\n%     'Lonlim', lim.lon, 'Latlim', lim.lat, ...\r\n%     'ImageHeight', img.h, 'ImageWidth', img.w);\r\n\r\nload wms\r\n\r\n%% Visualize Uber pickup locations\r\n% Now we are ready to show the Uber data over the map. \r\n\r\nfigure                                                  % create a new figure\r\nusamap(lim.lat, lim.lon);                               % limit to New York area\r\ngeoshow(A, R)                                           % display raster map\r\ngeoshow(T.Lat, T.Lon, ...                               % overlay data points\r\n    'DisplayType', 'point', ...                         % display as a point\r\n    'Marker', '.', ...                                  % use dot\r\n    'MarkerSize', 1, ...                                % keep the size small\r\n    'MarkerEdgeColor', 'c')                             % set color to cyan\r\ntitle({'NYC Uber Pickup Locations'; 'Apr - Sep 2014'})  % add title\r\n\r\n%% Visualize pickup frequency with a heat map\r\n% Manhattan is almost completely blanketed by dense dots and it's hard to\r\n% see any details. <https:\/\/blogs.mathworks.com\/graphics\/ Mike Garrity>\r\n% showed me how to use\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/histogram2.html histogram2>\r\n% instead. This function is in base MATLAB and not in Mapping Toolbox.\r\n% Therefore geospatial coordinates like latitudes and longitudes are\r\n% treated like ordinary points on a 2D surface. Since longitudes get closer\r\n% as we move away from the equator, we need to adjust for that with data\r\n% aspect ratio, which was loaded as |dar| earlier.\r\n%\r\n% We also have to load the raster map as an image, and x-y coordinates are\r\n% different between the plot and image. We need to flip the image and fix\r\n% the orientation of the plot.\r\n\r\nnbins = 150;                                            % number of bins\r\nxbinedges = linspace(lim.lon(1),lim.lon(2),nbins);      % x-axis bin edges\r\nybinedges = linspace(lim.lat(1),lim.lat(2),nbins);      % y-axis bin edges\r\nmap = flipud(A);                                        % flip image\r\n\r\nfigure\r\nimagesc(lim.lon, lim.lat, map)                          % show raster map\r\nhold on                                                 % don't overwrite\r\ncolormap cool                                           % set colormap\r\nhistogram2(T.Lon, T.Lat, xbinedges, ybinedges, ...      % overlay histogram\r\n    'DisplayStyle', 'tile', ...                         % in 2D style\r\n    'FaceAlpha', 0.5)                             \r\nhold off                                                % restore default\r\ndaspect(dar)                                            % adjust ratio\r\nset(gca,'ydir','normal');                               % fix y orientation\r\ncaxis([0 5000])                                         % color axis scaling\r\ntitle({'NYC Uber Pickup Frequency'; 'Apr - Sep 2014'})  % add title\r\ntext(lmk1.lon, lmk1.lat, lmk1.str, 'Color', 'w');       % add landmarks\r\ntext(lmk2.lon, lmk2.lat, lmk2.str, 'Color', 'w', ...    % add landmarks\r\n    'HorizontalAlignment', 'right');\r\ncolorbar                                                % add colorbar\r\n\r\n%%\r\n% The plot shows that Uber is particularly popular along Fifth Avenue,\r\n% around Grand Central Station, Penn Station, Chelsea, around the Empire\r\n% State Building, and Soho. It seems New York Uber users are primarily\r\n% interested in getting around from transportation hubs and shopping areas?\r\n\r\n%% Pickups by month\r\n% Did the number of pickups change over time? You can reload the whole\r\n% dataset and plot a histogram. We see that the volume is increasing month\r\n% by month.\r\n\r\nmonths = {'Apr','May','Jun','Jul','Aug','Sep'};         % month names\r\n\r\nfigure\r\nhistogram(T.DateTime.Month)                             % plot histogram\r\nax = gca;                                               % get current axes\r\nax.XTick = 4:9;                                         % change ticks\r\nax.XTickLabel = months;                                 % change tick labels\r\ntitle('Number of Uber Pickups by Month')                % add title\r\nxlabel('Month')                                         % x axis label\r\n\r\n%%\r\n% Let's plot this over the map and see if we see any variation by location.\r\n% To speed it up, we will reduce the data size by drawing samples in equal\r\n% proportion from each month.\r\n\r\nc = cvpartition(T.DateTime.Month, 'Holdout', 1\/10);     % partition data\r\nTs = T(test(c),:);                                      % get 1\/10\r\n\r\nfigure\r\nimagesc(lim.lon, lim.lat, map)                          % show raster map\r\nhold on                                                 % don't overwrite\r\ncolormap winter                                         % set colormap\r\ncols = Ts.DateTime.Month;                               % color by month\r\nscatter(Ts.Lon, Ts.Lat, 1, cols, 'MarkerEdgeAlpha', .3) % plot data points\r\nhold off                                                % restore default\r\nxlim(lim.lon)                                           % limit x range\r\nylim(lim.lat)                                           % limit y range\r\ndaspect(dar)                                            % adjust ratio\r\nset(gca,'ydir','normal');                               % fix y orientation\r\ntitle({'NYC Uber Pickup Locations by Month'; ...        % add title\r\n    'Apr - Sep 2014'})\r\ncolorbar('Ticks', unique(cols), 'TickLabels', months)   % add colorbar\r\n\r\n%% Create GIF Animation\r\n% Unfortunately, it is not easy to detect patterns in this plot. Mike\r\n% Garrity also showed me how to use animation with\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/imwrite.html imwrite> to see\r\n% the pattern more clearly.\r\n\r\nfirst = true;                                           % flag\r\n\r\nfigure('Visible', 'off')                                % make plot invisible\r\nfor i = 4:9                                             % loop over Apr to Sep \r\n    imagesc(lim.lon, lim.lat, map)                      % show raster map\r\n    hold on                                             % don't overwrite\r\n    colormap cool                                       % set colormap\r\n    idx = T.DateTime.Month == i;                        % pick data by month\r\n    histogram2(T.Lon(idx), T.Lat(idx), xbinedges, ...   % overlay histogram\r\n        ybinedges, 'DisplayStyle', 'tile')              % in 2D style\r\n    hold off                                            % restore default\r\n    xlim(lim.lon)                                       % limit x range\r\n    ylim(lim.lat)                                       % limit y range\r\n    daspect(dar)                                        % adjust ratio\r\n    set(gca,'ydir','normal');                           % fix y orientation\r\n    title('NYC Uber Pickup Locations by Month')         % add title\r\n    text(cir.ctr(2), cir.ctr(1), ...                    % add month\r\n        {months{i-3};'2014'}, ...                       % at an upper left\r\n        'Color', 'w', 'FontSize', 20, ...               % corner of the map\r\n        'FontWeight', 'bold', ...\r\n        'HorizontalAlignment', 'center')\r\n    caxis([0 1500])                                     % color axis scaling\r\n    colorbar                                            % add colorbar\r\n    fname = getframe(gcf);                                  % get the frame\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);                   % get indexed image\r\n    if first                                            % if first frame\r\n        first = false;                                  % update flag\r\n        imwrite(x,cmap, 'html\/monthly.gif', ...         % save as GIF\r\n            'Loopcount', Inf, ...                       % loop animation\r\n            'DelayTime', 1);                            % 1 frame per second\r\n    else                                                % if image exists\r\n        imwrite(x,cmap, 'html\/monthly.gif', ...         % append frame\r\n            'WriteMode', 'append', 'DelayTime', 1);     % to the image \r\n    end\r\nend\r\n\r\n%%\r\n% Now it is easier to see how Uber usage was spreading within Manhattan as\r\n% well as in the surrounding areas.\r\n%\r\n% <<monthly.gif>>\r\n%\r\n%% Pickups by day of week\r\n% Let's now check the changes by day of week. San Franciscans used Uber\r\n% more in the weekend but New Yorkers used it more during the weekdays.\r\n\r\nweek = {'Sun','Mon','Tue','Wed','Thu','Fri','Sat'};     % days of week\r\n\r\nfigure\r\nhistogram(weekday(T.DateTime))                          % plot histogram\r\nax = gca;                                               % get current axes\r\nax.XTick = 1:7;                                         % change ticks\r\nax.XTickLabel = week;                                   % change tick labels\r\ntitle('Number of Uber Pickups by Day of Week')          % add title\r\n\r\n%%\r\n% Let's animate this over the map again. \r\n\r\nfirst = true;                                           % flag\r\n\r\nfigure('Visible', 'off')                                % make plot invisible\r\nfor i = 1:7                                             % loop over Sun to Sat\r\n    imagesc(lim.lon, lim.lat, map)                      % show raster map\r\n    hold on                                             % don't overwrite\r\n    colormap cool                                       % set colormap\r\n    idx = weekday(T.DateTime) == i;                     % pick data by day\r\n    histogram2(T.Lon(idx), T.Lat(idx), xbinedges, ...   % overlay histogram\r\n        ybinedges, 'DisplayStyle', 'tile')              % in 2D style\r\n    hold off                                            % restore default\r\n    xlim(lim.lon)                                       % limit x range\r\n    ylim(lim.lat)                                       % limit y range\r\n    daspect(dar)                                        % adjust ratio\r\n    set(gca,'ydir','normal');                           % fix y orientation\r\n    title({'NYC Uber Pickup Locations by Day of Week';  % add title\r\n        'Apr - Sep 2014'})                             \r\n    text(cir.ctr(2), cir.ctr(1), week{i},...            % add day of week                          \r\n        'Color', 'w', 'FontSize', 20, ...               % at an upper left\r\n        'FontWeight', 'bold', ...                       % corner of the map\r\n        'HorizontalAlignment', 'center')\r\n    caxis([0 1500])                                     % color axis scaling\r\n    colorbar                                            % add colorbar\r\n    fname = getframe(gcf);                                  % get the frame\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);                   % get indexed image\r\n    if first                                            % if first frame\r\n        first = false;                                  % update flag\r\n        imwrite(x,cmap, 'html\/daily.gif', ...           % save as GIF\r\n            'Loopcount', Inf, ...                       % loop animation\r\n            'DelayTime', 1);                            % 1 frame per second\r\n    else                                                % if image exists\r\n        imwrite(x,cmap, 'html\/daily.gif', ...           % append frame\r\n            'WriteMode', 'append', 'DelayTime', 1);     % to the image \r\n    end\r\nend\r\n\r\n%%\r\n% The frequency clearly drops off during the weekend across Manhattan.\r\n% \r\n% <<daily.gif>>\r\n%\r\n\r\n%% Pickups by hour\r\n% Uber users in San Francisco were more active during earlier morning\r\n% hours. The histogram shows that New Yorkers actually don't stay out as\r\n% late, and volume peaks during the evening rush hour.\r\n\r\nfigure\r\nhistogram(T.DateTime.Hour)                              % plot histogram\r\nxlim([-1 24])                                           % set x-axis limits\r\nax = gca;                                               % get current axes\r\nax.XTick = 0:23;                                        % change ticks\r\ntitle('Number of Uber Pickups by Hour')                 % add title\r\nxlabel('Hour')                                          % x axis label\r\n\r\n%%\r\n% Let's animate this as well. \r\n\r\nfirst = true;                                           % flag\r\nampm = 'AM';                                            % flag\r\n\r\nfigure('Visible', 'off')                                % make plot invisible\r\nfor i = 1:24                                            % loop over 24 hours\r\n    j = i - 1;                                          % hour starts with zero\r\n    imagesc(lim.lon, lim.lat, map)                      % show raster map\r\n    hold on                                             % don't overwrite\r\n    colormap cool                                       % set colormap\r\n    idx = T.DateTime.Hour == j;                         % pick data by hour\r\n    histogram2(T.Lon(idx), T.Lat(idx), xbinedges, ...   % overlay histogram\r\n        ybinedges, 'DisplayStyle', 'tile')              % in 2D style\r\n    line(cir.lon, cir.lat, ...                          % draw clock face\r\n        'Color', 'w', 'LineWidth', 3)\r\n    line(hour.x(i,:), hour.y(i,:), ...                  % draw hour handle\r\n        'Color', 'w', 'LineWidth', 3) \r\n    line(min.x, min.y, ...                              % draw min handle\r\n        'Color', 'w', 'LineWidth', 3)\r\n    if j >= 12                                          % afternoon                   \r\n        ampm = 'PM';\r\n    end\r\n    text(cir.ctr(2), cir.ctr(1) - .02, ampm, ...        % add AM\/PM\r\n        'Color', 'w', 'FontSize', 14, ... \r\n        'FontWeight', 'bold', ... \r\n        'HorizontalAlignment', 'center')\r\n    hold off                                            % restore default\r\n    xlim(lim.lon)                                       % limit x range\r\n    ylim(lim.lat)                                       % limit y range\r\n    daspect(dar)                                        % adjust ratio\r\n    set(gca,'ydir','normal');                           % fix y orientation\r\n    title({'NYC Uber Pickup Locations by Hour'; ...     % add title\r\n       'Apr - Sep 2014'})              \r\n    caxis([0 700])                                      % color axis scaling\r\n    colorbar                                            % add colorbar\r\n    fname = getframe(gcf);                              % get the frame\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);               % get indexed image\r\n    if first                                            % if first frame\r\n        first = false;                                  % update flag\r\n        imwrite(x,cmap, 'html\/hourly.gif', ...          % save as GIF\r\n            'Loopcount', Inf, ...                       % loop animation\r\n            'DelayTime', 1);                            % 1 frame per second\r\n    else                                                % if image exists\r\n        imwrite(x,cmap, 'html\/hourly.gif', ...          % append frame\r\n            'WriteMode', 'append', 'DelayTime', 1);     % to the image \r\n    end\r\nend\r\n\r\n%%\r\n% You can see Midtown gets really busy during the evening rush hour and\r\n% Soho and Chelsea get more active during the evening.\r\n%\r\n% <<hourly.gif>>\r\n%\r\n%% Fast forward to 2015\r\n% We also have data from Jan through June 2015, but it is in a different\r\n% format, and the file size is also much bigger. We can use |datastore|\r\n% again.\r\n\r\ncsv2015 = 'uber-trip-data\/uber-raw-data-janjune-15.csv';% filename\r\ndbtype(csv2015,'1:8')                                   % show content\r\n\r\nds = datastore(csv2015, 'ReadVariableNames',false, ...  % setup  datastore\r\n    'VariableNames', ...                                % set variable names\r\n    {'Dispatching','Date','Affiliated','LocID'});\r\nds.NumHeaderLines = 1;                                  % has header line\r\nds.TextscanFormats = ...                                % set data formats\r\n    {'%C','%{yyyy-M-d HH:mm:ss}D','%C','%f'};\r\n\r\n%%\r\n% This time, we will load data sequentially, take what we need, and discard\r\n% the rest in order to avoid filling up our computer memory. This process\r\n% takes time and I instead reload\r\n% <https:\/\/blogs.mathworks.com\/images\/loren\/2016\/nyc2015.mat data I saved\r\n% earlier>.\r\n\r\n% ds.SelectedVariableNames = {'Date','LocID'};            % select variables\r\n% months = [];                                            % accumulator\r\n% days = [];                                              % accumulator\r\n% hours = [];                                             % accumulator\r\n% locations = [];                                         % accumulator\r\n% reset(ds)                                               % reset read point\r\n% while hasdata(ds)                                       % loop until end\r\n%     T = read(ds);                                       % read partial\r\n%     months = vertcat(months, T.Date.Month);             % append months\r\n%     days = vertcat(days, weekday(T.Date));              % append days\r\n%     hours = vertcat(hours, T.Date.Hour);                % append hours\r\n%     locations = vertcat(locations, T.LocID);            % append locations\r\n% end\r\n\r\nload nyc2015.mat\r\n\r\n%% Growth from 2014 to 2015 by month\r\n% We can compare data from 2014 and 2015 to see how Uber is growing in New\r\n% York. You can see a dramatic increase in the volume of pickups. \r\n\r\nmonthStr = {'Jan','Feb','Mar','Apr', ...                % month names\r\n    'May','Jun','Jul','Aug','Sep'};         \r\n\r\nfigure\r\nhistogram(months)                                       % plot histogram\r\nhold on\r\nhistogram(T.DateTime.Month)                             % plot histogram\r\nhold off\r\nax = gca;                                               % get current axes\r\nax.XTick = 1:9;                                         % change ticks\r\nax.XTickLabel = monthStr;                               % change tick labels\r\ntitle('Number of Uber Pickups by Month')                % add title\r\nlegend('2015', '2014')                                  % add legend\r\n\r\n%% Growth by day of week\r\n% When you look at the data by day of week, you see a usage shift - New\r\n% Yorkers started to use Uber over weekends as well as week days. \r\n\r\nfigure\r\nhistogram(days)                                         % plot histogram\r\nhold on\r\nhistogram(weekday(T.DateTime))                          % plot histogram\r\nhold off\r\nax = gca;                                               % get current axes\r\nax.XTick = 1:7;                                         % change ticks\r\nax.XTickLabel = week;                                   % change tick labels\r\ntitle('Number of Uber Pickups by Day of Week')          % add title\r\nlegend('2015', '2014', 'Location', 'NorthWest')         % add legend\r\n\r\n%% Growth by hour\r\n% However, New Yorkers still don't use Uber a lot in early morning hours,\r\n% and still use it heavily during the evening rush hour. \r\n\r\nfigure\r\nhistogram(hours)                                        % plot histogram\r\nhold on\r\nhistogram(T.DateTime.Hour)                              % plot histogram\r\nxlim([-1 24])                                           % set x-axis limits\r\nax = gca;                                               % get current axes\r\nax.XTick = 0:23;                                        % change ticks\r\ntitle('Number of Uber Pickups by Hour')                 % add title\r\nxlabel('Hour')                                          % x axis label\r\nlegend('2015', '2014', 'Location', 'NorthWest')         % add legend\r\n\r\n%% Mapping Hourly Pickups in 2015\r\n% Do we see any change in geographic pattern along with the volume\r\n% increase? Instead of latitudes and longitudes, we just have location Ids\r\n% for pickups in \"taxi-zone-lookup.csv\". For mapping I added latitudes and\r\n% longitudes in a <https:\/\/blogs.mathworks.com\/images\/loren\/2016\/latlon.xlsx\r\n% separate file>.\r\n\r\nlatlon = readtable('uber-trip-data\/latlon.xlsx');       % load lat lon data\r\nonMap = ismember(locations, latlon.LocationID);         % find points on map\r\nlocations = locations(onMap);                           % points on map only\r\nhours = hours(onMap);                                   % hours on map only\r\n\r\nfirst = true;                                           % flag\r\nampm = 'AM';                                            % flag\r\n\r\nfigure('Visible', 'off')                                % make plot invisible\r\nfor i = 1:24                                            % loop over 24 hours\r\n    j = i - 1;                                          % hour starts with zero\r\n    curHour = hours == j;                               % current hour\r\n    [locId, ~, idx] = unique(locations(curHour));       % get unique loc ids\r\n    count = accumarray(idx,1);                          % pickups by locatoin\r\n    rows = ismember(latlon.LocationID,locId);           % get matching rows\r\n    imagesc(lim.lon, lim.lat, map)                      % show raster map\r\n    hold on                                             % don't overwrite\r\n    colormap cool                                       % set colormap\r\n    scatter(latlon.Lon(rows), latlon.Lat(rows), 100, ...% plot data points\r\n        count, 'filled', 'MarkerFaceAlpha', 0.7)        % color by count\r\n    line(cir.lon, cir.lat, ...                          % draw clock face\r\n        'Color', 'w', 'LineWidth', 3)\r\n    line(hour.x(i,:), hour.y(i,:), ...                  % draw hour handle\r\n        'Color', 'w', 'LineWidth', 3) \r\n    line(min.x, min.y, ...                              % draw min handle\r\n        'Color', 'w', 'LineWidth', 3)\r\n    if j >= 12                                          % afternoon                   \r\n        ampm = 'PM';\r\n    end\r\n    text(cir.ctr(2), cir.ctr(1) - .02, ampm, ...        % add AM\/PM\r\n        'Color', 'w', 'FontSize', 14, ... \r\n        'FontWeight', 'bold', ... \r\n        'HorizontalAlignment', 'center')\r\n    hold off                                            % restore default\r\n    xlim(lim.lon)                                       % limit x range\r\n    ylim(lim.lat)                                       % limit y range\r\n    daspect(dar)                                        % adjust ratio\r\n    set(gca,'ydir','normal');                           % fix y orientation\r\n    caxis([0 20000])                                    % color axis scaling\r\n    title({'NYC Uber Pickups by Zone'; 'Jan-Jun 2015'}) % add title\r\n    colorbar                                            % add colorbar\r\n    fname = getframe(gcf);                              % get the frame\r\n    [x,cmap] = rgb2ind(fname.cdata, 128);               % get indexed image\r\n    if first                                            % if first frame\r\n        first = false;                                  % update flag\r\n        imwrite(x,cmap, 'html\/nyc2015.gif', ...         % save as GIF\r\n            'Loopcount', Inf, ...                       % loop animation\r\n            'DelayTime', 1);                            % 1 frame per second\r\n    else                                                % if image exists\r\n        imwrite(x,cmap, 'html\/nyc2015.gif', ...         % append frame\r\n            'WriteMode', 'append', 'DelayTime', 1);     % to the image \r\n    end\r\nend\r\n\r\n%%\r\n% The traffic pattern hasn't changed very much from 2014, but you can now\r\n% see some hot spots in Brooklyin and Queens in the evening rush hour. \r\n%\r\n% <<nyc2015.gif>>\r\n%\r\n\r\n%% Summary\r\n% It is very interesting to see such difference in Uber usage between New\r\n% York and San Francisco. New Yorkers seem to use Uber for commuting and\r\n% shopping, but it doesn't seem it is a big part of night life, while we\r\n% saw earlier that San Franciso users got more active in the early morning\r\n% hours. What accounts for this difference? Share your thought\r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=1291#respond here>!\r\n\r\n\r\n##### SOURCE END ##### 2885e67620ec459f8fd9169b204ca5f7\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2016\/nyc2015.gif\" onError=\"this.style.display ='none';\" \/><\/div><!--introduction--><p>I travel a lot and I use ridesharing services like Uber often when I am away. One of my guest bloggers, <a href=\"https:\/\/twitter.com\/toshi2fly\">Toshi<\/a>, just got his first experience with such a service when he visited New York, and that inspired a new post.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2016\/01\/20\/mapping-uber-pickups-in-new-york-city\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[66,39,41],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/1291"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=1291"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/1291\/revisions"}],"predecessor-version":[{"id":1292,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/1291\/revisions\/1292"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=1291"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=1291"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=1291"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}