Doug's MATLAB Video Tutorials

March 3rd, 2009

Read data from the web with URLREAD

I am blogging a little early this week because of the holiday. What holiday would that be? Square Root Day- 03/03/09. What makes this a particularly special event is that today’s video is number 144 (12^2) and I recorded it just minutes ago. Posted at 1:44 local time too! This kind of coincidence can not be ignored! When I first thought of this video, I figured it would take three videos to solve the problem. When I actually sat down to read data from the web and interactively filter it, I was happy to see it could be done so quickly and easily. Basically, the procedure is to read data from Many Eyes, then plot that Census data and filter it for city size. We will use URLREAD and TEXTSCAN to bring the data in and then interactive cell mode and plot to filter and visualize it.

17 Responses to “Read data from the web with URLREAD”

  1. Daniel Armyr replied on :

    You are a big nerd.

    But the video was nice. Few lines of code for a large wow-factor.

  2. dhull replied on :

    Daniel,

    Thanks for the kind words about the video.

    Doug

    PS. I prefer the term Geek! :)

  3. Tim Davis replied on :

    URLREAD is great. It’s been a great help. I have a MATLAB interface to the UF Sparse Matrix Collection called UFget, which downloads matrices from the collection into the MATLAB workspace. I used to have a Java component to it, but now I can do it all in M, with URLREAD, and my code is now much simpler (and more robust).

  4. Jim replied on :

    Can you post the code in a text file?

  5. dhull replied on :

    Jim,

    Sorry, I though I did post it. Here it is.

    clear
    clc

    block = urlread(‘http://manyeyes.alphaworks.ibm.com/manyeyes/datasets/us-zipcodes-with-city-state-fips-lat/versions/1.txt‘);

    %%
    readData = textscan(block, ‘%n %n %s %s %n %n %n %n’, ‘delimiter’, char(9));

    statenum = readData{1};
    zip = readData{2};
    abrev = readData{3};
    city = readData{4};
    lat = readData{5};
    lon = readData{6};
    pop = readData{7};
    percent = readData{8};

    clear readData

    %%

    load usapolygon

    %%
    vi = (pop > 53130.23903);

    plot(uslon, uslat, -lat(vi), lon(vi), ‘.’)

    %%
    city{vi}

    -Doug

  6. Russell replied on :

    This will be really handy as soon as Wolfram|Alpha is up and running!

  7. dhull replied on :

    @Russell,

    I am not sure what you mean, how would they work together?

    Doug

  8. Magnus replied on :

    Hi Doug,

    I tried to use you method to import data from the web, but it did not work. I did post a link on matlabcentral, http://www.mathworks.in/matlabcentral/newsreader/view_thread/265578, do you have any suggestion how I should read the data from the web and put it into a matrix?

    -Magnus

  9. dhull replied on :

    Looks like this was answered in the newsgroup.

  10. Need Help replied on :

    Hi Doug, I have a question. I want to know a way to convert the string you obtain from using urlread() into a matrix. In the case scenario described above, the data is in columns already but I was wondering if there is a function which converts it into a string without the requirement of the data being in columns already. Please let me know. Thanks

  11. dhull replied on :

    @Need Help

    The command does return the data as one big string. When you say:

    “I was wondering if there is a function which converts it into a string”

    What do you mean?

    What would a typical return from URLREAD() look like so we know what you are getting and what would the desired output look like?

    Doug

  12. paul replied on :

    Thanks Doug, great tutorial. I’m trying to do something similar, which I know is possible, but just a bit beyond my matlab ability. I am trying to extract historical wind data from a website, and while it is comma separated, it is a mixture of words, numbers, and annotations. I can get the data into matlab, but I can’t figure out how to (1) save the data as a csv file for archiving, (2) extract specific parts of the block (like hour), and (3) turn the data into variables (double?) on which I can do math. I have found several possible solutions but keep getting stuck on one part or another. I think I could do it in a way that would require a ton of loops and if/then statements, but since I want to mine data from many websites, this would likely get slow. Any help would be much appreciated. THANK YOU!! Here’s my start:

    block=urlread(‘http://www.wunderground.com/history/airport/KSVC/2010/4/1/DailyHistory.html?req_city=NA&req_state=NA&req_statename=NA&format=1‘);
    data=textscan(block,’%s %s %s %s %s %s %s %s %s %s %s %s’,'delimiter’,',’);

    urwrite(‘http://www.wunderground.com/history/airport/KSVC/2010/4/1/DailyHistory.html?req_city=NA&req_state=NA&req_statename=NA&format=1',data.csv‘);
    data = csvread(‘data.csv’);

    data2=textscan(block,’%s %s’,'delimiter’,':’);

    clear block
    points=length(data{1});
    dataTime=data{1}(2:points-1);
    dataTemperature=data{2}(2:points-1);
    dataHumidity=data{3}(2:points-1);
    dataWindDirection=data{7}(2:points-1);
    dataWindSpeedAvg=data{8}(2:points-1);
    dataWindSpeedGust=data{9}(2:points-1);

    dataHour=data2{1}(2:points-1);

  13. paul replied on :

    To those asking about converting the string data to matrix data, str2double worked for me. Somewhere somebody recommends to save the string as a csv and then read the csv back into Matlab, but this seems very roundabout to me.

  14. dhull replied on :

    @Paul,

    I like the following:

    
    urlwrite('http://www.wunderground.com/history/airport/KSVC/2010/4/1/DailyHistory.html?req_city=NA&req_state=NA&req_statename=NA&format=1','data.csv');
    [num, txt] = xlsread('data.csv');
    

    You now have num and txt. These can be parsed pretty easily, especially if the format of all of these are always the same.

    I actually like pulling the data locally into a file and then reading it in. Something about having the file close by, even if in a temporary file that you toss later feels right.

    -Doug

  15. Thisara replied on :

    Hi Doug,
    Thanks for the nice piece of work. I am trying to download some netcdf data from web. I tried to use urlread, but couldn’t success.File can not be scanned as you did.
    But if I can download the netcdf file, I can read it through Matlab in my computer.

    Is there any other way to download the netcdf file through Matlab ?

    Cheers
    Thisara

    this is the location of data :
    http://opendap-tpac.arcs.org.au/thredds/catalog/library/argo_australia/aoml/5900607/profiles/catalog.html?dataset=library/argo_australia/aoml/5900607/profiles/D5900607_001.nc

  16. dhull replied on :

    Thisara,

    Can you read the files if they were local? Can you use the FTP command to bring them local?

    Doug

  17. Ritesh replied on :

    Hi,

    I want to read data(download an excel file) from a website. The data is not open source. I have to log in first with password and then only I can access data through a web browser.

    How can I download the same file through urlread command?

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


MathWorks

Doug Hull is a proud MathWorker who is on a mission to help you with MATLAB.

Doug's picture

These postings are the author's and don't necessarily represent the opinions of The MathWorks.