Developer Zone

Advanced Software Development with MATLAB

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

MATLAB and Blob Storage 2

Posted by Arvind Hosagrahara,

As a continuation from my previous post, this post discusses the use of the MATLAB interface for Azure™ Storage Blob.

The Windows Azure Blob service is scalable, cost-effective cloud storage for all your unstructured data.

MATLAB developers can already use these services with our shipping products as documented by leveraging the datastore function that allows easy read/write access of data stored in Blob Storage (among other forms of remote data).

And in the same vein as the previous post, I will focus on additional interfaces that are available for developers. These interfaces are targeted at MATLAB developers who desire to exercise finer control over the storage service. Usually this need for control is driven by requirements to configure security settings, access control, and other features such as manipulating multiple blob types, table services, etc. typical in most enterprise applications.

MathWorks has released the MATLAB Interface for Azure Storage Blob on github to allow MATLAB developers to leverage the storage service on Azure.

To get started, clone this repository:

$ git clone --recursive https://github.com/mathworks-ref-arch/matlab-azure-blob.git

The repository above contains MATLAB code that leverages the Azure SDK for Java. This package uses certain third-party content which is licensed under separate license agreements. See the pom.xml file for third-party software downloaded at build time.

Contents

Build the underlying Java artifacts

You can build the underlying Java SDK using Maven and the process is straightforward:

$ cd matlab-azure-blob/Software/Java/
$ mvn clean package

On a successful build, a JAR archive is packaged and is made available to MATLAB by running the startup.m file

cd matlab-azure-blob/Software/MATLAB
startup

All access to the blob services originates with a cloud storage account which allows the provisioning of one or more containers. These containers allow users to group a set of blobs. A blob can be either block blob, page blob or append blob types and can be used for storage of artifacts such as text, binary, or media.

A good introduction to the service can be found at: https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction

The MATLAB interface to Azure Storage blobs To create a MATLAB client to work with the service, developers can provision an access key in the Azure portal and use it with the MATLAB interface to create a handle to the cloud storage account.

The MATLAB interface uses the same familiar interface as the underlying SDK so most of the solutions, documentation and community knowledge on online forums like StackOverflow should translate directly and easily.

In this case, the code to create a client looks like:

% Create a handle to the storage account
az = azure.storage.CloudStorageAccount;
az.AccountName = 'myaccountname';
az.AccountKey  = 'ABCDEFGH*****[REDACTED]****ABCDEFGH';
az.connect();

% Create a handle to the CloudBlobClient
azClient = azure.storage.blob.CloudBlobClient(az);

Container operations

The interfaces to create, list and configure containers is equally straightforward.

% Create a container and list all existing containers
azContainer = azure.storage.blob.CloudBlobContainer(azClient,'testcontainer');
azContainer.createIfNotExists();
containers = azClient.listContainers();

% Configure a container for public access
perm = azure.storage.blob.BlobContainerPermissions;
perm.AccessType = 'CONTAINER';  % Container-level public access
azContainer.uploadPermissions(perm);

Uploading a blob / data into a container

Blobs can be content of any nature. To generate some data in MATLAB and save it to newly created container:

% Create some random data
sampleData = rand(1000,1000);    % Approx 7MB
save SampleData sampleData;

% Uploading the data to a previously created container, create a blob handle (merely a reference) and upload.
blob = azContainer.getBlockBlobReference(which('SampleData.mat'));
blob.upload();

And that is it, our first piece of data has been uploaded to the Cloud and stored on the Azure Blob Storage service.

The interface is fully vectorized, so it is possible to upload an entire directory of files by invoking the upload method on a collection of configured blobs.

Downloading a blob / data into MATLAB

Listing the contents of a container and downloading from the Blob storage service to MATLAB is equally straightforward.

% List all existing blobs
blobList = azContainer.listBlobs();

% Download a particular file from cloud service into the current directory
blob = azure.storage.blob.CloudBlockBlob(azContainer, 'mydir1/SampleData.mat');
blobList.download(); % can accept an optional target directory argument

This basic example just scratches the surface of the possibilities of the interface. The package contains numerous features to work with common client workflows such as controlling access using Shared Access Signatures (SAS), etc.

And for clients that are using other types of blobs such as Table blobs, the interface provides support as described in the relevant sections of the documentation.

These interfaces enable new and powerful ways to extend MATLAB to access cloud based storage systems to store, partition and analyze data. This extended functionality takes on special significance when used with our reference architectures of running the MathWorks products on the public cloud systems like Azure to improve data access and analysis performance.

That concludes my brief blurb about block blob on the blog... try saying that aloud fast. Or as the old meme goes - On the internet, nobody knows you're a blob (with apologies to Peter Steiner).

In closing, I wanted to share this with you, the reader, to underline the fact that MATLAB developers can leverage these capabilities to build truly impressive data analytics systems that are cloud capable and run at scale. All of this becomes possible within the comfort of their time-tested MATLAB environments and workflows.


Get the MATLAB code

Published with MATLAB® R2019a

2 CommentsOldest to Newest

David Barry replied on : 1 of 2
Another interesting post. I am holding out for the one that includes code for Google Cloud Storage (which is what we use at JLR).
Arvind Hosagrahara replied on : 2 of 2
I am glad you liked it. It is good to hear about your interest in Google Cloud Storage (GCS). When used in MATLAB, usually with services like Google BigQuery (GBQ), it provides a compelling workflow and you can read a bit more about how we enabled exactly such a stack through professional services with one of our clients. https://www.mathworks.com/company/user_stories/freightos-performs-big-data-analytics-for-online-freight-logistics-with-matlab-and-google-bigquery.html. Thank you for your comment - it was duly noted and we/I will prioritize it for a future topic accordingly.