Path Management in Deployed Applications
This week, guest blogger Peter Webb continues his series of articles about the MATLAB Compiler. This week's topic: managing MATLAB's paths in a compiled application.
For an introduction to writing deployable code, please see the June 19th article.
Contents
One of the goals of the compilation process is to turn a flexible and easily modifiable MATLAB progam into a robust software
component with consistiently predictable behavior: a compiled application should not be able to change which functions it
calls or the way those functions work. As a result, certain aspects of the execution environment that are malleable in MATLAB
become fixed or constant in a deployed application. Perhaps the most important of these is the mechanism by which MATLAB locates
functions and data files: MATLAB's search paths.
A path consists of a list of directories that MATLAB searches to find files. MATLAB typically searches a path in list order, stopping
at the first file that matches the search criteria. MATLAB uses paths for two reasons: to determine which functions to execute;
and to locate data files in the file system.
MATLAB primarily interacts with four paths, two internal and two external: the MATLAB path; the MATLAB Java class path; the system path; and the system load library path.
Most familar, perhaps, is the MATLAB path, which affects how MATLAB determines which MATLAB function files to run and which MAT-files to open. Though the MATLAB path is the most visible, of nearly equal importance is the MATLAB Java class
path, which specifies the directories in which MATLAB searches for Java functions. The two external paths list the locations
which the operating system will look for executables and shared libraries, respectively. The Windows operating system uses
the same path to look for executables and shared libraries, but most Unix systems search separate paths for these two types
of files.
When writing an application intended for deployment via the MATLAB Compiler, keep in mind that MATLAB applications typically interact with paths in at least three independent ways:
- Executing MATLAB commands to change a path
- Passing paths as arguments to MATLAB functions
- Relying on the existence or structure of external paths
After briefly listing some general guidelines for safely interacting with paths in MATLAB, I provide more detailed techniques
to manage problems that can occur in each of these areas.
General Guidelines
A compiled application behaves more consistently if it:
- Does not change any path during execution.
- Uses relative paths (or anchors paths to a known root via the matlabroot or ctfroot functions).
- Avoids accessing or changing the current directory.
These guidelines become more important when the compiled application executes in an envrionment that differs from the one
in which it was developed: for example, when installed at a customer site.
In cases where it is convenient to keep path management commands in your M-files when they run in MATLAB, you can use the
isdeployed function to skip over calls to path management functions in deployed applications. For example, this M-code adds my "beta"-quality
M-files to the path when my applications run in MATLAB:
if ~isdeployed addpath /home/pwebb/mfiles/beta end
MATLAB Commands that Directly Access the Path
All M-file and Java functions rely on the state of MATLAB's internal paths, but only a few MATLAB commands directly access
or modify these paths. Most of these functions should be avoided in deployed applications because they have the potential
to change the way the application works, either by changing the functions it calls or the data files it loads.
addpath: | Add a directory to the MATLAB search path. |
cd: |
Change the current directory. |
javaaddpath: |
Add a directory to MATLAB's Java search path. |
javaclasspath: |
Get or set MATLAB's Java search path. |
javarmpath: |
Remove a directory from MATLAB's Java search path. |
path: |
Get or set the MATLAB search path. |
rmpath: |
Remove a directory from MATLAB's search path. |
savepath: |
Save the current MATLAB path. |
Path management commands often appear in startup or initialization code; when it comes time to deploy your application, you
may have forgotten that the application relies on these path settings. Problematic path settings sometimes occur in these
oft-forgotten files:
pathdef: | Used by MATLAB to establish the initial MATLAB search path, pathdef.m may be modified, in particular by the savepath command. |
startup: |
The MATLAB Compiler incorporates startup.m into every generated executable and shared library; any path management commands in startup.m therefore need to work in the deployed application. Directories added to MATLAB path by startup.m are automatically added to the MATLAB path in a deployed application by the MATLAB Compiler if the deployed application uses any files from those directories, thus making the addpath calls in startup.m both redundant and dangerous. |
Protect all calls to path management functions in these files by enclosing them in an ~isdeployed if-block.
path, rmpath
expects its MATLAB path to remain constant. If a directory is on the MATLAB search path in a deployed application, there's
a code path through the application that uses at least one file in that directory. During program execution (as opposed to
MATLAB startup or initialization) addpath is often used to navigate through MATLAB's flat function namespace, and
change the order in which MATLAB searches directories for matching file names. Don't use rmpath in a deployed application,
as it will almost surely cause undefined function errors. All of the MATLAB Deployment products prune the MATLAB search path
to exclude unused directories.
Instead: Ask yourself why you're changing the path. Rename functions or use MATLAB Objects to manage your
function namespace. Change the path before running the MATLAB Compiler to add optional or conditional functionality to your
application.
in deployed applications because it creates implicit dependencies between the application and the structure of the file system.
If the machine running the application does not meet these requirements, the application will fail in confusing ways.
Instead:
Use paths relative to matlabroot or ctfroot.
Change code like this:
cd my/data/directory fp = fopen('data.file', 'r');
To code like this:
fp = fopen(fullfile(ctfroot, 'my', 'data', 'directory', 'data.file'));
does for the MATLAB search path, javaaddpath adds a directory to MATLAB's Java search path. javaclasspath
gets or sets MATLAB's Java search path. While getting the Java class path, e.g., jcp = javaclasspath, is completely
safe, setting it can lead to problems in deployed applications. Removing directories from the Java class path will likely
result in undefined function errors.Instead: Set the Java class path before running the MATLAB Compiler. Use
classpath.txt or
javaaddpath.
path by overwriting pathdef.m. In a deployed application, this would create an unencrypted pathdef.m which
could not be run. Therefore savepath issues an error message when run from a deployed application.Instead:
If you must save the path, save it to a MAT-file; otherwise, enclose the call to savepath in an ~isdeployed
if-block.
p = path; save savedpath p
As further incentive to avoid these functions in deployed applications, please note the commands that change the Java class
path (javaaddpath, javarmpath) also clear the values of global variables.
For more details on how to manage the MATLAB function namespace with MATLAB objects, please see the polynomial example in the MATLAB documentation.
Paths as Arguments to MATLAB Functions
Many MATLAB programs assume that the directory structure of the filesystem cannot change. When this assumption leads to the
use of absolute paths in file names, e.g., load('c:/Work/MATLAB/data.mat') or an implicit dependency on MATLAB's ability to locate a file, e.g., load data.mat, it creates problems for deployed applications. A deployed application is very likely going to be installed on a machine
that has neither the same directory structure nor the access to the same filesystem as the machine on which it was developed.
The use of absolute paths causes errors because of differences between the development and the deployment file system structures.
But relying entirely on relative paths establishes a fragile dependency on the location of the current directory. The solution:
base all file locations on a known root.
In MATLAB, there are typically two file system roots that matter: the root of the external file system, typically / or c:/ on Unix and Windows systems respectively; and the MATLAB root, the directory in which MATLAB has been installed. In a deployed
application there is an additional root: the CTF root, which is the location of the application's MATLAB content (M-files,
data files, JAR files -- everything that the MATLAB Compiler put into the application's CTF file). In MATLAB ctfroot and matlabroot refer to the same directory: MATLAB's installation directory. In a deployed application, matlabroot refers to the directory in which the MATLAB Common Runtime (the MCR) is installed and ctfroot to the CTF root.
Deployed applications typically base their file names on the the CTF root. For example, assume an application that reads and
writes MAT files. To ensure that the MATLAB Compiler includes the input data files in the CTF archive, specify the data files
with mcc's -a (add) flag:
mcc fcn.m -a data.mat
In the application code, use which or ctfroot to construct a full path to the files being loaded or saved.
Use which when a known filename exists on the application's path:
[pathstr,name]=fileparts(which('data.mat')); % Load x from the MAT file load(name); x = x + 17; % Save x back to the MAT file save(fullfile(pathstr, 'data.mat'), x);
Use ctfroot when creating a new file:
data = input('Enter data: '); save(fullfile(ctfroot, 'more_data.mat'), data, '-append');
This use of ctfroot assumes that the application has permission to write to the file system on the deployment machine. This is not always true
by default, especially with network installations. Work with your IT department to secure the appropriate file system access.
Relying on the Structure of External Paths
A MATLAB application that uses the system or loadlibrary functions or calls MEX files linked to non-MathWorks shared libraries has a dependency on the external system paths. The
system command allows MATLAB programs to directly execute operating system commands and capture their output as text. This can be
very convenient, but it of course requires that the command be available in the current file system. loadlibrary allows MATLAB programs to load shared libraries and call their exported APIs; it similarly requires that the library in question
be available on the deployment machine's file system.
The techniques for managing external path dependencies are very similar to those used to manage internal path dependencies:
use full paths, favor explicit dependencies over implicit ones, and include all required files in the deployed application.
- Use full path names, rooted in the ctfroot, to specify the arguments to the system and loadlibrary functions.
- Include non-standard applications and shared libraries in the deployed application by using mcc's -a switch.
- Ensure the necessary applications and libraries are present on the deployment machine or in a network location accessible
from the deployment machine.
Next Up: Non-supported functions
The next post in this series will explore mechanisms for identifying and managing calls to non-supported functions. In the
meantime, you can refer to the documentation for the MATLAB Compiler or post follow-up questions here.
Published with MATLAB® 7.6
- カテゴリ:
- Best Practice,
- Deployment