Loren on the Art of MATLAB

September 20th, 2006

Working with Low Level File I/O and Encodings

I'm pleased to introduce Vadim Teverovsky, our guest blogger this week, who gives us his take on MATLAB low level file I/O and how it works with encodings.

It is fairly common for users to write and read character data where the characters are not 7-Bit ASCII (see ASCII wiki). Such characters may include both characters from languages other than English and various symbolic characters, such as a pound sign. Unfortunately, users may run into trouble when such files are shared across either platform or language boundaries. In order to be able to reliably read and write such data, there are certain things the user should know.

Contents

Common Problem #1: Platform Differences

Starting in R2006a, MATLAB's low level file IO has been, across the board, taking a "character" to mean an actual character as opposed to a "single byte", which has often been the case in the past. In today's multi-language environment, this is simply a necessity.

In order to understand what MATLAB writes out and reads in, we need to understand the concept of an "encoding". An encoding is a way of representing certain symbols, such as letters or numbers, in a language and locale specific way. An example of an encoding is 7-Bit ASCII, which many people are familiar with. Another example is Shift-JIS, which is commonly used in Japan. Yet others include windows-1252 and ISO-8859-1, which are both slightly different variants of what is commonly known as Extended ASCII. Each computer (user) will typically have a default locale. Thus, a user running Windows in the US will typically be running with the windows-1252 encoding. If the locale is changed, the encoding may change as well. Typically, all of the default encodings you may encounter will have the ASCII set of values in common, but after that, all bets are off.

MATLAB, unless you specify a particular encoding (more on that later), will use the computer's (user's) default encoding. Thus, when you are working on your Windows computer in Natick, Massachusetts, you could write the following code:

fid = fopen('sample1.txt', 'w','l');
fwrite(fid, 'abcdefg', 'char');
fclose(fid);

the result will be a file written in the windows-1252 encoding. When this file is read in again, assuming you are still running the same machine in the same environment, MATLAB will know how to translate this encoded data into its internal representation, and will read the characters properly.

type('sample1.txt');
abcdefg

But what happens if you write the file out on Windows, and try to read it on a Solaris machine. Well, it turns out that Solaris has, as its default, the 7-Bit ASCII encoding, which can not represent all of the symbols which are found in the windows-1252 encoding. As long as you stick to the set of ASCII characters (values <= 127), everything will look exactly the same. But what if you try to write out and read a pound sign (163 in windows-1252)?

char(163)
ans =

£

That value is not part of the 7-Bit ASCII encoding, and will therefore not be read in successfully on your Solaris computer. It will likely manifest itself as the ASCII value 26, which stands for "I don't know". What would the file contain? For now, trust me that providing the last argument to fopen below will result in a file very similar to what you would see on Solaris:

fid = fopen('sample2.txt', 'w','l', 'US-ASCII');
fwrite(fid, 'abcdefg £¥§©', 'char');
fclose(fid);
type('sample2.txt');
abcdefg 

What happened? The odd looking characters which correspond to non ASCII characters did not get written out properly, because MATLAB tried to convert them to US-ASCII, and could not do so.

Similarly, what if you are in Japan, working on a computer set to a Japanese environment, and write out a file that you wish to be able to read from a German environment machine? The same problem can occur, because the default encodings are different.

Common Problem #2: Files Coming From Outside Source

Yet another manifestation of this kind of issue is data that looks like this:

fid = fopen('sample3.txt', 'r', 'l');
str = fscanf(fid, '%s')
abs(str)
fclose(fid);
str =

ÿþa b c d e f g h 


ans =

  Columns 1 through 14

   255   254    97     0    98     0    99     0   100     0   101     0   102     0

  Columns 15 through 18

   103     0   104     0

Looks odd, doesn't it? First there are some odd characters in the beginning, then there are extra zeros inserted everywhere. What happened? In this case, the sample file was saved from Notepad, using the Save As... menu item and choosing to save it using the "Unicode" encoding. (BTW, when Notepad says "Unicode", they really mean UTF-16 encoding.) Since the file was opened with the default encoding, MATLAB transformed the data using the windows-1252 encoding, and what you see was the result. The first two bytes were a Byte Order Marker, which we would simply need to skip, since they do not actually represent data.

A Possible Solution

So much for this problem, now what can you do about them? If you wish to be more robust to platform and language/locale, you can specify an encoding to use when you fopen a file. For example, for the first problem described above:

fid = fopen('sample4.txt', 'w','l', 'ISO-8859-1');
fwrite(fid, 'abcdefg £¥§©', 'char');
fclose(fid);
type('sample4.txt');
abcdefg £¥§©

For the second problem, we will skip the Byte Order Marker, open the file in a Unicode encoding, and also turn off a warning, which indicates that not all functionality is supported for this encoding. For our purposes, which are reading text, the warning can be ignored.:

warning off MATLAB:iofun:UnsupportedEncoding;
fid = fopen('sample3.txt', 'r', 'l', 'UTF16-LE');
fseek(fid, 2, 0);
str = fscanf(fid, '%s')
abs(str)
fclose(fid);
str =

abcdefgh


ans =

    97    98    99   100   101   102   103   104

As you can see, the string is read correctly.

In general, if your program specifies the encoding for both reading and writing, then you don't have to worry about the default encoding, since you are specifying it explicitly. The character data is then saved in a file with the specified encoding, and MATLAB will read with that encoding as well, thus preserving all of the consistency. You just need to make sure that the character data you are saving is representable in the encoding you have chosen. For example, if you chose 'US-ASCII', you would not be able to write out a pound sign. If you are dealing with values in the range from 128 to 255, I would suggest using ISO-8859-1 as above. If you are writing out Japanese, Shift-JIS may be a good one to use.

Some Helpful Links

You can find out much more about Unicode, encodings, language and locales at the following references:

What sorts of other file encoding issues do you run into? Post here.


Published with MATLAB® 7.3

45 Responses to “Working with Low Level File I/O and Encodings”

  1. Arno replied on :

    This is really nice but I have Matlab 7 SP3 (7.1.0.246 (R14) Service Pack 3) under windows (which I thought was the most recent one) and it does not seem that this option of fopen is available

    fid = fopen(‘test.reg’, ‘r’,'l’, ‘ISO-8859-1′);
    ??? Error using ==> fopen
    Too many input arguments.

  2. Loren replied on :

    Arno-

    The encoding feature is new in R2006b. Here’s more information: http://www.mathworks.com/access/helpdesk/help/techdoc/rn/index.html?/access/helpdesk/help/techdoc/rn/f38-998197.html

    –Loren

  3. Petr Pošík replied on :

    Dear Loren,

    I ran into the encoding issues when trying to use the publish feature which you described in one of your recent posts. It works great until it cumes to locales.

    1) In this article, it is described what to do with external text files. But what encoding does MATLAB use e.g. for storing M-files? If I write an M-file (containing some local characters) on Windows in Czech Republic and then transfer it to Solaris in US, am I going to have problems reading that file?

    2) The publish function inserts a “Contents:” section which is a really nice feature. But it is a bit strange to see the string “Contents” in an article that is completely written in Czech. Is there a possibility to localize these automatically inserted texts? E.g. to pass a locale specification to the publish function?

    3) What about the text annotations and axis labels included in the graphs? The encoding of the m-file that generates them and the encoding of the font used to render them should be the same, but how can I ensure that?

    4) Anyway, is there a way to figure out what encoding is MATLAB actually using? Function like getcodepage() or getencoding()…

    Thanks for your answers and for your blog.
    Regards,

    Petr

  4. Yasuhiro Hara replied on :

    Dear Petr,

    I would like to answer your questions, 1) and 4).

    1) Basically, file encoding is determined by editors, such as MATLAB editor or Notepad, and most editors use the encoding specified by the user default locale setting. MATLAB itself also determines the file encoding based on the user default locale setting. If you write an M-file using MATLAB editor with Czech language setting on a Windows system, windows-1250 is used for the M-file. And MATLAB reads the M-file as a windows-1250 encoded file.

    As for Solaris systems in the U.S., there are four possible locale settings people may set on Solaris platform, and they are “en_US.ISO8859-1″, “en_US.ISO8859-15″, “en_US.UTF-8″, or “C”. Not all of them may be available on all versions of Solaris, by the way. If a windows-1250 encoded M-file is copied to a Solaris system with one of those locale settings, only 7-Bit ASCII characters are properly handled on the Solaris system. Other characters may be garbled or may be displayed as different characters. It depends on the locale setting on the Solaris system.

    4) There is an unsupported command to display the current default encoding on MATLAB. feature(‘DefaultCharacterSet’) command is the one, and can show the current default encoding.

    Regards,
    /yasu

  5. Matthew Simoneau replied on :

    Hi Petr,

    Your comment #2 about “Contents” is a good one. The only way to change this is to edit toolbox\matlab\codetools\private\mxdom2simplehtml.xsl, where this is hard-coded, or use a custom stylesheet. We should make this easier. I’ve put it on our wish list.

    Thanks,
    Matthew

  6. tls replied on :

    Dear Loren,
    I wanted to know if it is possible to read a .mat file and the write it to another file , say a .dat file.The .dat file will be used later to convert to other format.
    If it can be done, i would like to have the code for it.

    Regards,
    tls k

  7. Thomas replied on :

    Dear Loren,

    This blog already helped my with a number of problems I had for some time. Unfortunately, Petr Posiks question #3 didn’t receive much attention yet, though I think it’s a rather important one.

    How is it possible to get a correct hardcopy of a figure on a postscript printer when the figure contains unicode text (like special Czech characters)?

    The only possibilty seems to be to print it using a native windows driver in verbose mode but this is a very bad solution when printing many figures. Is there any more useful solution?

    Best regards
    Thomas

  8. Loren replied on :

    Hi Thomas-

    We’ve been upgrading MATLAB over time to be sensitive to different encodings, but have not completed all the work yet. The graphics is one area that currently doesn’t support encodings well. We know about it and it’s on our list to address. I don’t have a timeline for you.

    –Loren

  9. Geoff replied on :

    Thanks for your advice which helped me to read in unicode text. My next problem is that I would like to display non-roman text (e.g., Hebrew, Chinese) in a figure; however, all I get for characters above 255 is a series of arrows. Is displaying unicode above 255 supported in Matlab 2007a?

  10. Winston replied on :

    Could someone help me out with this problem.

    I keep getting this:

    fid = fopen(‘sample1.txt’, ‘r’,’l’, ‘utf8-le′);
    ??? Error using ==> fopen
    Too many input arguments.

    Error in ==> justtest1 at 5
    fid = fopen(‘sample1.txt’, ‘r’, ‘l’, ‘utf8-le’);

    I tried the same with ‘iso-8859-1′ but no luck there either.
    is there an update for the fopen script from Matlab on this?

    Many thanks.
    I’m using Matlab release 14

  11. Loren replied on :

    Winston-

    That final argument is new to MATLAB starting in R2006a.

    –Loren

  12. Anthony replied on :

    Thanks for this introduction :-)
    I am working with people using Matlab on a Linux Box and thus editing their M-files in UTF-8.

    I /have to/ work a Windows Box for now and I /have to/ use the matlab editor (not emacs for example) in order to use the (very interesting) Publish feature.

    The point is I can’t manage to display and edit UTF-8 correctly in the editor/debugger : any suggestion, trick or workaround (except for using Linux ;-) )

    Thanks a lot !

  13. karim raafat replied on :

    I need your help
    i am trying to enter arabic letters to the matlab
    although the letters are written, the program cannot trnasfer them into numbers to understand them. I checked the unicode table and found that below 30 all the letters or symbols are only blocks, also when i use DOUBLE fn i find that all the letters have the same number i.e. 26 although they are different arabic letters

    thanks

  14. Loren replied on :

    Karim-

    Please contact technical support for your issue.

    –Loren

  15. Gustaf replied on :

    Hello Loren, I have the same problem as some people above, namely that I would like to import and display unicode characters in a form, specifically Chinese. Is that possible in any Matlab version? In case it is, how?

    Best,

    Gustaf

  16. Yasuhiro Hara replied on :

    Properly handling non-7-Bit ASCII characters is affected by several different factors. We have been working to improve character data handling features, but there are still several restrictions with MATLAB. Especially, the following restriction may be related to recently reported problems here:

    * MATLAB can properly handle characters supported by the current locale setting
    Suppose the current locale setting is en_US.windows-1252. You can type Hebrew or Chinese characters in command window, but MATLAB handles them as invalid characters with the character code, 26 (0x1A).

    The above restriction may not be applied to all MATLAB usages, but please use MATLAB with a proper locale setting for now if you have to handle any non-7-Bit ASCII characters.

    Regards,
    /yasu

  17. Aris S replied on :

    well, i have a problem when reading files. I use one program writen in C to treat images and save the as binary files using the fwrite(,’rb’); the data that i write are 4 byte floating point values. When i try to read the same file with Matlab in order to process them further more i use the following code fid=fopen(‘name’,'r’);
    [B count]=fread(fid,’float32′);
    The problem is that even though that i read the file without a problem, tha values that i have at the end are different from the ones that i had when i was writting the file.
    I woulp apreciate any help.

    Best,
    Aris

  18. Loren replied on :

    Aris-

    Are you writing and reading the file on the same kind of machine (w.r.t. ENDIANness)?

    –Loren

  19. Aris S replied on :

    yes it is the same machine(exactly the same computer, i have only one :) ) and my Matlab version is 2007b.

    Aris

  20. Aris S replied on :

    sorry but there was an error in my C file :(

    Aris

  21. Vlad replied on :

    I create a map of the world with the names of the capital cities displayed in the local scripts. That is Latin, Greek, Cyrilic, Arabic, Hebrew, Hindi, Chinese, etc. My data comes from an Excel file and I would like to read it with the ‘xlsread’ function. I R2007a ‘xlsread’ doesn’t seem too recognize the Excel file as Unicode (UTF16?) encoded. Is there a way around or is ‘fopen’ as yet the only I/O function supporting the encoding feature? Thanks.

  22. Vlad replied on :

    Loren: Here are a number of quirk Matlab R2007a behavior in regard to Unicode characters outside the ANSI range. / Thanks! / Vlad

    1. Editor: Write Unicode characters in Editor such as \u001a (Hebrew ‘sh’), save, reload: they are gone.

    2. Workspace: Type an Unicode character outside the ANSI range at the prompt. It shows ok. Press ‘Enter’: a box character appears.

    3. Workspace: The above sometimes freezes Matlab.

    4. Workspace: assign a variable to the above Unicode character: it shows as a box in the Workspace. Now write the character in a file with Notepad, save it in UTF16, read the file in Matlab and display the character: it shows ok in the Workspace.

    fid = fopen(‘unicode.txt’, ‘r’, ‘l’, ‘UTF16-LE’);
    fseek(fid, 2, 0);
    str = fscanf(fid, ‘%c’)
    fclose(fid);

    5. Array Editor: Write four tab-delimited words in Latin Extended script, Hebrew, Arabic, Chinese in Notepad UTF16. Read them with Matlab: the characters show ok with ‘fscanf’. When wanting to section the string with ‘textscan’, the Unicode characters higher than ANSI are gone.

    fid = fopen(‘unicode.txt’, ‘r’, ‘l’, ‘UTF16-LE’);
    fseek(fid, 2, 0);
    str = fscanf(fid, ‘%c’);
    C = textscan(str, ‘%s %s %s %s’,…
    ‘delimiter’, ‘\t’, ‘whitespace’, ”);
    s = char(C{1,1}(1))
    c = str(3)
    fclose(fid);

    6. Datatips in Editor: Seemingly all +ANSI Unicode characters are shown as being \u001a.

    7. Common Problem #2: The proposed solution is very helpful. A welcome addition would be to show how to read a tab/comma/… delimited file. As it is the result is one string.

    8. Other functions: Other functions are also not straightforward to use with +ANSI Unicode characters. For example ‘strcmp’ find(strcmp( {r.countries_names}’, selection_string) ) doesn’t find selection_string = ‘Česká republika’ (i.e. Czech Republic) in a cell array even if a check with abs( char( cellstr(selection_string(m)) ) ) & abs( char( cellstr( {r.countries_names}’ ) ) ) shows it is there.

  23. Vlad replied on :

    Errata: Hebrew Letter Shin: \u1513. While \u001a is the unfortunate box character.

  24. Yasuhiro Hara replied on :

    Hi Vlad,

    MATLAB currently does not support the Unicode Standard. As I described in my reply # 16, MATLAB can properly handle characters supported by the current locale setting. In addition, each character has to be encoded in up to 2-byte character code. Otherwise, MATLAB may not properly handle characters. The 2nd restriction says that MATLAB cannot support the Unicode character set even if UTF-8 is specified as the process default encoding, by the way.

    You can use any encoding schemes for your data files, but only characters supported by your current locale setting are properly handled. If your current locale setting does not support Hebrew characters, unfortunately you see the results as you described.
    M-files must be encoded with the same encoding scheme as the one specified by your current locale setting.

    Note:
    There is a way to change the current encoding setting with M-functions, but it does not change entire encoding settings in the MATLAB process. I would not recommend doing that. Besides, it does not remove the up to 2-byte encoding restriction.

    The low-level file I/O M-functions allow to read/write files encoded with several different encoding schemes, but it does not mean that MATLAB can handle any characters with your current locale setting. The same rule is applied to several MATLAB features, and therefore, I described the restriction.

    The problems that you described are most likely caused by character code conversions between UTF-16 and the current encoding in the MATLAB process. But a couple of them, such as #3, may be caused by different reasons. If you see any of those problems with a proper locale setting, which means that all characters are supported by the current locale setting, please contact technical support.

    Best regards,
    /yasu

  25. Vlad replied on :

    Yasu, Thanks for your comments. Here some follow-up points. / Cheers, Vlad

    1. Common Problem #2B: The function ‘textscanu’ posted on FileExchange (*) reads Unicode strings from a file and outputs a cell array of strings. The input strings have to be tab-delimited and have carriage-returns at end of rows. Note that Matlab’s ‘textscan’ apparently gives an ASCII 26 value to characters outside the ANSI range, while ‘fscanf’ doesn’t have the option to recognize delimitors.
    (*) MATLAB Central > File Exchange > Utilities > Data Import/Export > Read Unicode Files

    2. Unicode strings stored in figures: Apparently Unicode strings attributed to ‘String’ in listboxes have non-ANSI values set to ASCI 26. I’m working to see if this is true, and if this behavior can be influenced in any way, such as setting a specific encoding and font for the listboxes and figures. Else it would mean that listboxes and other figure objects can’t be used to set and retrieve Unicode strings.

    3. Locale and file encodings: In my specific locale I can see on screen most Unicode characters I need: that is Latin Extended, Arabic, Hebrew, Japanese. I work with highly heterogeneous multi-script files where all these scripts coexist in the same document. The applications I develop are for Linguistics, Data Mining, GIS, for which multi-script is a basic requirement.

  26. Yasuhiro Hara replied on :

    Hi Vlad,

    Thank you very much for your input. We are aware of those character handling issues, and we are trying to resolve them.

    Thanks again.

  27. Assaf replied on :

    What Vlad wrote: “3. Workspace: The above sometimes freezes Matlab.”.

    I can also ascertain that this happens. Matlab Freezes if I accidentally type a Hebrew character at the prompt.

    This happens on both 2006 and 2007 versions. Computer is running CentOS with KDE.

  28. Loren replied on :

    Assaf and Vlad-

    Please contact technical support with details. This is not a low-level file i/o issue that can be dealt with on the blog. Thanks.

    –Loren

  29. A.N.Jyothi Swaroop replied on :

    Hi,
    I am trying to read a unicode encoded Tamil(an Indian language) text in MATLAB and want to display word by word. But unfortunately,some crazy characters are appearing on the window. As many said, many of the characters are lost and many boxes(ASCII-26) are appearing in between the characters. Any suggestions would be really helpful.
    p.s- I am using MATLAB 2006 version

  30. Yasuhiro Hara replied on :

    Hi A.N.Jyothi Swaroop,

    There are several factors to break character codes. I can assume that the file was read with wrong encoding setting or the current default encoding does not support Tamil characters. But could you provide the following information?

    1) Platform (operating system)?
    2) MATLAB encoding setting?
    Please execute the following command:
    >> feature(‘DefaultCharacterSet’)
    3) Which Unicode encoding is used with the file (UTF-8, UTF-16, UTF-32)?
    4) Sample code to read the file

    Regards,
    /yasu

  31. A.N.Jyothi Swaroop replied on :

    Hi Yasuhiro Hara,
    I am working on Windows platform and encoding setting is windows-1252.The file is of UTF-16 and while I was trying to read only one character namely ‘த’ which has unicode value of 2980 which is reading properly, but when I am trying to display it, I was using the text command as following but instead getting a different character.
    >>h=text(0.5,0.5,char(2980) ,’FontName’,'MS Arial Unicode’,'FontSize’,50,’LineWidth’,10);
    I tried converting it to ASCII and display it but still failed.
    Can you please help me out.

    Thanking you
    Regards
    A.N.Jyothi Swaroop

  32. Yasuhiro Hara replied on :

    Hi A.N.Jyothi Swaroop,

    MATLAB internally still converts character codes between the user default encoding and UTF-16 in many places. Because your default encoding is windows-1252, Tamil characters are not properly handled with those character conversions. The character code, 26 (0x1A), is the result of character code conversion failure. MATLAB can properly handle characters supported by the current locale setting. This is one of current MATLAB’s restrictions.

    Unfortunately, Tamil locale is not supported by MATLAB, and therefore, changing locale setting does not resolve this issue. If other MATALB supported Windows locale supports Tamil characters, setting the locale may be a solution.

    Regards,
    /yasu

  33. A.N.Jyothi Swaroop replied on :

    Hi,
    Now i just intend to do some file operations with unicode encoded text(Tamil) in MATLAB. While reading the file, the values corresponding to each characters are stored in a file. When I am trying to convert this into a string using native2unicode or char function, some junk characters are stored into the file.The problem with native2unicode function is that it reads data as 8 bit values whereas all Tamil characters are 16 bit values(ex-2965,2980).Is there any possibility of native2unicode for 16 bit values.

    Regards,
    A.N.Jyothi Swaroop

  34. Yasuhiro Hara replied on :

    Hi A.N.Jyothi Swaroop,

    I assume your file is encoded in UTF-16LE since you mentioned it at your reply #31, and you are trying to handle character data with MATLAB char data type. If you read your file as the following, your file contest is stored in to “str” variable.

    >> fid = fopen(‘YourFile.txt’, ‘r’, ‘n’, ‘UTF-16LE’);
    >> str = fread(fid, inf, ‘char=>char’)';
    >> fclose(fid);

    The data type of the “str” variable is MATLAB char data type, and is encoded in UTF-16. So, you don’t have to use native2unicode.

    But please be careful if your file has BOM (Byte Order Mark). Currently, MATLAB does not automatically skip BOM.

    Regards,
    /yasu

  35. Anibal Orozco Fuentes replied on :

    hi! everyone, I have problems with MATLAB 2009a in MAC OS X, the problem is that I want to make conditions witch UNICODE-16, like “á” or “ü”, but matlab don’t make it, why can make that MATLAB use UNICODE-16 in my project?, I making a Spanish and Trahumara Synthetizer, and I need it. please help me.

  36. Dirk replied on :

    Hi All

    Is there a way to configure MATLAB figures to display unicode characters correctly when plotted using the ‘text’ function. I am trying to display the latin small letter schwa (U+0259). I can display other things like the euro and pound correctly and the schwa character is displayed correctly in my workspace, but when I print it to the command window or a figure I get a default character. I need to plot it on a figure.

  37. Loren replied on :

    Dirk-

    It depends on too many things to answer unequivocally. Please contact technical support with as much detail as you can provide, e.g., locale, default encoding, etc. See link on the right to get support.

    –Loren

  38. Bob replied on :

    Hi Lori,

    Im trying to process Unicode text files from more than one different locales than the standard latin one.
    Im able to verify that it works with the one, or with the other, but never with more than one at a time.
    Do you think there is a way overcome this ?

     %
     fid1 = fopen(unicodeFileLocale_1, 'r', 'n', 'UTF-8');
     str1 = fread(fid1, inf, '*char')';
     fclose(fid1);
     %
     fid2 = fopen(unicodeFileLocale_2, 'r', 'n', 'UTF-8');
     str2 = fread(fid2, inf, '*char')';
     fclose(fid2);
    

    Best regards,

  39. Loren replied on :

    Bob-

    You don’t say what happens when you run your code. Can you please explain more. It looks like you are using one locale at a time in your code.

    Loren

  40. Terrance Nearey replied on :

    It’s pretty astounding to me that these problems haven’t been substantially solved after 3 of years of this blog. Every Web browser in the world handlesthis kind of thing pretty seamlessly by now.. as witnessed by some examples included in this blog.
    I realize that it gets pretty hairy out there wrt to alternate standards and even security issues. But aren’t there some minimal de facto standards that could be implemented? Say you could try to implement in a “coming real soon” matlab any character codings that will show up properly in this blog today?
    Cheers

  41. Bo Pedersen replied on :

    I concur with Terrance… I wasted a lot of time in Matlab trying to fix encoding problems – just to realize that they are not fixable..

  42. Robbe replied on :

    Hi Loren,

    Can you help me out please, trying to read a very big data file (250 mb), error out of memory appears – any ideas? Can’t figure a way to read sample by sample, because fread can’t read only the 2 value. It reads 1 and 2 (fread(fid,2,’unit16′)).

    Robbe

  43. Loren replied on :

    Robbe-

    Please contact technical support, link on the right of the blog, with full code details, error message, and sample data.

    –loren

  44. Leilane replied on :

    Hi Loren,

    On Matlab graphics window I can do that texts of my plots have accent. However, I run Matlab on bash (matlab-NoDisplay) and despite I changed the locale of my system for pt_BR.ISO-8859-1 (user@pc$Vitória), when I run matlab on bash I have (>> Vitria).

    Where I tell for Matlab that it must have this encoding (ISO-8859-1)?

    Thanks (sorry for english),

    Leilane

  45. Yasuhiro Hara replied on :

    Hi Leilane,

    The NoDisplay mode currently does not properly handle non 7-Bit ASCII characters. The NoDisplay mode may display non 7-Bit ASCII characters, but it cannot be accept non 7-Bit ASCII characters.

    As for encoding setting, MATLAB automatically gets the user default encoding from the user default locale setting. If you use MATLAB on a Linux system, you can specify it using locale environment variable, such as LANG.

    Regards,
    /yasu


MathWorks
Loren Shure works on design of the MATLAB language at MathWorks. She writes here about once a week on MATLAB programming and related topics.

These postings are the author's and don't necessarily represent the opinions of The MathWorks.