# MATLAB Speaks Python

MATLAB is a great computing environment for engineers and scientists. MATLAB also provides access to general-purpose languages including C/C++, Java, Fortran, .NET, and Python. Today's guest blogger, Toshi Takeuchi, would like to talk about using MATLAB with Python.

### Contents

- Why Not Use Both?
- Setting up Python in MATLAB
- Karate Club Dataset
- To Import or Not to Import
- Extracting Data from a Python Object
- Handling a Python List and Tuple
- Handling a Python Dict
- Visualizing the Graph in MATLAB
- Passing Data from MATLAB to Python
- Community Detection with NetworkX
- Streamlining the Code
- Summary

#### Why Not Use Both?

When we discuss languages, we often encounter a false choice where you feel you must choose one or the other. In reality, you can often use both. Most of us don't work alone. As part of a larger team, your work is often part of a larger workflow that involves multiple languages. That's why MATLAB provides interoperability with other languages including Python. Your colleagues may want to take advantage of your MATLAB code, or you need to access Python-based functionality from your IT systems. MATLAB supports your workflow in both directions.

Today I would like to focus on **calling Python from MATLAB** to take advantage of some existing Python functionality within a MATLAB-based workflow.

In this post, we will see:

- How to import data from Python into MATLAB
- How to pass data from MATLAB to Python
- How to use a Python package in MATLAB

#### Setting up Python in MATLAB

MATLAB supports Python 2.7, 3.6 and 3.7 as of this writing (R2019b). And here's another useful link.

I assume you already know how to install and manage Python environments and dependencies on your platform of choice, and I will not discuss it here because it is a complicated topic of its own.

Let's enable access to Python in MATLAB. You need to find the full path to your Python executable. Here is an example for Windows. On Mac and Linux, your operating system command may be different.

pe = pyenv; if pe.Status == "NotLoaded" [~,exepath] = system("where python"); pe = pyenv('Version',exepath); end

If that doesn't work, you can also just pass the path to your Python executable as string.

pe = pyenv('Version','C:\Users\username\AppData\Local\your\python\path\python.exe')

```
myPythonVersion = pe.Version
py.print("Hello, Python!")
```

myPythonVersion = "3.7" Hello, Python!

#### Karate Club Dataset

Wayne Zachary published a dataset that contains a social network of friendships between 34 members of a karate club at a US university in the 1970s. A dispute that erupted in this club eventually caused it to break up into two factions. We want to see if we can algorithmically predict how the club would break up based on its interpersonal relationships.

This dataset is included in NetworkX, a complex networks package for Python. We can easily get started by importing the dataset using this package.

I am using NetworkX 2.2. To check the package version in Python, you would typically use the version package attribute like this:

>>> networkx.__version__

MATLAB doesn't support class names or other identifiers starting with an underscore(_) character. Instead, use the following to get the help content on the package, including its current version.

```
> py.help('networkx')
```

#### To Import or Not to Import

Typically, you do this at the start of your Python script.

import networkx as nx G = nx.karate_club_graph()

However, this is not recommended in MATLAB because the behavior of the `import` function in MATLAB is different from Python's.

The MATLAB way to call Python is to use ` py`, followed by a package or method like this:

nxG = py.networkx.karate_club_graph();

If you must use `import`, you can do it as follows:

```
import py.networkx.*
nxG = karate_club_graph();
```

As you can see, it is hard to remember that we are calling a Python method when you omit `py`, which can be confusing when you start mixing MATLAB code and Python code within the same script.

#### Extracting Data from a Python Object

The following returns the karate club dataset in a NetworkX graph object.

myDataType = class(nxG)

myDataType = 'py.networkx.classes.graph.Graph'

You can see the methods available on this object like this:

methods(nxG)

You can also see the properties of this object.

properties(nxG)

A NetworkX graph contains an `edges` property that returns an object called `EdgeView`.

edgeL = nxG.edges; myDataType = class(edgeL)

myDataType = 'py.networkx.classes.reportviews.EdgeView'

To use this Python object in MATLAB, the first step is to convert the object into a core Python data type such as a Python `list`.

edgeL = py.list(edgeL); myDataType = class(edgeL)

myDataType = 'py.list'

Now `edgeL` contains a Python `list` of node pairs stored as Python `tuple` elements. Each node pair represents an edge in the graph. Let's see the first 5 `tuple` values.

listContent = edgeL(1:5)

listContent = Python list with no properties. [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5)]

#### Handling a Python List and Tuple

The Python way for handling a `list` or `tuple` typically looks like this, where you process individual elements in a loop.

for i in l: print i # l is the list for u, v in t: print((u, v)) # t is the tuple

The MATLAB way is to use arrays instead. The Python `list` can be converted into a `cell` array.

edgeC = cell(edgeL); myDataType = class(edgeC)

myDataType = 'cell'

This `cell` array contains Python `tuple` elements.

myDataType = class(edgeC{1})

myDataType = 'py.tuple'

The Python `tuple` can also be converted to a `cell` array. To convert the inner `tuple` elements, we can use `cellfun`.

```
edgeC = cellfun(@cell, edgeC, 'UniformOutput', false);
myDataType = class(edgeC{1})
```

myDataType = 'cell'

The resulting nested `cell` array contains Python `int` values.

myDataType = class(edgeC{1}{1})

myDataType = 'py.int'

#### Handling a Python Dict

Now let's also extract the nodes from the dataset. We can follow the same steps as we did for the edges.

```
nodeL = py.list(nxG.nodes.data);
nodeC = cell(nodeL);
nodeC = cellfun(@cell, nodeC, 'UniformOutput', false);
```

An inner `cell` array contains both Python `int` and `dict` elements.

cellContent = nodeC{1}

cellContent = 1×2 cell array {1×1 py.int} {1×1 py.dict}

Python `dict` is a data type based on key-value pairs. In this case, the key is `'club'` and the value is `'Mr. Hi'`.

cellContent = nodeC{1}{2}

cellContent = Python dict with no properties. {'club': 'Mr. Hi'}

Mr. Hi was a karate instructor at the club. The other value in the Python `dict` is `'Officer'`, and the officer was a leader of the club. They were the key individuals of the respective factions. The node attribute indicates which faction an individual node belongs to. In this case, Node 1 belonged to Mr. Hi's faction.

The Python way for handling a `dict` typically looks like this, where you process individual elements in a loop.

for k, v in d.items(): print (k, v)

Again, the MATLAB way is to use an array. The Python `dict` can be converted to a `struct` array.

nodeAttrs = cellfun(@(x) struct(x{2}), nodeC); myDataType = class(nodeAttrs)

myDataType = 'struct'

We can extract the individual values into a `string` array. The club was evidently evenly divided between the factions.

nodeAttrs = arrayfun(@(x) string(x.club), nodeAttrs); tabulate(nodeAttrs)

Value Count Percent Mr. Hi 17 50.00% Officer 17 50.00%

Let's extract the nodes that belong to Mr. Hi's faction.

```
group_hi = 1:length(nodeAttrs);
group_hi = group_hi(nodeAttrs == 'Mr. Hi');
```

#### Visualizing the Graph in MATLAB

MATLAB also provides graph and network capabilities and we can use them to visualize the graph.

Let's convert Python `int` values in the edge list to `double` and extract the nodes in the edges into separate vectors.

s = cellfun(@(x) double(x{1}), edgeC); t = cellfun(@(x) double(x{2}), edgeC);

MATLAB `graph` expects column vectors of nodes. Let's transpose them.

s = s'; t = t';

The node indices in Python starts with 0, but the node indices must start with non-zero value in MATLAB. Let's fix this issue.

s = s + 1; t = t + 1;

Now, we are ready to create a MATLAB graph object and plot it, with Mr. Hi's faction highlighted.

G = graph(s,t); G.Nodes.club = nodeAttrs'; figure P1 = plot(G); highlight(P1, group_hi,'NodeColor', '#D95319', 'EdgeColor', '#D95319') title({'Zachary''s Karate Club','Orange represents Mr. Hi''s faction'})

#### Passing Data from MATLAB to Python

In this case, we already have the NetworkX graph object, but for the sake of completeness, let's see how we could create this Python object within MATLAB.

Let's create an empty NetworkX graph.

nxG2 = py.networkx.Graph();

You can add edges to this graph with the `add_edges_from` method. It accepts a Python `list` of `tuple` elements like this:

[(1,2),(2,3),(3,4)]

This is not a valid syntax in MATLAB. Instead we can use a 1xN `cell` array of node pairs like this:

myListofTuples = {{1,2},{2,3},{3,4}};

When we pass this nested `cell` array to `py.list`, MATLAB automatically converts it to a Python `list` of `tuple` elements.

myListofTuples = py.list(myListofTuples); myDataType = class(myListofTuples{1})

myDataType = 'py.tuple'

Let's extract the edge list from the MATLAB `graph`. It is a 78x2 matrix of `double` values. In MATLAB, `double` is the default numeric data type.

edgeL = G.Edges.EndNodes; myDataType = class(edgeL)

myDataType = 'double'

If we convert an array of `double` values to a Python `list`, the values will be converted to Python `float`, but the default numeric data type in Python is `int`. So we cannot use `double`.

listContent = py.list(edgeL(1,:))

listContent = Python list with no properties. [1.0, 2.0]

Also, Python indexing is 0-based while MATLAB is 1-based. We need to convert the array of `double` elements to `int8` and change the variable elements to 0-based indexing.

edgeL = int8(edgeL) - 1; myDataType = class(edgeL)

myDataType = 'int8'

We can use `num2cell` to convert the matrix of `int8` values to a 78x2 `cell` array, where each element is in a separate cell.

edgeL = num2cell(edgeL); myDataType = class(edgeL)

myDataType = 'cell'

We can place the node pairs in the same cell by converting the 78x2 `cell` array to a 78x1 `cell` array using `num2cell`.

edgeL = num2cell(edgeL,2); [rows,cols] = size(edgeL)

rows = 78 cols = 1

The `add_edges_from` method expects a 1xN Python `list`. Now let's turn this into a 1xN `cell` array by transposing the Nx1 `cell` array, converting it to a Python `list` and adding it to the empty NetworkX graph object.

nxG2.add_edges_from(py.list(edgeL'));

The edges were added to the NetworkX graph object. Let's check the first 5 `tuple` values.

edgeL = py.list(nxG2.edges); listContent = edgeL(1:5)

listContent = Python list with no properties. [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5)]

The nodes were also added in the graph, but they currently don't have any attributes, as you can see below in the first 3 elements of the node list.

nodeL = py.list(nxG2.nodes.data); listContent = nodeL(1:3)

listContent = Python list with no properties. [(0, {}), (1, {}), (2, {})]

To add attributes, we need to use the `set_node_attributes` method. This method expects a nested Python `dict`. Here is how to create a `dict` in MATLAB.

myDict = py.dict(pyargs('key', 'value'))

myDict = Python dict with no properties. {'key': 'value'}

The `set_node_attributes` method expects a nested `dict`. The keys of the outer `dict` are the nodes, and values are `dict` `arrays` of key-value pairs like this:

{0: {'club': 'Mr. Hi'}, 1: {'club': 'Officer'}}

Unfortunately, this won't work, because `pyargs` expects only a `string` or `char` value as the key.

>> py.dict(pyargs(0, py.dict(pyargs('club', 'Mr. Hi')))) Error using pyargs Field names must be string scalars or character vectors.

Instead, we can create an empty `dict`, and add the inner `dict` from the `tuple` data, using 0-based indexing, with the `update` method like this:

attrsD = py.dict; for ii = 1:length(nodeAttrs) attrD = py.dict(pyargs('club', G.Nodes.club(ii))); attrsD.update(py.tuple({{int8(ii - 1), attrD}})) end

Then we can use the `set_node_attributes` to add attributes to the nodes.

py.networkx.set_node_attributes(nxG2, attrsD); nodeL = py.list(nxG2.nodes.data); listContent = nodeL(1:3)

listContent = Python list with no properties. [(0, {'club': 'Mr. Hi'}), (1, {'club': 'Mr. Hi'}), (2, {'club': 'Mr. Hi'})]

#### Community Detection with NetworkX

NetworkX provides the `greedy_modularity_communities` method to find communities within a graph. Let's try this algorithm to see how well it can detect the factions!

Since this club split into two groups, we expect to see 2 communities.

communitiesL = py.networkx.algorithms.community.greedy_modularity_communities(nxG2); myDataType = class(communitiesL)

myDataType = 'py.list'

The returned Python `list` contains 3 elements. That means the algorithm detected 3 communities within this graph.

num_communitieis = length(communitiesL)

num_communitieis = 3

The `list` contains a `frozenset`. A Python `frozenset` is the same as a Python `set`, except its elements are immutable. And a Python `set` is similar to a Python `list`, except all its elements are unique, whereas a `list` can contain the same element multiple times.

listContent = communitiesL{1}

listContent = Python frozenset with no properties. frozenset({32, 33, 8, 14, 15, 18, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31})

Let's convert it into nested `cells`.

```
communitiesC = cell(communitiesL);
communitiesC = cellfun(@(x) cell(py.list(x)), communitiesC, 'UniformOutput', false);
myDataType = class(communitiesC{1}{1})
```

myDataType = 'py.int'

The inner most `cell` contain Python `int` values. Let's convert them to `double`.

for ii = 1:length(communitiesC) communitiesC{ii} = cellfun(@double, communitiesC{ii}); end myDataType = class(communitiesC{1}(1))

myDataType = 'double'

Since the nodes are 0-based indexed in Python, we need to change them to 1-based indexed in MATLAB.

```
communitiesC = cellfun(@(x) x + 1, communitiesC, 'UniformOutput', false);
```

Let's plot the communities within the graph.

tiledlayout(1,2) nexttile P1 = plot(G); highlight(P1, group_hi,'NodeColor', '#D95319', 'EdgeColor', '#D95319') title({'Zachary''s Karate Club','Orange represents Mr. Hi''s faction'}) nexttile P2 = plot(G); highlight(P2, communitiesC{1},'NodeColor', '#0072BD', 'EdgeColor', '#0072BD') highlight(P2, communitiesC{2},'NodeColor', '#D95319', 'EdgeColor', '#D95319') highlight(P2, communitiesC{3},'NodeColor', '#77AC30', 'EdgeColor', '#77AC30') title({'Zachary''s Karate Club','Modularity-based Communities'})

If you compare these plots, you can see that the two communities on the right in orange and green, when combined, roughly overlap with Mr. Hi's faction.

We can also see that:

- Community 1 represents the 'Officer' faction
- Community 3 represents the devoted 'Mr. Hi' faction
- Community 2 represents the people who had connections with both factions

Interestingly, Community 2 ultimately ended up siding with Mr. Hi's faction.

Let's see if there is any difference between the output of the algorithm and the actual faction.

diff_elements = setdiff(group_hi, [communitiesC{2} communitiesC{3}]); diff_elements = [diff_elements setdiff([communitiesC{2} communitiesC{3}], group_hi)]

diff_elements = 9 10

The community detection algorithm came very close to identifying the actual faction.

#### Streamlining the Code

Up to this point we have been examining what data type is returned in each step. If you already know the data types, you can combine many of these steps into a few lines of code.

To get the karate club data and create a MATLAB graph, you can just do this:

nxG = py.networkx.karate_club_graph(); edgeC = cellfun(@cell, cell(py.list(nxG.edges)), 'UniformOutput', false); nodeC = cellfun(@cell, cell(py.list(nxG.nodes.data)), 'UniformOutput', false); nodeAttrs = cellfun(@(x) struct(x{2}), nodeC); nodeAttrs = arrayfun(@(x) string(x.club), nodeAttrs); s = cellfun(@(x) double(x{1}), edgeC)' + 1; t = cellfun(@(x) double(x{2}), edgeC)' + 1; G = graph(s,t); G.Nodes.club = nodeAttrs';

To create a Python graph from the MATLAB data, you can do this:

nxG2 = py.networkx.Graph(); edgeL = num2cell(int8(G.Edges.EndNodes) - 1); nxG2.add_edges_from(py.list(num2cell(edgeL, 2)')); attrsD = py.dict; for ii = 1:length(G.Nodes.club) attrD = py.dict(pyargs('club', G.Nodes.club(ii))); attrsD.update(py.tuple({{int8(ii - 1), attrD}})) end py.networkx.set_node_attributes(nxG2, attrsD);

And to detect the communities, you can do this:

communitiesC = cell(py.networkx.algorithms.community.greedy_modularity_communities(nxG2)); communitiesC = cellfun(@(x) cell(py.list(x)), communitiesC, 'UniformOutput', false); for ii = 1:length(communitiesC) communitiesC{ii} = cellfun(@double, communitiesC{ii}); end communitiesC = cellfun(@(x) x + 1, communitiesC, 'UniformOutput', false);

#### Summary

In this example, we saw how we can use Python within MATLAB. It is fairly straight forward once you understand how the data type conversion works. Things to remember:

- Python is 0-based indexed vs MATLAB is 1-based indexed
- Python's default numeric data type is
`int`whereas it's`double`for MATLAB - Instead of loops, convert Python data into suitable types of MATLAB arrays
- Use
`cell`arrays for Python`list`and`tuple` - Use
`struct`arrays for Python`dict`

In this example, we used a Python library in our MATLAB workflow to get the data and detect communities. I could have coded everything in MATLAB, but it was easier to leverage existing Python code and I was able to complete my tasks within the familiar MATLAB environment where I can be most productive.

Are you a coding polyglot? Share how you use MATLAB and Python together here.