You’ve Got Mail

Posted by Tim Johns, March 17, 2022

6 views (last 30 days) | 0 Likes | 7 comments

Let’s send some emails!

Imagine you write a function that sends an email to a customer. Your initial function call might look like this:

  sendEmail(emailAddress,firstName,lastName,bodyText)

Having 4 inputs to our function is fine, but if we need to add more customer details it becomes difficult to manage. When working on software projects, we find that certain variables tend to be grouped together. In our case, “first name”, “last name”, and “email address” represent a customer. We might also have “name”, “description”, and “price” to represent a product, or “customer”, “product”, and “datetime” to represent an order.

So let’s group our first 3 inputs into a single variable. The simplest way to do this is with a struct:

  customer = struct("FirstName",firstName,"LastName",lastName,"Email",emailAddress);

Our function call then becomes:

  sendEmail(customer,bodyText)

The advantage of this is that we can now pass around all the information related to a customer as a single variable. If their postal address is added to the struct, it’s automatically available for us to use in our sendEmail function.

What’s in a customer?

A disadvantage arises when a new developer (or you several weeks from now!) looks at this function again. What exactly is in the “customer” struct? Is the field for their first name called “FirstName” or “Firstname” or “firstname”?

If you try to access a field that doesn’t exist, you’ll get an error, but if you set a field that didn’t previously exist, it will be automatically added to the struct. Maybe you can work out the correct names from the existing code, assuming it’s not too long or complex. Otherwise, you’ll have to run a test (because someone has written a test, right?) and see what is provided to the function at runtime.

With a customer, we can at least take an educated guess as to the contents. If a more generic name is adopted (“data”, “metdadata”, “info”), it can be impossible.

This lack of clarity wastes developer time over and over again, and can cause subtle bugs that are hard to track down.

What’s a valid customer?

To address this, we could define validation on the customer input: we want to make sure that we have a scalar struct with the correct field names, that the data types are correct, and that any other validation rules are met. For example, that the email address is always of the format “*@*.*” .

Managing this can be complex and laborious. If you’re not careful, it leads to scattered code that repeats the validation and that is difficult to read. Furthermore, can we guarantee that all such “customer structs” will always be valid? No – it’s possible for us, or another developer, to create an invalid customer struct and we won’t know about it until we run our validation again or the code errors.

This forms a second source of bugs – data that is not in the expected format. Common issues include variables that are empty but shouldn’t be, or a cellstr that should be a string array.

What do we need from a customer?

There are operations that we frequently need to perform on our customer struct. For example, constructing the customer’s full name from their first and last name:

  fullname = customer.FirstName + " " + customer.LastName;

Writing this code each time we need the full name either leads to code duplication (and a high likelihood of inconsistency), or a function that’s divorced from its data and potentially hard to find.

Define a class instead

Instead, we can define a class for our customer! Doing so will bring the following advantages:

Functions that use the customer class only need one line of validation in the arguments block – the required class. The validation line tells you exactly what the input is, and you can immediately hit Ctrl+D to go to the class definition. It forms a clear contract between the calling code and your class.

The properties block tells you exactly what properties all objects of that class will have. The validation for each property makes explicit what each property will contain, it can set default values, and it guarantees that all objects of the class will be valid.

Dependent properties can be added to give you derived information without having to poke around the internals of the object (hello encapsulation!); tell the object what you want it to do, don’t ask for the individual pieces of data. Other functionality related to the class can be added as methods.

Make your class work with arrays

Custom classes really begin to shine when you make them array compatible. Rather than having a customer or an order, you can have an array of customers or orders and perform operations on them in one go. This native array handling is one of the unique features of MATLAB and removes the need for a “collection” object like you might have in C.

An array-based method I almost always add is table. table transforms an array of objects into a standard MATLAB table. It allows you to easily see the entire contents of the object array and perhaps to write the data into a uitable for display or to a spreadsheet for reporting purposes.

So why go to all this trouble of creating a custom class just to turn it back into a generic data type? The crucial difference now is that the table is derived from our custom class which handles all the validation and calculations; the table is not the source of truth.

Code example

Below is example code showing what a customer might look like when implemented as a class in MATLAB:

Property names are fixed, always present, and cannot be changed at runtime.
Data types and sizes are fixed, and type conversion performed automatically where possible (e.g. char to string).
Email is validated whenever it is changed.
The FullName dependent property gives calling code direct access to what it actually wants.
The table method allows us to easily visualise the contents of a customer array and that data to be consumed outside of our application.

classdef Customer
    
    properties
        FirstName (1,1) string
        LastName (1,1) string
        Email (1,1) string {mustBeValidEmail} = "undefined@domain.com"
    end
    
    properties (Dependent,SetAccess = private)
        FullName (1,1) string
    end

    methods
        
        function cust = Customer(first,last,email)
            
            cust.FirstName = first;
            cust.LastName = last;
            cust.Email = email;
            
        end
        
        function str = get.FullName(customer)
            
            str = customer.FirstName + " " + customer.LastName;
            
        end
        
        function tbl = table(customers)
            
            arguments
                customers (1,:) Customer
            end
            
            fn = [customers.FirstName]';
            ln = [customers.LastName]';
            email = [customers.Email]';
            
            tbl = table(fn,ln,email,'VariableNames',["FirstName" "LastName" "Email"]);
            
        end
        
    end
    
end

function mustBeValidEmail(value)
    
    anyLetterOrNum = alphanumericsPattern();
    pat = anyLetterOrNum + "@" + anyLetterOrNum + "." + anyLetterOrNum;
    assert(matches(value,pat),"Customer:InvalidEmail","Invalid email")
    
end

Here’s how we might use it:

c(1) = Customer("Mitch","Docker","mitch@foo.com")
c(2) = Customer("Lachlan","Morton","lachlan@bah.com");
c(3) = Customer("Rigoberto","Uran","rigo@uran.com");
table(c)

c = 

  Customer with properties:

    FirstName: "Mitch"
     LastName: "Docker"
        Email: "mitch@foo.com"
     FullName: "Mitch Docker"


ans =

  3×3 table

     FirstName     LastName          Email      
    ___________    ________    _________________

    "Mitch"        "Docker"    "mitch@foo.com"  
    "Lachlan"      "Morton"    "lachlan@bah.com"
    "Rigoberto"    "Uran"      "rigo@uran.com"

We can validate the inputs to our sendEmail function with a simple arguments block:

function sendEmail(customer,bodyText)
    
    arguments
        customer (1,1) Customer
        bodyText (1,1) string
    end
    
    % Other code...
    
end

When should I define a custom class?

You may be wondering at what point you should go to the effort and formality of creating custom classes. Much like making the decision to go from a script to a function or a function to a class, it’s when you hit the limits of what your current implementation can do. If you find that:

It’s difficult to understand what’s in your data structure.
You have problems with validation.
Code that’s closely related to the data is duplicated, scattered, or inconsistent.

It’s time to think about custom classes.

Published with MATLAB® R2022a