3  The “Hello World” of TinyML

3.1 Introduction to Microcontroller-Based Machine Learning

Machine learning at the edge represents a significant paradigm shift in computational intelligence, enabling sophisticated inferencing capabilities on resource-constrained embedded systems such as the EFR32MG24 Wireless Gecko microcontroller. This chapter explores the theoretical foundations and practical implementations of TinyML specifically tailored for microcontroller deployment, with particular focus on the sine wave prediction model as the canonical “Hello World” example of TinyML.

The concept of a “Hello World” example has long been a tradition in programming, where new technologies are introduced with simple code that demonstrates basic functionality. In the domain of TinyML, our sine wave prediction serves as an elegant introduction to the end-to-end process of building, training, and deploying models to microcontrollers.

3.2 Theoretical Foundations of TinyML for Microcontrollers

3.2.1 The Computational Constraints Paradigm

Traditional machine learning systems operate under the assumption of abundant computational resources, where model complexity and size are secondary concerns to performance metrics. TinyML, however, inverts this paradigm, placing primary emphasis on resource efficiency while maintaining acceptable inferencing quality.

For the EFR32MG24 platform, with its ARM Cortex-M33 core, limited memory footprint (1536KB flash and 256KB RAM), and power-sensitive applications, we must consider:

  1. Memory-Constrained Learning: Operating within a 256KB RAM budget necessitates models with minimal memory footprints
  2. Computation-Constrained Inference: The 78MHz Cortex-M33 processor requires algorithmic optimizations to achieve real-time performance
  3. Energy-Constrained Execution: Battery-powered applications demand power-aware ML implementations

These constraints fundamentally reshape our approach to machine learning model design, training methodologies, and deployment strategies.

3.2.2 Model Compression and Quantization

Central to TinyML is the concept of model compression, which can be formalized as an optimization problem:

\[\min_{\theta'} \mathcal{L}(f_{\theta'}(X), Y) \quad \text{s.t.} \quad |\theta'| \ll |\theta|\]

Where \(\theta\) represents the parameters of the original model, \(\theta'\) the compressed model parameters, \(\mathcal{L}\) the loss function, and \(f_{\theta'}(X)\) the model predictions on input \(X\) compared against ground truth \(Y\).

Quantization—a key technique in this domain—transforms floating-point weights and activations to reduced-precision integers:

\[Q(w) = \text{round}\left(\frac{w}{\Delta}\right) \cdot \Delta\]

Where \(\Delta\) represents the quantization step size. This transformation reduces both memory requirements and computational complexity at the cost of some precision.

3.3 Building Our Sine Wave Model in Google Colab

3.3.1 Generating and Processing the Dataset

For our introductory TinyML example, we’ll create a sine wave predictor that learns to approximate the sine function. This represents an ideal starting point for several reasons:

  1. The sine function is mathematically well-defined and bounded
  2. The input-output relationship exhibits nonlinearity that requires proper model architecture
  3. The implementation can produce visually verifiable results on a microcontroller

Let’s begin by creating a Google Colab notebook to build and train our model. Open a new notebook and start with the following code to generate our training data:

import numpy as np
import math
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers
import os

# Generate a uniformly distributed set of random numbers in the range from
# 0 to 2π, which covers a complete sine wave oscillation
SAMPLES = 1000
np.random.seed(1337)
x_values = np.random.uniform(low=0, high=2*math.pi, size=SAMPLES)
# Shuffle the values to guarantee they're not in order
np.random.shuffle(x_values)
# Calculate the corresponding sine values
y_values = np.sin(x_values)

# Add a small random number to each y value to simulate noise
y_values += 0.1 * np.random.randn(*y_values.shape)

# Split into train/validation/test sets
TRAIN_SPLIT = int(0.6 * SAMPLES)
TEST_SPLIT = int(0.2 * SAMPLES + TRAIN_SPLIT)
x_train, x_validate, x_test = np.split(x_values, [TRAIN_SPLIT, TEST_SPLIT])
y_train, y_validate, y_test = np.split(y_values, [TRAIN_SPLIT, TEST_SPLIT])

# Plot our data points
plt.figure(figsize=(10, 6))
plt.scatter(x_train, y_train, label='Training data')
plt.scatter(x_validate, y_validate, label='Validation data')
plt.scatter(x_test, y_test, label='Test data')
plt.legend()
plt.title('Sine Wave with Random Noise')
plt.xlabel('x values')
plt.ylabel('y values (sine of x + noise)')
plt.show()

3.3.2 Constructing and Training the Neural Network Model

Now we’ll construct a simple neural network to learn the sine function:

# Create a model with 2 layers of 16 neurons each
model = tf.keras.Sequential()
# First layer takes a scalar input and feeds it through 16 "neurons"
model.add(layers.Dense(16, activation='relu', input_shape=(1,)))
# Second layer with 16 neurons to capture non-linear relationships
model.add(layers.Dense(16, activation='relu'))
# Final layer is a single neuron for our output value
model.add(layers.Dense(1))
# Compile the model using a standard optimizer and loss function for regression
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

# Display model summary to understand its structure
model.summary()

# Train the model on our data
history = model.fit(x_train, y_train, 
                    epochs=500, 
                    batch_size=16,
                    validation_data=(x_validate, y_validate),
                    verbose=1)

# Plot the training and validation loss
plt.figure(figsize=(10, 6))
plt.plot(history.history['loss'], label='Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Training and Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

# Plot the training and validation mean absolute error
plt.figure(figsize=(10, 6))
plt.plot(history.history['mae'], label='MAE')
plt.plot(history.history['val_mae'], label='Validation MAE')
plt.title('Training and Validation Mean Absolute Error')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.legend()
plt.show()

# Evaluate the model on our test data
test_loss, test_mae = model.evaluate(x_test, y_test)
print(f'Test Loss: {test_loss:.4f}')
print(f'Test MAE: {test_mae:.4f}')

# Generate predictions across the full range for visualization
x_dense = np.linspace(0, 2*math.pi, 200)
y_dense_true = np.sin(x_dense)
y_dense_pred = model.predict(x_dense)

# Plot the true sine curve against our model's predictions
plt.figure(figsize=(10, 6))
plt.plot(x_dense, y_dense_true, 'b-', label='True Sine')
plt.plot(x_dense, y_dense_pred, 'r-', label='Model Prediction')
plt.scatter(x_test, y_test, alpha=0.3, label='Test Data')
plt.title('Sine Wave Prediction')
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.legend()
plt.show()

This architecture, though simple, is carefully designed to capture the nonlinear relationship of the sine function. The ReLU (Rectified Linear Unit) activation function is particularly important as it introduces nonlinearity:

\[\text{ReLU}(x) = \max(0, x)\]

We train the model using the mean squared error loss function, which for a regression problem is defined as:

\[\text{MSE} = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2\]

Where \(y_i\) represents the actual sine value and \(\hat{y}_i\) represents our model’s prediction.

3.4 Optimizing for Microcontroller Deployment

3.4.1 Model Conversion and Quantization for TensorFlow Lite

To deploy our trained model to the EFR32MG24 microcontroller, we must convert it into a format suitable for resource-constrained devices:

# Convert the model to the TensorFlow Lite format without quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model to disk
with open("sine_model.tflite", "wb") as f:
    f.write(tflite_model)

# Convert with quantization for further optimization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Define a generator function that provides our test data's x values
# as a representative dataset
def representative_dataset_generator():
  for value in x_test:
    yield [np.array(value, dtype=np.float32, ndmin=2)]
    
converter.representative_dataset = representative_dataset_generator
tflite_model_quantized = converter.convert()

# Save the quantized model to disk
with open("sine_model_quantized.tflite", "wb") as f:
    f.write(tflite_model_quantized)

# Print the size reduction achieved through quantization
print(f"Original model size: {len(tflite_model)} bytes")
print(f"Quantized model size: {len(tflite_model_quantized)} bytes")
print(f"Size reduction: {(1 - len(tflite_model_quantized) / len(tflite_model)) * 100:.2f}%")

3.4.2 Converting to C Code for Embedded Systems

For deployment on microcontrollers like the EFR32MG24, we need to convert our quantized model into a C header file that can be directly included in our firmware:

# Function to convert the model to a C array
def convert_tflite_to_c_array(tflite_model, array_name):
    hex_data = ["0x{:02x}".format(byte) for byte in tflite_model]
    c_array = f"const unsigned char {array_name}[] = {{\n"
    
    # Format the hex data into rows
    chunk_size = 12
    for i in range(0, len(hex_data), chunk_size):
        c_array += "  " + ", ".join(hex_data[i:i+chunk_size]) + ",\n"
    
    c_array = c_array[:-2] + "\n};\n"
    c_array += f"const unsigned int {array_name}_len = {len(tflite_model)};\n"
    
    return c_array

# Generate the C array for our model
c_array = convert_tflite_to_c_array(tflite_model_quantized, "g_sine_model_data")

# Save to a header file
with open("sine_model_data.h", "w") as f:
    f.write("#ifndef SINE_MODEL_DATA_H_\n")
    f.write("#define SINE_MODEL_DATA_H_\n\n")
    f.write("#include <stdint.h>\n\n")
    f.write(c_array)
    f.write("\n#endif  // SINE_MODEL_DATA_H_\n")

print("C header file generated: sine_model_data.h")

# Download the files
from google.colab import files
files.download("sine_model.tflite")
files.download("sine_model_quantized.tflite")
files.download("sine_model_data.h")

3.5 Deploying with Simplicity Studio and Gecko SDK

Now that we have our trained model in a format suitable for microcontrollers, we’ll implement the TinyML application using Simplicity Studio and the Gecko SDK. This approach simplifies development by providing a structured framework for EFR32 devices.

3.5.1 Creating a New Project in Simplicity Studio

  1. Launch Simplicity Studio and connect your EFR32MG24 development board
  2. Select your device in the “Debug Adapters” view
  3. Click on “Create New Project” in the Launcher perspective
  4. Select “Silicon Labs Project Wizard” and click “Next”
  5. Choose “Gecko SDK” as the project type
  6. Filter for “example” and select “TensorFlow Lite Micro Example” template
  7. Configure project settings:
    • Name: sine_wave_predictor
    • SDK version: Latest version
    • Click “Next” and then “Finish”

3.5.2 Project Structure and Important Files

Simplicity Studio creates a project with the following important files:

  • app.c: Main application entry point
  • sl_tflite_micro_model.{h,c}: TensorFlow Lite Micro integration
  • sl_pwm.{h,c}: PWM control for LED output
  • sine_model_data.h: Our model data (to be replaced with our trained model)

3.5.3 Adding Our Trained Model

  1. In Simplicity Studio, locate the project’s inc folder
  2. Right-click and select “Import” → “General” → “File System”
  3. Browse to the location where you saved sine_model_data.h
  4. Select the file and click “Finish”

3.5.4 Implementing the Application Logic

Now we’ll modify the application code to use our sine wave model. Open app.c and replace its contents with the following:

/***************************************************************************//**
 * @file app.c
 * @brief TinyML Sine Wave Predictor application
 *******************************************************************************
 * # License
 * <b>Copyright 2023 Silicon Laboratories Inc. www.silabs.com</b>
 *******************************************************************************
 *
 * SPDX-License-Identifier: Zlib
 *
 * The licensor of this software is Silicon Laboratories Inc.
 *
 * This software is provided 'as-is', without any express or implied
 * warranty. In no event will the authors be held liable for any damages
 * arising from the use of this software.
 *
 * Permission is granted to anyone to use this software for any purpose,
 * including commercial applications, and to alter it and redistribute it
 * freely, subject to the following restrictions:
 *
 * 1. The origin of this software must not be misrepresented; you must not
 *    claim that you wrote the original software. If you use this software
 *    in a product, an acknowledgment in the product documentation would be
 *    appreciated but is not required.
 * 2. Altered source versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.
 * 3. This notice may not be removed or altered from any source distribution.
 *
 ******************************************************************************/
#include "sl_component_catalog.h"
#include "sl_system_init.h"
#include "app.h"
#if defined(SL_CATALOG_POWER_MANAGER_PRESENT)
#include "sl_power_manager.h"
#endif
#include "sl_system_process_action.h"

#include <stdio.h>
#include <math.h>

#include "sl_tflite_micro_model.h"
#include "sl_led.h"
#include "sl_pwm.h"
#include "sl_sleeptimer.h"

// Constants for sine wave demonstration
#define INFERENCES_PER_CYCLE  32
#define X_RANGE               (2.0f * 3.14159265359f)  // 2π radians
#define PWM_FREQUENCY_HZ      10000
#define INFERENCE_INTERVAL_MS 50

// Global variables
static int inference_count = 0;

void app_init(void)
{
  // Initialize TFLite model
  sl_status_t status = sl_tflite_micro_init();
  if (status != SL_STATUS_OK) {
    printf("Failed to initialize TensorFlow Lite Micro\n");
    return;
  }
  
  // Initialize PWM for LED control
  sl_pwm_config_t pwm_config = {
    .frequency = PWM_FREQUENCY_HZ,
    .polarity = SL_PWM_ACTIVE_HIGH
  };
  
  sl_pwm_init(SL_PWM_LED0, &pwm_config);
  
  printf("Sine Wave Predictor initialized\n");
}

void app_process_action(void)
{
  // Calculate x value based on our position in the cycle
  float position = (float)inference_count / (float)INFERENCES_PER_CYCLE;
  float x_val = position * X_RANGE;
  
  // Prepare the input tensor with our x value
  float input_data[1] = { x_val };
  sl_tflite_micro_tensor_t input_tensor;
  sl_status_t status = sl_tflite_micro_get_input_tensor(0, &input_tensor);
  if (status != SL_STATUS_OK) {
    printf("Failed to get input tensor\n");
    return;
  }
  
  // Copy our input data to the input tensor
  status = sl_tflite_micro_set_tensor_data(&input_tensor, input_data, sizeof(input_data));
  if (status != SL_STATUS_OK) {
    printf("Failed to set input tensor data\n");
    return;
  }
  
  // Run inference
  status = sl_tflite_micro_invoke();
  if (status != SL_STATUS_OK) {
    printf("Inference failed\n");
    return;
  }
  
  // Get the output tensor
  sl_tflite_micro_tensor_t output_tensor;
  status = sl_tflite_micro_get_output_tensor(0, &output_tensor);
  if (status != SL_STATUS_OK) {
    printf("Failed to get output tensor\n");
    return;
  }
  
  // Get the predicted sine value
  float predicted_sine = 0.0f;
  status = sl_tflite_micro_get_tensor_data(&output_tensor, &predicted_sine, sizeof(predicted_sine));
  if (status != SL_STATUS_OK) {
    printf("Failed to get output tensor data\n");
    return;
  }
  
  // Map the sine value (-1 to 1) to PWM duty cycle (0 to 100%)
  uint8_t duty_cycle = (uint8_t)((predicted_sine + 1.0f) * 50.0f);
  
  // Set LED brightness using PWM
  sl_pwm_set_duty_cycle(SL_PWM_LED0, duty_cycle);
  
  // Log the values (only every 8th inference to reduce console traffic)
  if (inference_count % 8 == 0) {
    printf("x: %.3f, predicted sine: %.3f, duty cycle: %d%%\n", 
           x_val, predicted_sine, duty_cycle);
  }
  
  // Increment the inference counter
  inference_count++;
  if (inference_count >= INFERENCES_PER_CYCLE) {
    inference_count = 0;
  }
  
  // Add a delay before the next inference
  sl_sleeptimer_delay_millisecond(INFERENCE_INTERVAL_MS);
}

3.5.5 Creating the Model Integration File

Create a new file called sl_tflite_micro_model.c in the src folder with the following content:

#include "sl_tflite_micro_model.h"
#include "sine_model_data.h"
#include <stdio.h>

// TensorFlow Lite for Microcontrollers components
#include "tensorflow/lite/micro/all_ops_resolver.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"

// Global variables
static tflite::MicroErrorReporter micro_error_reporter;
static tflite::ErrorReporter* error_reporter = &micro_error_reporter;
static const tflite::Model* model = nullptr;
static tflite::MicroInterpreter* interpreter = nullptr;
static TfLiteTensor* input_tensor = nullptr;
static TfLiteTensor* output_tensor = nullptr;

// Create an area of memory for input, output, and intermediate arrays
constexpr int kTensorArenaSize = 8 * 1024;
static uint8_t tensor_arena[kTensorArenaSize];

sl_status_t sl_tflite_micro_init(void)
{
  // Map the model into a usable data structure
  model = tflite::GetModel(g_sine_model_data);
  if (model->version() != TFLITE_SCHEMA_VERSION) {
    printf("Model version mismatch: %d vs %d\n", model->version(), TFLITE_SCHEMA_VERSION);
    return SL_STATUS_FAIL;
  }
  
  // Create an all operations resolver
  static tflite::AllOpsResolver resolver;
  
  // Build an interpreter to run the model
  static tflite::MicroInterpreter static_interpreter(
    model, resolver, tensor_arena, kTensorArenaSize, error_reporter);
  interpreter = &static_interpreter;
  
  // Allocate memory for all tensors
  TfLiteStatus allocate_status = interpreter->AllocateTensors();
  if (allocate_status != kTfLiteOk) {
    printf("AllocateTensors() failed\n");
    return SL_STATUS_ALLOCATION_FAILED;
  }
  
  // Get pointers to the model's input and output tensors
  input_tensor = interpreter->input(0);
  output_tensor = interpreter->output(0);
  
  // Check that input and output tensors are the expected size and type
  if (input_tensor->dims->size != 2 || input_tensor->dims->data[0] != 1 || 
      input_tensor->dims->data[1] != 1 || input_tensor->type != kTfLiteFloat32) {
    printf("Unexpected input tensor format\n");
    return SL_STATUS_INVALID_PARAMETER;
  }
  
  if (output_tensor->dims->size != 2 || output_tensor->dims->data[0] != 1 || 
      output_tensor->dims->data[1] != 1 || output_tensor->type != kTfLiteFloat32) {
    printf("Unexpected output tensor format\n");
    return SL_STATUS_INVALID_PARAMETER;
  }
  
  printf("TensorFlow Lite Micro initialized successfully\n");
  return SL_STATUS_OK;
}

sl_status_t sl_tflite_micro_get_input_tensor(uint8_t index, sl_tflite_micro_tensor_t* tensor)
{
  if (interpreter == nullptr || index >= interpreter->inputs_size()) {
    return SL_STATUS_INVALID_PARAMETER;
  }
  
  tensor->tensor = interpreter->input(index);
  return SL_STATUS_OK;
}

sl_status_t sl_tflite_micro_get_output_tensor(uint8_t index, sl_tflite_micro_tensor_t* tensor)
{
  if (interpreter == nullptr || index >= interpreter->outputs_size()) {
    return SL_STATUS_INVALID_PARAMETER;
  }
  
  tensor->tensor = interpreter->output(index);
  return SL_STATUS_OK;
}

sl_status_t sl_tflite_micro_set_tensor_data(sl_tflite_micro_tensor_t* tensor, 
                                            const void* data, 
                                            size_t size)
{
  if (tensor == nullptr || tensor->tensor == nullptr || data == nullptr) {
    return SL_STATUS_NULL_POINTER;
  }
  
  // Size check based on tensor type and dims
  size_t tensor_size = 1;
  for (int i = 0; i < tensor->tensor->dims->size; i++) {
    tensor_size *= tensor->tensor->dims->data[i];
  }
  
  if (tensor->tensor->type == kTfLiteFloat32) {
    tensor_size *= sizeof(float);
  } else if (tensor->tensor->type == kTfLiteInt8) {
    tensor_size *= sizeof(int8_t);
  } else if (tensor->tensor->type == kTfLiteUInt8) {
    tensor_size *= sizeof(uint8_t);
  } else {
    return SL_STATUS_NOT_SUPPORTED;
  }
  
  if (size > tensor_size) {
    return SL_STATUS_WOULD_OVERFLOW;
  }
  
  // Copy the data to the tensor
  memcpy(tensor->tensor->data.raw, data, size);
  return SL_STATUS_OK;
}

sl_status_t sl_tflite_micro_get_tensor_data(sl_tflite_micro_tensor_t* tensor, 
                                            void* data, 
                                            size_t size)
{
  if (tensor == nullptr || tensor->tensor == nullptr || data == nullptr) {
    return SL_STATUS_NULL_POINTER;
  }
  
  // Size check based on tensor type and dims
  size_t tensor_size = 1;
  for (int i = 0; i < tensor->tensor->dims->size; i++) {
    tensor_size *= tensor->tensor->dims->data[i];
  }
  
  if (tensor->tensor->type == kTfLiteFloat32) {
    tensor_size *= sizeof(float);
  } else if (tensor->tensor->type == kTfLiteInt8) {
    tensor_size *= sizeof(int8_t);
  } else if (tensor->tensor->type == kTfLiteUInt8) {
    tensor_size *= sizeof(uint8_t);
  } else {
    return SL_STATUS_NOT_SUPPORTED;
  }
  
  if (size > tensor_size) {
    return SL_STATUS_WOULD_OVERFLOW;
  }
  
  // Copy the data from the tensor
  memcpy(data, tensor->tensor->data.raw, size);
  return SL_STATUS_OK;
}

sl_status_t sl_tflite_micro_invoke(void)
{
  if (interpreter == nullptr) {
    return SL_STATUS_NOT_INITIALIZED;
  }
  
  TfLiteStatus status = interpreter->Invoke();
  if (status != kTfLiteOk) {
    return SL_STATUS_FAIL;
  }
  
  return SL_STATUS_OK;
}

Now, create the header file sl_tflite_micro_model.h in the inc folder:

#ifndef SL_TFLITE_MICRO_MODEL_H
#define SL_TFLITE_MICRO_MODEL_H

#include "sl_status.h"
#include <stdint.h>
#include <stddef.h>

#ifdef __cplusplus
extern "C" {
#endif

// Forward declarations from TensorFlow Lite
#ifdef __cplusplus
namespace tflite {
struct TfLiteTensor;
}  // namespace tflite
typedef struct tflite::TfLiteTensor TfLiteTensor;
#else
typedef struct TfLiteTensor TfLiteTensor;
#endif

// Tensor structure
typedef struct {
  TfLiteTensor* tensor;
} sl_tflite_micro_tensor_t;

/**
 * @brief Initialize TensorFlow Lite Micro with the sine model
 * 
 * @return sl_status_t SL_STATUS_OK if successful
 */
sl_status_t sl_tflite_micro_init(void);

/**
 * @brief Get an input tensor by index
 * 
 * @param index Index of the input tensor
 * @param tensor Pointer to the tensor structure to fill
 * @return sl_status_t SL_STATUS_OK if successful
 */
sl_status_t sl_tflite_micro_get_input_tensor(uint8_t index, sl_tflite_micro_tensor_t* tensor);

/**
 * @brief Get an output tensor by index
 * 
 * @param index Index of the output tensor
 * @param tensor Pointer to the tensor structure to fill
 * @return sl_status_t SL_STATUS_OK if successful
 */
sl_status_t sl_tflite_micro_get_output_tensor(uint8_t index, sl_tflite_micro_tensor_t* tensor);

/**
 * @brief Set data to a tensor
 * 
 * @param tensor Pointer to the tensor
 * @param data Pointer to the data to copy
 * @param size Size of the data in bytes
 * @return sl_status_t SL_STATUS_OK if successful
 */
sl_status_t sl_tflite_micro_set_tensor_data(sl_tflite_micro_tensor_t* tensor, 
                                          const void* data, 
                                          size_t size);

/**
 * @brief Get data from a tensor
 * 
 * @param tensor Pointer to the tensor
 * @param data Pointer to the buffer to receive the data
 * @param size Size of the buffer in bytes
 * @return sl_status_t SL_STATUS_OK if successful
 */
sl_status_t sl_tflite_micro_get_tensor_data(sl_tflite_micro_tensor_t* tensor, 
                                          void* data, 
                                          size_t size);

/**
 * @brief Run inference using the TensorFlow Lite model
 * 
 * @return sl_status_t SL_STATUS_OK if successful
 */
sl_status_t sl_tflite_micro_invoke(void);

#ifdef __cplusplus
}
#endif

#endif // SL_TFLITE_MICRO_MODEL_H

3.5.6 Building and Flashing the Application

  1. In Simplicity Studio, right-click on the project and select “Build Project”
  2. After successful compilation, right-click again and select “Run As” → “Silicon Labs ARM Program”
  3. The application will be flashed to your EFR32MG24 device and start running

3.5.7 Observing the Results

Once your application is running on the EFR32MG24 device:

  1. The LED will pulse with brightness that follows the sine wave pattern
  2. Open the Serial Console in Simplicity Studio to view debug output
  3. You’ll see logs showing the input value, predicted sine value, and the corresponding LED duty cycle

3.6 How it Works: Understanding the Implementation

Our TinyML sine wave predictor consists of several key components:

  1. Model Training and Conversion: Using Google Colab, we trained a small neural network to approximate the sine function and converted it to TF Lite format, then to a C array.

  2. TensorFlow Lite Micro Integration: We’ve implemented a simple wrapper around TF Lite Micro’s C++ API, providing a clean C interface for our application.

  3. Application Logic: The main application loop:

    • Calculates an x value based on where we are in the cycle
    • Feeds this value into the model
    • Retrieves the predicted sine value
    • Maps the prediction to LED brightness via PWM
  4. Visual Output: The LED brightness follows a sine wave pattern, providing visual confirmation that our model is working correctly.

3.7 Extending the TinyML Application

Now that we have our basic “Hello World” TinyML application running, there are several ways we can extend and enhance it:

3.7.1 1. Adding Multiple LED Support

For devices with multiple LEDs, we can create more interesting visual patterns by controlling multiple LEDs based on different phases of the sine wave:

// In app.c, add phase offsets for each LED
#define LED_COUNT 4  // Assuming 4 available LEDs
const float phase_offsets[LED_COUNT] = {
  0.0f,                  // LED0: No phase offset
  0.5f * X_RANGE / 4.0f, // LED1: 45 degrees offset
  X_RANGE / 4.0f,        // LED2: 90 degrees offset
  1.5f * X_RANGE / 4.0f  // LED3: 135 degrees offset
};

// Then in app_process_action(), add a loop to control all LEDs
for (int i = 0; i < LED_COUNT; i++) {
  // Calculate offset x value for this LED
  float led_x_val = x_val + phase_offsets[i];
  if (led_x_val >= X_RANGE) {
    led_x_val -= X_RANGE;  // Wrap around to stay in range
  }
  
  // Prepare input tensor with our x value
  float input_data[1] = { led_x_val };
  sl_tflite_micro_tensor_t input_tensor;
  status = sl_tflite_micro_get_input_tensor(0, &input_tensor);
  if (status != SL_STATUS_OK) continue;
  
  // Set tensor data, invoke model, and get output as before...
  
  // Set corresponding LED brightness
  uint8_t duty_cycle = (uint8_t)((predicted_sine + 1.0f) * 50.0f);
  sl_pwm_set_duty_cycle(i, duty_cycle);  // Assuming LED PWM instances are indexed
}

3.7.2 2. Adding LCD Display Support

If your EFR32MG24 development board includes an LCD display, you can visualize the sine wave more directly:

// In app.c, add LCD-related includes
#include "sl_glib.h"
#include "sl_simple_lcd.h"

// Add LCD dimensions and buffers
#define LCD_WIDTH  128
#define LCD_HEIGHT 64
#define HISTORY_LENGTH LCD_WIDTH
static float sine_history[HISTORY_LENGTH];
static int history_index = 0;

// Initialize the LCD in app_init()
sl_simple_lcd_init();
sl_glib_init();
// Clear history buffer
for (int i = 0; i < HISTORY_LENGTH; i++) {
  sine_history[i] = 0.0f;
}

// In app_process_action(), after getting the predicted sine:
// Store in history buffer
sine_history[history_index] = predicted_sine;
history_index = (history_index + 1) % HISTORY_LENGTH;

// Every few inferences, update the display
if (inference_count % 4 == 0) {
  GLIB_Context_t context;
  sl_glib_get_context(&context);
  
  // Clear display
  GLIB_clear(&context);
  
  // Draw x and y axes
  GLIB_drawLineH(&context, 0, LCD_WIDTH-1, LCD_HEIGHT/2);
  GLIB_drawLineV(&context, 0, 0, LCD_HEIGHT-1);
  
  // Draw the sine wave
  for (int i = 0; i < HISTORY_LENGTH-1; i++) {
    int x1 = i;
    int y1 = (int)(LCD_HEIGHT/2 - (sine_history[(history_index + i) % HISTORY_LENGTH] * LCD_HEIGHT/4));
    int x2 = i + 1;
    int y2 = (int)(LCD_HEIGHT/2 - (sine_history[(history_index + i + 1) % HISTORY_LENGTH] * LCD_HEIGHT/4));
    GLIB_drawLine(&context, x1, y1, x2, y2);
  }
  
  // Update display
  sl_glib_update_display();
}

3.7.3 3. Implementing Power Optimization

To make our TinyML application more power-efficient for battery-powered operation, we can add sleep modes between inferences:

// Replace the fixed delay with sleep mode
// Instead of: sl_sleeptimer_delay_millisecond(INFERENCE_INTERVAL_MS);

#if defined(SL_CATALOG_POWER_MANAGER_PRESENT)
  // Schedule next wakeup
  sl_sleeptimer_tick_t ticks = sl_sleeptimer_ms_to_tick(INFERENCE_INTERVAL_MS);
  sl_power_manager_schedule_wakeup(ticks, NULL, NULL);
  
  // Enter sleep mode
  sl_power_manager_sleep();
#else
  // Fall back to delay if power manager isn't available
  sl_sleeptimer_delay_millisecond(INFERENCE_INTERVAL_MS);
#endif

3.7.4 4. Enhanced User Interface with Buttons

We can use the buttons on the development board to control aspects of the application:

// Include button support
#include "sl_button.h"
#include "sl_simple_button.h"
#include "sl_simple_button_btn0_config.h"

// Add state variables
static bool paused = false;
static float speed_factor = 1.0f;

// In app_init, initialize buttons
sl_button_init(&sl_button_btn0);

// Check button state in app_process_action
if (sl_button_get_state(&sl_button_btn0) == SL_SIMPLE_BUTTON_PRESSED) {
  // Toggle pause state
  paused = !paused;
  printf("Application %s\n", paused ? "paused" : "resumed");
}

// Only update inference_count if not paused
if (!paused) {
  inference_count += 1;
  if (inference_count >= INFERENCES_PER_CYCLE) inference_count = 0;
}

3.7.5 5. Performance Profiling and Optimization

To understand and optimize the performance of our TinyML application, we can add timing measurements:

// Add profiling includes
#include "em_cmu.h"
#include "em_timer.h"

// Setup timer for profiling in app_init()
CMU_ClockEnable(cmuClock_TIMER1, true);
TIMER_Init_TypeDef timerInit = TIMER_INIT_DEFAULT;
TIMER_Init(TIMER1, &timerInit);

// In app_process_action(), measure inference time
// Reset timer
TIMER_CounterSet(TIMER1, 0);

// Start timer
TIMER_Enable(TIMER1, true);

// Run inference
status = sl_tflite_micro_invoke();

// Stop timer and read value
TIMER_Enable(TIMER1, false);
uint32_t ticks = TIMER_CounterGet(TIMER1);

// Convert ticks to microseconds (depends on timer clock)
uint32_t us = ticks / (CMU_ClockFreqGet(cmuClock_TIMER1) / 1000000);

// Log every Nth inference
if (inference_count % 10 == 0) {
  printf("Inference time: %lu microseconds\n", us);
}

3.8 Building TinyML Applications with the Gecko SDK

Compared to the traditional TinyML approach with direct TensorFlow Lite for Microcontrollers integration, the Gecko SDK approach offers several advantages for EFR32 developers:

3.8.1 Simplified Project Setup

The Gecko SDK provides a structured approach to project creation with built-in TinyML support:

  1. Project Templates: Simplicity Studio’s project wizard includes TinyML templates that set up the necessary directory structure, build configuration, and initial code.

  2. Integrated Build System: The SDK handles compiler flags, library dependencies, and linking, eliminating the need for custom Makefiles.

  3. Hardware Abstraction Layer (HAL): The SDK provides hardware-specific drivers and APIs for peripherals like PWM, GPIOs, and timers, making it easier to integrate TinyML with device hardware.

3.8.2 Streamlined Development Workflow

The development workflow with Simplicity Studio and Gecko SDK is straightforward:

  1. Model Training: Use Google Colab or TensorFlow on your computer to train and convert models.

  2. Project Creation: Launch Simplicity Studio, select the TensorFlow Lite Micro example template, and create a new project.

  3. Model Integration: Import your model header file into the project.

  4. Application Logic: Write code in C to initialize the model, prepare inputs, run inference, and process outputs.

  5. Build and Flash: Use Simplicity Studio’s integrated tools to compile the code and flash it to your device.

  6. Debug and Monitor: The Serial Console and Energy Profiler tools help monitor application behavior and optimize performance.

3.8.3 Hardware-Specific Optimizations

The Gecko SDK includes optimizations specifically for the EFR32 platform:

  1. Memory Optimization: Memory management is tuned for the EFR32’s memory architecture.

  2. Power Management: Integration with the Energy Management Unit (EMU) allows for fine-grained control over active, sleep, and deep sleep states.

  3. Peripheral Control: Direct access to hardware accelerators and peripherals that can enhance TinyML performance.

3.9 Conclusion

The sine wave predictor represents an elegant “Hello World” example of TinyML deployment on the EFR32MG24 platform. While seemingly simple, this implementation encompasses all the key elements of machine learning on microcontrollers:

  1. Model design with consideration for resource constraints
  2. Training and evaluation on standard datasets
  3. Quantization and optimization for embedded deployment
  4. Integration with Simplicity Studio and Gecko SDK
  5. Hardware output integration via GPIO and PWM capabilities

Through this foundation, developers can extend to more complex TinyML applications on the EFR32MG24, such as sensor fusion, predictive maintenance, anomaly detection, and keyword spotting—all within a power envelope suitable for long-term battery-powered operation.

The techniques demonstrated here—model quantization, C code generation, and deployment with Simplicity Studio—provide a template that can be adapted for more sophisticated machine learning tasks, enabling a new class of intelligent edge devices based on the EFR32MG24 platform.