Secure/Non-secure Part 2

lib/v3.0 Prototype: Secure/Non-secure programs – implementation STM32.

Overview, Purpose

In Secure/Non-secure Part 1 we had explored some basics and mechanisms of Secure and Non-secure programs and library modules using Oberon and Astrobe for RP2350. Secure and Non-secure are terms related to ARM's TrustZone technology, which we find in the RP2350 as well as other MCUs based on the Cortex-M33 processor from ST and NXP.

C/C++ programming tools, such as gcc, provide a special mode to create Secure programs. I am using Astrobe here beyond its purpose and specifications. Nonetheless, Astrobe has again proven to be a reliable and versatile tool that allows – and graciously endures – pushing its boundaries.

In part 1, the basic mechanics have been explored and implemented still completely in the Non-secure world. In this part 2, we're now tackling the actual separation and isolation of Secure and Non-secure programs. The STM32U585 MCU comes from the factory with TrustZone disabled, that is, it works as if the corresponding features don't even exist. For the experiment described here, I have now enabled TrustZone.

Stay tuned for an RP2350 variant. Real Soon Now.

Motivation: Robustness

My interest in Secure/Non-secure separation of control programs is in increasing the robustness of the system. Protection against attacks or even of intellectual property is not in my focus for now. I assume a programmer, or teams of programmers, have access to all source code, design and implement the system as a whole, and draw the line between Secure and Non-secure domains based on robustness criteria, for example to protect certain peripheral devices or data from non-intended access due to programming errors or other defects.

Anatomy of a Secure and Non-Secure Program Combo

Pro memoria, here's the relationships and the control flow for a program consisting of both Non-secure and Secure parts.

Non-secure world        Secure world
                        +------------+
             reset ---> | Program S  |
                        +------------+
+------------+    start   |    | call
|            |<-----------+    V
| Program NS |          +------------+
|            |--------->| lib S0     |
+------------+    call  +------------+
       | call                  ^
       V                       |
+------------+     call        |
| Lib NS0    |-----------------+
+------------

The program as a whole always starts with the Secure program, that is, after reset, the code must run as Secure privileged. The Secure code sets up all the memory separation and other resource isolation components and parameters.
The Secure program then passes control to the Non-secure program, which is usually the actual control program.
The Non-secure program and its library modules can make use of the Secure library modules.
The Secure software can access Non-secure memory, including calling procedures, and load/store operations in SRAM (not depicted above).
Program development is usually organised in two projects, one for Secure and one for the Non-secure world. The projects can be staffed by the same teams, of course, but the code of the two projects is completely separated on a technical level, with the Secure side only giving interface (gateway) code to the Non-secure side, but not the Secure modules, which are never directly imported into Non-secure modules.
In fact, both projects each result in separate program images that are loaded into the MCU program memory at address regions with different security attributes. The Non-secure project can update its modules and the resulting program image, with the same Secure code staying unchanged in place in the physical memory. The Secure project can change its code independently as well – under certain conditions, which should become clear below.
The Secure code usually provides services to the Non-secure program. For this, the Secure developers provide an Oberon Non-secure interface module – part of the aforementioned gateway – that is being linked into the Non-secure program, which will then execute the protocol to invoke the corresponding procedures contained in the Secure image.
The Secure code is not directly accessible to the Non-secure code. TrustZone and other security components enforce a strict separation in hardware. The separation parameters and their corresponding implementation in hardware are either defined in non-volatile memory in the MCU, and loaded upon reset before the software even runs, or defined by the Secure program after reset.

TrustZone and Other Security Components

The complete security architecture of an MCU can be pretty complex. It's important to realise that TrustZone alone cannot provide the complete security – additional components are required. Usually, TrustZone only restricts the CPU and so called TrustZone-aware peripheral devices. Apart from the processor, there can be other bus managers (controllers, "masters"), and the devices that are not natively TrustZone-aware may need to be isolated from Non-secure access as well. Here where the additional security components come into play.

As said, all security measures are enforced by hardware, either in the bus itself, or by security elements between the bus and the devices. TrustZone is enforced by the AHB itself, which carries "side-band" signals to designate transactions as either Secure or Non-secure.

We'll cover some of the security components below.

TrustZone Basics

Code is Secure if it fetches instructions from a Secure address, Non-secure if it does so from a Non-secure address. The address can be in flash memory or SRAM.

TrustZone defines two major hardware units to control Secure and Non-secure addresses:

the Implementation Defined Attribution Unit (IDAU), and
the Security Attribution Unit (SAU).

The IDAU is a controller that is usually completely defined and fixed in hardware, ie. without configuration options by the software, while the SAU requires to be configured by (Secure) software. IDAU and SAU work in tandem.

The IDAU divides the whole address space of the Cortex-M33 into segments: Non-secure, Secure, Non-secure Callable, and Exempt. Which security attribution is provided is up to the MCU designer/vendor. The IDAU of the STM32U585 only defines address ranges for Non-secure and Non-secure Callable, but not for the other security categories. Other MCU designers make different choices.

The SAU allows to define up to eight (STM32U585) regions that are overlaid on the IDAU-defined scheme. Note that SAU regions can only "downgrade" the security level, to either Non-Secure or Non-secure Callable. Secure addresses are achieved by the base rule that with the SAU enabled, all addresses are Secure.

There's a lot more to TrustZone, for example:

the stack pointers (MSP and PSP) are banked between security states, as are their limit check registers;
some configuration and status registers are banked between states, some are not; in some of these registers, only single bits or bit-fields are banked;
the NVIC is banked, each with its own vector table, and exceptions can run as Secure or Non-secure;
exception handling, and the related entry stacking, needs to take into account that a Non-secure exception handler can interrupt Secure code, and therefore protect the stacked values, and clear the CPU and FPU registers;

and so on.

Other Security Components

Since TrustZone has a limited range of protection, additional components are required to isolate other bus managers and certain peripheral devices.

The STM32U585 uses the following devices to extend TrustZone, that is IDAU and SAU:

the flash memory controller, and
two Global TrustZone controllers (GTZC1 and GTZC2).

The flash memory controller can define banks and pages as Secure. The basic set-up is achieved with two watermark regions, one for each bank. These watermarks are loaded from non-volatile flash memory at reset, and can only be changed by Secure software using a procedure using memory-mapped register access – ie. not flash memory programming – that also requires a reset, after which the MCU always starts in Secure privileged mode. A common set-up is to define flash bank1 and Secure, and bank2 as Non-secure using these watermarks.

The two GTZCs are used to:

define peripheral devices that are not TrustZone-aware as either Secure or Non-secure. By default, after reset, they are Non-secure;
define SRAM regions as Secure or Non-secure. By default, after reset, they are Secure. A common set=up is to configure SRAM1 as Secure, SRAM3 as Non-secure.

The GTZCs also have functionality to flag security violations, including triggering corresponding interrupts.

Address Aliasing

Another aspect of Secure and Non-secure separation are address alias ranges. The same hardware can be accessed via two different addresses, one Secure and one Non-secure.

Just as an aside, SRAM can be accessed via the instruction (code) bus of the Cortex-M33 as well as the system bus. This again is determined by another set of alias ranges.

Security Configuration

All the above – and all I didn't cover – components and configurations play together to define any specific address as Secure, Non-secure, or Non-secure Callable. From the above very cursory description you can infer that it's possible to configure all the different security components with "contradictions", ie. in a way that one components inhibits the intended effect of the other.

Here's the configuration chosen for this experimental program.

Flash Memory

The STM32U585 has two 1M banks of flash memory. Using the aforementioned watermarks, the flash controller parametrised to configure the flash memory as follows upon reset:

Bank1: Secure, for the Secure address range starting at 0C000000H;
Bank2: Non-Secure, for the Non-secure address range starting at 08100000H.

Note that address 08100000H actually accesses a higher physical memory cell than 0C000000H due to aliasing. 0C100000H and 08100000H access the same physical memory cell, the former Secure, the latter Non-secure.

As outlined above, this flash memory configuration is stored in non-volatile memory and will survive a reset. That is, the Secure program does not need to (re-) configure this at each run.

Caution: you can lock yourself out of the MCU with certain parameter combinations. What we want to set here are just the two watermarks for flash bank1 and bank2.^[1]

GTZC1

Using GTZC1, the Secure program configures SRAM as follows (the GTZC config is volatile):

SRAM1: Secure, for the Secure address range starting at 030000000H;
SRAM3: Non-secure, for the Non-secure address range starting at 020040000H;

As with flash memory, 020040000H accesses a higher physical address than 030000000H due to aliasing.

With the SAU enabled, all addresses are Secure, hence we need to consistently overlay the above address ranges with corresponding SAU regions that downgrade the "all Secure" to what we want to achieve.

The Secure program configures the following SAU regions (the SAU config is volatile):

Region 0: 0C0FE000H to 0C0FFFFFH as Non-secure Callable flash
Region 1: 08100000H to 081FFFFFH as Non-secure flash
Region 2: 020040000H to 0200BFFFFH as Non-secure SRAM
Region 3: 040000000H to 04FFFFFFFH as Non-secure peripherals

The Non-secure Callable (NSC) flash memory will contain the gateway code from the Non-secure to the Secure world. We put NSC in the uppermost 128k of the Secure flash memory. I'll come back to NSC memory below.

Memory Secure Configuration Overview

flash memory                                    SRAM
+------------------+ -                    SRAM3 +------------------+ -
| 0081FFFFFH       | ^                          | 0200BFFFFH       | ^
|                  | |                          |                  | |
| Non-secure       | |                          | Non-secure       | |
| SAU: region 1    | | FLASH controller         | SAU region 2     | | GTZC
| IDAU: Non-secure | | watermark-based          | IDAU: Non-secure | | Non-secure
~                  ~ | non-volatile cfg         ~                  ~ | volatile cfg
|                  | |                          |                  | |
| 008100000H       | V                          | 020040000H       | V
+------------------+ -                          +------------------+ -
+------------------+ -                          +------------------+
| 00C0FFFFFH       | ^                    SRAM2 | unused           |
|                  | |                          +------------------+
| NSC              | |                          +------------------+ -
| SAU: region 0    | |                    SRAM1 | 03002FFFFH       | ^
| IDAU: NSC        | |                          |                  | |
|                  | |                          | Secure           | | GTZC
| 00C0FE000H       | |                          | SAU: no region   | | Secure
+------------------+ | FLASH controller         | IDAU: NSC        | | block-based
| 00C0FDFFFH       | | Secure                   ~                  ~ | volatile cfg
|                  | | watermark-based          |                  | |
| Secure           | | non-volatile cfg         |  030000000H      | V
| SAU: no region   | |                          +------------------+ -
| IDAU: NSC        | |
~                  ~ |
|                  | |
| 00C000000H       | V
+------------------+ -

Lower physical addresses are at the bottom of the figure.
No SAU region means Secure.

Test Program Overview

This test program builds upon the one used in Secure/Non-secure Part 1. It comprises:

module S: the Secure program, which runs after a reset;
module NS: the Non-secure program, which will be started by S;
module S0: a Secure module, which is used by S and NS.

Module S0 implements procedures, slightly artificially defined to explore and demonstrate the Secure/Non-secure separation. Notably, Non-secure program NS uses Secure module S0 across this separation boundary.

Tool gen-secure

gen-secure is a Python program in the tools directory. It is used to generate

an Oberon interface module NS_S0 for Secure module SO in the Non-secure world. The Non-secure program NS can import NS_S0 to call procedures in SO across the security boundary;
a binary file that contains the gateway code from Non-secure NS_S0 to Secure S0. This is the code to be loaded in the Non-secure Callable program region as outlined above.

Note the substantial changes to the previous version of gen-secure described and used in Secure/Non-secure Part 1:

there's no more Oberon module NSC_S0 that needs to be compiled to get the gateway binary, since this new version generates that binary image directly;
in fact, we now only need to run gen-secure once to get all the necessary results to enable NS to call procedures in S0. No more make and fixup steps.

I'll show the use of gen-secure below.

Directory and Module Set-up

The test program directory Secure1 contains two directories ns and s. Directory ns contains the Non-secure program module NS (and later also NS_S0), directory s the Secure program module S and library module S0. Note: no directory nsc anymore.

gen-secure relies on this directory structure. The directories correspond to the separation of Secure and Non-secure development projects.

Module Main is empty in both directories. In a real application, they will contain the usual set-up code for clocks, run-time error handling, serial terminal output, and probably also a standard SAU and GTZC configuration for the Secure program.

Modules S (Secure Program), NS (Non-secure Program), and S0 (Secure library)

I have artificially extracted two procedures from the original MVP test program into module S0, so that NS can call these procedures across the image boundaries.

(*! SEC *)
MODULE S;
  IMPORT SYSTEM, Main, MCU := MCU2, S0, GTZC, SAU, Secure;

  CONST
    (* Secure GPIOH registers *)
    GPIOH_BSSR = MCU.GPIOH_BASE + MCU.GPIO_BSRR_Offset + MCU.PERI_S_Offset;
    GPIOH_MODER = MCU.GPIOH_BASE + MCU.GPIO_MODER_Offset + MCU.PERI_S_Offset;
    GPIOH_OSPEEDR = MCU.GPIOH_BASE + MCU.GPIO_OSPEEDR_Offset + MCU.PERI_S_Offset;
    GPIOH_SECCFGR = MCU.PORTH + MCU.GPIO_SECCFGR_Offset + MCU.PERI_S_Offset;

    (* Non-secure program flash address *)
    NSimageAddr = 08100000H;

    LEDred   = 6; (* GPIOH *)
    MODER_Out = 1;
    OSPEED_High = 2;

  PROCEDURE cfgSRAM;
  BEGIN
    (* set all super-blocks of SRAM1 to Secure *)
    GTZC.ConfigSRAMsecRange(GTZC.SRAM1, 0, 12, GTZC.AllBlocksSecure);
    (* set all super-blocks of SRAM3 to Non-secure *)
    GTZC.ConfigSRAMsecRange(GTZC.SRAM3, 0, 32, GTZC.AllBlocksNonSecure)
  END cfgSRAM;

  PROCEDURE cfgGPIO; (* Secure part *)
    VAR val: SET; reg, devNo: INTEGER;
  BEGIN
    (* enable GPIOH clock, module CLK style *)
    reg := MCU.DEV_GPIOH DIV 32;
    reg := MCU.RCC_AHB1ENR + MCU.PERI_S_Offset + (reg * 4);
    devNo := MCU.DEV_GPIOH MOD 32;
    SYSTEM.GET(reg, val);
    val := val + {devNo};
    SYSTEM.PUT(reg, val);

    (* set all GPIOH pins to Non-secure, apart from LEDred *)
    SYSTEM.PUT(GPIOH_SECCFGR, {LEDred});

    (* config Secure LEDred pin *)
    S0.SetBits2(LEDred, GPIOH_MODER, MODER_Out);
    S0.SetBits2(LEDred, GPIOH_OSPEEDR, OSPEED_High);

    (* LEDred off *)
    SYSTEM.PUT(GPIOH_BSSR, {LEDred});
  END cfgGPIO;

  PROCEDURE configSAU;
    CONST Enabled = 1; Disabled = 0;
    VAR cfg: SAU.RegionCfg; r: INTEGER;
  BEGIN
    SAU.Enable;
    r := 0;
    (* flash NSC, top 128k of block1  *)
    cfg.baseAddr := 0C0FE000H;
    cfg.limitAddr := 0C0FFFFFH;
    cfg.nsc := Enabled;
    SAU.ConfigRegion(r, cfg);
    INC(r);
    (* flash NS, all block2 *)
    cfg.baseAddr := 08100000H;
    cfg.limitAddr := 081FFFFFH;
    cfg.nsc := Disabled;
    SAU.ConfigRegion(r, cfg);
    INC(r);
    (* sram3 NS, all blocks *)
    cfg.baseAddr := 020040000H;
    cfg.limitAddr := 0200BFFFFH;
    cfg.nsc := Disabled;
    SAU.ConfigRegion(r, cfg);
    INC(r);
    (* peripheral devices NS, all devices *)
    cfg.baseAddr := 40000000H;
    cfg.limitAddr := 4FFFFFFFH;
    cfg.nsc := Disabled;
    SAU.ConfigRegion(r, cfg);
    INC(r);
    WHILE r < SAU.NumRegions DO
      SAU.DisableRegion(r);
      INC(r)
    END
  END configSAU;

BEGIN
  configSAU;
  cfgSRAM;
  cfgGPIO;
  Secure.StartNonSecure(NSimageAddr)
END S.

GTZC, SAU, and Secure are preliminary, nascent modules to make use of the functionality of these MCU components.
The code structure and the comments above should be self-explanatory.
GPIOH is a TrustZone-aware device, which is Secure after reset. Its security state needs to be configured specifically, pin by pin. To demonstrate how Secure and Non-secure code can use pins in the same GPIO port, LEDred (GPIOH pin 6) will be operated by Secure code via S0, LEDgreen (GPIOH pin 7) by Non-secure code in program module N. Therefore, S0 sets all GPIOH pins to Non-secure, apart from LEDred, and configures LEDred using Secure GPIOH register addresses. LEDgreen will be configured by NS using Non-secure GPIOH register addresses.

MODULE NS;
  IMPORT SYSTEM, MCU := MCU2, S0 := NS_S0, Main;

  CONST
    (* Non-secure GPIOH registers *)
    GPIOH_BSSR = MCU.GPIOH_BASE + MCU.GPIO_BSRR_Offset;
    GPIOH_MODER = MCU.GPIOH_BASE + MCU.GPIO_MODER_Offset;
    GPIOH_OSPEEDR = MCU.GPIOH_BASE + MCU.GPIO_OSPEEDR_Offset;

    LEDgreen = 7; (* GPIOH *)
    LEDred   = 6;
    MODER_Out = 1;
    OSPEED_High = 2;

  PROCEDURE cfgGPIO; (* Non-secure part *)
  (* set-up GPIOH for Non-secure pin LEDgreen *)
  BEGIN
    (* bus clock has been enabled by Secure program *)
    (* GPIOH pins have been set Non-secure by Secure program *)
    S0.SetBits2(LEDgreen, GPIOH_MODER, MODER_Out);
    S0.SetBits2(LEDgreen, GPIOH_OSPEEDR, OSPEED_High);
    SYSTEM.PUT(GPIOH_BSSR, {LEDgreen})
  END cfgGPIO;

  PROCEDURE toggleLED(VAR led: SET);
    CONST Mask = {LEDgreen, LEDgreen + 16};
  BEGIN
    SYSTEM.PUT(GPIOH_BSSR, led);
    led := led / Mask
  END toggleLED;

  PROCEDURE run;
    VAR i: INTEGER; ledRed, ledGreen: SET;
  BEGIN
    ledRed := {LEDred};
    ledGreen := {LEDgreen + 16};
    REPEAT
      S0.ToggleLED(ledRed);
      toggleLED(ledGreen);
      i := 0;
      WHILE i < 100000 DO INC(i) END
    UNTIL FALSE
  END run;

BEGIN
  cfgGPIO;
  run
END NS.

NS configures GPIOH for LEDgreen through Non-secure register addresses, as enabled by the Secure program;
NS contains the Non-secure pin-toggling procedure toggleLED, see module S0 below for the Secure version for LEDred;
the program simply toggles both LEDs in run with an ugly busy waiting loop, but hey, simplicity was important for this experiment.

(*! SEC *)
MODULE S0;
  IMPORT SYSTEM, MCU := MCU2;

  CONST
    GPIOH_BSSR = MCU.GPIOH_BASE + MCU.GPIO_BSRR_Offset + MCU.PERI_S_Offset;

    LEDred   = 6; (* GPIOH *)

  PROCEDURE testLR(x: INTEGER);
  BEGIN
    x := 42
  END testLR;

  PROCEDURE* ToggleLED*(VAR led: SET);
    CONST Mask = {LEDred, LEDred + 16};
  BEGIN
    SYSTEM.PUT(GPIOH_BSSR, led);
    led := led / Mask;

    (* manually inserted Secure epilogue *)
    (* no add sp,#n as leaf procedure *)
    SYSTEM.EMIT(MCU.POP_LR);
    SYSTEM.EMITH(MCU.BXNS_LR) (* ok within Secure code *)
  END ToggleLED;

  PROCEDURE SetBits2*(pin, addr, twoBitValue: INTEGER);
  (* could be leaf, but is not for testing purposes *)
    VAR val, mask: SET;
  BEGIN
    twoBitValue := twoBitValue MOD 04H;
    twoBitValue := LSL(twoBitValue, pin * 2);
    SYSTEM.GET(addr, val);
    mask := BITS(LSL(03H, pin * 2));
    val := val * (-mask);
    val := val + BITS(twoBitValue);
    SYSTEM.PUT(addr, val);
    testLR(42);

    (* manually inserted Secure epilogue *)
    SYSTEM.EMITH(MCU.ADD_SP + 05H); (* add sp,#20 *)
    SYSTEM.EMIT(MCU.POP_LR);
    SYSTEM.EMITH(MCU.BXNS_LR)
  END SetBits2;

  PROCEDURE Test*(x, v: INTEGER);
    VAR z: INTEGER;
  BEGIN
    (* ... stuff ... *)
    (* manually inserted Secure epilogue *)
    SYSTEM.EMITH(MCU.ADD_SP + 03H); (* add sp,#12 *)
    SYSTEM.EMIT(MCU.POP_LR);
    SYSTEM.EMITH(MCU.BXNS_LR)
  END Test;

END S0.

Note the manually inserted Secure procedure epilogues. A secure procedure must return with bxns lr. This epilogues would be generated by a Secure compiler in lieu of the current standard ones (which are generated here, too, of course, but ignored).
Program module S also uses S0, with the bxns lr returns. The processor interprets this correctly, basically as bx lr, since bit 0 has not been set to 0 by the secure gateway instruction due to the direct Secure-to-Secure call, see below.

Starting the Non-secure Program

Secure program S starts the Non-secure program NS calling TZ.StartNonSecure(NSimageAddr), with NSimageAddr being the flash memory address where the Non-secure program image will be loaded:

  PROCEDURE* StartNonSecure*(imageAddr: INTEGER);
    CONST R11 = 11;
    VAR val: INTEGER;
  BEGIN
    (* VTOR *)
    SYSTEM.PUT(MCU.PPB_VTOR + MCU.PPB_NS_Offset, imageAddr);
    (* stack pointer *)
    SYSTEM.GET(imageAddr, val);
    SYSTEM.LDREG(R11, val);
    SYSTEM.EMIT(MCU.MSR_MSPns_R11);
    (* branch to NS entry *)
    SYSTEM.GET(imageAddr + 04H, val);
    EXCL(SYSTEM.VAL(SET, val), 0);
    SYSTEM.LDREG(R11, val);
    SYSTEM.EMITH(MCU.BLXNS_R11)
  END StartNonSecure;

The initial stack pointer is read from relative address 0H in the program binary, and put into the Non-secure stack pointer register (MSP_NS), using an MSR instruction. This stack pointer will automatically be used as soon as the Non-secure program starts, ie. as soon as instructions are read from Non-secure addresses, and the processor is in the Non-secure state.
The entry point into the Non-secure program is read from relative address 04H in the program binary. Bit 0 must be cleared, so that blxns r11 can initiate the impending change to the Non-secure state.
This procedure represents the absolute minimum. In a real Secure set-up (framework, compiler), there's more to do to avoid leaking any Secure data into the Non-secure world. In particular, the processor registers must be cleared (or set to another arbitrary value: for example, gcc sets the registers to the Non-secure target address), the FPU registers must be cleared, and the flags in the APSR must be cleared as well.

Calling Secure Code from the Non-secure Program

As we've seen in Part 1, calling a procedure in Secure code from Non-secure programs or library modules entails:

a branch to a fixed address in NSC memory, which contains the gateway code (often called veneer);
from there, a branch to the Secure procedure prologue address;
a return directly back to the Non-Secure code, right to the address after the procedure call, so that from a Non-secure code's perspective, a Secure call is indistinguishable from a Non-secure one.

NS world                  Secure world
NS memory                 NSC memory                S memory
+--------------+                                    +--------------+
| Module NS    |                                    | Module S0    |
|              |                     return         |              |
|              |<-----------------------------------|              |
|              |                     BXNS           |              |
|              |                                    |              |
+--------------+                                    +--------------+
    | call                +--------------+              ^
    V BL, BLX             | gateways     |       invoke |
+--------------+ invoke   | (veneers)    |       B, BX  |
| Module NS_S0 |--------->|              |--------------+
|              | B, BX    |              |
|              |          +--------------+
|              |
+--------------+

Module NS_S0 represents Secure module S0 in the Non-secure world. Each procedure exported from S0 is also defined in NS_S0, with exactly the same signature.
The procedures in NS_S0, however, implement the branch to the gateway code.

Module NS_S0 and the gateway code (the veneers) are generated by running gen-secure, after building Secure program S:

> python -m gen-secure make s0.lst C000250 C0FE000

Arguments:
- s0.lst: the listing file for Secure module S0 as produced by the Astrobe compiler;
- C000250: the load address of module S0, found in the .map file for S, as produced by the Astrobe linker;
- C0FE000: the load address for the gateway binary, defined by the SAU configuration: region 0 for the NSC code.
Results:
- Oberon module NS_S0 in directory ns;
- binary file NSC_S0.bin in directory s.

MODULE NS_S0;
(* generated, do not edit *)
IMPORT SYSTEM;

PROCEDURE* ToggleLED*(VAR led: SET);
BEGIN
SYSTEM.EMITH(0B001H); (* add sp,#4, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
SYSTEM.DATA(00C0FE001H); (* nsc target address *)
END ToggleLED;

PROCEDURE SetBits2*(pin, addr, twoBitValue: INTEGER);
BEGIN
SYSTEM.EMITH(0B004H); (* add sp,#16, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
SYSTEM.DATA(00C0FE011H); (* nsc target address *)
END SetBits2;

PROCEDURE Test*(x, v: INTEGER);
BEGIN
SYSTEM.EMITH(0B003H); (* add sp,#12, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
SYSTEM.DATA(00C0FE021H); (* nsc target address *)
END Test;

END NS_S0.

As explained in Part 1, we need to keep the NS stack balanced by adding to the stack pointer value the memory space used for stacking the parameters by the prologue, which we don't need here;
The SYSTEM.DATA(00C0FE001H) and friends lines represent the target addresses in the veneer code in NSC memory (+1 for thumb code). Each veneer is 16 bytes.

Here's the disassembly listing of the veneer binary created by gen-secure:

0C0FE000   E97F E97F   SG
0C0FE004   F8DF B004   LDR.W  R11, =0x0C000261    ; [PC, #4] [0x0C0FE00C]
0C0FE008   4758        BX     R11
0C0FE00A   46C0        NOP
0C0FE00C   0C00 0261

0C0FE010   E97F E97F   SG
0C0FE014   F8DF B004   LDR.W  R11, =0x0C000281    ; [PC, #4] [0x0C0FE01C]
0C0FE018   4758        BX     R11
0C0FE01A   46C0        NOP
0C0FE01C   0C00 0281

0C0FE020   E97F E97F   SG
0C0FE024   F8DF B004   LDR.W  R11, =0x0C0002DD    ; [PC, #4] [0x0C0FE02C]
0C0FE028   4758        BX     R11
0C0FE02A   46C0        NOP
0C0FE02C   0C00 02DD

SG is the Secure Gateway instruction that is the first instruction mandatory at the entry into the veneer. Among other things, it clears bit 0 of the link register value. Before SG, the processor is still in the Non-secure state, thereafter in the Secure state.
Each exported procedure in S0 is represented by one veneer (gateway) block of 16 bytes.
The addresses read into r11 by LDR.W R11, [PC, #4] are the absolute addresses of the corresponding procedure prologues in S0.
You recognise the first address 0C0FE000H as the start of the NSC memory, as defined by SAU region 0, and used in gen-secure.

With this one gen-secure run, we have a fully functioning and consistent configuration of Oberon modules and binaries to build Non-secure program NS.

Call Chain

Let's have a look at the target addresses and the call chain for a Non-secure to Secure procedure call.

NS        +---------------------------------------------------+
          | ...                                               |
          | set up procedure args in registers                |
          | call procedure in S0 via BL                       |-----+
          | return address (will be in LR via BL)             |<----------+
          | ....                                              |     |     |
          +---------------------------------------------------+     |     |
                                                                    |     |
NS_S0     +---------------------------------------------------+     |     |
          | prologue:                                         |     |     |
          | push args in regs and LR onto stack               |<----+     |
          | note: no local variables                          |           |
          +---------------------------------------------------+           |
          | add sp,#n to reverse of the push operations       |           |
          | branch to address in NSC_S0 via bx r11            |-----+     |
          | note: LR still contains return address in NS      |     |     |
          +---------------------------------------------------+     |     |
          | epilogue:                                         |     |     |
          | add sp,#m, leaving only LR value on stack         |     |     |
          | pop LR value into PC                              |     |     |
          +---------------------------------------------------+     |     |
                                                                    |     |
NSC_S0    +---------------------------------------------------+     |     |
          | secure gateway instruction (SG), modifies LR      |<----+     |
          | branch to address in S0 via bx r11                |-----+     |
          | note: LR contains return address, modified by SG  |     |     |
          +---------------------------------------------------+     |     |     |     |
                                                                    |     |
S0        +---------------------------------------------------+     |     |
          | prologue:                                         |     |     |
          | push args in regs and LR onto stack               |<----+     |
          | sub sp,#n to make space for local vars on stack   |           |
          +---------------------------------------------------+           |
          | procedure body                                    |           |
          +---------------------------------------------------+           |
          | manually inserted epilogue:                       |           |
          | add sp,#m, leaving only LR on stack               |           |
          | pop LR value into LR                              |           |
          | bxns lr: uses LR value as modified by SG          |-----------+
          + --------------------------------------------------+
          | epilogue:                                         |
          | add sp,#m, leaving only LR on stack               |
          | pop LR value into PC                              |
          + --------------------------------------------------+

NS0 sets up the procedure arguments in registers as usual.
NS0 calls the procedure via BL (or BLX) as usual.
In NS_S0, the procedure executes its standard prologue.
However, the arguments and LR are now on the Non-secure stack, but we want the procedure in S0 to access them on the Secure stack. Hence, we reverse the stack actions of the Non-secure prologue by adding number-of-pushed-registers * 4 to the stack pointer. A Secure compiler could simply not insert the standard prologue here.
The procedure in NS_S0 then branches to the entry point in NSC_S0 in NSC memory.
Important: the LR still contains the return address in NS.
As required by the TrustZone rules, the first instruction in NSC_S0 is Secure Gateway SG, which finds the return address in LR.
After SG has done its magic, the procedure branches to its equivalent in S0.
Here, the registers still contain the procedure arguments, and they are pushed onto the Secure stack, together with LR, so the Secure procedure can execute without any modification. That is, the branch target address in S0 is the procedures prologue.
The procedure in S0 executes its body. As usual, the return address in NS (LR modified by SG) is safe on the stack, ie. the procedure in S0 can do procedure calls (see procedure call to testLR in module S0, which verifies that).
The procedure in S0 executes its manually inserted epilogue. A Secure compiler would use that epilogue in the first place, so no need for any manual adjustment.
The Secure epilogue pops the return address (LR value) back into LR.
bxns lr returns to NS, undoing the modifications by SG, hence from NS point of view, we have a normal procedure return.
Any RETURN value is stored in register R0, and thus available to NS. Also, since Secure code can access Non-secure memory, any VAR parameters can be modified from S0.
The secure procedure (or its epilogue) would also clear the processor registers and other registers to avoid leaking Secure data (leaving any return parameter in R0 intact, of course).

Final Images, ELF File

After running gen-secure, we can build Non-secure program NS, too, using Astrobe. To load the test set-up into the MCU's flash memory, we create an ELF file, by running

> python -m make-elf s.bin:C000000 nsc_s0.bin:0C0FE000 ../ns/ns.bin:8100000

in directory s. Note that this creates one ELF file with the Secure, Non-secure, and Non-secure Callable code images, which is useful for testing here. If the Secure and the Non-secure projects were separated, possibly even between organisations, two ELF files could be created and loaded independently.

Memory Layout

After loading the ELF file, we have the following flash memory layout (see Part 1) for a description of the different segments).

+-----------------+
| resources       |
+-----------------+
| init sequence   |
+-----------------+
| S               |
+-----------------+
| S0              |
+-----------------+
| MCU2            |
+-----------------+
| Main (S)        |
+-----------------+
| link parameters |
+-----------------+
| unused          |
+-----------------+
| entry address   |
| initial SP      |
+-----------------+ 08100000H

+-----------------+
| NSC_S0          |
+-----------------+ 0C0FE000H

+-----------------+
| resources       |
+-----------------+
| init sequence   |
+-----------------+
| NS              |
+-----------------+
| NS_S0           |
+-----------------+
| MCU2            |
+-----------------+
| Main (NS)       |
+-----------------+
| link parameters |
+-----------------+
| unused          |
+-----------------+
| entry address   |
| initial SP      |
+-----------------+ 0C000000H

Lower physical addresses are at the bottom of the figure.

Summary

Secure Memory Configuration

Once we have figured out and defined the required memory layout for a project, setting up a Secure memory configuration is straight-forward.

Here are the steps with the STM32U585 used.

Preparation: program the non-volatile parts of the the configuration, either using a dedicated separate program, or the tool provided by the vendor for that purpose. With the STM32U585, this is done using the so called option bytes in the flash controller.

In the Secure program:

Enable the SAU; the SAU is a standard component of each Cortex-M33 MCU, accessed via the Private Peripheral Bus (PPB).
Configure the Non-secure and the NSC address SAU regions. Memory zones without an SAU region are Secure.
Configure the SRAM blocks as Secure and Non-secure, consistent with the SAU regions, using GTZC.
Configure peripheral device security configuration. If the device itself is TrustZone-aware, this is done directly in the device's registers (eg. GPIO, see test program), if not, using GTZC (the test program does not use such devices).

Compilation and Linking

Once the memory and peripheral address space is configured, these are the steps to build the three binary images to be loaded into memory for execution:

Build the Secure program.
Run gen-secure on each Secure module that needs to be called from the Non-secure program or its modules.
Build the Non-secure program.

Compared to the build steps necessary and used in Part 1, this is now much simplified.

Astrobe Configuration Files

We need two Astrobe configuration files, one for the Secure and one for the Non-secure program, just as we would for any two projects. Obviously, the code and data range parameters must be consistent with the Secure memory configuration.

Conclusions

Designing and implementing Secure and Non-secure programs with Oberon and Astrobe is, on this very basic level, possible and straight-forward, with the help of a simple extra tool, and some manual tweaking. Using Astrobe this way is beyond its current purpose and specification, but then again, so is writing dual-core programs for the RPs. Thanks to the conceptual clarity, and dare I say it, simplicity – in the best Wirthian sense – of Oberon and its implementation in Astrobe we can tinker.^[2]

ARM TrustZone is an effective way to divide flash memory and SRAM, as well as the rest of the address space, into the required Secure, Non-secure, and Non-secure Callable address ranges. However, since TrustZone does not secure all bus managers and all peripheral devices, additional security components are required in an MCU. While TrustZone is standardised for the ARM-8M architecture, each MCU vendor uses different concepts and implementations for these additional components.

To get the overall configuration correct requires to find and understand many different components, and figure out their interactions. Actually writing, or extending and adapting the test code as well as gen-secure was the smallest part of this two weeks exercise and experiment. I have seen a lot of BusFaults and SecureFaults. :)

What's Missing

This test program is an utterly simple, bare-bones proof-of-concept implementation.

Among the aspects not evaluated and tested yet:

calling Non-secure code from the Secure world;
TYPEs defined in the Secure world to be used in the Non-secure program (including type tests);
procedure parameters other than basic ones;
exception handling, including run-time fault and error handling.

Also, I should check out TrustZone and related concepts as implemented in the RP2350.

Repository

lib/v3.0 <repo>/examples/v3.0/stm/u585i-iot

This is probably not obvious, at least it wasn't for me: to set a bank to Non-secure, set the starting page index to the last page number (127), and the ending page index to the first page number (0). ↩︎
Yes, speaking of simplicity in the context of a compiler and linker can be easily misunderstood. These programs are never actually simple. I use simplicity here as a compliment. ↩︎

Last updated: 1 February 2026