Secure/Non-secure Part 1

lib/v3.0 Exploratory: Secure/Non-secure programs – basics.

Overview, Purpose

This document describes, together with the test code, an attempt to evaluate how Secure and Non-secure programs, based on ARM's TrustZone technology, could be structured and implemented with Oberon and the Astrobe compiler and linker. C/C++ programming tools, such as gcc, provide a special mode to create the Secure part of this kind of programs. Astrobe does not yet provide such support, hence this evaluation is a proof-of-concept experiment, which may provide some insights regarding what this could entail.

This is part 1 of the experiment and description, focusing on the basics, that is, concepts and implementing the corresponding mechanics in the Non-secure world. Part 2 will address actually getting things running with a proper Secure/Non-secure separation.

Caveat lector: this is a long-ish document – there's a lot of ground to cover. It's nerdy stuff, with diving into the arcane details of program binary structures, procedure call mechanics, and all this Good Stuff.™ Furthermore, it's based on my current understanding of the concepts and their implementation, that is, possibly not completely correct and complete.

Part 2 is here.

Introduction

Cortex-M33-based MCUs (among other core architectures) can be equipped with what ARM calls TrustZone components, for example the RP2350, or the STM32U585. This technology enables the separation of programs on the MCU into Secure and Non-secure parts. Non-secure programs cannot read the Secure parts, but can call services provided by the latter. This serves to better protect the MCU's software from run-time errors, as well as malicious attacks. As a side effect, it can also be used to protect the intellectual property contained in the Secure software.

To achieve these objectives, TrustZone is usually complemented by additional security and resource isolation components, since the TrustZone only constrains the CPU, but there can be other bus masters, such as DMA. Also, access control to peripheral devices requires more granularity than TrustZone alone can provide.

On a very basic level, the system is in a Secure state when the currently executing code resides in Secure memory, and in a Non-secure state when the code runs from Non-secure memory. Secure code can access Secure and Non-secure memory (flash, SRAM) and peripherals, Non-secure code can only access Non-secure memory and peripherals.

This memory separation is key, and the MCUs provide the corresponding controllers and bus-level logic to enforce it: Implementation Defined Attribution Unit (IDAU), Security Attribution Unit (SAU), and other components such as Global TrustZone (GTZC) in the STM, or the ACCESSCTRL registers in the RP2350.

The separation requires two binary images of linked programs, to be loaded into the respective Secure and Non-secure memories. In fact, as we'll see, there is also a third image, which contains the interface code between the two worlds, allowing their secure interaction.

This test program attempts to evaluate and assess how this set-up could be implemented using Oberon and the current Astrobe compiler and linker. Astrobe's purpose is to create programs compiled into a single binary, where the linker resolves all the interactions between all modules, hence the set-up of the modules and the tools, described below, cannot be more than a proof-of-concept experiment with the goal to gain insights – it's not meant for actual programming work at this stage.

Concepts

Starting Point: One Program, One Binary

Let's have a module NS that imports module S, and NS calls procedures defined and implemented in S.

+------------+          +------------+
|            | IMPORT   |            |
|            |<---------|            |
| Module NS  |          | Module S   |
|            |--------->|            |
|            | call     |            |
+------------+          +------------+

Assuming NS is a program module (ie. importing Main), if we compile and link NS, we get this binary image (lower addresses at the bottom):

+-----------------+
| resources       |
+-----------------+
| init sequence   |
+-----------------+
| NS              |
+-----------------+
| S               |
+-----------------+
| Main            |
+-----------------+
| link parameters |
+-----------------+
| unused          |
+-----------------+
| entry address   |
| initial SP      |
+-----------------+

The initial stack pointer value and the program entry address are at addresses 0H and 4H, respectively. The MCU hardware relies on these values at the specified addresses to get off the ground.
Above these values, there's the vector table, but it is not used with Astrobe (apart from the entry address, which is technically part of the vector table, at the vector address for the reset handler).
link parameters is a section with the linker parameters as defined in the Astrobe config file, from where they can be read at run-time via module LinkOptions, which is not required and used for this test program.
init sequence is the code section that initialises all modules.
resources include, apart from any data provided by the programmer, meta data about the program, eg. to determine module and procedure names for run-time error logging.
The linker resolves all references in the compiled modules to absolute addresses within the above single binary image.

Separation of NS and S

Now let's say we want to run module S as Secure software. As outlined above, NS and S then require to be in separate binary images, which can be loaded at different addresses for the Secure and Non-secure memory. With Astrobe, this is easily achievable: create two Astrobe's config files with different code and data addresses, and run the compiler and linker on both modules separately.

+-----------------+     +-----------------+
| resources       |     | resources       |
+-----------------+     +-----------------+
| init sequence   |     | init sequence   |
+-----------------+     +-----------------+
| NS              |     | S               |
+-----------------+     +-----------------+
| Main (NS)       |     | Main (S)        |
+-----------------+     +-----------------+
| link parameters |     | link parameters |
+-----------------+     +-----------------+
| unused          |     | unused          |
+-----------------+     +-----------------+
| entry address   |     | entry address   |
| initial SP      |     | initial SP      |
+-----------------+     +-----------------+

As an aside, S also needs to import Main now, else the linker is not happy, but we can use an empty Main in the same directory as S for this basic experiment. With a real program, the start of execution would be with the Secure part, hence Main will be useful there for setting things up from the Secure world, as is a specific Main for NS to do the same for the Non-secure world, as soon as the Secure software has transferred control to the Non-secure one (see below).

A challenge arises regarding the interaction between the code in the two separate images. The linker cannot resolve the corresponding references, since the two linking processes are separate and independent. Furthermore, TrustZone requires a defined protocol for this interaction in order to ensure the overall security, eg. to inhibit attacks of the Secure world from the Non-secure side.

Since Non-secure code NS cannot access Secure memory, it cannot call any procedure in S directly. Here's where a third type for memory comes into play: Non-secure Callable (NSC).

Let's first have a look at the overall structure of a Secure/Non-secure program.

Anatomy of a Secure/Non-Secure Program

Here's the relationships and the control flow for a program consisting of both Non-secure and Secure parts.

Non-secure world        Secure world
                        +------------+
             reset ---> | Program S  |
                        +------------+
+------------+    start   |    | call
|            |<-----------+    V
| Program NS |          +------------+
|            |--------->| lib S      |
+------------+    call  +------------+
       | call                  ^
       V                       |
+------------+     call        |
| Lib NS     |-----------------+
+------------+

The program as a whole starts with the Secure program. After reset, the code runs as Secure privileged. The Secure code sets up all the memory separation and other resource isolation components and parameters.
The Secure program then passes control to the Non-secure program, which is usually the actual control program.
The Non-secure program and its library modules can make use of the Secure library modules as outlined below.
The Secure software can access Non-secure memory, including calling procedures, and load/store operations in SRAM (not depicted above).
Program development is usually organised in two projects for both the Secure and the Non-secure world.

Procedure Calls

Here's the set-up and the flow of activations for the Non-secure to Secure procedure calls as required for TrustZone:

NS world                  Secure world
NS memory                 NSC memory                S memory
+--------------+                                    +--------------+
| Module NS    |                                    | Module S     |
|              |                     return         |              |
|              |<-----------------------------------|              |
|              |                     BXNS           |              |
|              |                                    |              |
+--------------+                                    +--------------+
    | call                +--------------+              ^
    V BL, BLX             | Module NSC_S |       invoke |
+--------------+ invoke   |              |       B, BX  |
| Module NS_S  |--------->|              |--------------+
|              | B, BX    |              |
|              |          +--------------+
|              |
+--------------+

Module NS_S represents S in the Non-secure world. NS_S exports exactly the same interface as S. It gets linked into the Non-secure image, which is loaded into Non-secure memory (NS memory).
Module NSC_S represents the entry points from the Non-secure to the Secure world. Entry points are basically exposed absolute addresses within Secure module S. NS (via NS_S) only "sees" these addresses, but nothing else, since the actual Secure code is in module S in Secure memory. NSC_S gets loaded into Non-secure Callable memory (NSC memory).
Module S contains the Secure code, which is loaded into Secure memory (S memory).
Compared to directly importing and calling S from NS, outlined above as starting point, no code changes are required in NS and S, apart from NS importing NS_S in lieu of S (IMPORT S := NS_S does the trick).
Note the terminology used above: invoke means b or bx instructions to hard coded addresses, while call means "normal" procedure calls via bl or blx.
Also note the direct return from S to NS0, skipping NSC_S and NS_S0.
Obviously, all modules must be valid Oberon modules for this set-up to be used with Astrobe. As we'll see, this results in some dead code, which is OK for a proof-of-concepts experiment, but could be avoided with specific compiler and linker support for Secure programs.

Implementation

Modules `NS_S` and `NSC_S`

Since both modules NS_S and NSC_S represent module S, with the same exposed interface, they could be automatically generated when compiling S. With the current version of Astrobe, there's a separate tool that does this for the test set-up and program, see below. This tool extracts NS_S and NSC_S in source form, which then needs to be compiled and linked using Astrobe.

Following ARM's interaction protocol between the Non-secure and Secure world, the call of a procedure from NS employs branch instructions to fixed absolute addresses in the separate binary images in the NSC and S memory, respectively. Which means that first S must be compiled and linked, then NSC_S, then NS together with NS_S, resulting in three images with branching in-between. Depending on how S and NSC memory can be defined via IDAU and SAU, modules S and NSC_S could be included in the same image. This test program uses three separate images.

Module `S`

Secure code must be compiled in a specific way according to the specifications for TrustZone.

The first instruction in NSC_S upon calling a procedure via NS_S and branching to NSC_S must be SG, Secure Gateway. This instruction will take care of setting up the transition from Non-secure to Secure memory and state. One important action is to modify bit 0 the value in the link register LR, so that when using LR for the procedure return from the Secure to Non-secure world via bxns, the CPU can adjust its state accordingly.

For the code in S, ie. running from Secure memory, it is mandatory to return from procedures using this instruction:

bxns  lr

bxns lr will take care of interpreting the LR contents, as set up by the SG instruction, and return correctly to the Non-secure state.

There's more to Secure compilation. For example, the CPU registers must be cleared before returning to the Non-secure world (apart from a possible return result in R0), in order to avoid leaking Secure data. This test program does not implement that.

Tool makesec0

makesec0 is a Python program in the tools directory.

Quick overview:

> python -m makesec0 -h
usage: makesec0 [-h] [-v] {make,fixup} ...
options:
  -h, --help    show this help message and exit
  -v            verbose, print feedback
commands:
  {make,fixup}
    make
    fixup

>python -m makesec0 make -h
usage: makesec0 make [-h] mod_file
positional arguments:
  mod_file    secure module file (.mod)
options:
  -h, --help  show this help message and exit

> python -m makesec0 fixup -h
usage: makesec0 fixup [-h] mod_file s_addr nsc_addr
positional arguments:
  mod_file    secure module file (.mod)
  s_addr      absolute address of S module in hexadecimal
  nsc_addr    absolute address of NSC module in hexadecimal
options:
  -h, --help  show this help message and exit

The tool serves to generate NSC_S and NS_S type modules from S (running makesec0 make), and then to update the absolute branch addresses once we have a compiled and linked secure program (running makesec0 fixup).

I'll show the tool's usage below.

Test Program Description

Overview

The test program is simple, just my usual MVP – minimal viable program – that I use to get a new MCU up and running.^[1] It blinks an LED or two. I have used the version for the STM32U585 MCU, since it is relatively easy to load different binary images to different flash addresses by using a corresponding ELF file. In addition, the MCU allows to run with TrustZone completely disabled, which is ideal for this initial experiment.

This ELF file is created using make-elf in the tools directory, which I have extended to accept more than one binary file as generated by Astrobe, to include several PROGBITS sections. It seemed easier than to muck with the RP2350's .uf2 files with their meta data blocks. The Ozone debugger takes care of programming the flash memory accordingly.

For this test program, as a starting point, I have left TrustZone disabled in the STM32U585, loading and running all images and code in Non-secure memory and CPU state. This saves me from the need of correctly setting up the Secure world, which is another can of worms, but allows me to experiment anyway with the interaction between separate images as if they were loaded into Secure and Non-secure memory, following the ARM's corresponding interaction rules and requirements.

I'll tackle implementing the Secure world in Part 2, extending this test program.

Directory and Module Set-up

The test program directory Secure0 contains three directories ns, nsc, and s. Directory ns contains the Non-secure module NS0 (and later also NS_S0), directory s the Secure module S0. I have artificially extracted two procedures from the original MVP test program into module S0, so that NS0 can call these procedures across the image boundaries.

makesec0 relies on this directory structure.

Module Main is empty in all directories.

Initial Modules NS0 (Non-secure Program) and S0 (Secure Program)

Initially, we have the two modules NS0 and S0, in their respective directories.

MMODULE NS0;
  IMPORT SYSTEM, MCU := MCU2, S0, Main;
  CONST
    LEDgreen = 7; (* GPIOH *)
    LEDred   = 6;
    MODER_Out = 1;
    OSPEED_High = 2;

  PROCEDURE init;
    VAR val: SET; reg, devNo: INTEGER;
  BEGIN
    (* enable GPIOH clock *)
    reg := MCU.DEV_GPIOH DIV 32;
    reg := MCU.RCC_AHB1ENR + (reg * 4);
    devNo := MCU.DEV_GPIOH MOD 32;
    SYSTEM.GET(reg, val);
    val := val + {devNo};
    SYSTEM.PUT(reg, val);

    (* set-up GPIOH for the leds *)
    reg := MCU.GPIOH_BASE + MCU.GPIO_MODER_Offset;
    S0.SetBits2(LEDred, reg, MODER_Out);
    S0.SetBits2(LEDgreen, reg, MODER_Out);
    reg := MCU.GPIOH_BASE + MCU.GPIO_OSPEEDR_Offset;
    S0.SetBits2(LEDred, reg, OSPEED_High);
    S0.SetBits2(LEDgreen, reg, OSPEED_High)
  END init;

  PROCEDURE run;
    VAR i: INTEGER; leds: SET;
  BEGIN
    leds := {LEDred, LEDgreen + 16};
    REPEAT
      S0.ToggleLED(leds);
      i := 0;
      WHILE i < 100000 DO INC(i) END;
    UNTIL FALSE
  END run;

BEGIN
  init;
  run
END NS0.

(*! SEC *)
MODULE S0;

  IMPORT SYSTEM, MCU := MCU2, Main;
  CONST
    GPIOH_BSSR = MCU.GPIOH_BASE + MCU.GPIO_BSRR_Offset;
    LEDgreen = 7; (* GPIOH *)
    LEDred   = 6;
    POP_LR = 0E8BD4000H;
    BX_LR = 04770H;

  PROCEDURE testLR(x: INTEGER);
  END testLR;

  PROCEDURE* ToggleLED*(VAR leds: SET);
    CONST Mask = {LEDred, LEDgreen, LEDred + 16, LEDgreen + 16};
  BEGIN
    SYSTEM.PUT(GPIOH_BSSR, leds);
    leds := leds / Mask;

    (* manually inserted Secure epilogue *)
    (* no add sp,#n as leaf procedure *)
    SYSTEM.EMIT(POP_LR);
    SYSTEM.EMITH(BX_LR)
  END ToggleLED;

  PROCEDURE SetBits2*(pin, addr, twoBitValue: INTEGER);
    VAR val, mask: SET;
  BEGIN
    twoBitValue := twoBitValue MOD 04H;
    twoBitValue := LSL(twoBitValue, pin * 2);
    SYSTEM.GET(addr, val);
    mask := BITS(LSL(03H, pin * 2));
    val := val * (-mask);
    val := val + BITS(twoBitValue);
    SYSTEM.PUT(addr, val);
    testLR(42);

    (* manually inserted Secure epilogue *)
    SYSTEM.EMITH(0B005H); (* add sp,#20 *)
    SYSTEM.EMIT(POP_LR);
    SYSTEM.EMITH(BX_LR)
  END SetBits2;

  PROCEDURE Test*(x, v: INTEGER);
    VAR z: INTEGER;
  BEGIN
    (* stuff *)
    (* manually inserted Secure epilogue *)
    SYSTEM.EMITH(0B003H); (* add sp,#12 *)
    SYSTEM.EMIT(POP_LR);
    SYSTEM.EMITH(BX_LR)
  END Test;
END S0;

With this initial set-up, module NS0 can be compiled and linked into a single binary, just to confirm that we have a valid and functioning starting point.
Note the (*! SEC *) annotation at the beginning of S0, which could instruct a Secure compiler to generate Secure code. All exported procedures in a Secure module will become part of their corresponding NSC and NS modules.
Furthermore, note the manually inserted Secure epilogues at the end of these procedures to return. Astrobe's standard epilogue add sp,#n; pop { pc } will be ignored, obviously.^[2] A Secure compiler would simply generate this epilogue in lieu of the standard one.

Generate NS_S0 and NSC_S0

As outlined above, NS_S0 and NSC_S0 can be derived from S0. A Secure compiler could do this directly for us, but for now, we need to use makesec0. First we compile S0, then we run, inside directory s:

> python -m makesec0 make s0.mod

Which creates NSC_S0 and NS_S0 in their respective directories, using listing file S0.lst as produced by the compiler.

MODULE NSC_S0;
(* generated, do not edit *)
IMPORT SYSTEM, Main;

PROCEDURE* ToggleLED*(VAR leds: SET);
BEGIN
(* SYSTEM.EMIT(0E97FE97FH); *) (* SG *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
(*!addr_s 12 *) SYSTEM.DATA(0H); (* target address *)
END ToggleLED;

PROCEDURE SetBits2*(pin, addr, twoBitValue: INTEGER);
BEGIN
(* SYSTEM.EMIT(0E97FE97FH); *) (* SG *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
(*!addr_s 44 *) SYSTEM.DATA(0H); (* target address *)
END SetBits2;

PROCEDURE Test*(x, v: INTEGER);
BEGIN
(* SYSTEM.EMIT(0E97FE97FH); *) (* SG *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
(*!addr_s 136 *) SYSTEM.DATA(0H); (* target address *)
END Test;

END NSC_S0.

MODULE NS_S0;
(* generated, do not edit *)
IMPORT SYSTEM;

PROCEDURE* ToggleLED*(VAR leds: SET);
BEGIN
SYSTEM.EMITH(0B001H); (* add sp,#4, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
(*!addr_nsc 6 *) SYSTEM.DATA(0H); (* target address *)
END ToggleLED;

PROCEDURE SetBits2*(pin, addr, twoBitValue: INTEGER);
BEGIN
SYSTEM.EMITH(0B004H); (* add sp,#16, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
(*!addr_nsc 22 *) SYSTEM.DATA(0H); (* target address *)
END SetBits2;

PROCEDURE Test*(x, v: INTEGER);
BEGIN
SYSTEM.EMITH(0B003H); (* add sp,#12, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
(*!addr_nsc 38 *) SYSTEM.DATA(0H); (* target address *)
END Test;

END NS_S0.

As explained above, these modules each
- replicate the NSC-relevant procedure signatures from S0;
- will redirect the calls to absolute addresses.
Note the add sp,#n instructions in NS_S0, which we will cover below.
The secure gateway instruction SG is commented out for now, since we will run the program from Non-secure memory.
The branches are implemented using bx r11 instructions, with a value loaded into r11 from flash memory. This allows to hard-code the absolute branch address in NSC_S0 and S0 in NS_S0 and NSC_S0, respectively, without the need to know the address from where the branches are executed to calculate an offset. The advantage of this mechanism is that as long as the Secure software does not change, the Non-secure side can be compiled and linked independently, and Hence, no new fix-up is needed. A disadvantage is that we lose a register for arguments passing.
The SYSTEM.DATA code lines that insert the target addresses are annotated, and indicate the relative address within S0 (in NSC_S0) and NSC_SO (in NS_S0). As we'll see, the address fix-up process keeps these annotations, so that we can re-run the fix-up without having to re-generate the above base modules again.

Let's look at these addresses.

Branch Target Addresses and Call Chain

From the Non-secure program's point of view, its Secure procedure calls into the other images must be exactly the same as Non-secure calls inside the same image. With the re-directions required, and the direct return from procedures in S0 to NS0, skipping NSC_S and NS_S0, we need to evaluate what the target addresses of these redirections are. Remember that the Secure software exposes its procedures as entry points in NSC_S0 in the form of absolute addresses in the binary image in NSC memory, and we don't want any other (or at least not too much) code than the branch instructions into the code in S memory there for security reasons, apart from the mandatory SG instruction.

However, all modules must be valid Oberon modules to be compiled with Astrobe – which they are, if you check out modules NSC_S0 and NS_S0 above.

Let's look at the standard way of calling a procedure first, eg. when compiling NS0 and S0 into a single binary.

NS0       +---------------------------------------------------+
          | ...                                               |
          | set up procedure args in registers                |
          | call procedure in S0 via BL                       |-----+
          | return address (will be in LR via BL)             |<----------+
          | ....                                              |     |     |
          +---------------------------------------------------+     |     |
                                                                    |     |
S0        +---------------------------------------------------+     |     |
          | prologue (for non-leaf procedures):               |     |     |
          | push args in regs and LR onto stack               |<----+     |
          | sub sp,#n to make space for local vars on stack   |           |
          +---------------------------------------------------+           |
          | procedure body                                    |           |
          +---------------------------------------------------+           |
          | epilogue:                                         |           |
          | add sp,#m, leaving only LR value on stack         |           |
          | pop LR value into PC                              |-----------+
          +---------------------------------------------------+

With the Non-secure to Secure call, ie. with NS_S0 in Non-secure memory and linked with NS0, and with NSC_S0 in Non-secure Callable memory, and with the compiler not yet supporting Secure code compilation, ie. without modifications to procedure prologues and epilogues, we need the following behaviour:

NS0       +---------------------------------------------------+
          | ...                                               |
          | set up procedure args in registers                |
          | call procedure in S0 via BL                       |-----+
          | return address (will be in LR via BL)             |<----------+
          | ....                                              |     |     |
          +---------------------------------------------------+     |     |
                                                                    |     |
NS_S0     +---------------------------------------------------+     |     |
          | prologue:                                         |     |     |
          | push args in regs and LR onto stack               |<----+     |
          | note: no local variables                          |           |
          +---------------------------------------------------+           |
          | add sp,#n to reverse of the push operations       |           |
          | branch to address in NSC_S0 via bx r11            |-----+     |
          | note: LR still contains return address in NS0     |     |     |
          +---------------------------------------------------+     |     |
          | epilogue:                                         |     |     |
          | add sp,#m, leaving only LR value on stack         |     |     |
          | pop LR value into PC                              |     |     |
          +---------------------------------------------------+     |     |
                                                                    |     |
NSC_S0    +---------------------------------------------------+     |     |
          | prologue:                                         |     |     |
          | push args in regs and LR onto stack               |     |     |
          | note: no local variables                          |     |     |
          +---------------------------------------------------+     |     |
          | secure gateway instruction (SG), modifies LR      |<----+     |
          | branch to address in S0 via bx r11                |-----+     |
          | note: LR contains return address, modified by SG  |     |     |
          +---------------------------------------------------+     |     |
          | epilogue:                                         |     |     |
          | pop LR value into PC                              |     |     |
          + --------------------------------------------------+     |     |
                                                                    |     |
S0        +---------------------------------------------------+     |     |
          | prologue:                                         |     |     |
          | push args in regs and LR onto stack               |<----+     |
          | sub sp,#n to make space for local vars on stack   |           |
          +---------------------------------------------------+           |
          | procedure body                                    |           |
          +---------------------------------------------------+           |
          | manually inserted epilogue:                       |           |
          | add sp,#m, leaving only LR on stack               |           |
          | pop LR value into LR                              |           |
          | bxns lr: uses LR value as modified by SG          |-----------+
          + --------------------------------------------------+
          | epilogue:                                         |
          | add sp,#m, leaving only LR on stack               |
          | pop LR value into PC                              |
          + --------------------------------------------------+

Call chain:

NS0 sets up the procedure arguments in registers as usual.
NS0 calls the procedure via BL (or BLX) as usual.
In NS_S0, the procedure executes its standard prologue.
However, the arguments and LR are now on the Non-secure stack, but we want the procedure in S0 to access them on the Secure stack. Hence, we reverse the stack actions of the Non-secure prologue by adding number-of-pushed-registers * 4 to the stack pointer. A Secure compiler could simply not insert the standard prologue here.
The procedure in NS_S0 then branches to the entry point in NSC_S0 in NSC memory.
Important: the LR still contains the return address in NS0.
In NSC_S0, the prologue is not executed, that is, the procedure in NS_S0 branches to the body of the procedure in NSC_S0, indicated by the relative address aa in the annotation (*!addr_nsc aa *).
As required by the TrustZone rules, the first instruction in NSC_S0 is Secure Gateway SG, which finds the return address in LR. Since we're testing in Non-secure memory for now, this instruction is commented out, otherwise we get an MCU fault.
After SG has done its magic, the procedure branches to its equivalent in S0.
Here, the registers still contain the procedure arguments, and they are pushed onto the Secure stack, together with LR, so the Secure procedure can execute without any modification. That is, the branch target address in S0 is the procedures prologue. Hence, the address aa in the annotation (*!addr_s aa *) in NSC_S0 is the relative address of the procedure prologue in S0.
The procedure in S0 executes its body. As usual, the return address in NS0 (LR modified by SG) is safe on the stack, ie. the procedure in S0 can do procedure calls (see procedure call to testLR in module S0, which verifies that).
The procedure in S0 executes its manually inserted epilogue. A Secure compiler would use that epilogue in the first place, so no need for any manual adjustment.
The Secure epilogue pops the return address (LR value) back into LR.
bxns lr returns to NS0, undoing the modifications by SG, hence from NS0 point of view, we have a normal procedure return. In the current Non-secure testing set-up, we use bx lr, since bxns must only be executed from Secure memory.
Any RETURN value is stored in register R0, and thus available to NS0. Also, since Secure code can access Non-secure memory, any VAR parameters can be modified from S0.

We have some dead code in this experimental set-up:

epilogue in NS_S0;
prologue and epilogue in NSC_S0;
epilogue in S0

The prologue in NS_S0 is technically not dead code, since it is executed, but its effects need to be undone, as described above.

A Secure compiler could avoid generating code for these segments, or adjust them (epilogue in S0).

Call Branch Addresses Fix-up

Remember that we only have relative addresses in the modules that are being branched to as this moment, but obviously we need absolute ones. makesec0 make has determined these relative addresses, based on the schema above.

For the absolute addresses, we need to know the the loading addresses of modules NSC_S0 and S0, and for this we need to compile and link the two images, after which we can run a fix-up routine to insert the correct branch addresses in the code of NSC_S0 and NSC_S0.

Here the experiment gets a bit tedious, since of course we need three different Astrobe config files for the three images for NS0 plus NS_S0, for NSC_S0, and for S0, and we need to switch to the correct configuration for each build run. This process could be further automated.

The three config files used are in the repository's project directory.

So, next step, we need to build the two Secure images, using the correct config file for each. To also build the Non-secure image, the original import of S0 in module NS0 to build a single binary at the start, above, needs to refer to NS_S0 now: S0 := NS_S0.

Then we can run

> python -m makesec0 fixup s0.mod 008000C50 008000848

where:

008000C50: load address of module S0,
008000848: load address of module NSC_S0,

as found from the corresponding .map files. Again, reading these values from the .map file could be automated in makesec0 fixup.

After running the above command, modules NS_S0 and NSC_S0 have been patched with the correct addresses in the SYSTEM.DATA statements.

MODULE NSC_S0;
(* generated, do not edit *)
IMPORT SYSTEM, Main;

PROCEDURE* ToggleLED*(VAR leds: SET);
BEGIN
(* SYSTEM.EMIT(0E97FE97FH); *) (* SG *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
(*!addr_s 12 *) SYSTEM.DATA(008000C5DH); (* target address *)
END ToggleLED;

PROCEDURE SetBits2*(pin, addr, twoBitValue: INTEGER);
BEGIN
(* SYSTEM.EMIT(0E97FE97FH); *) (* SG *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
(*!addr_s 44 *) SYSTEM.DATA(008000C7DH); (* target address *)
END SetBits2;

PROCEDURE Test*(x, v: INTEGER);
BEGIN
(* SYSTEM.EMIT(0E97FE97FH); *) (* SG *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
(*!addr_s 136 *) SYSTEM.DATA(008000CD9H); (* target address *)
END Test;

END NSC_S0.

MODULE NS_S0;
(* generated, do not edit *)
IMPORT SYSTEM;

PROCEDURE* ToggleLED*(VAR leds: SET);
BEGIN
SYSTEM.EMITH(0B001H); (* add sp,#4, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
(*!addr_nsc 6 *) SYSTEM.DATA(00800084FH); (* target address *)
END ToggleLED;

PROCEDURE SetBits2*(pin, addr, twoBitValue: INTEGER);
BEGIN
SYSTEM.EMITH(0B004H); (* add sp,#16, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
(*!addr_nsc 22 *) SYSTEM.DATA(00800085FH); (* target address *)
END SetBits2;

PROCEDURE Test*(x, v: INTEGER);
BEGIN
SYSTEM.EMITH(0B003H); (* add sp,#12, fix stack *)
SYSTEM.EMIT(0F8DFB004H); (* ldr.w r11,[pc,#4] *)
SYSTEM.EMITH(04758H); (* bx r11 *)
SYSTEM.ALIGN; (* word alignment *)
(*!addr_nsc 38 *) SYSTEM.DATA(00800086FH); (* target address *)
END Test;

END NS_S0.

Final Images, ELF File

Since we do all this on source code level, we need to compile and link the images again after the fix-up, using the correct Astrobe config for each build. We could patch the binaries directly in makesec0 fixup, but remember, this is an experiment at this stage. :)

The ELF file to load is created running

> python -m make-elf ns0.bin:8000000 ../nsc/nsc_s0.bin:8000600 ../s/s0.bin:8000A00

in directory ns. When determining the binary sizes and load addresses, be aware that the .map file does not list the code for the code to initialise all modules between the end of the module code and the resource data.

To check the ELF file we can use the standard utility readelf, and we recognise the three binary images:

> readelf -S -l ns0.elf

Partial output:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .NS0              PROGBITS        08000000 0001ac 0004dc 00  AX  0   0  4
  [ 2] .NSC_S0           PROGBITS        08000600 000688 00034c 00  AX  0   0  4
  [ 3] .S0               PROGBITS        08000a00 0009d4 000404 00  AX  0   0  4
  [ 4] .strtab           STRTAB          00000000 000e05 000002 00      0   0  0
  [ 5] .symtab           SYMTAB          00000000 000e07 000010 10      4   2  0
  [ 6] .shstrtab         STRTAB          00000000 000dd8 00002d 00      0   0  0

Elf file type is EXEC (Executable file)
Entry point 0x8000379
There are 3 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x0001ac 0x08000000 0x08000000 0x004dc 0x004dc R E 0x1000
  LOAD           0x000688 0x08000600 0x08000600 0x0034c 0x0034c R E 0x1000
  LOAD           0x0009d4 0x08000a00 0x08000a00 0x00404 0x00404 R E 0x1000

Note the entry point in the Non-secure program NS0 at address 8000379H.

With real Secure/Non-secure separation – stay tuned for part 2 – this will need to be the Secure program S0.

Memory Layout

After loading the ELF file, we have the following memory layout. Lower addresses are at the bottom of the figure. The Non-secure flash memory starts at 08000000H.

+-----------------+ 08000E20H
| resources       |
+-----------------+
| init sequence   |
+-----------------+
| S0              |
+-----------------+
| MCU2            |
+-----------------+
| Main (S)        |
+-----------------+
| link parameters |
+-----------------+
| unused          |
+-----------------+
| entry address   |
| initial SP      |
+-----------------+ 08000A00H

+-----------------+ 08000934H
| resources       |
+-----------------+
| init sequence   |
+-----------------+
| NSC_S0          |
+-----------------+
| Main (NSC)      |
+-----------------+
| link parameters |
+-----------------+
| unused          |
+-----------------+
| entry address   |
| initial SP      |
+-----------------+ 08000600H

+-----------------+ 800004C0H
| resources       |
+-----------------+
| init sequence   |
+-----------------+
| NS0             |
+-----------------+
| NS_S0           |
+-----------------+
| MCU2            |
+-----------------+
| Main (NS)       |
+-----------------+
| link parameters |
+-----------------+
| unused          |
+-----------------+
| entry address   |
| initial SP      |
+-----------------+ 08000000H

It should be evident that the NSC image does not actually require all the segments apart from NSC_S0 .

SRAM Allocation for Data

If you check the Astrobe config files used, you'll realise that all images have the same SRAM allocation (data range). With S0 (and NSC_S0) actually running in the Secure state, we would allocate the data ranges in Secure SRAM, and the MCU would take care of switching the stack pointer accordingly as soon as we enter NSC_S0. While this could be emulated in this experiment using explicit code, I have omitted this complication, since it will be non-issue going forward.

Conclusions

This proof-of-concept experiment suggests that the separation of programs into Secure and Non-secure parts, with the interaction in between as required by ARM's TrustZone, can be implemented using Oberon and the Astrobe tools, albeit using a slightly convoluted process and external tools, simply because Astrobe does not yet support Secure compilation and linking.

That is, I am using Astrobe beyond its purpose and specifications. :) Nonetheless, Astrobe has again proven to be a reliable and versatile tool that allows – and endures – pushing its boundaries. As simple as they may appear on the surface, SYSTEM.EMIT and SYSTEM.DATA are utterly powerful concepts.

The tests have been run in a Non-secure environment so far, hence the implementation in a proper Secure environment still needs to be demonstrated and verified.

I'll try to make some observations regarding possible changes or extension to the Astrobe tools in part 2.

Repository

lib/v3.0 <repo>/examples/v3.0/stm/u585i-iot

It's the "luxury" version of the MVP that already uses definitions in module MCU2. :) ↩︎
In case you're eagle-eyed, you note that it's bx lr, not bxns lr as explained before. As our experiment still takes place in Non-secure memory, bxns lr would be unpredictable, since it must only be executed from Secure memory. ↩︎

Last updated: 14 January 2026