What do you mean Unsafe isn't safe?

What are the risks and benefits of implementing unsafe rust code?

One of a billion possible answers

Implementing unsafe Rust code involves directly bypassing the safety guarantees enforced by the Rust compiler.
These safety guarantees are designed to prevent a wide range of common bugs and security vulnerabilities, such as null pointer dereferencing, buffer overflows, and data races in concurrent code.
However, there are times when using unsafe is necessary.

Time to lay out the risks and benefits of unsafe Rust code:

Benefits

1. Performance Optimization:

unsafe code can be used to implement low-level optimizations that are not possible with safe Rust.
For example, you can perform manual memory management which can optimize data layouts or implement efficient low-level data structures, potentially leading to significant performance improvements.

2. Interoperability with Other Languages:

FFI is where it is at. Unsafe is often necessary when interfacing Rust code with other programming languages, such as C or C++.
This includes calling functions from C libraries, working with data structures expected by these libraries, or implementing callbacks.
This interoperability is essential for using existing libraries that are not available in Rust.

3. Access to Hardware and OS Features:

Some low-level system functionalities or hardware instructions are not directly accessible through safe Rust.
Unsafe code is required to call these system APIs or perform certain operations, such as inline assembly, that are necessary for systems programming, embedded systems development, or writing operating system kernels.

Risks

1, Memory Safety Violations:

The primary risk of using unsafe code is the potential for memory safety violations.
This includes dereferencing null or dangling pointers, violating the borrowing rules (leading to data races in concurrent code), and buffer overflows. Such issues can lead to crashes, undefined behavior, and security vulnerabilities.

2. Increased Complexity and Maintenance Hurdles:

Unsafe code is a royal PITA and requires a deep understanding of Rust's ownership and borrowing rules, as well as how the compiler enforces these rules.
It can make codebases more difficult to understand, reason about, and maintain, especially for those who are not as familiar with these concepts.

3. Security Risks:

Security vulnerabilities are a significant concern with unsafe code.
You are now back in the place you wanted to avoid by using Rust in the first place.
Memory safety violations can be exploited by attackers to execute arbitrary code, leading to potential security breaches.
This is particularly critical in software that processes untrusted input or runs in security-sensitive contexts.
In summary, the memory issue land mine is still waiting for you.

4. Compromises Rust's Guarantees:

One of Rust's key selling points is its promise of memory safety without sacrificing performance.
Unsafe code destrys that guarantee, putting the onous on you to ensure it won't blow up when you least expect it to.

Best Practices

Minimize Use of Unsafe Code (if you can):

Limit the use of unsafe to cases where it is absolutely necessary.
For everything else, prefer safe Rust abstractions if possible. After all, using unsafe code is delibrate.
No one every said "I accidentally wrote it that way. I though unsafe meant something else."

Isolate Unsafe Code (if you can):

Encapsulate unsafe code in small, well-defined modules or functions with a safe interface.
This containment strategy helps limit the potential impact of bugs.

Document and Review as if your life depended upon it:

Explaing the rationale for using unsafe, the invariants that are being relied upon for safety, and any preconditions that must be met.
Code reviews are particularly important for unsafe blocks to ensure that the safety invariants are correctly understood and upheld.

Wrapping it up:

Unsafe can allow you do everything you wanted to do in C or C++ but allows you claim (to the impressionable): "Rust is safe. See the website says so."
Unsafe code is essentially Tickling a Dragon's tail you can do it but you better know what you are doing or you could end up in a world of trouble.

CODE

An unsafe example (that I did not write) of something you can't do in safe Rust


use std::mem;
use std::ptr;

struct UnsafeFixedArray<T> {
    data: *mut T, // Pointer to the data
    capacity: usize, // Maximum number of elements
}

impl<T> UnsafeFixedArray<T> {
    // Creates an empty UnsafeFixedArray with the specified capacity. Elements are uninitialized.
    unsafe fn new(capacity: usize) -> Self {
        let data = if capacity == 0 {
            std::ptr::null_mut()
        } else {
            // Allocate memory without initializing it
            let layout = std::alloc::Layout::array::<T;>(capacity).unwrap();
            std::alloc::alloc(layout) as *mut T
        };

        UnsafeFixedArray { data, capacity }
    }

    // Writes a value into the array at the specified index without bounds checks
    unsafe fn set(&mut self, index: usize, value: T) {
        if index >= self.capacity {
            panic!("Index out of bounds");
        }
        // Place the value into the memory location directly, without running the destructor of the old value
        ptr::write(self.data.add(index), value);
    }

    // Reads a value from the array at the specified index without bounds checks
    unsafe fn get(&self, index: usize) -> &T {
        if index >= self.capacity {
            panic!("Index out of bounds");
        }
        // Dereference the pointer to get the value
        &*self.data.add(index)
    }
}

impl<T> Drop for UnsafeFixedArray<T> {
    fn drop(&mut self) {
        unsafe {
            if !self.data.is_null() {
                // Manually drop the elements of the array
                for i in 0..self.capacity {
                    ptr::drop_in_place(self.data.add(i));
                }
                // Deallocate the memory
                let layout = std::alloc::Layout::array::<T;>(self.capacity).unwrap();
                std::alloc::dealloc(self.data as *mut u8, layout);
            }
        }
    }
}

This code and the remaining text hopefully illustrates an UnsafeFixedArray struct that uses raw pointers for data storage, allowing manual control over memory allocation, initialization, and deallocation. ChatGPT wrote this for me as I didn't have the time.
The new function allocates memory without initializing it, which can lead to performance gains in scenarios where initialization is not needed or can be deferred.
The set and get methods provide unchecked access to the array's elements, offering potential performance improvements by avoiding runtime bounds checking.
However, the lack of bounds checks introduces the risk of buffer overflows, making the code unsafe. The Drop trait implementation ensures that the allocated memory is properly deallocated and the elements are dropped when the UnsafeFixedArray goes out of scope, preventing memory leaks.
This example (Thank you ChatGPT!) is a demonstration of how unsafe Rust can be used to implement low-level data structures with manual memory management for optimization purposes. It's important to note that using such unsafe code requires a deep understanding of Rust's ownership and borrowing rules, as well as careful management of memory safety and lifetime invariants to avoid introducing bugs and vulnerabilities into the codebase.

The C Version (also not written by me)


#include <stdlib.h>
#include <stdio.h>
#include <string.h>

typedef struct {
    void* data; // Pointer to the data
    size_t capacity; // Maximum number of elements
    size_t element_size; // Size of each element
} UnsafeFixedArray;

// Initializes the array with a specified capacity and size of each element
UnsafeFixedArray* unsafe_fixed_array_new(size_t capacity, size_t element_size) {
    UnsafeFixedArray* array = (UnsafeFixedArray*)malloc(sizeof(UnsafeFixedArray));
    if (capacity == 0) {
        array->data = NULL;
    } else {
        array->data = malloc(capacity * element_size);
    }
    array->capacity = capacity;
    array->element_size = element_size;
    return array;
}

// Sets a value in the array at the specified index
void unsafe_fixed_array_set(UnsafeFixedArray* array, size_t index, void* value) {
    if (index >= array->capacity) {
        fprintf(stderr, "Index out of bounds\n");
        exit(EXIT_FAILURE);
    }
    memcpy((char*)array->data + index * array->element_size, value, array->element_size);
}

// Gets a pointer to the value in the array at the specified index
void* unsafe_fixed_array_get(UnsafeFixedArray* array, size_t index) {
    if (index >= array->capacity) {
        fprintf(stderr, "Index out of bounds\n");
        exit(EXIT_FAILURE);
    }
    return (char*)array->data + index * array->element_size;
}

// Frees the memory allocated for the array
void unsafe_fixed_array_free(UnsafeFixedArray* array) {
    free(array->data);
    free(array);
}

Dwight J. Browne