Rust for IoT - Simple Input Output

by Alexandru Radovici  (3 years ago)

Simple Input Output is one of the main tools used by attackers to gain access to IoT systems.

Our first lesson in Rust is how to make input output secure by default. While C/C++ tries to mitigate risks by emitting warning developers and performing run-time checks, Rust performs all the checks at compile-time and does not allow developers to make such mistakes.

While modern C/C++ compilers have most of the tools available to avoid these kinds of errors, they still have to be backward compatible. As there is a large amount of source code that does not follow the best practices, C/C++ compilers are not able to enforce them by default.

On the other hand, Rust, being a new language with a clear focus on security, is able to enforce them.

Print your name

The simple example that we will go through is an application asking the user to enter his name and print it back. In C, this would look like this:

#include <stdio.h>
 
#define LEN 100
 
int main () {
   char s[LEN];
   printf ("Enter your name: ");
   gets (s); // Security flaw
   printf (s); // Security flaw
   return 0;
}

Building this example will result in the following:

$ gcc read_write.c -o read_write
read_write.c: In function ‘main’:
read_write.c:8:5: warning: implicit declaration of function ‘gets’; did you mean ‘fgets’? [-Wimplicit-function-declaration]
   8 |     gets (s); // Security flaw
     |     ^~~~
     |     fgets
read_write.c:9:13: warning: format not a string literal and no format arguments [-Wformat-security]
   9 |     printf (s); // Security flaw
     |             ^
/usr/bin/ld: /tmp/cc6Lj3lX.o: in function `main':
read_write.c:(.text+0x39): warning: the `gets' function is dangerous and should not be used.

Now, why is this wrong? Both reading the name by using the gets function, and using printf are security flaws. The C compiler warns us about this, but does compile the executable.

gets

This function is so dangerous that its documentation states Do not use this function and the compiler says warning: the `gets' function is dangerous and should not be used..

This function reads characters from the console, usually what the user types, until it reaches an end of line character (\n, pressing enter) or an end of file (EOF, there are no more characters to read) and copies them to the s variable. It never checks if those characters fit into the variable.

In our example, s has a capacity of 99 (99 useful characters plus a '\0' marker) characters.

Why doesn't gets check this? It can't. It only gets a pointer to where to copy the read characters. Due to the way C/C++ treats pointers, which is by memorizing the address of the pointer, gets has no way of checking this.

So what happens if the read characters do not fit? It simply overwrites whatever is stored in memory after the s without throwing any error. This leads to undefined behavior in the software, meaning it might fail sometimes or might misbehave, depending on what has been overwritten. This paves the way to attacks like buffer overflow, stack smashing, and return oriented programming.

This function is so dangerous that it is not even declared anymore in the stdio.h file.

So why does C/C++ compile it? Due to backward compatibility. Specifying -Werror in the command line will transform warnings into errors, but now imagine if this were included in a much larger project and there are hundreds of warnings. Most applications will not use this flag.

printf

While gets is marked as dangerous and the C/C++ does its best to discourage developers from using it, printf is one of the most commonly used functions. The function declaration is like this:

int printf(const char *format, ...);

The first parameter is a format string, followed by a variable number of parameters, depending on how many %... items are found in the format at runtime. This format is a constant string. We can identify two problems here:

  1. The way C/C++ sends a variable number of parameters means that printf has no idea how many parameters it has actually received;
  2. The first parameter, even though it is marked as constant, can be any string variable, even one read from the keyboard.

So what are the implications of 1 and 2? First, if the number of %... items does not match the number of received parameters, printf has two possible behaviors: if the number is less, it just prints a part of the provided variables. If it is greater, it will print whatever values it finds on the stack as printf thinks it received enough parameters. The bottom line, this is incorrect and dangerous as it will leak information.

The second point is so dangerous that some compilers even emit a warning. In our example, it reads a string from the keyboard and provides it as the format. If this format contains only normal characters and no %... items, that is fine, it just prints the string. But what if the user enters %... items. The function will start behaving as if it received parameters and print data from the stack. Take a look at the run example below:

$ ./read_write
Enter your name: stack values %p %p %p
stack values 0x562bb0b076b1 (nil) 0x7f31b99e9980

Instead of a name, we entered format items and printf has just printed some information for us.

This is how attackers will use it.

The correct way of writing our example is:

#include <stdio.h>
 
#define LEN 100
 
int main () {
   char s[LEN];
   printf ("Enter your name: ");
   fgets (s, LEN, stdin);
   printf ("%s\n", s);
   return 0;
}

How is this safe?

The fgets function receives an additional argument in the length of the string. This will instruct fgets not to read more than the length of the string.

The format is specified as a literal constant that cannot be modified at runtime and the actual string variable is supplied as the first argument to be formatted.

Rust

This is how the same example looks like in Rust:

use std::io;

fn main () {
    let mut s = String::new();
    print! ("Enter your name: ");
    io::stdin().read_line(&mut s).unwrap();
    println! (s);
}
$ rustc read_write.rs 
error: format argument must be a string literal
 --> read_write.rs:7:15
  |
7 |     println! (s);
  |               ^
  |
help: you might be missing a string literal to format with
  |
7 |     println! ("{}", s);
  |               ^^^^^

error: aborting due to previous error

We made the same mistake as in the previous example. We just asked println! to print a string directly. In Rust, this is an error and the compiler even suggests how to do it properly.

The correct implementation of the print example is this:

use std::io;

fn main () {
    let mut s = String::new();
    print! ("Enter your name: ");
    io::stdin().read_line(&mut s).unwrap();
    println! ("{}", s);
}

How is this safer in Rust?

Rust uses slices instead of raw pointers. A slice is a reference to a string or an array, but unlike raw pointers, it stores the address of the portion of the string or array, and the length. A function receiving a slice as an argument is able to find out how long it is. This is one of the reasons why read_line knows how much space it has to read from the keyboard.

The second reason is that Rust has a String type. This is not just a simple characters array, but stores additional data like capacity and length. This allows read_line to allocate as much space as needed for the input string. Now pay attention to the unwrap part. We will discuss this in a further post, but for now, this behaves in the following way: if for some reason read_line encounters a problem and is not able to read a string (no input or memory allocation error), unwrap will panic the program. This means it stops the program and prints an error. This is very important as if the program would continue to run, the value of s would not be what the developer expects and the program might have undefined behavior.

Printing in Rust uses the println! macro. A Rust macro is an extension of the Rust language parser. The way println! is defined instructs the compiler to expect a literal string as its first argument. If we place a string slice as the first argument, the compiler will throw an error and refuse to produce the executable. This is not something that we can turn off from the command line. The program is simply wrong, it does not respect the language's rules.

Moreover, due to the same println! definition, the compiler will count the {} items in the format and the number of arguments supplied. If they do not match, it refuses to compile. Due to the way the Rust format system works, it splits the format string into parts at compile-time, rather than at run-time.

Why Rust?

We have shown in this post that a simple action like printing a name can be a security issue. Even though some of the newer C/C++ compilers might be able to mitigate these issues by warning the developer or performing some run-time checks, they allow the creation of the executable.

On the other hand, Rust will perform most of the checks at compile time and refuse to produce an executable.