This morning, while I was having breakfast with my delightful morning coffee, I started to wonder how different programming languages handled stack overflows. I was more curious on how interpreted or more "self-aware" languages handled such cases.
This is my collection of what I found out and learned in the process.
No, today I won't be talking about Stack Overflow. But rather about stack overflows.
First off, the "stack" can be explained as a special region of memory that handles "simple memory", which can be automatically created and freed in a scope. Temporary variables from functions are stored there.
The stack has a limited amount of memory it can store defined by the operating system. For example, if I run ulimit -a | grep 'stack size'
on my system, it prints out stack size (kbytes, -s) 2032
. This means that my system reserves approximately 2MB for each program's stack.
So, if we somehow use too much temporary memory, which is stored in the stack, and do not free it, the stack of our program will run out of space to store stuff. The program won't know where to store its temporary memory anymore. Then it just gives up and crashes, causing a stack overflow.
The most common way to cause a stack overflow is through tail recursions. Sure, it may be possible to cause an overflow through other ways, such as initializing a local variable bigger than the stack itself. But that can be easily avoidable and, in my opinion, doesn't serve as a good example.
function x(data) {
var some_other_data = data + 1;
x(some_other_data);
}
Here we have a pseudocode with function x. Function x calls itself to do a repetitive task that accumulates itself. Every time a call is made to the next function x a new fixed space on the stack is created to store memory needed by the next function x. But the problem is that the previous function x memory is still not freed from the stack, since it hasn't ended its execution. It just started the next function x within itself before its end.
This will happen with the next function x, which will not be freed when calling the next next function x, that will also not be freed when the next next next function x gets called before its end. This will happen until the amount of memory created for the multiple function xes adds up and overflow the stack.
C is a great language to demonstrate stack overflows. It doesn't give you any protections or abstractions. It's very easy to do a stack overflow, but not so easy to detect it.
#include <stdio.h>
void overflow(int data) {
int some_other_data = data + 1;
overflow(some_other_data);
}
int main() {
overflow(1);
printf("returned from overflow\n");
return 0;
}
When compiled and run, the program does nothing. It doesn't even prints returned from overflow
. It crashes without finishing its execution.
You can even see how many times overflow
is called adding printf("%d, ", some_other_data);
inside the overflow
function.
Note that when Clang is called with -Winfinite-recursion
or -Wall
it warns you about the possible infinite recursion.
overflow.c:3:25: warning: all paths through this function will call itself
[-Winfinite-recursion]
void overflow(int data) {
^
While GCC doesn't gives out any warnings or errors with neither -Wall
nor other flags for stack protection I found.
Let's use this opportunity to check out the better C (only half joking)!
package main
import (
"fmt"
)
func overflow(data int) {
some_other_data := data + 1
overflow(some_other_data)
}
func main() {
overflow(1)
fmt.Println("returned from overflow.")
}
I expect Go's GC to at least gracefully handle overflows...
runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc0200e1388 stack=[0xc0200e0000, 0xc0400e0000]
fatal error: stack overflow
runtime stack:
runtime.throw({0x10a59a9?, 0x112ae60?})
...
So it does handle it. It crashes with an understandable stack overflow error message and a long stack trace.
In Java we can have the privillege of some protection the JVM gives us. It's still easy to cause a stack overflow, though.
public class overflow {
public static void main(String[] args) {
overflow(1);
System.out.println("returned from overflow");
}
public static void overflow(int data) {
int some_other_data = data + 1;
overflow(some_other_data);
}
}
While returned from overflow
still isn't shown, the stack overflow is detected and an exception is thrown at runtime.
Exception in thread "main" java.lang.StackOverflowError
at overflow.overflow(overflow.java:11)
at overflow.overflow(overflow.java:11)
at overflow.overflow(overflow.java:11)
...
Fortunately, we can prevent the overflow with a try
block that catches StackOverflowError
.
public class overflow {
public static void main(String[] args) {
try {
overflow(1);
} catch (StackOverflowError e) {
System.out.println("a stack overflow was detected!");
}
System.out.println("returned from overflow");
}
public static void overflow(int data) {
int some_other_data = data + 1;
overflow(some_other_data);
}
}
Now we can stop and return from the overflow! It even prints out returned from overflow
!
It's also possible to see the amount of times overflow
is called before StackOverflowError
is thrown by adding System.out.print(Integer.toString(some_other_data) + ", ");
to the overflow
function. Note that this amount is way smaller than C. Probably because the JVM has an additional overhead that uses the stack frame.
If we are talking about Java, we can't not talk about Microsoft's Java, C#. Powered by the CLR, the language ironically, for our program, is faster and safer than Java.
namespace overflow;
class Program
{
static void Main(string[] args)
{
overflow(1);
Console.WriteLine("returned from overflow");
}
static void overflow(int data)
{
int some_other_data = data + 1;
overflow(some_other_data);
}
}
This program will quickly detect the stack overflow on runtime, report the function, and how many times it repeated. Very neat!
Stack overflow.
Repeat 24100 times:
--------------------------------
at overflow.Program.overflow(Int32)
--------------------------------
at overflow.Program.Main(System.String[])
The .NET documentation even tells you to keep track of recursions yourself.
Let's jump ahead to interpreted languages. Beginning with the such hated JavaScript.
function overflow(data) {
let some_other_data = data + 1;
overflow(some_other_data);
}
overflow(1);
console.log("returned from overflow");
On Node.js, the stack overflow is detected as a RangeError
and the program execution is stopped.
overflow.js:2
let some_other_data = data + 1;
^
RangeError: Maximum call stack size exceeded
at overflow (.../overflow.js:2:27)
at overflow (.../overflow.js:4:5)
at overflow (.../overflow.js:4:5)
...
On Deno it's the same thing, but with pretty colors (which you can't see here).
error: Uncaught RangeError: Maximum call stack size exceeded
let some_other_data = data + 1;
^
at overflow (file:///.../overflow.js:2:27)
at overflow (file:///.../overflow.js:4:5)
at overflow (file:///.../overflow.js:4:5)
...
The browser will also throw an error, but simpler.
Uncaught InternalError: too much recursion
overflow
Error handling on JavaScript can be poor at minimum. You can get out of the overflow
function by using a try/catch
error, although the browser thrown exception is different.
function overflow(data) {
let some_other_data = data + 1;
overflow(some_other_data);
}
try {
overflow(1);
} catch (e) {
// InternalError in the browser
if (e instanceof RangeError) {
console.log("a stack overflow was detected!");
}
}
console.log("returned from overflow");
Since JavaScript is weakly typed, the error is not guaranteed to be about a stack overflow. Both RangeError
and InternalError
can represent other errors.
Now for the simplest and most famous one: Python!
def overflow(data):
some_other_data = data + 1
overflow(some_other_data)
overflow(1)
print("returned from overflow")
Python throws an error and a traceback, stopping the program execution and showing how many times the recursive function repeated.
Traceback (most recent call last):
File ".../overflow.py", line 6, in <module>
overflow(1)
File ".../overflow.py", line 4, in overflow
overflow(some_other_data)
File ".../overflow.py", line 4, in overflow
overflow(some_other_data)
File ".../overflow.py", line 4, in overflow
overflow(some_other_data)
[Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded
And then, we can also detect the overflow at runtime with a try/except
block, catching RecursionError
.
def overflow(data):
some_other_data = data + 1
overflow(some_other_data)
try:
overflow(1)
except RecursionError:
print("a stack overflow was detected!")
print("returned from overflow")
I couldn't let the beautiful language that powers this website escape.
function overflow(data::Int)
some_other_data = data + 1
overflow(some_other_data)
end
overflow(1)
println("returned from overflow")
Julia gracefully handle a trace, throws a StackOverflowError and shows the amount of times the overflow
function repeats.
ERROR: LoadError: StackOverflowError:
Stacktrace:
[1] overflow(data::Int64) (repeats 79984 times)
@ Main .../overflow.jl:4
in expression starting at .../overflow.jl:7
It's interesting to notice that Julia repeats the function way more than C. When I added a print statement for some_other_data
Julia completely froze on 130,000 repeats. And in the error message (without a print statement) the times repeated amount was kept as a fixed amount on consequent runs. I have no idea why, though. I wonder if this can be attributed to Julia's incredible optimization (pre-compilation?).
The recursion code can then be returned through yet another try/catch
block that catches the previous StackOverflowError
.
function overflow(data::Int)
some_other_data = data + 1
overflow(some_other_data)
end
try
overflow(1)
catch e
if isa(e, StackOverflowError)
println("a stack overflow was detected!")
end
end
println("returned from overflow")
Now, coming back to statically compiled languages, there's Zig. Zig is quite simple, and I must admit I haven't got any experience with it past a "hello world", but it's certainly a language that's getting its properly deserved traction.
const print = @import("std").debug.print;
fn overflow(data: i32) void {
var some_other_data: i32 = data + 1;
overflow(some_other_data);
}
pub fn main() void {
overflow(0);
print("returned from overflow\n", .{});
}
Though it doesn't give you any warnings, when built and run the program eventually overflows and spits out a crash message mentioning a stack overflow and the segmentation fault memory address, before actually crashing.
Stack Overflow
Segmentation fault at address 0x2dbb600fe0
Segmentation fault at address 0xSegmentation fault
Interesting to notice that when print("{},", .{some_other_data})
is added to the overflow
function, we can visualize that Zig somehow reaches even more function repeats before crashing! The program crashes at roughly 260,000, the double from Julia.
Is C really getting beat up that hard? I must say that Zig peaked my interest after this experiment.
Nim was brand new for me. The syntax doesn't quite fit my style, but it's still a pretty impressive language, as we'll see.
proc overflow(data: int) =
var some_other_data = data + 1
overflow(some_other_data)
overflow(0)
echo "returned from overflow"
The program compiles fine with no warnings. Though when I ran it I ended up being impressed by Nim for giving me a traceback and an error message about a call depth limit with a helpful suggestion.
Traceback (most recent call last)
C:\...\overflow.nim(6) overflow
C:\...\overflow.nim(4) overflow
C:\...\overflow.nim(4) overflow
...
(1874 calls omitted) ...
...
C:\...\overflow.nim(4) overflow
C:\...\overflow.nim(4) overflow
C:\...\overflow.nim overflow
Error: call depth limit reached in a debug build (2000 function calls). You can change it with -d:nimCallDepthLimit=<int> but really try to avoid deep recursions instead.
It has a default limit for recursion calls on debug builds so we can investigate possible crashes!
Wait, the error message mentions a -d:nimCallDepthLimit
flag. What would happen if we set it to a big number? Well, it actually complains that it can't convert our input to int16. If it's a 16-bit number we should be able to set it to the maximum of 32,767.
If we do that, it just... crashes? I don't know, let's put a echo some_other_data
before the recursion to see where we can go.
...
16230
16231
16232
16233
Huh? It goes half way from our call depth limit and then silently crashes. At least it works for a more than big enough limit, I guess.
Well, wasn't this limit for debug build only? If we compile the program with -d:release
, we can see it gets pretty far (again, even further than C...). I could reach at least 130,000 function calls until it eventually crashes.
I kinda liked this way of warning a possible crash. Nim certainly has a better reputation in my book.
Yes! We finally reached the oh-so loved ultimate "safe" language of them all. Let's see how safe Rust is handling stack overflows as well.
fn overflow(data: i32) {
let some_other_data = data + 1;
overflow(some_other_data);
}
fn main() {
overflow(1);
println!("returned from overflow.");
}
So, how does Rust manages itself? Well, it doesn't consider it as an error, as in other more common memory-related mistakes. It emits a warning about the unconditional recursion on overflow
.
warning: function cannot return without recursing
--> overflow.rs:1:1
|
1 | fn overflow(data: i32) {
| ^^^^^^^^^^^^^^^^^^^^^^ cannot return without recursing
...
4 | overflow(some_other_data);
| ------------------------- recursive call site
|
= help: a `loop` may express intention better if this is on purpose
= note: `#[warn(unconditional_recursion)]` on by default
warning: 1 warning emitted
And if the program runs, it crashes with a stack overflow error!
thread 'main' has overflowed its stack
Not too bad.
We can see that the majority of the big languages nowadays do have some kind of protection, warning, exception or runtime error for stack overflows. And even with help or not, you should be able to easily avoid deep recursions and giant stack allocations. A problem that causes a stack overflow wil probably be noticeable before it's too late.
It's worth mentioning languages that caught me by surprise: C# for its clean error message, Zig and Julia for its speed (is it really speed?), and Nim for sensibly detecting a crash.
The most disappointing is... C. It's at least understandable, it does what it is asked to and nothing more. The small ammount of funtion calls is still a mistery though. Given that, I won't ever stop loving and using C.
At the end, I loved testing this out on different languages, especially those I didn't had much experience in. I do think I could add more details and other ways to explore and cause stack overflows, but I think this is enough for a start. Otherwise, I'd lose interest and wouldn't finish this post anytime soon.
This was a fun and great experience for my proper first blog post. And of course, thank you for even taking your time to open this post. Later! :)
[1] | Amount of calls is merely illustrative. I made this in a couple hours on GIMP, please don't judge :^P |