> Contrary to popular belief, unwrapping the IO constructor is deeply unsafe and can violate memory safety, even if State# tokens are never duplicated or dropped.
Does ANYONE believe that unwrapping the IO constructor is normal and safe? I must live in a very sheltered bubble. Isn't it extremely obvious that once you get that state function out of the IO constructor, you can build your own unsafePerformIO?
I mean, you definitely can't build your own unsafePerformIO without dropping the state token. I have never thought deeply about it, but if you had asked me I would probably have guessed that using the token linearly would be enough to ensure safety. That's not the same as it being "normal", of course.
In Rust, it's considered a bug for any code which isn't using `unsafe` to encounter a memory error (e.g., to segfault). That bug might be in some underlying library (which is itself using `unsafe`), or more rarely in the compiler, but it's a bug and not how Rust is supposed to work.
Does Haskell have any similar line? What is the property that code must have in order for it to be a bug to segfault? Must not call `unsafePerformIO`? Must not call `unsafeCoerce`? (Must not call any function with the `unsafe` prefix?)
In other words, is the segfault here to be considered a bug in the language -- or is unwrapping IO one of the things that, if you do it, you're own your own and may segfault? (Is part of the point of the article is that it is currently considered safe but should not be? Is that a bug in the language or in peoples' expectations?)
Or is a clear line like this not a notion that Haskell has? It's been a long time since I've done any Haskell, though I don't recall any clear guideline like this!
> is unwrapping IO one of the things that, if you do it, you're own your own
To be able to do it in the first place, I think you need to import libraries that expose compiler internals, so I would say it belongs in the "you're on your own" category, yes.
Also if you try to Google how to do it, every hit says "don't do it".
To a certain extent, the line in Haskell is: don't use unsafePerformIO and unsafeCoerce. The tricky bit is that this line is not enforced by syntax or by the type system (unlike Rust, where you have a syntactic label `unsafe`). One generally puts "unsafe" before function names that have preconditions that are not expressed in their type, but this practice is not quite always adhered to -- though the worst offenders are reliably marked "unsafe".
>In Rust, it's considered a bug for any code which isn't using `unsafe` to encounter a memory error (e.g., to segfault)
Teeechnically, it's not true. Unfortunately, you can trigger a memory error in safe code by overflowing stack by allocating big objects on stack, executing poorly written recursive code, or spawning a thread with small stack. In older Rust versions you literally got segfault in such cases.
Isn't stack overflow made safe via guard pages and probes (on sufficiently high-tier target platforms)? That is you should get a guaranteed error, even if that is a segfault, and not memory corruption.
Very cool insight. Been a while since I haskelled but that behaviour wrt IO just feels intuitive but I wouldn't have been able to explain why like this.
This caught me by surprise:
> Contrary to popular belief, unwrapping the IO constructor is deeply unsafe and can violate memory safety, even if State# tokens are never duplicated or dropped.
Does ANYONE believe that unwrapping the IO constructor is normal and safe? I must live in a very sheltered bubble. Isn't it extremely obvious that once you get that state function out of the IO constructor, you can build your own unsafePerformIO?
I mean, you definitely can't build your own unsafePerformIO without dropping the state token. I have never thought deeply about it, but if you had asked me I would probably have guessed that using the token linearly would be enough to ensure safety. That's not the same as it being "normal", of course.
In Rust, it's considered a bug for any code which isn't using `unsafe` to encounter a memory error (e.g., to segfault). That bug might be in some underlying library (which is itself using `unsafe`), or more rarely in the compiler, but it's a bug and not how Rust is supposed to work.
Does Haskell have any similar line? What is the property that code must have in order for it to be a bug to segfault? Must not call `unsafePerformIO`? Must not call `unsafeCoerce`? (Must not call any function with the `unsafe` prefix?)
In other words, is the segfault here to be considered a bug in the language -- or is unwrapping IO one of the things that, if you do it, you're own your own and may segfault? (Is part of the point of the article is that it is currently considered safe but should not be? Is that a bug in the language or in peoples' expectations?)
Or is a clear line like this not a notion that Haskell has? It's been a long time since I've done any Haskell, though I don't recall any clear guideline like this!
> is unwrapping IO one of the things that, if you do it, you're own your own
To be able to do it in the first place, I think you need to import libraries that expose compiler internals, so I would say it belongs in the "you're on your own" category, yes.
Also if you try to Google how to do it, every hit says "don't do it".
To a certain extent, the line in Haskell is: don't use unsafePerformIO and unsafeCoerce. The tricky bit is that this line is not enforced by syntax or by the type system (unlike Rust, where you have a syntactic label `unsafe`). One generally puts "unsafe" before function names that have preconditions that are not expressed in their type, but this practice is not quite always adhered to -- though the worst offenders are reliably marked "unsafe".
There's also this classic:
https://hackage.haskell.org/package/bytestring-0.11.4.0/docs...>In Rust, it's considered a bug for any code which isn't using `unsafe` to encounter a memory error (e.g., to segfault)
Teeechnically, it's not true. Unfortunately, you can trigger a memory error in safe code by overflowing stack by allocating big objects on stack, executing poorly written recursive code, or spawning a thread with small stack. In older Rust versions you literally got segfault in such cases.
Isn't stack overflow made safe via guard pages and probes (on sufficiently high-tier target platforms)? That is you should get a guaranteed error, even if that is a segfault, and not memory corruption.
Haskell has a definition of Safe/Unsafe/Trustworthy. I would think/hope you can't import that RealWorld type from safe code.
https://wiki.haskell.org/Safe_Haskell
unfortunately this is as far as that goes
> is unwrapping IO one of the things that, if you do it, you're own your own and may segfault?
It's just not something you do, I don't think there is any specific reason to do that. And article itself says
> Using this constructor directly can be unsafe
Interesting and slightly scary :-) I'll have to keep in mind that whenever I see `State#` I should think `unsafe`
It's not unboxed types that are unsafe, it is unpacking of IO.
Very cool insight. Been a while since I haskelled but that behaviour wrt IO just feels intuitive but I wouldn't have been able to explain why like this.
[dead]
[dead]