Violating memory safety with Haskell's value restriction

tomsmeding 5 days ago

This caught me by surprise:

> Contrary to popular belief, unwrapping the IO constructor is deeply unsafe and can violate memory safety, even if State# tokens are never duplicated or dropped.

Does ANYONE believe that unwrapping the IO constructor is normal and safe? I must live in a very sheltered bubble. Isn't it extremely obvious that once you get that state function out of the IO constructor, you can build your own unsafePerformIO?

vilhelm_s 5 days ago

I mean, you definitely can't build your own unsafePerformIO without dropping the state token. I have never thought deeply about it, but if you had asked me I would probably have guessed that using the token linearly would be enough to ensure safety. That's not the same as it being "normal", of course.

jwatzman 5 days ago

In Rust, it's considered a bug for any code which isn't using `unsafe` to encounter a memory error (e.g., to segfault). That bug might be in some underlying library (which is itself using `unsafe`), or more rarely in the compiler, but it's a bug and not how Rust is supposed to work.

Does Haskell have any similar line? What is the property that code must have in order for it to be a bug to segfault? Must not call `unsafePerformIO`? Must not call `unsafeCoerce`? (Must not call any function with the `unsafe` prefix?)

In other words, is the segfault here to be considered a bug in the language -- or is unwrapping IO one of the things that, if you do it, you're own your own and may segfault? (Is part of the point of the article is that it is currently considered safe but should not be? Is that a bug in the language or in peoples' expectations?)

Or is a clear line like this not a notion that Haskell has? It's been a long time since I've done any Haskell, though I don't recall any clear guideline like this!

kqr 5 days ago

> is unwrapping IO one of the things that, if you do it, you're own your own
To be able to do it in the first place, I think you need to import libraries that expose compiler internals, so I would say it belongs in the "you're on your own" category, yes.
Also if you try to Google how to do it, every hit says "don't do it".
tomsmeding 5 days ago

To a certain extent, the line in Haskell is: don't use unsafePerformIO and unsafeCoerce. The tricky bit is that this line is not enforced by syntax or by the type system (unlike Rust, where you have a syntactic label `unsafe`). One generally puts "unsafe" before function names that have preconditions that are not expressed in their type, but this practice is not quite always adhered to -- though the worst offenders are reliably marked "unsafe".
- itishappy 5 days ago
  There's also this classic:
  accursedUnutterablePerformIO
  https://hackage.haskell.org/package/bytestring-0.11.4.0/docs...
- thesz 5 days ago
  > The tricky bit is that this line is not enforced by syntax or by the type system (unlike Rust, where you have a syntactic label `unsafe`).
  Safe Haskell: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/safe...
newpavlov 5 days ago

>In Rust, it's considered a bug for any code which isn't using `unsafe` to encounter a memory error (e.g., to segfault)
Teeechnically, it's not true. Unfortunately, you can trigger a memory error in safe code by overflowing stack by allocating big objects on stack, executing poorly written recursive code, or spawning a thread with small stack. In older Rust versions you literally got segfault in such cases.
- Rusky 5 days ago
  
  Isn't stack overflow made safe via guard pages and probes (on sufficiently high-tier target platforms)? That is you should get a guaranteed error, even if that is a segfault, and not memory corruption.
lmm 5 days ago

Haskell has a definition of Safe/Unsafe/Trustworthy. I would think/hope you can't import that RealWorld type from safe code.
- jrvieira 5 days ago
  
  https://wiki.haskell.org/Safe_Haskell
  unfortunately this is as far as that goes
IsTom 5 days ago

> is unwrapping IO one of the things that, if you do it, you're own your own and may segfault?
It's just not something you do, I don't think there is any specific reason to do that. And article itself says
> Using this constructor directly can be unsafe

internet_points 5 days ago

Interesting and slightly scary :-) I'll have to keep in mind that whenever I see `State#` I should think `unsafe`

ysangkok 5 days ago

It's not unboxed types that are unsafe, it is unpacking of IO.

nssnsjsjsjs 5 days ago

Very cool insight. Been a while since I haskelled but that behaviour wrt IO just feels intuitive but I wouldn't have been able to explain why like this.