The Pre-Scheme Restoration

201 points by nickmain a year ago

Fantastic news! This is a really interesting place in the design space and has come so close to being lost to history.

I believe the idea is essentially to write C semantics in scheme notation. Variables get marked with 'u32' or similar instead of being implicit sum types of anything the language can represent, memory allocation is explicit instead of garbage collected. In itself that essentially means writing C syntax trees in prefix notation, which is probably an acquired taste.

However scheme also comes with the compile time macro layer and that lot runs just fine in pre-scheme, garbage collected and all, because it's burned off before runtime anyway. Specifically, it's wholly macro-expanded before compilation to C (or similar), which is the obvious lowering to use for execution.

Also scheme has tooling, so if you're careful, the type annotated Cish syntax trees execute correctly as scheme, so you can debug the thing there, unit test it from scheme and so forth.

I really like it as a path to writing lisp runtimes in something that isn't C since an alarming fraction of them turn out to have a C runtime library at the bottom of the stack. Also for writing other things that I tend to write in C, where it's really the semantics I want and the syntax getting in the way.

jaccarmac a year ago

Smalltalk can use a similar bootstrapping method, which blew my mind the first time I read and understood (see https://ftp.squeak.org/docs/OOPSLA.Squeak.html, "Smalltalk to C Translation"). In that world, images can easily obscure some of the lineage. Scheme has the advantage of having a standard to hold dialects to. I am watching this project with interest.
fire_lake a year ago

This is (potentially) so much better than complex C preprocessor or C++ template complexity.
chubot a year ago

This is similar to how https://www.oilshell.org/ is written
There are two complete implementations
1. one that runs under a stock Python interpreter (which doesn't use static types)
2. one that's pure C++, translated from statically typed Python code, and from data structures generated in Python
In the second case, everything before main() is "burned off" at build time -- e.g. there is metaprogramming on lexers, flag parsers, dicts, etc. that gets run and then turned into static C data -- i.e. data that incurs zero startup cost
Comparison to Pre-Scheme: https://lobste.rs/s/tjiwrd/revival_pre_scheme_systems_progra... (types, compiler output, and GC)
Brief Descriptions of a Python to C++ Translator - https://www.oilshell.org/blog/2022/05/mycpp.html
...
And related to your other point, I remember looking at Racket's implementation around the time it started the Chez Scheme conversion. For some reason, I was surprised that it was over 100K lines of hand-written C in the runtime -- it looked similar to CPython in many ways (which is at least 250K lines of C in the core).

paroneayea a year ago

Pre-Scheme is an incredible piece of history, largely forgotten and lost to time outside of a very small group that knew about it. Live hackable at the REPL, and yet with static type inference (Hindley-Milner!), compiles to C, no GC? It's something I've always wanted, and it existed, but it felt like one of those lost pieces of technology that was at risk of fading into the dustbin of history.

But no more! It's so exciting that Andrew Whatson has begun reviving the project with such great enthusiasm and making it so that Pre-Scheme can run on top of a variety of Schemes. And it's wonderful that NLnet has recognized how important this effort is. I think Pre-Scheme could play an interesting role alongside Zed and Rust, and indeed I know that Andrew plans to incorporate many of the newer ideas explored in those languages on top of Pre-Scheme eventually.

Go Pre-Scheme revival... I'm cheering it on, and can't wait to use this stuff myself!

voidhorse a year ago

https://github.com/carp-lang/Carp is a recent attempt to create a similar language, but with a Rust-inspired borrow checker. though it looks like prescheme would end up being less dependent on C ultimately, this is another option in the space.

dleslie a year ago

Without a new maintainer and perhaps a group of dedicated supporters, Carp may as well be considered abandoned. The existing head maintainer and major contributor has moved on to other things.
https://github.com/carp-lang/Carp/issues/1460#issuecomment-2...
dualogy a year ago

Stumbled upon this a while ago while looking for "a systems Scheme", but abandonware in my book given the most recent commit date (and GH Issue).
Settled on Gerbil Scheme instead, lively community & been actively developed to this day for over 15 years now. Although fair warning, still GC'd and (for now) only type-annotated, not (100%) statically typed. But stdlib-wise and compilation-wise still way more "systems-bent" than most Schemes out there.
- davexunit a year ago
  
  Guile is quite systems focused, as well, what with all the POSIX stuff it exposes. But neither Guile nor Gerbil can be used to implement their own runtimes. You need to write the GC, somehow. This is why Pre-Scheme exists.

davexunit a year ago

I am so excited for a Lispy systems language. Existing languages just don't do it for me, though I think Zig is the closest to being what I'm into. So much good stuff in Scheme48. Glad the good ideas are being revived.

dualogy a year ago

> I am so excited for a Lispy systems language.
Can recommend Gerbil Scheme. Although fair warning, still GC'd and (for now) only type-annotated, not (100%) statically typed. But stdlib-wise and compilation-wise still way more "systems-bent" than most Schemes out there.
- dang a year ago
  
  Related:
  Gerbil Scheme – A Lisp for the 21st Century - https://news.ycombinator.com/item?id=39809323 - March 2024 (126 comments)
  Gerbil – A meta-dialect of Scheme - https://news.ycombinator.com/item?id=20585637 - Aug 2019 (17 comments)
  Gerbil Scheme - https://news.ycombinator.com/item?id=17707622 - Aug 2018 (9 comments)
  Gerbil – An opinionated dialect of Scheme designed for systems programming - https://news.ycombinator.com/item?id=15394603 - Oct 2017 (78 comments)
- nerdponx a year ago
  
  Gerbil is based on Gambit, right? Have you tried Gambit itself? I'm curious how they compare.
  - dualogy a year ago
    
    Haven't, so no hard "comparison" results to offer — but Gambits libs are included in Gerbil and readily importable or some of them (maybe all? dunno) auto-imported, and Gambit is Gerbil's compilation foundation essentially AFAIK. Where it differs from or expands upon the Gambit basis is "our own macro expander" (haven't particular gotten into that area of understanding yet tho) and the extremely-modern-real-worldish stdlib (http, json, actors, db driver bindings etc), perhaps other aspects too (ie. might cover further SRFIs beyond what Gambit does, dunno for sure though; ie. the FFI might be beyond Gambit's or not, again I wouldn't know).
samatman a year ago

I welcome any and all experiments in the low-level programming space, and writing a good one with Lisp syntax is an obvious approach. Carp (mentioned in the article) was a good start, but seems to have stalled out, and while it's easy to see the advantages of bootstrapping from an ML language, I view it as essential to the philosophy that the compiler itself be in a broadly-compatible Lisp syntax, compatible, that is, to the resulting language or sub-language.
But I also think it will exacerbate an existing problem with C, namely, macros. Low-level programming is all about knowing exactly what's going on, and since C has a preprocessor, that's more difficult than it otherwise would be. Just because something looks like a function call, doesn't mean it actually is one.
Schemes have a much better macro system, and that will simultaneously make the core issue both better, and worse. But it's very much worthwhile to try it, imho, and see if good tooling can ameliorate the downsides, while still enjoying the power, and freedom from tedium, which macros bring to the table.

giraffe_lady a year ago

Scheme with HM type system sounds fun. I've used ocaml a fair bit and I really find that the sweet spot for effectiveness of types vs arguing with the compiler. Racket and common lisp both have optional type systems but neither ever really clicked with me.

Y_Y a year ago

Have you tried Alexis King's Hackett? It was an experiment in coercing Haskell semantics into lisp syntax and it came out really nicely.
https://lexi-lambda.github.io/hackett/
- giraffe_lady a year ago
  
  Totally new to me, thanks! It looks really fun.
brabel a year ago

Common Lisp has Coalton [1]. It's basically a functional Lisp embedded within Common Lisp which has HM types and a bit more modern constructs than CL.
[1] https://coalton-lang.github.io/

dang a year ago

Pre-scheme: A Scheme dialect for systems programming (1997) [pdf] - https://news.ycombinator.com/item?id=29725313 - Dec 2021 (12 comments)

(surprised there hasn't been more)

roleks a year ago

few days I've spent a few hours with prescheme, but was stopped in the end because there where gcc errors. I've felt a little guilty to have have spent some quite some time, but not have achieved anything. But thats the thing with Prescheme, so fascinating I could not resist. I mean look at its history, at all the cool and unique features. Anyway, very cool to read this news, I'm also a little relieved not to have burnt all this hours into nothings. Very glad to see the story continues.

ethagnawl a year ago

> On another front, the Guix project is a major force bringing new users to Scheme, providing an unparalleled foundation for free and reproducible computing.

The Nix/OS folks might take exception. I'm guessing this is tongue-in-cheek but it belies the tone of the rest of the post.

In all seriousness, though, this is exciting from a modern, end-user's vantage point and fascinating from an historical perspective.

ryukafalz a year ago

It doesn't read as tongue-in-cheek to me. NixOS does not have an equivalent to Guix's full-source bootstrap mentioned in the next sentence: https://guix.gnu.org/blog/2023/the-full-source-bootstrap-bui...
Nixpkgs also doesn't seem to require that all packages be built from source - which, if you're really looking for reproducibility, is a downside. I recognize that there are practical reasons for this, and it's part of why Nix has so many more packages available than Guix, but IMO it makes Guix a better foundation to build on if you want as much of your system as possible to be reproducible.
- Y_Y a year ago
  
  There is also the "secret" nonguix channel which packages nonfree things for Guix: https://gitlab.com/nonguix/nonguix
  It's a funny problem but because it's antithetical to the original project's spirit you won't hear about it from any official Guix sources and so it's relatively unknown.
  - sham1 a year ago
    
    Nonfree software is a major part of nonguix, both because of the ethical problems which are the raisons d'être of the GNU project, there's also the more practical consequence drom the nonfree nature, that you can't bootstrap the binaries any you can't know their provinence beyond "it's from the vendor".
    However just to clarify for others, it's not the only thing there of course. There is free software in nonguix, maybe because it's PITA to bootstrap, like for example Leinigen and other parts of the Clojure ecosystem, as well as everything and anything written using Electron. And of course notable free software things there are also the blobbed Linux kernel (probably obvious reasons), as well as Firefox, since Mozilla has some interesting trademark opinions, so you can't have it on the main Guix channel.
  - ryukafalz a year ago
    
    I wouldn't say it's relatively unknown, I see it come up just about every time Guix comes up in discussions here. And I'm glad nonguix exists, for what it's worth.
    But it's helpful to have Guix itself aim for reproducibility even if nonguix exists, so you can install upstream Guix alone if you're looking for reproducibility.
- rssoconnor a year ago
  
  > Nixpkgs also doesn't seem to require that all packages be built from source - which, if you're really looking for reproducibility, is a downside.
  Does Guix not have GHC (Glasgow Haskell Compiler) or did it somehow bootstrap GHC? Last time I checked bootstrapping GHC on today's hardware is effectively an unsolved problem. [1]
  > NixOS does not have an equivalent to Guix's full-source bootstrap
  While you are not wrong, there is nothing fundamentally stopping Nixpkgs from being bootstrapped in a similar way to Guix. emilytrau has already done a lot of the work. [2]
  [1] https://elephly.net/posts/2017-01-09-bootstrapping-haskell-p... [2] https://github.com/NixOS/nixpkgs/pull/227914
  - ryukafalz a year ago
    
    > Does Guix not have GHC (Glasgow Haskell Compiler) or did it somehow bootstrap GHC? Last time I checked bootstrapping GHC on today's hardware is effectively an unsolved problem.
    I think you're right, it looks like they've gotten a little further now than in that post but there's still a gap in the bootstrap chain. So maybe not every package is fully bootstrapped, but they do seem to take it more seriously.
    > While you are not wrong, there is nothing fundamentally stopping Nixpkgs from being bootstrapped in a similar way to Guix. emilytrau has already done a lot of the work.
    Yes, I agree, and I hope they get there! I just also think that acknowledging the places where Guix is currently ahead isn't wrong. Nix isn't the only game in town anymore.
    
    rssoconnor a year ago
    
    > it looks like they've gotten a little further now than in that post
    Can you say more, or provide any references? I would be interested in the state of the art here.
    
    ryukafalz a year ago
    
    There's a comment in the Guix source (added by the author of that first post you linked, Ricardo Wurmus) that seems to indicate that they found a way around the segfault problem described in the post by using a registerised version of GHC: https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages...
    I had to look up exactly what this means, not being very familiar with the Haskell ecosystem myself. It looks like it's not the raw source form and is architecture-specific, but it's also not the compiled binary form. So that's not perfect, but better than relying on the compiled binaries I guess. (Unfortunate for me since my laptop is ARM and I'd like to be able to use git-annex, haha.) But this seems to work for older versions of GHC.
    This post by Simon Tournier from last year describes the current situation near the bottom, and from what I can tell this is still correct: https://simon.tournier.info/posts/2023-10-01-bootstrapping.h...
    > The bootstrapping problem for Haskell is not solved. And Ricardo works hard on it. Currently, from the older GHC around (4.08.2), which relies on gcc-2.95 – part of the Bootstrapping story above – it is possible to chain until version 6.10.4. Then versions 6.12.3 and 7.4.2 are not packaged yet for completing the Haskell chain from version 4.08.2 to modern version as 9.2.5; fully connecting the dots with bootstrap-seeds and dropping these 450MiB of binaries. The solution of this chicken-or-the-egg is not yet complete.
    
    rekado a year ago
    
    The technical details of the quote are correct, but FWIW I'm no longer working on the GHC bootstrap. It's fun for a while but the lack of interest in the Haskell community and the general high level of ridicule and hostility from the rest of the software world towards all things GNU / free software / bootstrapping have kinda turned me off the whole computer thing.
    
    ryukafalz a year ago
    
    That's unfortunate to hear, I'm sorry you've had to deal with that :(
  - rekado a year ago
    
    I've built up to GHC 6 (I've stopped after reaching 6) from GHC 4. GHC 4 does use some generated C files, so it's not a pure bootstrap, but it's still much better than taking a binary of GHC 6 or later.
    (I'm the author of the 2017 blog post. I had planned a follow-up but since I didn't have much to show I scrapped it.)
    
    rssoconnor a year ago
    
    Is this committed to Guix and/or otherwise available somewhere?
    
    rekado a year ago
    
    It's part of Guix. It starts here: https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages...
- 3836293648 a year ago
  
  Nix doesn't require everything to be built from source, sure, but everything downloaded must match a provided hash. What's the difference between downloading source code and binaries at that point?
  - ryukafalz a year ago
    
    It's easier to audit source code than binaries, and easier to audit it once than once for each architecture.
    
    3836293648 a year ago
    
    Auditing is irrelevant to whether or not it's reproducable, which was the question here.
    You also forgo any improvements to compiler improvements

troad a year ago

This is very cool. I've added it to my RSS reader, can't wait to see what comes of it.

Genuine question: would there be any advantages in targeting LLVM IR, rather than transpiling to C? With C being notoriously implementation dependent (down to things like the sizes of integer types), it seems like a messy target for something intended to be a sane systems language.

gergo_barany a year ago

C has had fixed-size integers since C99: https://en.m.wikibooks.org/wiki/C_Programming/inttypes.h
Targeting LLVM IR has the drawback that it is not platform independent: Details of calling conventions must be modeled in the IR, so the compiler must know what ABI it is targeting and emit the appropriate code. Compiling to C doesn't have this problem, since the C compiler will handle calling conventions for you.
That said, LLVM would indeed have some advantages. Scheme has guaranteed tail call optimization, which you cannot guarantee with C. But LLVM does allow you to annotate calls as tail calls, and it can transform tail self-recursion into a loop for you.
- troad a year ago
  
  > C has had fixed-size integers since C99: https://en.m.wikibooks.org/wiki/C_Programming/inttypes.h
  Good point! You don't tend to see them too much in the wild, but they're available, which is good enough for present purposes.
  > Targeting LLVM IR has the drawback that it is not platform independent: Details of calling conventions must be modeled in the IR, so the compiler must know what ABI it is targeting and emit the appropriate code. Compiling to C doesn't have this problem, since the C compiler will handle calling conventions for you.
  Ooft, that would be a rough one. It kind of seems like there'd be some benefit to a low-level IR that's neither platform / implementation-specific, nor has the warts of C, but I appreciate that'd be well outside the scope of this project.
  > That said, LLVM would indeed have some advantages. Scheme has guaranteed tail call optimization, which you cannot guarantee with C. But LLVM does allow you to annotate calls as tail calls, and it can transform tail self-recursion into a loop for you.
  I suppose this will need be handled manually in the Pre-Scheme transpiler itself. Losing TCO seems like it ought to be a non-starter for anything Scheme-like.
  - 392 a year ago
    
    I wonder if a subset of Rust would be a good candidate. It would certainly involve hardcoding the templates for some `unsafe` incantations into the compiler in order to treat Rust as remotely equivalent to a C target, which is beyond me, but _if_ it could be done the opportunity to reuse all of the Rust ecosystem would be killer. Heck, even generating only safe Rust with runtime overhead might be a fast enough of a runtime for an experiment with this.
  - gergo_barany a year ago
    
    WebAssembly fans might say that it's the kind of universal low-level but portable IR that you envision. It would certainly make sense for a Scheme compiler to consider a WebAssembly target.

trealira a year ago

According to the article: thanks to a grant from the NLnet foundation under the NGI Zero Core program, Pre-Scheme can continue to be developed. It's supposed to be a C alternative. Currently, it compiles to C, has a Hindley Milner type system, macros, and it can run in a Scheme REPL. And they have a roadmap of features now.

This is pretty cool, and it's generous of them to grant them funding, but (and I'm not trying to be rude) I wonder why they chose to give a grant for Pre-Scheme specifically. This seems only loosely related to the goals of the NGI Zero Core program (linked in the article):

"The next generation internet initiative envisions the information age will be an era that brings out the best in all of us. We want to enable human potential, mobility and creativity at the largest possible scale – while dealing responsibly with our natural resources. In order to preserve and expand the European way of life, the programme helps shape a value-centric, human and inclusive Internet for all."

...

"We want a more resilient, trustworthy and open internet. We want to empower end-users. Given the speed at which the 'twin transition' is taking place, we need a greener internet and more sustainable services sooner rather than later. Neither will happen at global scale without protocol evolution, which — as the case of three decades of IPv6 introduction demonstrates — is extremely challenging. NGI0 Core is designed to push beyond the status quo and create a virtuous cycle of innovation through free and open source software, libre hardware and open standards. If we want everyone to use and benefit from the internet to its full potential without holding back, the internet must be built on strong and transparent technologies that allow for permissionless innovation and are equally accessible to all."

paroneayea a year ago

Pre-Scheme is one more path to moving away from low level programming being done on top of directly programmed C for one thing, and the revival effort directly ties in by moving Pre-Scheme to be on top of r7rs, an open standard. This opens up Pre-Scheme to a variety of other ecosystems that NLnet already invests in, including Guix, Mes, and Guile, which have put a lot of efforts into secure and highly reproducible (and indeed bootstrappable) computing. There's definitely some ties in with security and security-oriented communities NLnet already funds, and this project directly works towards moving towards a more standardized approach, leading to hopefully broader adoption.
- trealira a year ago
  
  I see, thanks for contextualizing it for me; I hadn't known about the ecosystems NLnet is involved in.
- dg_meme a year ago
  
  [dead]
392 a year ago

Another way to look at it is a long shot on a new higher level language to entice the folks currently reaching for Java, Python, or other runtime interpreters. An easy and safe language that shifts more work to compile time and offers high performance with ergonomic use will use less power, reduce the education requirements to writing fast code, and potentially change the world like another fast memory safe language governments are increasingly endorsing. Or if nothing else it may serve as a language for interpreters of those languages, or to make extensions to them more ergonomic. With enough scale the improvements can come very incrementally.

Y_Y a year ago

NLNet is doing god's work funding really cool projects that would have a very hard time justifying their existence to some mainstream donors.

I dream of some day soon running Emacs/Guix/Hurd on an open RISC-V chip and not having it be some flossy novelty but a genuine spiritual successor to Genera and the Lisp Machines.

a2code a year ago

Tail-call optimization is very important when writing Scheme programs. By removing those, you loose the power of recursion.

Also when it comes to macros, does that include `syntax-rules` or `syntax-case` style macros, where the latter are much more powerful?

While an embedded Scheme-like language is incredibly useful, at some point I feel as if you would simply have to include these features, and to that end it would just be Scheme reinvented.

dg_meme a year ago

Please don't write "thanks to a generous grant from the NLnet foundation under the NGI Zero Core program". The most of money comes from the European Commission through the Horizon Europe / NGI funding schemes. NLNet is mainly the operator of the call.

paroneayea a year ago

The European Commission deserves thanks for funding the commons, a thing that rarely happens by governments, but should! NGI Zero was thanked though?
NLnet being the operator of the call is no small thing though, having been through the process they are very thoughtful, knowledgeable, thorough in how they run things. They even run the software they fund and verify it's working and check that the overall ideas are sensible, which is something I can't say of many other grant programs I've interacted with. So NLnet does deserve thanks.

tmtvl a year ago

See also (kinda, maybe) c-mera: https://github.com/kiselgra/c-mera

hayley-patton a year ago

The hell of the systems language is the systems, not that it has infix syntax.

neilv a year ago

When they write this:
> Scheme syntax, with full support for macros,
you can read that not as that Scheme prefix syntax alone is a big selling point, but the fact that it then supports Scheme macros (which are much better than in most other languages that support some kind of macros, partly due to the syntax making this easier).
Then you can read the rest of the sentence, for a bonus:
> and a compatibility library to run Pre-Scheme code in a Scheme interpreter.
Which means that you can do things like develop using this language within a normal Scheme development environment, possibly share code between developing for the PreScheme compiler target and non-PreScheme targets, etc.
- hayley-patton a year ago
  
  > but the fact that it then supports Scheme macros
  Good for them.
  > possibly share code between developing for the PreScheme compiler target and non-PreScheme targets
  "possibly" is a strong word, seeing that Pre-Scheme is a statically typed, explicitly memory managed subset and all. There's a very large and coarse-grained semantic leap.
  Then you can read the rest of <https://www.steveblackburn.org/pubs/papers/vmmagic-vee-2009....>, for a bonus.
bitwize a year ago

Pre-Scheme can be run in a Scheme system, then when it is found to be correct there, compiled VERY straightforwardly to C. This is a huge win in terms of productivity. Plus, at the top level, at compile time, you have all of Scheme available.
kagevf a year ago

It looks like with this, you get macros and a repl, though.
Zambyte a year ago

Depends on if your "systems programming" activities consist of compiling Scheme or not.

pjc50 a year ago

A Hindley-Milner typed language with rigorous semantics for targeting native platforms? Amazing! Pity about the syntax.