Ask HN: Why is Cursor IDE accessing all my env vars?
Recently I tried playing with https://www.cursor.com/ but got spooked by the LuLu alert (https://objective-see.org/products/lulu.html) when launching the app, where the process args included "JSON.stringify(process.env)" part, see screenshot here: https://imgur.com/a/DmDuGTz
Is this... normal? I don't understand why they might want to serialize/access all of my env vars. Does anyone have a suggestion for that behaviour? I'm probably missing some reasonable explanation, happy to learn more.
I've been running a lot of stuff in VMs lately anyway, but don't want to end up having to spin up a VM for the core app like a code editor. How do you all deal with untrusted (but not really malware-level untrusted) software?
> Is this... normal? I don't understand why they might want to serialize/access all of my env vars. Does anyone have a suggestion for that behaviour?
All processes get a copy of all environment variables [edit for clarity: all environment variables, from the global environment].
Unless one goes out of one's way to prevent this from happening.
> the process args included "JSON.stringify(process.env)" part
And this app choses to receive the env vars in a JSON format. NBD really, in light of the above points.
Environment variables are not secret at all. Quite the opposite: because all processes get a copy of them. They're just variables that are associated with- / stored in- the environment, instead of e.g. in code itself. They absolutely should not be considered to be secure in any way.
Managing secrets is always tricky. Even a naive attempt at trying to avoid using env vars generally leaks stuff in some way - shell command history will record secrets passed-in at launch time, plus any running process (with sufficient permissions) can get a list of running processes, and can see the command line used to invoke a process.
And once one gets past the naive solutions, it usually adds some friction somewhere along the line. There's no easy, transparent, way to do things, as far as I am aware. They all have some cost.
There are quite a few articles on the web about stuff this topic as a whole. I don't think anything particularly new will come from HN users here, it'll mostly be repeating the same already known/discussed stuff. As I myself am doing here, really.
You might find it helpful to consider something like Hashicorp's Vault, or similar, for proper management of secrets.
Using env vars for secrets has become semi-normalised because of container-based development and deployment. It's okay-ish in the limited context and scope of a container, but it's not good at all in a host OS or VM context. Some dev practices have leaked through, possibly because it's an approach that works in all environments even if it's not best practice
Thanks, I appreciate the detailed explanation.
I'm familiar with Vault and been using that at work — but we tend to fetch values from Vault and export them as env variables in the end anyway. Obviously we don't want to hardcode these values in the code either. So env vars are not good for secrets, hardcoding is terrible — what's good/secure then?
Env vars are fine for secrets, as long as you provide the right env vars to the right processes. You can unset them before launching a new process, or better still, not "export" the sensitive ones to all processes.
This ^^
Just avoid putting secrets in the global environment if it is a concern, and instead just pass necessary secrets locally in the environment when launching a specific app.
Ideally you would fetch values directly from Vault, e.g. using the REST API, ideally with SSL (but that depends on the environment your app is running in /etc.) or using the vault command.
One can either access the Vault REST API directly inside the app itself, or one can pull data from it in a script file that launches the app, etc. and set any necessary environment vars dynamically before launching the app.
e.g. in a launch script you might do something like (sorry, no idea how to do preformatted text on HN) :
SOME_KEY=$(curl [access-your-vault-appropriately-here-using-access-tokens-etc] | jq whatever)
Or, in wrapper launch scripts, instead of using the REST API directly with curl, instead use the vault command directly, if it's installed, e.g.
SOME_KEY=$(vault kv get foo/whatever)
Although you'd also need to do some calls upfront first, to authenticate and get an access token, before querying for data/secrets.
But doing these kinds of calls, in the global environment gives those secrets to, well, everything in the global environment.
If you need to pass a vault secret to some specific app, then you want to read from the vault as close to that app's launch as possible, e.g. in a wrapper script that launches that app (instead of launching it 'naked', and leaving it to read from global environment) - or by actually accessing the vault directly from within the app (which isn't gonna be possible with third-party stuff, unless it already supports your vault natively)
Environment variables exist to share information with processes spawned in that environment. If you don’t want the process to know something, you could look into using something like “env” to spawn the process with a reduced environment, but in general it’s good hygiene not to have anything in the environment that you wouldn’t feel comfortable with a process reading.
> you could look into using something like “env” to spawn the process with a reduced environment
Can you elaborate/show an example?
I agree in general to limit sensitive data stored in env vars, but often it's not feasible. One might have e.g. AWS keys loaded when using Terraform. The problem in this particular case is that Cursor is not selective in what env vars are loaded, it serializes _everything_. That seems like an odd choice to me.
> The problem in this particular case is that Cursor is not selective in what env vars are loaded, it serializes _everything_. That seems like an odd choice to me.
If you want to stop it having access to your secrets, that is done by not providing them, not by hoping that it won't access the secrets that you give it. The focus on what Cursor is doing after it already has access that you didn't mean to give it, instead of on what you are doing to provide that access to it, seems the wrong way around.
I'm very confused here, the argument is just JavaScript code... What is passing that argument?
Assuming that JavaScript is evaluated but the launched process, that will serialise the entire environment -- which is passed to every process anyway.
I'm confused too :). The LuLu alert says [0]:
The alert showed up right after I installed the app and clicked on the Cursor icon [1]. I understand that processes can access the entire environment by default (which was always a bit too "open"/strange for my taste, but I get the reason) but then I don't understand why you need to use `JSON. stringify(process.env)`[0] <some id> was some 12 character hex string
[1] Note that this was like a month ago; I have a cold, so finally have time to catch up on such questions
I know nothing about how Cursor is implemented, so this may be wildly off base, but...
Perhaps it is written using some kind of JavaScript framework that doesn't allow access to the process environment by default, but this lets them work around that to access the environment like a native app?
One reason you'd want an IDE to have access to your environment is to enable any tools/compilers/whatever you launch from the IDE to inherit the environment (say, to access SSH_AGENT_SOCK, or whatever).
Ok, this sounds like a plausible direction — thanks!
It seems that Cursor is using Electron. I don't know about its internals (tried it a couple of times years ago), but after a quick look at the docs I've found this: https://www.electronjs.org/docs/latest/api/utility-process#u... which is a section describing `utilityProcess.fork(modulePath[, args][, options])`:
I don't get why some unique-looking prefix/suffix would be needed there, though.Why does that seem like an odd choice? An experienced developer knows that making a list of which vars are relevant and only serializing those means extra work in the future to upkeep the list. Seems odder to pick just the one you want in that instant.
It seems like the command is from this line of the VSCode source (Cursor is a fork of VSCode): https://github.com/microsoft/vscode/blob/f8b29f8da2c9bfda029...
GitHub Copilot thinks it does this to capture shell-specific environment variables (like those set up in .zshrc) that you wouldn't necessarily get unless you open the app from a shell yourself. Given it's been like this for at least 4 years, I don't think it's necessarily anything nefarious, and it's likely unchanged in Cursor.
Seems like the correct answer to me. Let's assume henceforth in this post that the code still does what the original vs code authors claim it was intended to do, and nothing more. If you launch the IDE from a shell, or launch ANY program from that same shell, it will automatically have access to the environment that you're concerned about.
Here's where they introduced wrapping the environment output in "random" numbers:
https://github.com/microsoft/vscode/commit/1336b3d3c0d4338fb...
The associated issue explains that they needed to be able to ignore extraneous info returned by the shell itself, so they make the command return a token to delimit the actual environment info they want.
The very idea of spawning a shell to grab its environment has been there since the beginning of vs code:
https://github.com/microsoft/vscode/commit/8f35cc4768393b254...
Thanks, I appreciate the additional context. Now all of it makes sense :)
Thank you, this is it!
The serialization of environment variables when Cursor starts might be for configuring its running environment, like determining plugin loading or server connections, or for diagnostic and debugging purposes to help locate issues. However, this has risks as environment variables may contain sensitive info such as API keys, and could be leaked if the software has vulnerabilities. To handle such software, check the official documentation or community forums for the reason of accessing env vars, use tools to monitor its usage of the vars, and download from official channels. So, what do you think will be the biggest obstacle when implementing these actions?
There's not nearly enough context for what that code is doing to say whether this is a typical usage or not. The environment is there to be used. It's a bit weird that it's stringified, not again - not enough information.
That's the thing, though — trying to understand the reason with this limited info. I imagine that I could perform some reverse engineering but I wondered if maybe someone has some ideas/suggestions. Serializing the env vars is one thing, the other is the fact that they prefix/suffix it with (what appears like) a random identifier.
Clickable link to the screenshot: https://i.imgur.com/47UeNAw.png
Use an IP restricted key vault.
If you’re just trusting everything to .env, someone will hack you eventually.