Ask HN: Why is Cursor IDE accessing all my env vars?

15 points by iyn 5 months ago

Recently I tried playing with https://www.cursor.com/ but got spooked by the LuLu alert (https://objective-see.org/products/lulu.html) when launching the app, where the process args included "JSON.stringify(process.env)" part, see screenshot here: https://imgur.com/a/DmDuGTz

Is this... normal? I don't understand why they might want to serialize/access all of my env vars. Does anyone have a suggestion for that behaviour? I'm probably missing some reasonable explanation, happy to learn more.

I've been running a lot of stuff in VMs lately anyway, but don't want to end up having to spin up a VM for the core app like a code editor. How do you all deal with untrusted (but not really malware-level untrusted) software?

jimsmart 5 months ago

> Is this... normal? I don't understand why they might want to serialize/access all of my env vars. Does anyone have a suggestion for that behaviour?

All processes get a copy of all environment variables [edit for clarity: all environment variables, from the global environment].

Unless one goes out of one's way to prevent this from happening.

> the process args included "JSON.stringify(process.env)" part

And this app choses to receive the env vars in a JSON format. NBD really, in light of the above points.

Environment variables are not secret at all. Quite the opposite: because all processes get a copy of them. They're just variables that are associated with- / stored in- the environment, instead of e.g. in code itself. They absolutely should not be considered to be secure in any way.

Managing secrets is always tricky. Even a naive attempt at trying to avoid using env vars generally leaks stuff in some way - shell command history will record secrets passed-in at launch time, plus any running process (with sufficient permissions) can get a list of running processes, and can see the command line used to invoke a process.

And once one gets past the naive solutions, it usually adds some friction somewhere along the line. There's no easy, transparent, way to do things, as far as I am aware. They all have some cost.

There are quite a few articles on the web about stuff this topic as a whole. I don't think anything particularly new will come from HN users here, it'll mostly be repeating the same already known/discussed stuff. As I myself am doing here, really.

You might find it helpful to consider something like Hashicorp's Vault, or similar, for proper management of secrets.

Two4 5 months ago

Using env vars for secrets has become semi-normalised because of container-based development and deployment. It's okay-ish in the limited context and scope of a container, but it's not good at all in a host OS or VM context. Some dev practices have leaked through, possibly because it's an approach that works in all environments even if it's not best practice
- jimsmart 5 months ago
  
  I think it was actually normalised long before container-based development was even a thing. It's always just been standard common practice — both in development and for live deployment.
  With the assumption being that it's safe, if the box itself is safe (is secure and is running trusted processes).
  You have to store the secrets somewhere, and at point of usage they are no longer secret. So one has to assume that any truly determined adversary will undoubtedly get hold of all secrets anyhow.
  Anything else is all about minimising risk. And, as with all security practices, there is always a cost/benefit analysis that has to be made, and there will be some kind of cost/benefit tradeoffs made throughout the system / system design, as a result.
  But regarding your original point: I would actually think that container-based development makes it easier to provide secrets to only the containers that need them, because e.g. with Docker, environmental variables can easily be specified in separate env files that are passed only to specific containers.
iyn 5 months ago

Thanks, I appreciate the detailed explanation.
I'm familiar with Vault and been using that at work — but we tend to fetch values from Vault and export them as env variables in the end anyway. Obviously we don't want to hardcode these values in the code either. So env vars are not good for secrets, hardcoding is terrible — what's good/secure then?
- cjbprime 5 months ago
  
  Env vars are fine for secrets, as long as you provide the right env vars to the right processes. You can unset them before launching a new process, or better still, not "export" the sensitive ones to all processes.
  - jimsmart 5 months ago
    
    This ^^
    Just avoid putting secrets in the global environment if it is a concern, and instead just pass necessary secrets locally in the environment when launching a specific app.
- jimsmart 5 months ago
  
  Ideally you would fetch values directly from Vault, e.g. using the REST API, ideally with SSL (but that depends on the environment your app is running in /etc.) or using the vault command.
  One can either access the Vault REST API directly inside the app itself, or one can pull data from it in a script file that launches the app, etc. and set any necessary environment vars dynamically before launching the app.
  e.g. in a launch script you might do something like (sorry, no idea how to do preformatted text on HN) :
  SOME_KEY=$(curl [access-your-vault-appropriately-here-using-access-tokens-etc] | jq whatever)
  Or, in wrapper launch scripts, instead of using the REST API directly with curl, instead use the vault command directly, if it's installed, e.g.
  SOME_KEY=$(vault kv get foo/whatever)
  Although you'd also need to do some calls upfront first, to authenticate and get an access token, before querying for data/secrets.
  But doing these kinds of calls, in the global environment gives those secrets to, well, everything in the global environment.
  If you need to pass a vault secret to some specific app, then you want to read from the vault as close to that app's launch as possible, e.g. in a wrapper script that launches that app (instead of launching it 'naked', and leaving it to read from global environment) - or by actually accessing the vault directly from within the app (which isn't gonna be possible with third-party stuff, unless it already supports your vault natively)

seanhunter 5 months ago

Environment variables exist to share information with processes spawned in that environment. If you don’t want the process to know something, you could look into using something like “env” to spawn the process with a reduced environment, but in general it’s good hygiene not to have anything in the environment that you wouldn’t feel comfortable with a process reading.

iyn 5 months ago

> you could look into using something like “env” to spawn the process with a reduced environment
Can you elaborate/show an example?
I agree in general to limit sensitive data stored in env vars, but often it's not feasible. One might have e.g. AWS keys loaded when using Terraform. The problem in this particular case is that Cursor is not selective in what env vars are loaded, it serializes _everything_. That seems like an odd choice to me.
- cjbprime 5 months ago
  
  > The problem in this particular case is that Cursor is not selective in what env vars are loaded, it serializes _everything_. That seems like an odd choice to me.
  If you want to stop it having access to your secrets, that is done by not providing them, not by hoping that it won't access the secrets that you give it. The focus on what Cursor is doing after it already has access that you didn't mean to give it, instead of on what you are doing to provide that access to it, seems the wrong way around.
- seanhunter 5 months ago
  
  The basic pattern is instead of launching your binary directly, you launch it using something like /usr/bin/env -i PATH=/bin:/usr/sbin TMPDIR=/var/tmp/thisprocess/tmpdir SOMEOTHER_ENV_VAR1=foo SOMEOTHER_ENV_VAR2=bar mybinary args
  ie you whitelist just the env vars you know your process needs to operate and set a specific path. This prevents a lot of problems caused by weird LD_LIBRARY_PATH exploits, and also more prosaically prevents things like api keys from being passed into processes that don't need them and ending up in debug messages, logfiles etc. It's also good when you're root to do this so you don't accidentally start long-running processes using your user environment which is probably full of stuff that while not actively harmful, the process doesn't need and could cause problems...
- davidt84 5 months ago
  
  I'm very confused here, the argument is just JavaScript code... What is passing that argument?
  Assuming that JavaScript is evaluated but the launched process, that will serialise the entire environment -- which is passed to every process anyway.
  - iyn 5 months ago
    
    I'm confused too :). The LuLu alert says [0]:
    process id: 54279 process args: -p "<some id>" + JSON. stringify(process.env) + "<some id>" process path: /Applications/Cursor.app
    The alert showed up right after I installed the app and clicked on the Cursor icon [1]. I understand that processes can access the entire environment by default (which was always a bit too "open"/strange for my taste, but I get the reason) but then I don't understand why you need to use `JSON. stringify(process.env)`
    [0] <some id> was some 12 character hex string
    [1] Note that this was like a month ago; I have a cold, so finally have time to catch up on such questions
    
    davidt84 5 months ago
    
    I know nothing about how Cursor is implemented, so this may be wildly off base, but...
    Perhaps it is written using some kind of JavaScript framework that doesn't allow access to the process environment by default, but this lets them work around that to access the environment like a native app?
    One reason you'd want an IDE to have access to your environment is to enable any tools/compilers/whatever you launch from the IDE to inherit the environment (say, to access SSH_AGENT_SOCK, or whatever).
    
    iyn 5 months ago
    
    Ok, this sounds like a plausible direction — thanks!
    It seems that Cursor is using Electron. I don't know about its internals (tried it a couple of times years ago), but after a quick look at the docs I've found this: https://www.electronjs.org/docs/latest/api/utility-process#u... which is a section describing `utilityProcess.fork(modulePath[, args][, options])`:
    env Object (optional) - Environment key-value pairs. Default is process.env.
    I don't get why some unique-looking prefix/suffix would be needed there, though.
- fragmede 5 months ago
  
  Why does that seem like an odd choice? An experienced developer knows that making a list of which vars are relevant and only serializing those means extra work in the future to upkeep the list. Seems odder to pick just the one you want in that instant.

tsunitsuni 5 months ago

It seems like the command is from this line of the VSCode source (Cursor is a fork of VSCode): https://github.com/microsoft/vscode/blob/f8b29f8da2c9bfda029...

GitHub Copilot thinks it does this to capture shell-specific environment variables (like those set up in .zshrc) that you wouldn't necessarily get unless you open the app from a shell yourself. Given it's been like this for at least 4 years, I don't think it's necessarily anything nefarious, and it's likely unchanged in Cursor.

jrootabega 5 months ago

Seems like the correct answer to me. Let's assume henceforth in this post that the code still does what the original vs code authors claim it was intended to do, and nothing more. If you launch the IDE from a shell, or launch ANY program from that same shell, it will automatically have access to the environment that you're concerned about.
Here's where they introduced wrapping the environment output in "random" numbers:
https://github.com/microsoft/vscode/commit/1336b3d3c0d4338fb...
The associated issue explains that they needed to be able to ignore extraneous info returned by the shell itself, so they make the command return a token to delimit the actual environment info they want.
The very idea of spawning a shell to grab its environment has been there since the beginning of vs code:
https://github.com/microsoft/vscode/commit/8f35cc4768393b254...
- iyn 5 months ago
  
  Thanks, I appreciate the additional context. Now all of it makes sense :)
iyn 5 months ago

Thank you, this is it!

marshughes 5 months ago

The serialization of environment variables when Cursor starts might be for configuring its running environment, like determining plugin loading or server connections, or for diagnostic and debugging purposes to help locate issues. However, this has risks as environment variables may contain sensitive info such as API keys, and could be leaked if the software has vulnerabilities. To handle such software, check the official documentation or community forums for the reason of accessing env vars, use tools to monitor its usage of the vars, and download from official channels. So, what do you think will be the biggest obstacle when implementing these actions?

viraptor 5 months ago

There's not nearly enough context for what that code is doing to say whether this is a typical usage or not. The environment is there to be used. It's a bit weird that it's stringified, not again - not enough information.

iyn 5 months ago

That's the thing, though — trying to understand the reason with this limited info. I imagine that I could perform some reverse engineering but I wondered if maybe someone has some ideas/suggestions. Serializing the env vars is one thing, the other is the fact that they prefix/suffix it with (what appears like) a random identifier.

iyn 5 months ago

Clickable link to the screenshot: https://i.imgur.com/47UeNAw.png

ratg13 5 months ago

Use an IP restricted key vault.

If you’re just trusting everything to .env, someone will hack you eventually.