Vs code extensions have been terrifying for a long time. Such a wild and obvious attack vector. I'm constantly getting pop ups in vscode to install an extension because it recognizes a certain file type. It's 50-50 whether that extension is owned by a company or some random dev. Some of these have millions of installs and on first glance appear to be official company owned extensions. I'm at a point in my life where I only installed official company owned extensions and even that is hard to be sure I'm not getting suckered. Sad state.
The problem extends far beyond VS code. All extensions and executable code has the same problem. There was a case where Disney was hacked because an employee installed a BeamNG mod that had bundled malware.
A company that wants to remain secure would have to employ strict restrictions on installing software. Only installing npm packages and plugins from an internal preapproved repo for example.
The problem is not VS code itself. It's the fact extensions can access things outside of the editor. As far as I am aware, no editor sandboxes extensions.
I won't say "you can take my VS Code from cold dead hands" or anything, but it is a very good tool, and Microsoft hasn't yet fucked it up the way they have so many other things.
I guess I'd say "you take my VS Code ... willingly ... but only after M$ fucks it up and makes me not want it anymore (like they've done to everything else they acquired)".
Zed is the closest thing I've found to meet my needs, and I do plan to try it. However it's dev container support looks to be lacking in some important ways so we'll see.
Emacs has been a viable option for going on a half century now. The GNU Emacs 31 branch[0] was cut recently and we're barreling towards a new release. It might be time to give it another look.
I'm not saying its package ecosystem isn't vulnerable to these kind of attacks, it is, but it is at least developed by folks who have very different goals and ambitions than Microsoft.
The root(-ish) cause here is the ease of publishing and installing extension code, and in particular the fact that there's no independent validation/verification step between the upstream author and armageddon. And upstream authors aren't set up with the needed precautions themselves, they're just hackers.
Basically if you phish Just One Account with write access to an extension you wan pwn everyone who's running it.
Yeah, the only thing that gives me hope is the optics of this happening to GitHub. Though it seems possible VS Code team could double down on the opinion that this isn't a permission/sandboxing problem, and is instead a scanning/threat detection problem.
Note that VS Code is built on Electron and it is a pain to sandbox because Electron has (had?) SUID sandbox helper, and you cannot run SUID binaries in sandbox easily. Sandboxing on Linux is extremely difficult task.
It feels so bad to see the "You need go give Chrome SUID Root for the sandbox to work". Setting a Web Browser SUID Root was an old joke about clueless users. It was the worst security screwup someone could imagine.
The forum listing for the stolen source code (per the screenshot in article) says 1 buyer or they leak for free. Is GitHub about to become open source?
As others have said it's just a fraction. I'm in a medium size tech-related company and we have 7500+ in one Github org. We have two orgs, so altogether easily 10K+. Of course most of it is stale, obsolete, sandbox, personal tools, etc. I wouldn't be surprised if Github would have 100K+ internal repos or even more.
No OP but I used to work at a large company with a similar number of repos.
When I left about a year ago, we had just started (after being on Github for almost 8 years) an ongoing project of first archiving old/outdated repos in place, and then moving them to an "archived" sub-org, and waiting to see if anyone complained.
Previously no one wanted to outright delete or remove repos because of the risk that someone somewhere was relying on it, and also there was no actual downside to just leaving them there (no cost savings, no imminent danger other than clutter, etc), so resources were never allocated to do it. There was always something more important to work on.
In an org with a higher floor of engineering management, a proactive program for removing unused or outdated repos would absolutely be expected though I think.
This is a continual fight for me. At nearly every company I've had to compromise on using a graveyard repo for packages within a monorepo, even though git has the whole history already.
I worked for a food retail store once. I remember going in the first day wondering, how hard can it really be... From the outside, it looks like they have a simple website. The website to order things on was an amalgamation of 300+ repo's. GitHub lost less in this breach. It takes a lot of effort to keep things simple as you grow.
Something cool that I've always liked about working at GitHub is how much of the company _runs on GitHub_ -- A lot of teams, even non-technical teams, have their own repos just to organize docs/SOP's/designs/etc like a traditional knowledge work company might use a Sharepoint
Personally I have over a hundred, especially from quick prototypes, studies or instances of templates so I can easily see how over 18 years and many hundreds of employees you end up with thousands.
No, there's no joke, you might have just misread the article (the 3,800 number is the number of internal GitHub repos the employee had downloaded on their personal computer / had access to on their own GitHub account)
In my personal experience, give it a decade or two, and any corporation will accumulate hundreds (or even thousands) of abandoned internal repos containing discontinued services, POCs/prototypes that never went anywhere, etc – people forget to archive them, or aren't sure whether something is still in use or not so err on the safe side.
AI is making this even worse. With coding agents, anyone can throw together a quick internal prototype of any idea they have, even if it has no hope of ever making it to production.
Maybe though AI will make it better, assign agents to monitor, maintain and keep repos up to date or via A2A refer them to an agent to dispose of them in accordance with company requirements. I actually think AI will greatly help this type of problem.
Autoarchiving repos which nobody has used in X years doesn’t require any AI - you can just write a bot to do it. People don’t, because it isn’t a priority. AI can make writing such a bot a bit easier, but can’t help much with getting approval from the powers that be to run it.
really? I mean these are internal repos. Probably most of them are random one-off experiments or a place to park code. Google has 2,900 "public" repos on github. Microsoft has ~8k "public" on github too. Can't even imagine how many they have on their internal systems.
i'd love to be able to use fine grained tokens with gh and not expose every repo and org that i am connected to on github, but you can't see the results of a github actions check that way (no 'Checks' permission available). hoping these breaches push things in the direction of access being less annoying to manage.
The problem is that the main target for these repos are the internal IaaS type repos that contain much of the juicy information.
A fine grained token is likely to have read access to the IaaS repo as that is likely the very repo they are operating on when the malware compromises them.
3800 repos up for blackmail may make a good headline but it's likely that Github don't really care about 3798 of those repos being made public. It'd be annoying for those 3798 to be made public but they can deal with that. It's the 2 repos that contain really important stuff that they really don't want to be made public. You can't rely on fine grained tokens to limit the leak of these things as, at some point, someone with that very access will get compromised.
Limiting TTL on tokens/auth isn't a perfect solution either. If the token is leaked via some malware it can be used to clone repos within minutes (even seconds) of being leaked. No-one wants to have to perform 2FA every few seconds in order to get on with their day.
IP based restrictions may help, but then the malware would probably evolve to include a tailscale/wireguard key so that the clone/exfiltration is done from the existing IP address and then the data is proxied away separately.
Future dev environments are going to be heavily sandboxed in terms of "do github stuff in this sandbox, copy files to another sandbox to do package updates, vet everything coming back, etc"
i was more thinking like, if i am working on project ABC for org XYZ it's understandable that if my dev vm gets owned that ABC is leaked. it's not that acceptable if all of org XYZ's repos that i have access to get leaked. and especially not acceptable if everything i have access to, including other orgs, and the admin ability to do destructive operations on them, gets exposed. but status quo is that that's absolutely the case, and you basically need org specific github accounts to reduce the risk of that. or use the knee-capped fine grained PATs that github offers but don't work for common things like seeing if your PR is green.
agree generally with what your getting at though: doesn't solve this problem. but even just a basic reduction in blast radius would be nice.
I was noodling around with personal access tokens today on GitHub and found out that you actually can restrict tokens to specific repositories, orgs, etc. Not sure if actions is a scope that is available or not.
I'd have thought that by now, most would have been swapping to WebAssembly. It's really nicely sandboxed, you expose it to only what you want, and you can compile a lot of languages into a WASM form meaning you're not stuck with only Javascript or similar. Am I naive for thinking that?
The (lack of) security of VSCode has always been astounding. People have asked for sandboxing extensions for years [0] with little to no progress, and issues have been discussed a lot (e.g. [1][2]). I guess it hasn't been a big issue, likely because most developers are not complete idiots. But it only takes one developer and one bad extension to consequences like this.
I mean, I understand that it is hard to sandbox Node.js applications, but apparently Microsoft has put way more effort into their Copilot slop than security.
I am so, so stressed about Sublime Text... It feels like a massive disaster just waiting to happen. They don't even run their own package marketplace :(
Even if you don't install crap, the latest strategy is attacking the developer of one of the extensions or their build process so you can push a malware update to an otherwise legitimate extension.
At what point did/does it start feeling naive to trust the integrity and output of Github Actions on general? Does it feel unlikely that an attacker would be able to get a foothold in that infrastructure?
If you own a GitHub organization and are looking for what changes/controls you can apply to reduce the risk/impact of PAT token exfiltration (and subsequent abuse) like what occurred here, I listed a few at the end of https://blog.bored.engineer/github-canarytokens-5c9e36ad7ecf...
- Enable audit log streaming[1] on your enterprise including source IPs and API requests, even if it’s just going to an S3 bucket nobody looks at it, your incident response team will thank you later.
- Enforce the use of SSO on your GitHub organization[2], not just because SSO is good but because it forces an explicit authorization action[3] by users to grant an SSH key/PAT access to your organization resources, instead of granting access implicitly. That way the PAT created for someone’s weekend project won’t have access to your organization resources.
- Enforce an IP allowlist[4] for your organization from a set of known trusted VPN/corporate IPs. This is by-far the strongest control (and the most painful to rollout) as it will prevent stolen credentials (even if still valid) from being used by an attacker except on the intended systems where you (hopefully) have other visibility/alerting via EDR or related tooling.
- If you can, restrict access from personal access tokens[5] to your organization resources. Blocking classic PATs and enforcing a maximum expiration (ex: 3 months) on fine-grained PATs is a great way to reduce risk if you can’t eliminate PATs altogether[6].
- If you use GitHub enterprise (on-prem), configure collection of the raw HTTP access logs[7] in addition to native GitHub audit logs, it may prove critical during incident response.
Has there been any confirmation of this from a source other than X? It's weird that that's the only source, and therefore makes me distrust the entire story.
A few days ago I saw I had an update to the Twig extension. The UI flagged it as having new executable code in the update bundle, so I didn't install the update, disabled the extension as I wasn't working on Drupal views that day, and went about my work. I didn't have time to investigate the new update's contents. When I went back to the extension page, it was taken down: https://open-vsx.org/extension/whatwedo/twig
I'm not saying it was whatwedo.twig, but I'm not saying it wasn't, either.
Edit: If anyone's got a good recommendation for a twig formatter for Cursor / VS Code, please let me know.
The security measure that the developer didn't use was completely refusing to use vscode.
vscode has no security model. It's not like swiss cheese where there are holes and some of the go all the way through. vscode is all hole with some cheese on the side. There is absolutely no isolation between the front-end process, the backend size (the thing that runs in the remote or the devcontainer), and any extensions or anything that might be in a repository whose authors you "trust".
Or you can just refuse to use random extensions. I built my own extensions if I needed them. You're a programmer, right? The whole point of extensibility is that you, or your company, can program what you need from your IDE, without having to make a whole IDE from scratch. I have since moved on to making my own IDE, mostly because I hate Electron and its >1gb memory footprint, but vscode served me so much better than anything else for years, without installing a single rando's extension.
A vscode workspace can trivially execute code on the machine that runs the server end of vscode. (This is how building works -- there is no sandbox unless the workspace config explicitly uses some kind of sandbox.) So the workspace can usually trivially elevate permissions to take over the vscode server, including installing extensions on it without asking you.
In principle, there is a teeny tiny bit of isolation between the local and remote sides, so the remote side cannot trivially execute code on the local machine. But I recommend reading this rather long-standing ticket:
It would be nice if there was an easy way to prevent people from installing vscode remotes on a shared server... Probably can run an ebpf routine to disallow creation of folders named . vscode*
This is my position as well, but it's rarely received well. Usually, a response like "why would I rewrite something that's already been written and available?" By writing the code, I know how it works. I know it is not infected with crap. I know it will not in the future be infected with crap from a down stream dependency. It seems to me this really took off with node to the point that it's laughable at what people will include with no thought at all. I know component libraries have existed for many other languages before, but node just stands out to me
Most bosses look poorly upon spending their budget on rewriting software that already exists and simultaneously most bosses(although not the exact same set) don’t care about security until a disaster has already occurred.
And it’s also not like you’re going to literally write every piece of software you use, unless you’ve started all the way down at machine code you’re drawing the line somewhere on using code written by other people.
> I built my own extensions if I needed them. You're a programmer, right? The whole point of extensibility is that you, or your company, can program what you need from your IDE,
Dude, get real. We don't all have the luxury of being able to engage in endless IDE extension programming side quests just to do our day jobs. And even if we did, there's the reality that whatever you produce is probably not nearly as feature complete or bug free as the extension someone spent years writing. Hence why people want to reach for off the shelf solutions.
Ah, there it is. The root of most problems in the software industry: people who hate programming and avoid doing it as much as possible, because they only got into it for the money.
I have no problem writing extensions in my spare time because programming is fun. Because I know how to program, like, actually program and not just copypaste stuff off StackOverflow, it doesn't take years to write a vscode extension, either.
> people who hate programming and avoid doing it as much as possible, because they only got into it for the money.
Yeah, not the case at all. I love programming, I've been doing it since I was a kid, for over 30 years. But I DO have to earn a living, and I'd rather spend free time programming things that interest me. Writing IDE extensions and tooling all the way down to the bare metal because I can't be absolutely sure at all times that node.js code doesn't contain a virus is not one of those things.
The 3800 repos weren't exfiltrated from the compromised machine.
The malware (be it a VSCode plugin, an npm package, or whatever is next) simply slurps up all of the users private keys/tokens/env-vars it can find and sends this off somewhere covertly.
It's trivial to do this in a way to avoid detection. The small payload can be encrypted (so it can't be pattern matched) and then the destination can be one of millions of already compromised websites found via a google search and made to look like a small upload (it could even be chunked and uploaded via query parameters in a HTTP GET request).
The hackers receive the bundle of compromised tokens/keys and go look at what they give access to. Most of the time it's going to be someone's boring home network and a couple of public or private github repos. But every once in a while it's a developer who works at a big organisation (e.g. Github) with access to lots of private repos.
The hackers can then use the keys to clone all of the internal/private repos for that organisation that the compromised keys have access to. Some organisations may have alerts setup for this, but by the time they fire or are actioned upon the data will probably be downloaded. There's no re-auth or 2FA required for "git clone" in most organisations.
With this data the hackers have further options:
a) attempt to extort the company to pay a ransom on the promise of deleting the data
b) look for more access/keys/etc buried somewhere in the downloaded repos and see what else they can find with those
c) publish it for shits and giggles
d) try and make changes to further propagate the malware via similar or new attack vectors
e) analyse what has been downloaded to work out future attack vectors on the product itself
Right now Github (and others recently compromised in similar ways) will be thinking about what information is in those internal repos and what damage would it cause if that information became public, or what that information could be used to find out further down the line.
"Customer data should not be in a github repo" is all well and good, but if the customer data is actually stored in a database somewhere in AWS and there's even just one read-only access token stored somewhere in one private github repo, then there's a chance that the hackers will find that and exfiltrate the customer data that way.
Preventing the breach is hard. There will always be someone in an org who downloads and installs something on their dev machine that they shouldn't, or uses their dev machine for personal browsing, or playing games, or the company dev infra relies on something that is a known attack vector (like npm).
Preventing the exfiltration is virtually impossible. If you have a machine with access to the Internet and allow people to use a browser to google things then small payloads of data can be exfiltrated trivially. (I used to work somewhere where the dev network was air-gapped. The only way to get things onto it was typing it in, floppy or QIC-150 tape - in the days before USB memory sticks.)
Detecting the breach is nigh on impossible if the keys are not used egregiously. Sure some companies can limit access to things like Github to specific IPs, but it wouldn't take much for the malware to do something to work around this. (I can see things like a wireguard/tailscale client being embedded in malware to allow the compromised machine to be used as a proxy in such cases.)
Alerting that requires manual response is nigh on useless as by the time someone has been paged about something the horse has already bolted.
Knowing what has been taken is also a huge burden. 3800 repos that people now have to think about and decide what the implications are. Having been through something like this in the past there are plenty of times people go "I know that repo, it's fine, we can ignore that one" only for it to contain something they don't realise could be important.
These kind of attacks are going to become increasingly common as they're proven to work well and the mitigations for them are HARD. It doesn't need to be targeted at all either, you just infect a bunch of different things and see what gets sent in.
If companies continue to not pay the ransom then we're going to get a lot more things published and many companies having to apologise for all manner of things that end up being leaked.
> It's trivial to do this in a way to avoid detection
I'd love to see a real example/PoC.
Anyway, we discussed this issue in the other thread. For me, unrestricted outbound requests to any url, whether it's well known domains like api.github.com or any other domain, are a red flag.
Why does VS need to establish outbound requests to any domain, without authorization?
There's no magic solution, and these attacks will evolve, but I still think that restricting outbound requests is a good measure to mitigate these attacks.
> slurps up all of the users private keys/tokens/env-vars it can find and sends this off somewhere covertly.
Isolating applications can also mitigate the impact of these attacks. For example, you can restrict VS code to only share with the host .vscode/, .git/ and other directories. Even by project.
Again, it's not bulletproof, but helps.
> Why does VS need to establish outbound requests to any domain, without authorization?
I don't know but it's very standard practice in most applications, because telemetry. But VS code is one of the worst: just check open snitch when running VS code, it's constantly phoning to a bunch of IPs.
> but I still think that restricting outbound requests is a good measure
It is 100% necessary, but doesn't stop most attacks quick enough.
If you're posting to github.com/acmecompany then attackers love to do things like add their own user github.com/acemcompany and just upload your data to that. Generally it doesn't last very long, but with CI/CD they can get thousands of keys in a minute and be gone seconds later.
There are plenty of exfiltration examples out there that could go through known, commonly-greenlit domains. Even exfil via DNS requests has been demonstrated.
But at least in that case, there’s a chance that the outbound requests are blocked. Malware isn’t perfect. Simple measures can block a significant proportion of attacks.
Ah yes, sandboxing/limiting a VSCode plugin is not impossible. I was thinking in more general terms (such as post install scripts within npm/python packages). Random test code in golang packages. There's an awful lot that people don't vet because keeping up with the vetting is a huge burden which seems pointless until you're the one that gets hacked.
The trick is to infect a plugin that has a legitimate reason for accessing the internet or running certain commands, and then coming up with ways to abuse that to exfiltrate the data. Or exfiltrating via DNS queries, or some other vector that isn't so obvious as "allow TCP/UDP connections to the whole world".
That or just repeatedly pester a user for permissions until one user (and you only need one within the organisation) relents and grants it.
the pop-ups fatigue is already an issue, and not an easy one to solve. Pretty much like SIEM/SOC alerts.
> The trick is to infect a plugin that has a legitimate reason for accessing the internet or running certain commands, and then coming up with ways to abuse that to exfiltrate the data. Or exfiltrating via DNS queries, or some other vector that isn't so obvious as "allow TCP/UDP connections to the whole world".
They'll get there, maybe. But the reality is that right now, everyone allows outbound requests blindly.
Instead of speculating, I suggest to actually investigate current IOCs and common tactics of malicious npm/pip/plugins/VS extensions. Something like this:
Or use OpenSnitch (or Lulu, Glasswire, ZoneAlarm anyone?:D etc) to actually analyze real VS malicious extensions or npm packages and see if it stops the exfiltration, and if not, suggest ways to improve it. For example:
If paying the ransom doesn't stop your data getting leaked, nobody will pay the ransom. There is a rational basis for the ransomers to follow through with the deletion. Even the mob did provide "protection" when they coerced you into paying for it.
This sounds like a naive presumption. Are ransomware distributors well known for operating within strict hierarchies bound by culturally-ingrained traditions, or acting in the best interests of their own “greater good”?
Last I heard, teenagers can deploy ransomware with minimal technical knowledge or skill.
The data has been stolen by a criminal group. Paying for "restoring" the data does not guarantee they will delete all copies. There is no way of proving they actually did and they have in fact very little incentive to actually delete it.
You have to take their words for it but how can you trust crooks?
At some point, these people will come up with a ransom-as-a-service that you can subscribe to make monthly payments. It's no different than having to pay criminals monthly for security to prevent them from harming your themselves.
If only the company behind VSCode, the company behind NPM and the company behind GitHub could get together and figure out a solution to this.
Perfectly demonstrating the truth of the "Microsoft org chart" cartoon.
https://bonkersworld.net/organizational-charts
It is also company behind NuGet.
Guess what they did a year ago.
They removed 700 or so packages from NuGet proactively but those turned out to be false positives.
It is hard to do the right things.
i mean, then you say it like that…
Microsoft is the inverse hand of Midas, turns everything into shit.
Everything Microsoft makes sucks. If they decided to make vacuum cleaners though, they wouldn’t suck, they would blow.
Mierdas, as they say.
With $101 billion in profit last year I wish I could turn things into $hit as well as they do.
these days it's just Microslop
Vs code extensions have been terrifying for a long time. Such a wild and obvious attack vector. I'm constantly getting pop ups in vscode to install an extension because it recognizes a certain file type. It's 50-50 whether that extension is owned by a company or some random dev. Some of these have millions of installs and on first glance appear to be official company owned extensions. I'm at a point in my life where I only installed official company owned extensions and even that is hard to be sure I'm not getting suckered. Sad state.
The problem extends far beyond VS code. All extensions and executable code has the same problem. There was a case where Disney was hacked because an employee installed a BeamNG mod that had bundled malware.
A company that wants to remain secure would have to employ strict restrictions on installing software. Only installing npm packages and plugins from an internal preapproved repo for example.
I'm more surprised hackers found enough of a window of uptime to do this.
For those not getting the joke, GitHub has had an increasingly difficult keeping itself up since Microsoft acquired them.
It's gotten a lot worse (and made news) more recently, as the downtime as increased.
No, it's had an increasingly difficult time keeping itself ever since they fixed their uptime metric collection, added Actions, and exploded in users.
Previous thread in sequence:
GitHub is investigating unauthorized access to their internal repositories - https://news.ycombinator.com/item?id=48201316 - May 2026 (321 comments)
I really hope this pushes Microsoft to add a explicit permission system to VS Code extensions, and improve security of dev containers.
I really hope this pushes users (here: devs and maintainers) to decrease their reliance on Microsoft and especially stop outsourcing security to them.
Migrate off vscode already.
The problem is not VS code itself. It's the fact extensions can access things outside of the editor. As far as I am aware, no editor sandboxes extensions.
I won't say "you can take my VS Code from cold dead hands" or anything, but it is a very good tool, and Microsoft hasn't yet fucked it up the way they have so many other things.
I guess I'd say "you take my VS Code ... willingly ... but only after M$ fucks it up and makes me not want it anymore (like they've done to everything else they acquired)".
Vs code is a weapon, designed to fracture. It being “good” is a weapon as well. https://ghuntley.com/fracture/
> Migrate off vscode already.
Zed is the closest thing I've found to meet my needs, and I do plan to try it. However it's dev container support looks to be lacking in some important ways so we'll see.
Emacs has been a viable option for going on a half century now. The GNU Emacs 31 branch[0] was cut recently and we're barreling towards a new release. It might be time to give it another look.
I'm not saying its package ecosystem isn't vulnerable to these kind of attacks, it is, but it is at least developed by folks who have very different goals and ambitions than Microsoft.
[0]: https://github.com/emacs-mirror/emacs/blob/master/etc/NEWS
> Migrate off vscode already.
It's not the IDE, though. Any extensible, customizable display editor can be coerced into behaving badly by installing external code. Even this one: https://www.gnu.org/software/emacs/emacs-paper.html
The root(-ish) cause here is the ease of publishing and installing extension code, and in particular the fact that there's no independent validation/verification step between the upstream author and armageddon. And upstream authors aren't set up with the needed precautions themselves, they're just hackers.
Basically if you phish Just One Account with write access to an extension you wan pwn everyone who's running it.
Not holding my breath. This issue has been open since 2018 https://github.com/microsoft/vscode/issues/52116
Yeah, the only thing that gives me hope is the optics of this happening to GitHub. Though it seems possible VS Code team could double down on the opinion that this isn't a permission/sandboxing problem, and is instead a scanning/threat detection problem.
I wonder if this was the compromised nx console extension that bit me yesterday. The timing seems identical. See https://github.com/nrwl/nx-console/security/advisories/GHSA-...
GitHub confirmed that it's indeed the nx console extension, in their blog post: https://github.blog/security/investigating-unauthorized-acce...
Note that VS Code is built on Electron and it is a pain to sandbox because Electron has (had?) SUID sandbox helper, and you cannot run SUID binaries in sandbox easily. Sandboxing on Linux is extremely difficult task.
It feels so bad to see the "You need go give Chrome SUID Root for the sandbox to work". Setting a Web Browser SUID Root was an old joke about clueless users. It was the worst security screwup someone could imagine.
Don't build your ide on electron then.
podman seems to handle rootless namespaces just fine, minor caveat for some perf overhead but it's not the end of the world.
And volumes. Volumes are not fun with podman. Ironically my team tried GitHub Codespaces and never looked back. Super cheap and uses DevContainers.
What's the difference between Podman and docker for volumes? Other than needing to add Z to get volumes to mount with SELinux
The forum listing for the stolen source code (per the screenshot in article) says 1 buyer or they leak for free. Is GitHub about to become open source?
That's one way to make things open source.
Maybe I'm missing something really obvious, but... 3,800 repos? I guess I find it kind of surprising they have that many!
As others have said it's just a fraction. I'm in a medium size tech-related company and we have 7500+ in one Github org. We have two orgs, so altogether easily 10K+. Of course most of it is stale, obsolete, sandbox, personal tools, etc. I wouldn't be surprised if Github would have 100K+ internal repos or even more.
no pruning of repos?
No OP but I used to work at a large company with a similar number of repos.
When I left about a year ago, we had just started (after being on Github for almost 8 years) an ongoing project of first archiving old/outdated repos in place, and then moving them to an "archived" sub-org, and waiting to see if anyone complained.
Previously no one wanted to outright delete or remove repos because of the risk that someone somewhere was relying on it, and also there was no actual downside to just leaving them there (no cost savings, no imminent danger other than clutter, etc), so resources were never allocated to do it. There was always something more important to work on.
In an org with a higher floor of engineering management, a proactive program for removing unused or outdated repos would absolutely be expected though I think.
This is a continual fight for me. At nearly every company I've had to compromise on using a graveyard repo for packages within a monorepo, even though git has the whole history already.
Gitlab is so nice for this. You can group repos together so it is harder to lose track of stale projects.
Breaks old stuff
Uber had 8000 repos at one point with 2000 engineers - https://highscalability.com/lessons-learned-from-scaling-ube...
Probably most of them are forks of some public repo with some patch applied and half of those are probably not even used internally anymore.
I worked for a food retail store once. I remember going in the first day wondering, how hard can it really be... From the outside, it looks like they have a simple website. The website to order things on was an amalgamation of 300+ repo's. GitHub lost less in this breach. It takes a lot of effort to keep things simple as you grow.
Can confirm as someone working in the same field, we have a ton of repos
Something cool that I've always liked about working at GitHub is how much of the company _runs on GitHub_ -- A lot of teams, even non-technical teams, have their own repos just to organize docs/SOP's/designs/etc like a traditional knowledge work company might use a Sharepoint
Personally I have over a hundred, especially from quick prototypes, studies or instances of templates so I can easily see how over 18 years and many hundreds of employees you end up with thousands.
3800 is low for an org like GitHub. Glad it’s highly likely not all their repos are compromised.
Given the attack vector, it's possible that the impacted repos were ones that see more activity.
I was part of an org with more than 15k repos
It sounds low to me, I worked at a Fortune high number a few years ago and they had more.
Am I missing the joke here... they have hundreds of millions of repos.
I think they mean that these are internal github-org repos.
The ones used for running the site itself.
Though, its so many that i think there are some customer ones in there too.
No, there's no joke, you might have just misread the article (the 3,800 number is the number of internal GitHub repos the employee had downloaded on their personal computer / had access to on their own GitHub account)
The breach is about internal repositories, not user repositories.
In my personal experience, give it a decade or two, and any corporation will accumulate hundreds (or even thousands) of abandoned internal repos containing discontinued services, POCs/prototypes that never went anywhere, etc – people forget to archive them, or aren't sure whether something is still in use or not so err on the safe side.
AI is making this even worse. With coding agents, anyone can throw together a quick internal prototype of any idea they have, even if it has no hope of ever making it to production.
Maybe though AI will make it better, assign agents to monitor, maintain and keep repos up to date or via A2A refer them to an agent to dispose of them in accordance with company requirements. I actually think AI will greatly help this type of problem.
Autoarchiving repos which nobody has used in X years doesn’t require any AI - you can just write a bot to do it. People don’t, because it isn’t a priority. AI can make writing such a bot a bit easier, but can’t help much with getting approval from the powers that be to run it.
really? I mean these are internal repos. Probably most of them are random one-off experiments or a place to park code. Google has 2,900 "public" repos on github. Microsoft has ~8k "public" on github too. Can't even imagine how many they have on their internal systems.
They have 800 engineers. So 3,800 repos is high, but not crazy.
Some of those could be forks.
i'd love to be able to use fine grained tokens with gh and not expose every repo and org that i am connected to on github, but you can't see the results of a github actions check that way (no 'Checks' permission available). hoping these breaches push things in the direction of access being less annoying to manage.
The problem is that the main target for these repos are the internal IaaS type repos that contain much of the juicy information.
A fine grained token is likely to have read access to the IaaS repo as that is likely the very repo they are operating on when the malware compromises them.
3800 repos up for blackmail may make a good headline but it's likely that Github don't really care about 3798 of those repos being made public. It'd be annoying for those 3798 to be made public but they can deal with that. It's the 2 repos that contain really important stuff that they really don't want to be made public. You can't rely on fine grained tokens to limit the leak of these things as, at some point, someone with that very access will get compromised.
Limiting TTL on tokens/auth isn't a perfect solution either. If the token is leaked via some malware it can be used to clone repos within minutes (even seconds) of being leaked. No-one wants to have to perform 2FA every few seconds in order to get on with their day.
IP based restrictions may help, but then the malware would probably evolve to include a tailscale/wireguard key so that the clone/exfiltration is done from the existing IP address and then the data is proxied away separately.
Future dev environments are going to be heavily sandboxed in terms of "do github stuff in this sandbox, copy files to another sandbox to do package updates, vet everything coming back, etc"
i was more thinking like, if i am working on project ABC for org XYZ it's understandable that if my dev vm gets owned that ABC is leaked. it's not that acceptable if all of org XYZ's repos that i have access to get leaked. and especially not acceptable if everything i have access to, including other orgs, and the admin ability to do destructive operations on them, gets exposed. but status quo is that that's absolutely the case, and you basically need org specific github accounts to reduce the risk of that. or use the knee-capped fine grained PATs that github offers but don't work for common things like seeing if your PR is green.
agree generally with what your getting at though: doesn't solve this problem. but even just a basic reduction in blast radius would be nice.
I was noodling around with personal access tokens today on GitHub and found out that you actually can restrict tokens to specific repositories, orgs, etc. Not sure if actions is a scope that is available or not.
But, it did not go down! Progress!
It's working, we just don't know what it's doing.
GitHub confirmed skynet
No time to deploy, nobody must move a muscle during the forensic investigation!
Don't jinx it!
I'd have thought that by now, most would have been swapping to WebAssembly. It's really nicely sandboxed, you expose it to only what you want, and you can compile a lot of languages into a WASM form meaning you're not stuck with only Javascript or similar. Am I naive for thinking that?
The (lack of) security of VSCode has always been astounding. People have asked for sandboxing extensions for years [0] with little to no progress, and issues have been discussed a lot (e.g. [1][2]). I guess it hasn't been a big issue, likely because most developers are not complete idiots. But it only takes one developer and one bad extension to consequences like this.
I mean, I understand that it is hard to sandbox Node.js applications, but apparently Microsoft has put way more effort into their Copilot slop than security.
[0] https://github.com/microsoft/vscode/issues/52116
[1] https://news.ycombinator.com/item?id=42979994
[2] https://news.ycombinator.com/item?id=46855527
I am so, so stressed about Sublime Text... It feels like a massive disaster just waiting to happen. They don't even run their own package marketplace :(
> but apparently Microsoft has put way more effort into their Copilot slop than security.
Your security or their money (selling Copilot to enterprise customers): what would they choose, hmm? Surprise!
Why would you sandbox extension?
Just don’t install crap maybe.
Even if you don't install crap, the latest strategy is attacking the developer of one of the extensions or their build process so you can push a malware update to an otherwise legitimate extension.
Any good, benign extension can be taken over and weaponized with malware.
This mans security onion has one layer.
At what point did/does it start feeling naive to trust the integrity and output of Github Actions on general? Does it feel unlikely that an attacker would be able to get a foothold in that infrastructure?
What's inside the canada.tar.gz one?
If you own a GitHub organization and are looking for what changes/controls you can apply to reduce the risk/impact of PAT token exfiltration (and subsequent abuse) like what occurred here, I listed a few at the end of https://blog.bored.engineer/github-canarytokens-5c9e36ad7ecf...
- Enable audit log streaming[1] on your enterprise including source IPs and API requests, even if it’s just going to an S3 bucket nobody looks at it, your incident response team will thank you later.
- Enforce the use of SSO on your GitHub organization[2], not just because SSO is good but because it forces an explicit authorization action[3] by users to grant an SSH key/PAT access to your organization resources, instead of granting access implicitly. That way the PAT created for someone’s weekend project won’t have access to your organization resources.
- Enforce an IP allowlist[4] for your organization from a set of known trusted VPN/corporate IPs. This is by-far the strongest control (and the most painful to rollout) as it will prevent stolen credentials (even if still valid) from being used by an attacker except on the intended systems where you (hopefully) have other visibility/alerting via EDR or related tooling.
- If you can, restrict access from personal access tokens[5] to your organization resources. Blocking classic PATs and enforcing a maximum expiration (ex: 3 months) on fine-grained PATs is a great way to reduce risk if you can’t eliminate PATs altogether[6].
- If you use GitHub enterprise (on-prem), configure collection of the raw HTTP access logs[7] in addition to native GitHub audit logs, it may prove critical during incident response.
[1]: https://docs.github.com/en/enterprise-cloud@latest/admin/mon... [2]: https://docs.github.com/en/enterprise-cloud@latest/authentic... [3]: https://docs.github.com/en/enterprise-cloud@latest/authentic... [4]: https://docs.github.com/en/enterprise-cloud@latest/organizat... [5]: https://docs.github.com/en/enterprise-cloud@latest/organizat... [6]: https://edu.chainguard.dev/open-source/octo-sts/overview/ [7]: https://docs.github.com/en/enterprise-server@3.16/admin/moni...
Has there been any confirmation of this from a source other than X? It's weird that that's the only source, and therefore makes me distrust the entire story.
someone above mentioned it's been confirmed as nx console here: https://github.blog/security/investigating-unauthorized-acce...
So which extension? Why don't they tell us?
A few days ago I saw I had an update to the Twig extension. The UI flagged it as having new executable code in the update bundle, so I didn't install the update, disabled the extension as I wasn't working on Drupal views that day, and went about my work. I didn't have time to investigate the new update's contents. When I went back to the extension page, it was taken down: https://open-vsx.org/extension/whatwedo/twig
I'm not saying it was whatwedo.twig, but I'm not saying it wasn't, either.
Edit: If anyone's got a good recommendation for a twig formatter for Cursor / VS Code, please let me know.
I’ve used djlint on a liquid project and it worked well. It supports twig too: https://djlint.com/docs/languages/twig/
They also have an online demo/playground so you can at least give it a shot to see if it works.
I’ve used the twiggy LSP before and there seems to be a few VS code extensions for it: https://marketplace.visualstudio.com/items?itemName=moetelo.... and https://marketplace.visualstudio.com/items?itemName=Stanisla...
I'm not seeing anything on the official marketplace: https://marketplace.visualstudio.com/items?itemName=whatwedo...
I wonder if it was open-vsx specific?
That’s very possible. I switch between Cursor and VS Code, don’t remember which it was that day.
There are rumours that was NX Console VS code extension
https://github.com/nrwl/nx-console/security/advisories/GHSA-...
https://www.stepsecurity.io/blog/nx-console-vs-code-extensio...
Sounds like another "why even bother" extension, made to automate things that shouldn't be automated
Curious timing that I've started moving private repos to SourceHut a couple of weeks ago. It's pretty good and fast!
I'm also mirroring public ones to Codeberg.
I'll write about it when I'm done.
so how did they exfiltrate the information without noticing? what OS was the developer using? what security measures were they using?
yesterday discussion https://news.ycombinator.com/item?id=48191680
The security measure that the developer didn't use was completely refusing to use vscode.
vscode has no security model. It's not like swiss cheese where there are holes and some of the go all the way through. vscode is all hole with some cheese on the side. There is absolutely no isolation between the front-end process, the backend size (the thing that runs in the remote or the devcontainer), and any extensions or anything that might be in a repository whose authors you "trust".
Or you can just refuse to use random extensions. I built my own extensions if I needed them. You're a programmer, right? The whole point of extensibility is that you, or your company, can program what you need from your IDE, without having to make a whole IDE from scratch. I have since moved on to making my own IDE, mostly because I hate Electron and its >1gb memory footprint, but vscode served me so much better than anything else for years, without installing a single rando's extension.
Kind of.
A vscode workspace can trivially execute code on the machine that runs the server end of vscode. (This is how building works -- there is no sandbox unless the workspace config explicitly uses some kind of sandbox.) So the workspace can usually trivially elevate permissions to take over the vscode server, including installing extensions on it without asking you.
In principle, there is a teeny tiny bit of isolation between the local and remote sides, so the remote side cannot trivially execute code on the local machine. But I recommend reading this rather long-standing ticket:
https://github.com/microsoft/vscode-remote-release/issues/66...
It would be nice if there was an easy way to prevent people from installing vscode remotes on a shared server... Probably can run an ebpf routine to disallow creation of folders named . vscode*
The shared part has almost nothing to do with this, IMO.
> You're a programmer, right?
This is my position as well, but it's rarely received well. Usually, a response like "why would I rewrite something that's already been written and available?" By writing the code, I know how it works. I know it is not infected with crap. I know it will not in the future be infected with crap from a down stream dependency. It seems to me this really took off with node to the point that it's laughable at what people will include with no thought at all. I know component libraries have existed for many other languages before, but node just stands out to me
Most bosses look poorly upon spending their budget on rewriting software that already exists and simultaneously most bosses(although not the exact same set) don’t care about security until a disaster has already occurred.
And it’s also not like you’re going to literally write every piece of software you use, unless you’ve started all the way down at machine code you’re drawing the line somewhere on using code written by other people.
> I built my own extensions if I needed them. You're a programmer, right? The whole point of extensibility is that you, or your company, can program what you need from your IDE,
Dude, get real. We don't all have the luxury of being able to engage in endless IDE extension programming side quests just to do our day jobs. And even if we did, there's the reality that whatever you produce is probably not nearly as feature complete or bug free as the extension someone spent years writing. Hence why people want to reach for off the shelf solutions.
> just to do our day jobs.
Ah, there it is. The root of most problems in the software industry: people who hate programming and avoid doing it as much as possible, because they only got into it for the money.
I have no problem writing extensions in my spare time because programming is fun. Because I know how to program, like, actually program and not just copypaste stuff off StackOverflow, it doesn't take years to write a vscode extension, either.
> people who hate programming and avoid doing it as much as possible, because they only got into it for the money.
Yeah, not the case at all. I love programming, I've been doing it since I was a kid, for over 30 years. But I DO have to earn a living, and I'd rather spend free time programming things that interest me. Writing IDE extensions and tooling all the way down to the bare metal because I can't be absolutely sure at all times that node.js code doesn't contain a virus is not one of those things.
bait
(We merged this thread hither - it was originally in https://news.ycombinator.com/item?id=48201316)
The 3800 repos weren't exfiltrated from the compromised machine.
The malware (be it a VSCode plugin, an npm package, or whatever is next) simply slurps up all of the users private keys/tokens/env-vars it can find and sends this off somewhere covertly.
It's trivial to do this in a way to avoid detection. The small payload can be encrypted (so it can't be pattern matched) and then the destination can be one of millions of already compromised websites found via a google search and made to look like a small upload (it could even be chunked and uploaded via query parameters in a HTTP GET request).
The hackers receive the bundle of compromised tokens/keys and go look at what they give access to. Most of the time it's going to be someone's boring home network and a couple of public or private github repos. But every once in a while it's a developer who works at a big organisation (e.g. Github) with access to lots of private repos.
The hackers can then use the keys to clone all of the internal/private repos for that organisation that the compromised keys have access to. Some organisations may have alerts setup for this, but by the time they fire or are actioned upon the data will probably be downloaded. There's no re-auth or 2FA required for "git clone" in most organisations.
With this data the hackers have further options:
a) attempt to extort the company to pay a ransom on the promise of deleting the data
b) look for more access/keys/etc buried somewhere in the downloaded repos and see what else they can find with those
c) publish it for shits and giggles
d) try and make changes to further propagate the malware via similar or new attack vectors
e) analyse what has been downloaded to work out future attack vectors on the product itself
Right now Github (and others recently compromised in similar ways) will be thinking about what information is in those internal repos and what damage would it cause if that information became public, or what that information could be used to find out further down the line.
"Customer data should not be in a github repo" is all well and good, but if the customer data is actually stored in a database somewhere in AWS and there's even just one read-only access token stored somewhere in one private github repo, then there's a chance that the hackers will find that and exfiltrate the customer data that way.
Preventing the breach is hard. There will always be someone in an org who downloads and installs something on their dev machine that they shouldn't, or uses their dev machine for personal browsing, or playing games, or the company dev infra relies on something that is a known attack vector (like npm).
Preventing the exfiltration is virtually impossible. If you have a machine with access to the Internet and allow people to use a browser to google things then small payloads of data can be exfiltrated trivially. (I used to work somewhere where the dev network was air-gapped. The only way to get things onto it was typing it in, floppy or QIC-150 tape - in the days before USB memory sticks.)
Detecting the breach is nigh on impossible if the keys are not used egregiously. Sure some companies can limit access to things like Github to specific IPs, but it wouldn't take much for the malware to do something to work around this. (I can see things like a wireguard/tailscale client being embedded in malware to allow the compromised machine to be used as a proxy in such cases.)
Alerting that requires manual response is nigh on useless as by the time someone has been paged about something the horse has already bolted.
Knowing what has been taken is also a huge burden. 3800 repos that people now have to think about and decide what the implications are. Having been through something like this in the past there are plenty of times people go "I know that repo, it's fine, we can ignore that one" only for it to contain something they don't realise could be important.
These kind of attacks are going to become increasingly common as they're proven to work well and the mitigations for them are HARD. It doesn't need to be targeted at all either, you just infect a bunch of different things and see what gets sent in.
If companies continue to not pay the ransom then we're going to get a lot more things published and many companies having to apologise for all manner of things that end up being leaked.
> It's trivial to do this in a way to avoid detection
I'd love to see a real example/PoC.
Anyway, we discussed this issue in the other thread. For me, unrestricted outbound requests to any url, whether it's well known domains like api.github.com or any other domain, are a red flag.
Why does VS need to establish outbound requests to any domain, without authorization?
There's no magic solution, and these attacks will evolve, but I still think that restricting outbound requests is a good measure to mitigate these attacks.
> slurps up all of the users private keys/tokens/env-vars it can find and sends this off somewhere covertly.
Isolating applications can also mitigate the impact of these attacks. For example, you can restrict VS code to only share with the host .vscode/, .git/ and other directories. Even by project. Again, it's not bulletproof, but helps.
> Why does VS need to establish outbound requests to any domain, without authorization?
I don't know but it's very standard practice in most applications, because telemetry. But VS code is one of the worst: just check open snitch when running VS code, it's constantly phoning to a bunch of IPs.
> but I still think that restricting outbound requests is a good measure
It is 100% necessary, but doesn't stop most attacks quick enough.
If you're posting to github.com/acmecompany then attackers love to do things like add their own user github.com/acemcompany and just upload your data to that. Generally it doesn't last very long, but with CI/CD they can get thousands of keys in a minute and be gone seconds later.
There are plenty of exfiltration examples out there that could go through known, commonly-greenlit domains. Even exfil via DNS requests has been demonstrated.
But at least in that case, there’s a chance that the outbound requests are blocked. Malware isn’t perfect. Simple measures can block a significant proportion of attacks.
Ah yes, sandboxing/limiting a VSCode plugin is not impossible. I was thinking in more general terms (such as post install scripts within npm/python packages). Random test code in golang packages. There's an awful lot that people don't vet because keeping up with the vetting is a huge burden which seems pointless until you're the one that gets hacked.
The trick is to infect a plugin that has a legitimate reason for accessing the internet or running certain commands, and then coming up with ways to abuse that to exfiltrate the data. Or exfiltrating via DNS queries, or some other vector that isn't so obvious as "allow TCP/UDP connections to the whole world".
That or just repeatedly pester a user for permissions until one user (and you only need one within the organisation) relents and grants it.
the pop-ups fatigue is already an issue, and not an easy one to solve. Pretty much like SIEM/SOC alerts.
> The trick is to infect a plugin that has a legitimate reason for accessing the internet or running certain commands, and then coming up with ways to abuse that to exfiltrate the data. Or exfiltrating via DNS queries, or some other vector that isn't so obvious as "allow TCP/UDP connections to the whole world".
They'll get there, maybe. But the reality is that right now, everyone allows outbound requests blindly.
Instead of speculating, I suggest to actually investigate current IOCs and common tactics of malicious npm/pip/plugins/VS extensions. Something like this:
https://github.com/evilsocket/opensnitch/discussions/1119
Or use OpenSnitch (or Lulu, Glasswire, ZoneAlarm anyone?:D etc) to actually analyze real VS malicious extensions or npm packages and see if it stops the exfiltration, and if not, suggest ways to improve it. For example:
https://markdownpastebin.com/?id=9c294c75f09349d2977a4ccd250...
> If companies continue to not pay the ransom then we're going to get a lot more things published
Paying the ransom means your data still gets leaked and now you're out of money and embarrassed.
Why would they ever, ever, delete the data?
If paying the ransom doesn't stop your data getting leaked, nobody will pay the ransom. There is a rational basis for the ransomers to follow through with the deletion. Even the mob did provide "protection" when they coerced you into paying for it.
This sounds like a naive presumption. Are ransomware distributors well known for operating within strict hierarchies bound by culturally-ingrained traditions, or acting in the best interests of their own “greater good”?
Last I heard, teenagers can deploy ransomware with minimal technical knowledge or skill.
Because if they leak then nobody will pay the ransom in the future?
(responded to a similar response above)
> The malware (be it a VSCode plugin, an npm package, or whatever is next)
Not the first time we've seen a developer get popped thanks to a malicious game mod either...
question is why are people still using vscode or coding by hand?
A good day not to be using Electronjs trash.
Electron has nothing to do with the exploit here. A Vim plugin would have just as much ability to run malware.
friendly reminder:
- disable auto-updates for extensions in VS Code/Cursor
- use static analysis for GitHub Actions to catch security issues in pre-commit hook and on ci: https://github.com/zizmorcore/zizmor
- set locally: pnpm config set minimum-release-age 4320 # 3 days in minutes https://pnpm.io/supply-chain-security
- for other package managers check: https://gist.github.com/mcollina/b294a6c39ee700d24073c0e5a4e...
- add Socket Free Firewall when installing npm packages on CI to catch malware https://docs.socket.dev/docs/socket-firewall-free#github-act...
friendly reminder: use vim :)
It honestly surprises me we don't hear news about vim/neovim plugin supply chain attacks.
probably a much smaller dependency graph (lesser usage of transitive dependencies)
=)
And now, the hackers will be able to scan the repos for more vulnerabilities. Vulnerabilities all the way down.
Isn't 50k a bargain for what could potentially be in those files?
Maybe they looked it up and there wasn't anything interesting but then why take the risk for this kind of money?
Something doesn't make sense.
The data has been stolen by a criminal group. Paying for "restoring" the data does not guarantee they will delete all copies. There is no way of proving they actually did and they have in fact very little incentive to actually delete it.
You have to take their words for it but how can you trust crooks?
> You have to take their words for it but how can you trust crooks?
Because these are repeat actors. If they take a ransom and then re-sell it, no company will pay them ever again.
Don't think of experienced criminal enterprises as "groups of irrational scoundrels." They are companies, with employees, who understand game theory.
At some point, these people will come up with a ransom-as-a-service that you can subscribe to make monthly payments. It's no different than having to pay criminals monthly for security to prevent them from harming your themselves.
> this is not a ransom … Send your offers … we are not interested in under 50k…
It is a blind auction with a $50k minimum bid.
Sure but I meant I do find the minimum bid very low for such a high profile hack.
Is it premature to blame AI Microslop?
It's definitely AI slop. So tired of pushing AI-generated crap to production at my company
Another day another issue with Microsoft products, what else can be said :( At least they are being upfront these days.
Who uses GitHub in 2026