Overall it feels like UV is the best thing to happen to python packaging in two decades, by circumventing the endless non productive discussions on peps and instead just building something that works and is fast. In Rust naturally.
UV is great but also builds on existing PEPs. While they have the ability experiment (which is great), they also benefit from those "endless non productive discussions on peps" as you called it.
I think UV proves that dedicated funding can make a huge impact on a project and benefit for the community. They are a doing a damn, good job.
They mostly took inspiration from other languages for UV. Cargo (Rust) was a huge inspiration, but they got stuff from Ruby as well, I believe. There was an episode on "the changelog" about it. Don't remember them saying anything about PEPs, although that might just be me not having listened to the entire thing. However, the Charlie Marsh was extremely insistent on the advantages of being a polyglot and hiring people with diverse programming experiences. So I think it's quite safe to assume that played a bigger role than just PEPs.
I can't speak for the UV team. My 2C on how I would treat the PEPs: If there is an accepted one, and implementing it doesn't go too strongly against your competing design goals, do it for compatibility. This does not imply that the PEP is driving your design, or required to make your software. It is a way to improve compatibility.
I wrote an earlier one (rust, inspired by Cargo, managed deps, scripts, and py versions) called PyFlow that I abandoned, because nobody used it. "Why should I use this when we have pip, pipenv, and poetry?"
It's true that where Python offers critical performance it's typically by providing a nice interface to existing compiled code. But people who work through those interfaces are still fundamentally "doing it in Python"; the most important "it" is that which makes a useful system on top of the number-crunching.
But putting that aside, a big part of uv's performance is due to things that are not the implementation language. Most of the actually necessary parts of the installation process are I/O bound, and works through C system calls even in pip. The crunchy bits are package resolution in rare cases (where lock files cache the result entirely), and pre-compiling Python to .pyc bytecode files (which is embarrassingly parallel if you don't need byte-for-byte reproducibility, and normally optional unless you're installing something with admin rights to be used by unprivileged users).
Uv simply has better internal design. You know that original performance chart "installing the Trio dependencies with a warm cache"?
It turns out that uv at the time defaulted to not pre-compiling while pip defaulted to doing it; an apples-to-apples comparison is not as drastic, but uv also does the compilation in parallel which pip hasn't been doing. I understand this functionality is coming to pip soon, but it just hasn't been a priority even though it's really not that hard (there's even higher-level support for it via the standard library `compileall` module!).
More strikingly, though, uv's cache actually caches the unpacked files from a wheel. It doesn't have to unzip anything; it just hard-links the files. Pip's cache, on the other hand, is really an HTTPS cache; it basically simulates an Internet connection locally, "downloading" a wheel by copying (the cached artifact has a few bytes of metadata prepended) and unpacking it anew. And the files are organized and named according to a hash of the original URL, so you can't even trivially reach in there and directly grab a wheel. I guess this setup is a little better for code reuse given that it was originally designed without caching and with the assumption of always downloading from PyPI. But it's worse for, like, everything else.
Can someone explain why UV is so praised when Poetry achieved a lot of that good several years earlier? Maybe I missed the train but I’ve been using poetry since its first version and all the benefits people praise UV for have long been in my builds.
Ruff was the gateway drug for me, much better than the black/isort/etc combo.
Led me to try uv, which fixed a couple of egregious bugs in pip. Add speed and it’s a no brainer.
I don’t think poetry has these advantages, and heard about bugs early on. Is that completely fair? Probably not. But it’s obvious astral tools have funding and a competent team.
Poetry was in some senses before its time; people were frustrated with pip, but didn't really understand the problems they were encountering, meanwhile the ecosystem had started trying to move away from doing everything with Setuptools for everyone. There's a ton that I'd love to explain here about the history of pyproject.toml etc. but the point is that Poetry had its own idea about what it would mean to be an all-in-one tool... and so did everyone else. Meanwhile, the main packaging people had been designing standards with the expectation of making a UNIX-style tool ecosystem work.
Everyone seems to like uv's answer better, but I'm still a believer in composable toolchains, since I've already been using those forever. I actually was an early Poetry adopter for a few reasons. In particular, I would have been fine sticking with Setuptools for building projects if it had supported PEPS 517/518/621 promptly. 621 came later but Poetry's workaround was nicer than Setuptools' to me. And it was easier to use with the new pyproject.toml setup, and I really wanted to get away from the expectation of using setup.py even for pure-Python projects.
But that was really it. The main selling point point of Poetry was (and is) that they offered a lockfile and dependency resolution, but these weren't really things I personally needed. So there was nothing really to outweigh the downsides:
* The way Poetry does the actual installation is, as far as I can tell, not much different from what pip does. And there are a ton of problems with that model.
* The early days of Poetry were very inconsistent in terms of installation and upgrade procedures. There was at least once that it seemed that the only thing that would work was a complete manual uninstall and reinstall, and I had to do research to figure out what I had to remove for the uninstallation as there was nothing provided to automate that.
* In the end, Poetry didn't have PEP 621 support for about four years (https://github.com/python-poetry/roadmap/issues/3 ; the OP was already almost a year after PEP acceptance in https://discuss.python.org/t/_/5472/109); there was this whole thing about how you were supposed to use pyproject.toml to describe the basic metadata of your project for packaging purposes, but if you used Poetry then you used Masonry to build, and that meant using a whole separate metadata configuration. Setuptools was slow in getting PEP 621 support off the ground (and really, PEP 621 itself was slow! It's hard to justify expecting anyone to edit pyproject.toml manually without PEP 621!), but Poetry was far slower still. I had already long given up on it at that point.
So for me, Poetry was basically there to provide Masonry, and Masonry was still sub-par. I was still creating venvs manually, using `twine` to upload to PyPI etc. because that's just how I think. Writing something like `poetry shell` (or `uv run`) makes about as much sense to me as `git run-unit-tests` would.
> Some of uv's functionality cannot be expressed in the pylock.toml format; as such, uv will continue to use the uv.lock format within the project interface.
> However, uv supports pylock.toml as an export target and in the uv pip CLI.
That is pretty unfortunate. Would have been cool if all of them could have used the same file.
Of course there must have been really good reasons for this decision. I hope no one will hold it against the maintainers of any of the projects. Especially because it looks like it is easy to move between the two lock files.
> The biggest limitation is that there's no support for arbitrary entrypoints to the graph, because pylock.toml includes a fixed marker for each package entry rather than recording a graph of dependencies.
Year 1: A small but vocal subset of the Python/Django community pops up in every thread: "It's not actually a problem." or "It's not an issue that my project would ever encounter so limited resources shouldn't be expended on it."
Year 2: People are choosing other solutions because Python/Django isn't addressing the problem.
Year 3: We'll form a committee. The committee doesn't think it's a problem.
Year 4: The community has a problem. Fine. Why doesn't the community write a Python Enhancement Proposal/Django Enhancement Proposal (PEP/DEP)?
Years 5-10: PEP/DEP ignored.
Year 11: The community has a problem. PEP/DEP implemented and released.
Year 12-22: Major packages refuse to support the change.
Year 23: Last version not supporting the change deprecated.
Year 23+1 day: Fork of last deprecated version Python not supporting change created and released.
I have 15 years of code in Python still running but spend a little more than 50% of my time in other stacks. I don't notice as many people arguing against basic features, like a REST API package in Django, in non-Python/Django communities. The precursor to a Django REST API package, content negotiation, has been a draft DEP since early 2014 (https://github.com/django/deps/blob/main/draft/content-negot...). That's going on 12 years of stalled progress for a feature that needed to be released in 2010.
With Python/Django you learn not to wait because nothing is going to change.
And yes, Python/Django are open source. And yes agin, I donate over $1,000/year to support F/OSS projects that I depend on.
My uninformed impression of the Python steering committee has always been like the C/C++ one. Ponderously bureaucratic, trying to find solutions that work for every competing interest, by the time they get to an agreement the real world has already moved on and solved things in their own way, which makes fragmentation and intercompatibility worse.
I know that Guido isn’t around any more, but this is what a BDFL is useful for. To have the final say when, inevitably, the squabbling parties are unable to find a compromise.
No worries, it is taking longer to be able to rely on C++ modules for portable code, having Valhala available on JVM, Android supporting anything beyond Java 17, or a JIT in CPython, some things take their time to finally become widespread, for various kinds of reasons.
Interesting to see the seemingly canonical meaning of lockfile (semaphore vs package version lock) change over the years. I at least was curious, how one could specify a format for typical empty files.
So… what is a good example of a consensus driven culture on something popular with a lot of opinions, some legacy use cases, that can get these things done quickly?
This is a systems problem. Successful examples wanted.
In the general case what you say is true. But look at the specific example of PEP 751 (the thread we’re in). Normally you’d designate some person who has to gain consensus and add the feature to python or to the standard library. Even if everyone isn’t on board, they’ll get on board when they upgrade python.
PEP 751 isn’t a python feature, it’s a feature that will be implemented by 3 projects - PDM, pip and uv. Consensus isn’t optional or nice to have, it’s necessary. If any of the 3 maintainers felt their needs weren’t met they wouldn’t have implemented it.
Some projects wait too long for consensus because they prioritise not rushing into a suboptimal solution or hurting people’s feelings. Sometimes it’s ok to just go ahead and implement something. Pep 751 is not one of those projects.
An endless multiplication of veto points in a consensus culture is a failure mode. Funny thing is people embedded in such a culture will see its slowness as a feature and sneer at the world as it leaves them behind.
Python was better with a BDFL.
The best thing that could happen to Python right now would be for someone to fork it. Maybe just have Astral run away with the whole language. This lock file format should have taken a weekend, not four damn years.
A governance model where “time-based decision thresholds” directly damaged veto power as time passes could be used to harness this sort of BDFL-only power in a consensus culture.
This is why having a benevolent dictator sometimes results in better progress than committees. It’s a double edged sword obviously if the dictator has limited skill but having someone like a Steve Jobs or Linus clears the way for progress when things like “consensus” causes decisions to take years or die from inertia. I’ve seen this first hand at FAANGs where bureaucracy kills great ideas because bureaucrats in key areas don’t want to lift a finger to make changes.
The big counter example is c++ which I feel is too productive and should slow down their decisions by a factor of 3.
I was hoping part of this delay was due to people arguing lock files are poor engineering to begin with, but alas, no mention of that. I guess we've just given up on any kind of package version flexibility.
That would be because package version flexibility is an entirely orthogonal concept to lock files, and to conflate them shows a lack of understanding.
pyproject.toml describes the supported dependency versions. Those dependencies are then resolved to some specific versions, and the output of that resolution is the lock file. This allows someone else to install the same dependencies in a reproducible way. It doesn't prevent someone resolving pyproject.toml to a different set of dependency versions.
If you are building a library, downstream users of your library won't use your lockfile. Lockfiles can still be useful for a library: one can use multiple lockfiles to try to validate its dependency specifications. For example you might generate a lockfile using minimum-supported-versions of all dependencies and then run your test suite against that, in addition to running the test suite against the default set of resolved dependencies.
> I guess we've just given up on any kind of package version flexibility.
Presumably because decades of experience has demonstrated that humans are extremely bad at maintaining compatibility between releases and dealing with fallout from badly specified package versions is probably second only to NULL in terms of engineering time wasted?
Or possibly it's just because a lot of the Python ecosystem doesn't even try and follow semver and you have no guarantee that any two versions are compatible with each other without checking the changelog and sacrificing a small chicken...
> Or possibly it's just because a lot of the Python ecosystem doesn't even try and follow semver
Even if they try, semver can only ever be a suggestion of the possibility of compatibility at best because people are not oracles and they misjudge the effects of changes all the time.
PHP compatibility and its commitment (across the ecosystem) to backwards compatibility is actually pretty cool. If there is one thing PHP does right, it’s this.
One opinion that gets me flamed all the things is this: I hate semver. Just use linear version numbers. Incompatibility? That's a new package with a new name.
In ecosystems with hundreds of dependencies, that requires you to review the changelog and source code of all of them all the time, especially if you want to avoid security issues fixed in more recent versions. I’d rather have a clear indication of something I need to take care of (a new major version), something new I might benefit from (a new minor version), or something that improved the library in some way (a new patch version). That alone slices required effort on my part down considerably.
That would be ideal, but in practice, SemVer is frequently broken with technicalities or outright disregard for standards. Look at K8s: still on v1, but there have been countless breaking changes. Their argument is that the core application hasn’t changed its API, it’s all of the parts that go into it to make it useful (Ingress, Service, etc.) that have had breaking changes. This is an absurd argument IMO.
Then there’s Python - which I dearly love - that uses something that looks exactly like SemVer, but isn’t. This is often confusing for newcomers, especially because most libraries use SemVer.
Maybe I'm being too pedantic here, but semver for applications is always going to be broken and driven by marketing. SemVer, for my money, is only applicable realistically to libraries.
What do you mean, "have to take care of something"? You don't have to upgrade to a new major version. The problem with major versions is that they make it too easy to break other people are cause work for them.
Software is churn. Sticking to outdated versions for too long, the rest of the world evolves without you, until other things will start breaking. For example because a new dependency A you need depends on another package B you already have, but A needs a newer version of B than you use.
At that point, you have a huge undertaking ahead of you that blocks productivity and comes with a lot of risk of inadvertently breaking something.
Whether someone else or I am the problem doesn’t matter to my customers at the end of the day, if I’m unable to ship a feature I’m at fault.
Sometimes you do have to upgrade. We were using a package that was two years old and the Google APIs it called were renamed one day. I’m sure there was an announcement or something to give us warning, but for whatever reason, we didn’t get them. So that day, everything came crashing to a halt. We spent the afternoon upgrading and then we were done.
To say that you don’t have to upgrade is true, but it always comes at a price.
Well, yeah, it's reasonable people flame you there. What is the difference between,
- zlib-1 v 23
- zlib 1.2.3
except that automatic updates and correlation are harder for the first approach. It also will likely make typosquatting so much more fun and require namespacing at the very least (to avoid someone publishing, e. G., zlib-3 when official projects only cover zlib-{1,2}.
Sure, but when bumping an existing zlib from 1 -> 2, you would increase the version number (in a package manager) instead of removing & adding separate dependencies.
At least that'll allow you to install both in parallel. Which is an absolutely essential requirement IMHO, and there not being a solution for this for semver'd Python packages is a root cause of all this I'd say.
That's why other spaces have machine tools for this. There seems to be an overall drift in Python to use more type annotations anyway; making a tool that compares 2 versions of a package isn't rocket science (maybe it already exists?)
Literally everyone posting here is using a system built on compatible interfaces. Stable DLLs on Windows, dylib framework versions on OSX, ELF SOversioning on Linux.
It's clearly not impossible, just a question of priorities and effort, and that makes it a policy decision to do it or not. And I'll lean towards we've been shifting that policy too far in the wrong direction.
The only reason DLLs are stable on Windows is that every application now ships all the DLLs they need to avoid the DLL Hell caused by this exact thing not working.
I look forward to you demonstrating your tool that can check if two python packages are compatible with each other, maybe you can solve the halting problem when you're done with that?
I'm not a Windows developer, you'll have to excuse my ignorance on that. Pretty sure you're not shipping user32.dll with your applications though.
Also, I didn't claim any tool would give a perfect answer on compatibility. They don't for ELF libraries either, they just catch most problems, especially accidental ones. The goal is 99.9%, not 100%. Just being unable to solve a problem perfectly doesn't mean you should give up without trying.
> I'm not a Windows developer, you'll have to excuse my ignorance on that. Pretty sure you're not shipping user32.dll with your applications though.
Microsoft's famously terrifyingly large amount of engineering work that goes into maintaining backwards compatibility to allow you to run Windows 3.1 software on Windows 11 is certainly impressive, but maybe also is the exception that proves the rule.
> Just being unable to solve a problem perfectly doesn't mean you should give up without trying.
Currently no one can solve that problem at all, let alone imperfectly. If you can, I'd gladly sponsor your project, since it would make my life a lot easier.
I think it'd be near impossible to guarantee API compatibility, regardless of type hinting. E.g. if a function returns a list, in a new version I can add/remove items from that list such that it's a breaking change to users, without any API compatibility issues
Overall it feels like UV is the best thing to happen to python packaging in two decades, by circumventing the endless non productive discussions on peps and instead just building something that works and is fast. In Rust naturally.
UV is great but also builds on existing PEPs. While they have the ability experiment (which is great), they also benefit from those "endless non productive discussions on peps" as you called it.
I think UV proves that dedicated funding can make a huge impact on a project and benefit for the community. They are a doing a damn, good job.
They mostly took inspiration from other languages for UV. Cargo (Rust) was a huge inspiration, but they got stuff from Ruby as well, I believe. There was an episode on "the changelog" about it. Don't remember them saying anything about PEPs, although that might just be me not having listened to the entire thing. However, the Charlie Marsh was extremely insistent on the advantages of being a polyglot and hiring people with diverse programming experiences. So I think it's quite safe to assume that played a bigger role than just PEPs.
I can't speak for the UV team. My 2C on how I would treat the PEPs: If there is an accepted one, and implementing it doesn't go too strongly against your competing design goals, do it for compatibility. This does not imply that the PEP is driving your design, or required to make your software. It is a way to improve compatibility.
> dedicated funding can make a huge impact on a project
Where does Astral's funding come from, anyway?
Rye was already pretty good before it was donated to astral and renamed to uv though...
I wrote an earlier one (rust, inspired by Cargo, managed deps, scripts, and py versions) called PyFlow that I abandoned, because nobody used it. "Why should I use this when we have pip, pipenv, and poetry?"
Programming is 80% marketing, eh?
Yet more proof that "the best way to do anything in python is to not do it in python."
It's true that where Python offers critical performance it's typically by providing a nice interface to existing compiled code. But people who work through those interfaces are still fundamentally "doing it in Python"; the most important "it" is that which makes a useful system on top of the number-crunching.
But putting that aside, a big part of uv's performance is due to things that are not the implementation language. Most of the actually necessary parts of the installation process are I/O bound, and works through C system calls even in pip. The crunchy bits are package resolution in rare cases (where lock files cache the result entirely), and pre-compiling Python to .pyc bytecode files (which is embarrassingly parallel if you don't need byte-for-byte reproducibility, and normally optional unless you're installing something with admin rights to be used by unprivileged users).
Uv simply has better internal design. You know that original performance chart "installing the Trio dependencies with a warm cache"?
It turns out that uv at the time defaulted to not pre-compiling while pip defaulted to doing it; an apples-to-apples comparison is not as drastic, but uv also does the compilation in parallel which pip hasn't been doing. I understand this functionality is coming to pip soon, but it just hasn't been a priority even though it's really not that hard (there's even higher-level support for it via the standard library `compileall` module!).
More strikingly, though, uv's cache actually caches the unpacked files from a wheel. It doesn't have to unzip anything; it just hard-links the files. Pip's cache, on the other hand, is really an HTTPS cache; it basically simulates an Internet connection locally, "downloading" a wheel by copying (the cached artifact has a few bytes of metadata prepended) and unpacking it anew. And the files are organized and named according to a hash of the original URL, so you can't even trivially reach in there and directly grab a wheel. I guess this setup is a little better for code reuse given that it was originally designed without caching and with the assumption of always downloading from PyPI. But it's worse for, like, everything else.
Yet more proof that confirmation bias exists.
Find something better than Django.
Or matplotlib.
Or PyTorch.
PyTorch is a C++ project with a Python wrapper.
>Find something better than Django
Rails. QED.
Rails is a good project no doubt.
Django comes batteries included for basic apps, including an admin.
Can someone explain why UV is so praised when Poetry achieved a lot of that good several years earlier? Maybe I missed the train but I’ve been using poetry since its first version and all the benefits people praise UV for have long been in my builds.
Ruff was the gateway drug for me, much better than the black/isort/etc combo.
Led me to try uv, which fixed a couple of egregious bugs in pip. Add speed and it’s a no brainer.
I don’t think poetry has these advantages, and heard about bugs early on. Is that completely fair? Probably not. But it’s obvious astral tools have funding and a competent team.
Speed and simplicity. Now I can fetch one binary on a system and in seconds fetch everything needed to run a Python tool or work on a code base.
I can do all that without having to even worry about virtual ends, or Python versions too.
Poetry was in some senses before its time; people were frustrated with pip, but didn't really understand the problems they were encountering, meanwhile the ecosystem had started trying to move away from doing everything with Setuptools for everyone. There's a ton that I'd love to explain here about the history of pyproject.toml etc. but the point is that Poetry had its own idea about what it would mean to be an all-in-one tool... and so did everyone else. Meanwhile, the main packaging people had been designing standards with the expectation of making a UNIX-style tool ecosystem work.
Everyone seems to like uv's answer better, but I'm still a believer in composable toolchains, since I've already been using those forever. I actually was an early Poetry adopter for a few reasons. In particular, I would have been fine sticking with Setuptools for building projects if it had supported PEPS 517/518/621 promptly. 621 came later but Poetry's workaround was nicer than Setuptools' to me. And it was easier to use with the new pyproject.toml setup, and I really wanted to get away from the expectation of using setup.py even for pure-Python projects.
But that was really it. The main selling point point of Poetry was (and is) that they offered a lockfile and dependency resolution, but these weren't really things I personally needed. So there was nothing really to outweigh the downsides:
* The way Poetry does the actual installation is, as far as I can tell, not much different from what pip does. And there are a ton of problems with that model.
* The early days of Poetry were very inconsistent in terms of installation and upgrade procedures. There was at least once that it seemed that the only thing that would work was a complete manual uninstall and reinstall, and I had to do research to figure out what I had to remove for the uninstallation as there was nothing provided to automate that.
* In the end, Poetry didn't have PEP 621 support for about four years (https://github.com/python-poetry/roadmap/issues/3 ; the OP was already almost a year after PEP acceptance in https://discuss.python.org/t/_/5472/109); there was this whole thing about how you were supposed to use pyproject.toml to describe the basic metadata of your project for packaging purposes, but if you used Poetry then you used Masonry to build, and that meant using a whole separate metadata configuration. Setuptools was slow in getting PEP 621 support off the ground (and really, PEP 621 itself was slow! It's hard to justify expecting anyone to edit pyproject.toml manually without PEP 621!), but Poetry was far slower still. I had already long given up on it at that point.
So for me, Poetry was basically there to provide Masonry, and Masonry was still sub-par. I was still creating venvs manually, using `twine` to upload to PyPI etc. because that's just how I think. Writing something like `poetry shell` (or `uv run`) makes about as much sense to me as `git run-unit-tests` would.
Unfortunately:
> Some of uv's functionality cannot be expressed in the pylock.toml format; as such, uv will continue to use the uv.lock format within the project interface.
> However, uv supports pylock.toml as an export target and in the uv pip CLI.
— https://docs.astral.sh/uv/concepts/projects/layout/#the-lock...
That is pretty unfortunate. Would have been cool if all of them could have used the same file.
Of course there must have been really good reasons for this decision. I hope no one will hold it against the maintainers of any of the projects. Especially because it looks like it is easy to move between the two lock files.
Which functionality of uv's cannot be expressed in the pylock.toml format?
More information here: https://github.com/astral-sh/uv/issues/12584
> The biggest limitation is that there's no support for arbitrary entrypoints to the graph, because pylock.toml includes a fixed marker for each package entry rather than recording a graph of dependencies.
I've exited countless Python/Django threads discussing future plans.
Year -1: The community has a problem.
Year 0: Proposal to fix the problem.
Year 1: A small but vocal subset of the Python/Django community pops up in every thread: "It's not actually a problem." or "It's not an issue that my project would ever encounter so limited resources shouldn't be expended on it."
Year 2: People are choosing other solutions because Python/Django isn't addressing the problem.
Year 3: We'll form a committee. The committee doesn't think it's a problem.
Year 4: The community has a problem. Fine. Why doesn't the community write a Python Enhancement Proposal/Django Enhancement Proposal (PEP/DEP)?
Years 5-10: PEP/DEP ignored.
Year 11: The community has a problem. PEP/DEP implemented and released.
Year 12-22: Major packages refuse to support the change.
Year 23: Last version not supporting the change deprecated.
Year 23+1 day: Fork of last deprecated version Python not supporting change created and released.
I have 15 years of code in Python still running but spend a little more than 50% of my time in other stacks. I don't notice as many people arguing against basic features, like a REST API package in Django, in non-Python/Django communities. The precursor to a Django REST API package, content negotiation, has been a draft DEP since early 2014 (https://github.com/django/deps/blob/main/draft/content-negot...). That's going on 12 years of stalled progress for a feature that needed to be released in 2010.
With Python/Django you learn not to wait because nothing is going to change.
And yes, Python/Django are open source. And yes agin, I donate over $1,000/year to support F/OSS projects that I depend on.
My uninformed impression of the Python steering committee has always been like the C/C++ one. Ponderously bureaucratic, trying to find solutions that work for every competing interest, by the time they get to an agreement the real world has already moved on and solved things in their own way, which makes fragmentation and intercompatibility worse.
I know that Guido isn’t around any more, but this is what a BDFL is useful for. To have the final say when, inevitably, the squabbling parties are unable to find a compromise.
No worries, it is taking longer to be able to rely on C++ modules for portable code, having Valhala available on JVM, Android supporting anything beyond Java 17, or a JIT in CPython, some things take their time to finally become widespread, for various kinds of reasons.
Interesting to see the seemingly canonical meaning of lockfile (semaphore vs package version lock) change over the years. I at least was curious, how one could specify a format for typical empty files.
So… what is a good example of a consensus driven culture on something popular with a lot of opinions, some legacy use cases, that can get these things done quickly?
This is a systems problem. Successful examples wanted.
Postgres community is one example to look at, that I can think of. Linux may be other, but I'm not intimately aware of its inner workings.
https://www.postgresql.org/community/ is a good start to get a feel of all things related to Postgres community.
Angular turned things around quickly. Corporate sure, but if you know anything about how Angular is used within Google that was a massive job.
You need a consensus driven culture with a final decision maker when that culture fails to reach a decision.
In the general case what you say is true. But look at the specific example of PEP 751 (the thread we’re in). Normally you’d designate some person who has to gain consensus and add the feature to python or to the standard library. Even if everyone isn’t on board, they’ll get on board when they upgrade python.
PEP 751 isn’t a python feature, it’s a feature that will be implemented by 3 projects - PDM, pip and uv. Consensus isn’t optional or nice to have, it’s necessary. If any of the 3 maintainers felt their needs weren’t met they wouldn’t have implemented it.
Some projects wait too long for consensus because they prioritise not rushing into a suboptimal solution or hurting people’s feelings. Sometimes it’s ok to just go ahead and implement something. Pep 751 is not one of those projects.
> A lock file is meant to record all the dependencies your code needs to work along with how to install those dependencies.
It's about dependency locking in Python packaging.
The post didn't answer why it took over 4 years.
Why couldn't everyone be flied to the same place and have it all figured out in a week instead of having the process drag on for years?
An endless multiplication of veto points in a consensus culture is a failure mode. Funny thing is people embedded in such a culture will see its slowness as a feature and sneer at the world as it leaves them behind.
Python was better with a BDFL.
The best thing that could happen to Python right now would be for someone to fork it. Maybe just have Astral run away with the whole language. This lock file format should have taken a weekend, not four damn years.
> Python was better with a BDFL.
The problem is that the BDFL also didn't want to think about packaging, certainly not the issues that inspired Conda.
I wouldn't mind seeing a PyPy-like fork done in Rust. Maybe take the opportunity to redesign the standard library, too.
A governance model where “time-based decision thresholds” directly damaged veto power as time passes could be used to harness this sort of BDFL-only power in a consensus culture.
This is why having a benevolent dictator sometimes results in better progress than committees. It’s a double edged sword obviously if the dictator has limited skill but having someone like a Steve Jobs or Linus clears the way for progress when things like “consensus” causes decisions to take years or die from inertia. I’ve seen this first hand at FAANGs where bureaucracy kills great ideas because bureaucrats in key areas don’t want to lift a finger to make changes.
The big counter example is c++ which I feel is too productive and should slow down their decisions by a factor of 3.
Why did we have to call them “lock files?” There is an existing thing known as a lock file for actual file locking.
Call them literally anything else. Freeze file, version spec, dependency pin…
There really are only two hard problems in computer science, as the saying goes. Cache invalidation and epithet manufacturing (cough).
Python is following the precedent of many other "language ecosystems" here.
we should have called the other thing mutex files
Locks don't actually work in POSIX in real life anyway.
doesn't opening a file with O_CREAT|O_EXCL work in posix?
POSIX only guarantees advisory locks; mandatory locks are an optional feature and are not supported in Linux. See for example <https://stackoverflow.com/questions/77931997/linux-mandatory...>.
I was hoping part of this delay was due to people arguing lock files are poor engineering to begin with, but alas, no mention of that. I guess we've just given up on any kind of package version flexibility.
That would be because package version flexibility is an entirely orthogonal concept to lock files, and to conflate them shows a lack of understanding.
pyproject.toml describes the supported dependency versions. Those dependencies are then resolved to some specific versions, and the output of that resolution is the lock file. This allows someone else to install the same dependencies in a reproducible way. It doesn't prevent someone resolving pyproject.toml to a different set of dependency versions.
If you are building a library, downstream users of your library won't use your lockfile. Lockfiles can still be useful for a library: one can use multiple lockfiles to try to validate its dependency specifications. For example you might generate a lockfile using minimum-supported-versions of all dependencies and then run your test suite against that, in addition to running the test suite against the default set of resolved dependencies.
> I guess we've just given up on any kind of package version flexibility.
Presumably because decades of experience has demonstrated that humans are extremely bad at maintaining compatibility between releases and dealing with fallout from badly specified package versions is probably second only to NULL in terms of engineering time wasted?
Or possibly it's just because a lot of the Python ecosystem doesn't even try and follow semver and you have no guarantee that any two versions are compatible with each other without checking the changelog and sacrificing a small chicken...
> Or possibly it's just because a lot of the Python ecosystem doesn't even try and follow semver
Even if they try, semver can only ever be a suggestion of the possibility of compatibility at best because people are not oracles and they misjudge the effects of changes all the time.
The PHP ecosystem is almost universally semver and has been going strong for years now, without any major accidental breaking change related outages.
A little discipline and commitment to backwards compatibility and it isn’t too hard, really?
PHP compatibility and its commitment (across the ecosystem) to backwards compatibility is actually pretty cool. If there is one thing PHP does right, it’s this.
One opinion that gets me flamed all the things is this: I hate semver. Just use linear version numbers. Incompatibility? That's a new package with a new name.
In ecosystems with hundreds of dependencies, that requires you to review the changelog and source code of all of them all the time, especially if you want to avoid security issues fixed in more recent versions. I’d rather have a clear indication of something I need to take care of (a new major version), something new I might benefit from (a new minor version), or something that improved the library in some way (a new patch version). That alone slices required effort on my part down considerably.
That would be ideal, but in practice, SemVer is frequently broken with technicalities or outright disregard for standards. Look at K8s: still on v1, but there have been countless breaking changes. Their argument is that the core application hasn’t changed its API, it’s all of the parts that go into it to make it useful (Ingress, Service, etc.) that have had breaking changes. This is an absurd argument IMO.
Then there’s Python - which I dearly love - that uses something that looks exactly like SemVer, but isn’t. This is often confusing for newcomers, especially because most libraries use SemVer.
If a cure was effective on 99.9999% of all patients, would you say that it’s frequently failing? Millions of projects use SemVer just fine.
Singling out two behemoths designed by committee to demonstrate SemVer's shortcomings seems misleading.
> Look at K8s
Maybe I'm being too pedantic here, but semver for applications is always going to be broken and driven by marketing. SemVer, for my money, is only applicable realistically to libraries.
What do you mean, "have to take care of something"? You don't have to upgrade to a new major version. The problem with major versions is that they make it too easy to break other people are cause work for them.
Software is churn. Sticking to outdated versions for too long, the rest of the world evolves without you, until other things will start breaking. For example because a new dependency A you need depends on another package B you already have, but A needs a newer version of B than you use. At that point, you have a huge undertaking ahead of you that blocks productivity and comes with a lot of risk of inadvertently breaking something.
Whether someone else or I am the problem doesn’t matter to my customers at the end of the day, if I’m unable to ship a feature I’m at fault.
Sometimes you do have to upgrade. We were using a package that was two years old and the Google APIs it called were renamed one day. I’m sure there was an announcement or something to give us warning, but for whatever reason, we didn’t get them. So that day, everything came crashing to a halt. We spent the afternoon upgrading and then we were done.
To say that you don’t have to upgrade is true, but it always comes at a price.
I have to upgrade if I want security fixes. Even if they patch old majors for a time, that’s not perpetual.
> That's a new package with a new name.
Well, yeah, it's reasonable people flame you there. What is the difference between,
- zlib-1 v 23
- zlib 1.2.3
except that automatic updates and correlation are harder for the first approach. It also will likely make typosquatting so much more fun and require namespacing at the very least (to avoid someone publishing, e. G., zlib-3 when official projects only cover zlib-{1,2}.
I can already typosquat a "zlib2", so what's the difference?
Sure, but when bumping an existing zlib from 1 -> 2, you would increase the version number (in a package manager) instead of removing & adding separate dependencies.
At least that'll allow you to install both in parallel. Which is an absolutely essential requirement IMHO, and there not being a solution for this for semver'd Python packages is a root cause of all this I'd say.
That's why other spaces have machine tools for this. There seems to be an overall drift in Python to use more type annotations anyway; making a tool that compares 2 versions of a package isn't rocket science (maybe it already exists?)
Literally everyone posting here is using a system built on compatible interfaces. Stable DLLs on Windows, dylib framework versions on OSX, ELF SOversioning on Linux.
It's clearly not impossible, just a question of priorities and effort, and that makes it a policy decision to do it or not. And I'll lean towards we've been shifting that policy too far in the wrong direction.
The only reason DLLs are stable on Windows is that every application now ships all the DLLs they need to avoid the DLL Hell caused by this exact thing not working.
I look forward to you demonstrating your tool that can check if two python packages are compatible with each other, maybe you can solve the halting problem when you're done with that?
I'm not a Windows developer, you'll have to excuse my ignorance on that. Pretty sure you're not shipping user32.dll with your applications though.
Also, I didn't claim any tool would give a perfect answer on compatibility. They don't for ELF libraries either, they just catch most problems, especially accidental ones. The goal is 99.9%, not 100%. Just being unable to solve a problem perfectly doesn't mean you should give up without trying.
Windows kept system libraries stable and modern software does a lot to avoid them with abstractions on top because they are unpleasant to use.
You could call that success but I think it’s just an extra layer of cruft.
> I'm not a Windows developer, you'll have to excuse my ignorance on that. Pretty sure you're not shipping user32.dll with your applications though.
Microsoft's famously terrifyingly large amount of engineering work that goes into maintaining backwards compatibility to allow you to run Windows 3.1 software on Windows 11 is certainly impressive, but maybe also is the exception that proves the rule.
> Just being unable to solve a problem perfectly doesn't mean you should give up without trying.
Currently no one can solve that problem at all, let alone imperfectly. If you can, I'd gladly sponsor your project, since it would make my life a lot easier.
I think it'd be near impossible to guarantee API compatibility, regardless of type hinting. E.g. if a function returns a list, in a new version I can add/remove items from that list such that it's a breaking change to users, without any API compatibility issues
Lock files are about what version the application needs installed, not what a library depends on. They don’t prevent package version flexibility.