[HN Gopher] Pypi.org is running a survey on the state of Python ... ___________________________________________________________________ Pypi.org is running a survey on the state of Python packaging Author : zbentley Score : 163 points Date : 2022-09-07 15:05 UTC (7 hours ago) (HTM) web link (pypi.org) (TXT) w3m dump (pypi.org) | sanshugoel wrote: | Pickup poetry and fix it. I thought it would be fun to use poetry | but it smacks itself here and there. | clintonb wrote: | The survey is at https://www.surveymonkey.co.uk/r/M5XKQCT. | MonkeyMalarky wrote: | Seems more oriented to (potential) contributors than end users | of the packaging system. Who cares about mission statements and | inclusivity, secure funding and pay developers to make the | tools. | woodruffw wrote: | > Who cares about mission statements and inclusivity, secure | funding and pay developers to make the tools. | | These are connected things. | | I maintain a PyPA member project (and contribute to many | others), and the latter is aided by the former: the mission | statement keeps the community organized around shared goals | (such as standardizing Python's packaging tooling), and | inclusivity insures a healthy and steady flow of new | contributors (and potential corporate funding sources). | nomdep wrote: | The PSF are not engineers looking for a better developer | experience, but politicians looking for power. That's why the | pipenv fiasco a few years ago | zo1 wrote: | What was the pipenv fiasco? | crazytalk wrote: | This survey is the literal definition of leading question. | Found about 2 boxes I could tick, before being forced to order | a list of the designer's preferences according to how much I | agree with them. The only data that can be generated from a | survey like this is the data you wanted to find (see also | Boston Consulting Group article earlier today). I cannot | honestly respond to it | | The only question I have is, what grant application(s) is the | survey data being used to support? | KingEllis wrote: | The absence of the go binary as a tool (i.e. "go get ...", | "go install ..." etc.) is odd, considering that is what has | been eating Python's lunch lately. | erwincoumans wrote: | I am pretty happy with PyPi/pip, it is an easy way to distribute | Python and C++ code wrapped in a Python extension to others. For | a C++ developer it is becoming harder to distribute native | executables, since MacOS and Windows require signing binaries. | Python package version conflicts and backwards incompatibility | can be an issue. | mdmglr wrote: | My wishlist: | | We need a way to configure an ordered list of indexes pip | searches for packages. --extra-index-url or using a proxy index | is not the solution. | | Also namespaces and not based on a domain. So for example: pip | install apache:parquet | | Also some logic either in the pip client or index server to | minimize typosquatting | | Also pip should adopt a lock file similar to npm/yarn. Instead of | requirements.txt | | And also "pip list" should output a dependency tree like "npm | list" | | I should not have to compile source when I install. Every package | should have wheels available for the most common arch+OS combos. | | Also we need a way to download only what you need. Why does | installing scipy or numpy install more dependencies than the | conda version? For example pywin and scipy. | ciupicri wrote: | > apache:parquet | | How are you going to name the file storing the wheel for that | package? Using ":" on Windows is going to be problematic. | tempest_ wrote: | If you are using poetry you can add something to the | pyproject.toml to handle the indexes, though I am not sure if | they are ordered or not | | [[tool.poetry.source]] name = "my-pypi" url = "https://my-pypi- | index.wherever" secondary = true | orf wrote: | For improvements I commented: Remove setup.py files and mandate | wheels. This is the root cause of a lot of the evil in the | ecosystem. | | Next on the list would be pypi namespaces, but there are good | reasons why that is very hard. | | The mission statement they are proposing, "a packaging ecosystem | for all", completely misses the mark. How about a "packaging | ecosystem that works" first? | | I spent a bunch of time recently fixing our internal packaging | repo (nexus) because the switch from md5 hashes to sha256 hashes | broke everything, and re-locking a bajillion lock files would | take literally months of man hours time. | | I've been a Python user for the last 17 years, so I'm sympathetic | of how we got to the current situation and aware that we've | actually come quite far. | | But every time I use Cargo I am insanely jealous, impressed and | sad that we don't have something like it. Poetry is closest, but | it's a far cry. | baggiponte wrote: | I recommend PDM over poetry! | blibble wrote: | > The mission statement they are proposing, "a packaging | ecosystem for all", completely misses the mark. How about a | "packaging ecosystem that works" first? | | I think at the point a programming language is going on about | "mission statements" for a packaging tool, you know they've | lost the plot | | copy Maven from 2004 (possibly with less XML) | | that's it, problem solved | dalke wrote: | > Remove setup.py files and mandate wheels | | What alternative is there for me? | | My package has a combination of hand-built C extensions and | Cython extensions, as well as a code generation step during | compilation. These are handled through a subclass of | setuptools.command.build_ext.build_ext. | | Furthermore, I have compile-time options to enable/disable | certain configuration options, like enabling/disabling support | for OpenMP, via environment variables so they can be passed | through from pip. | | OpenMP is a compile-time option because the default C compiler | on macOS doesn't include OpenMP. You need to install it, using | one of various approaches. Which is why I only have a source | distribution for macOS, along with a description of the | approaches. | | I have not found a non-setup.py way to handle my configuration, | nor to provide macOS wheels. | | Even for the Linux wheels, I have to patch the manylinux Docker | container to whitelist libomp (the OpenMP library), using | something like this: RUN perl -i -pe | 's/"libresolv.so.2"/"libresolv.so.2", "libgomp.so.1"/' | /opt/_internal/pipx/venvs/auditwheel/lib/python3.9/site- | packages/ auditwheel/policy/manylinux-policy.json | | Oh, and if compiling where platform.machine() == "arm64" then I | need to not add the AVX2 compiler flag. | | The non-setup.py packaging systems I've looked at are for | Python-only code bases. Or, if I understand things correctly, | I'm supposed to make a new specialized package which implements | PEP 518, which I can then use to boot-strap my code. | | Except, that's still going to use effectively arbitrary code | during the compilation step (to run the codegen) and still use | setup.py to build the extension. So it's not like the evil | disappears. | [deleted] | korijn wrote: | I emphatize with your situation and it's a great example. As | crazy as this may sound, I think you would have to build | every possible permutation of your library and make all of | them available on pypi. You'd need a some new mechanism based | on metadata to represent all the options and figure out how | to resolve against available system libraries. Especially | that last part seems very complicated. But I do think it's | possible. | orf wrote: | To be clear, I'm not suggesting we remove the ability to | compile native extensions. | | I'm suggesting we find a better way to build them, something | a bit more structured, and decouple that specific use case | from setup.py. | | It would be cool to be able to structure this in a way that | means I can describe what system libraries I may need without | having to execute setup.py and find out, and express compile | time flags or options in a structured way. | | Think of it like cargo.toml va build.rs. | dalke wrote: | I agree it would be cool and useful. | | But it appears to be such a hard problem that modern | packaging tools ignore it, preferring to take on other | challenges instead. | | My own attempts at extracting Python configuration | information to generate a Makefile for personal use | (because Makefile understand dependencies better than | setup.py) is a mess caused by my failure to understand what | all the configuration options do. | | Given that's the case, when do you think we'll be able to | "Remove setup.py files and mandate wheels"? | | I'm curious on what evils you're thinking of? I assume the | need to run arbitrary Python code just to find metadata is | one of them. But can't that be resolved with a | pyproject.toml which uses setuptools only for the build | backend? So you don't need to remove setup.py, only | restrict when it's used, yes? | infogulch wrote: | The closest thing I've seen to a solution in this space | is Riff, discussed yesterday [1], which solves the | external dependency problem for rust projects. | | [1]: https://news.ycombinator.com/item?id=32739954 | dec0dedab0de wrote: | The ability to create a custom package that can run any | custom code you want at install time is very powerful. I | think a decent solution would be to have a way to mark a | package as trusted, and only allow pre/post scripts if they | are indeed trusted. Maybe even have specific permissions | that can be granted, but that seems like a ton of work to | get right across operating systems. | | My specific use cases are adding custom CA certs to certifi | after it is installed, and modifying the maximum version of | a requirement listed for an abandoned library that works | fine with a newer version. | | I think the best solutions would be an official way to | ignore dependencies for a specific package, and specify | replacement packages in a project's dependencies. Something | like this if it were a Pipfile: public- | package = {version = "~=1.0",replace_with='path/to/local- | package'} abandoned-package = {version = | "~=*",ignore_dependencies=True} | | But the specific problem doesn't matter, what matters is | that there will always be exceptions. This is Python, we're | all adults here, and we should be able to easily modify | things to get them to work the way we want them to. Any | protections added should include a way to be dangerous. | | I know your point is more about requiring static metadata | than using wheels per se. I just believe that all things | Python should be flexible and hack-able. There are other | more rigid languages if you're into that sort of thing. | | edit: | | before anyone starts getting angry I know there are other | ways to solve the problems I mentioned. | | forking/vendoring is a bit of overkill for such a small | change, and doesn't solve for when a dependency of a | dependency needs to be modified. | | monkeypatching works fine, however it would need to be done | at all the launch points of the project, and even then if I | open a repl and import a specific module to try something | it won't have my modifications. | | modifying an installed package at runtime works reasonably | well, but it can cause a performance hit at launch, and | while it only needs to be run once, it still needs to be | run once. So if the first thing you do after recreating a | virualenv is to try something with an existing module we | have the same problem as monkey patching. | | 'just use docker' or maybe the more toned down version: | 'create a real setup script for developers' are both valid | solutions, and where I'll probably end up. It was just very | useful to be able to modify things in a pinch. | kevin_thibedeau wrote: | Setup.py can do things wheels can't. Most notably it's the | _only_ installation method that can invoke 2to3 at runtime | without requiring a dev to create multiple packages. | orf wrote: | It's lucky Python 2 isn't supported anymore then, and | everyone has had like a decade to run 2to3 once and publish a | package for Python 3, so that use case becomes meaningless. | mistrial9 wrote: | very unfortunately the direct burden of python2 is placed | on the packagers.. users of Python 2 like their libs (me) | and have no horse in this demonization campaign | orf wrote: | Pay for support for Python 2 then? At which point it's a | burden on the person you are paying. | coldtea wrote: | You'd be surprised at how many billions lines of production | code are still at 2 (and could not care less whether it's | end-of-lined) | orf wrote: | I'm not surprised at all, but regardless they also should | not be similarly surprised if people could not care less | about that use case. | ziml77 wrote: | I tend to just give up on a package if it requires a C | toolchain to install. Even if I do end up getting things set up | in a way that the library's build script is happy with, I'll be | inflicting pain on anyone else who then tries to work with my | code. | [deleted] | tux3 wrote: | It feels so suboptimal to need the C toolchain to do things, | but having no solid way to depend on it as a non-C library | (especially annoying in Rust, which insists on building | everything from source and never installing libraries | globally). | | I make a tool/library that requires the C toolchain at | runtime. That's even worse than build time, I need end users | to have things like lld, objdump, ranlib, etc installed | anywhere they use it. My options are essentially: | | - Requiring users to just figure it out with their system | package manager | | - Building the C toolchain from source at build time and | statically linking it (so you get to spend an hour or two | recompiling _all of LLVM_ each time you update or clear your | package cache! Awesome!), | | - Building just LLD/objdump/.. at build-time (but user still | need to install LLVM. So you get both slow installs AND have | to deal with finding a compatible copy of libLLVM), | | - Pre-compiling all the C tools and putting them in a storage | bucket somewhere, for all architectures and all OS versions. | But then not have support when things like the M1 or new OS | versions right away, or people on uncommon OSes. And now need | to maintain a build machine for all of these myself. | | - Pre-compile the whole C toolchain to WASM, build Wasmtime | from source instead, and just eat the cost of Cranelift | running LLVM 5-10x slower than natively... | | I keep trying to work around the C toolchain, but I still | can't see any very good solution that doesn't make my users | have extra problems one way or another. | | Hey RiiR evangelism people, anyone want to tackle all of | LLVM? .. no? No one? :) | cycomanic wrote: | I know this is unpopular opinion on here, but I believe all | this packaging madness is forced on us by languages because | Windows (and to a lesser degree osx) have essentially no | package management. | | Especially installing a tool chain to compile C code for | python is no issue on Linux, but such a pain on Windows. | humanrebar wrote: | C tends to work in those cases because there aren't a | significant number of interesting C dependencies to add... | because there is no standard C build system, packaging | format, or packaging tools. | | When juggling as many transitive dependencies in C as folks | do with node, python, etc., there's plenty of pain to deal | with. | progval wrote: | > For improvements I commented: Remove setup.py files and | mandate wheels. | | This would make most C extensions impossible to install on | anything other than x86_64-pc-linux-gnu (or arm-linux- | gnueabihf/aarch64-linux-gnu if you are lucky) because | developers don't want to bother building wheels for them. | urschrei wrote: | cibuildwheel (which is an official, supported tool) has made | this enormously easier. I test and generate wheels with a | compiled (Rust! Because of course) extension using a Cython | bridge for all supported Python versions for 32-bit and | 64-bit Windows, macOS x86_64 and arm64, and whatever | manylinux is calling itself this week. No user compilation | required. It took about half a day to set up, and is | extremely well documented. | mathstuf wrote: | I think it'd make other things impossible too. One project I | help maintain is C++ and is mainly so. It optionally has | Python bindings. It also has something like 150 options to | the build that affect things. There is zero chance of me ever | attempting to make `setup.py` any kind of sensible "entry | point" to the build. Instead, the build detects "oh, you want | a wheel" and _generates_ `setup.py` to just grab what the C++ | build then drops into a place where `build_ext` or whatever | expects them to be using some fun globs. It also fills in | "features" or whatever the post-name `[name]` stuff is called | so you can do some kind of post-build "ok, it has a feature I | need" inspection. | korijn wrote: | ...and ensure _all_ package metadata required to perform | dependency resolution can be retrieved through an API (in other | words without downloading wheels). | orf wrote: | Yeah, that's sort of what I meant by my suggestion. | Requirements that can only be resolved by downloading and | executing code is a huge burden on tooling | LukeShu wrote: | If the package is available as a wheel, you don't need to | execute code to see what the requirements are; you just | need to parse the "METADATA" file. However, the only way to | get the METADATA for a wheel (using PyPA standard APIs, | anyway) is to download the whole wheel. | | For comparison, pacman (the Arch Linux package manager) | packages have fairly similar ".PKGINFO" file in them; but | in order to support resolving dependencies without | downloading the packages, the server's repository index | includes not just a listing of the (name, version) tuple | for each package, it also includes each package's full | .PKGINFO. | | Enhancing the PyPA "Simple repository API" to allow | fetching the METADATA independently of the wheel would be a | relatively simple enhancement that would make a big | difference. | | ---- | | As I was writing this comment, I discovered that PyPA did | this; they adopted PEP 658 in March of this year! https://g | ithub.com/pypa/packaging.python.org/commit/1ebb57b7... | korijn wrote: | Yeah. Well, mandating wheels and getting rid of setup.py at | least avoids having to run scripts, and indeed enables the | next step which would be indexing all the metadata and | exposing it through an API. I just thought it wouldn't | necessarily be obvious to all readers of your comment. | orf wrote: | Just to be clear, package metadata already is sort of | available through the pypi json api. I've got the entire | set of all package metadata here: | https://github.com/orf/pypi-data $ gzcat | release_data/c/d/cdklabs.cdk-hyperledger-fabric- | network.json.gz | jq '. | to_entries | | .[].value.info.requires_dist' | head [ | "typeguard (~=2.13.3)", "publication (>=0.0.3)", | "jsii (<2.0.0,>=1.63.2)", "constructs | (<11.0.0,>=10.0.5)", "aws-cdk-lib | (<3.0.0,>=2.33.0)" ] | | It's just not everything has it, and there isn't a way to | differenciate between "missing" and "no dependencies". | And it's also only for the `dist` releases. But anyway, | poetry uses this information during dependency | resolution. | dalke wrote: | What if I have a dependency on a commercial third-party | Python package which is on Conda but not on PyPI? | mistrial9 wrote: | you are placing open code in a vendor lock-in, to start | dalke wrote: | Yes, I understand that. | | I see I misunderstood korijn's comment. My earlier reply | is off-topic, so I won't continue further off the track. | IceHegel wrote: | If I see some JS, Go, or Rust code online I know I can probably | get it running on my machine in less than 5 min. Most of the | time, it's a 'git clone' and a 'yarn' | 'go install' | 'cargo | run', and it just works. | | With python, it feels like half the time I don't even have the | right version of python installed, or it's somehow not on the | right path. And once I actually get to installing dependencies, | there are often very opaque errors. (The last 2 years on M1 were | really rough) | | Setting up Pytorch or Tensorflow + CUDA is a nightmare I've | experienced many times. | | Having so many ways to manage packages is especially harmful for | python because many of those writing python are not professional | software engineers, but academics and researchers. If they write | something that needs, for example, CUDA 10.2, Python 3.6, and a | bunch of C audio drivers - good luck getting that code to work in | less than a week. They aren't writing install scripts, or testing | their code on different platforms, and the python ecosystem makes | the whole process worse by providing 15 ways of doing basically | the same thing. | | My proposal: | | - Make poetry part of pip | | - Make local installation the default (must pass -g for global) | | - Provide official first party tooling for starting a new package | | - Provide official first party tooling for migrating old | dependency setups to the new standard | | edit: fmt | bno1 wrote: | I wish pip had some package deduplication implemented. Even | some basic local environments have >100MB of dependencies. ML | environments go into the gigabytes range from what I remember. | mcdermott wrote: | Installing packages, creating a manifest of dependancies, | managing virtual environments, packaging, checking/formatting | code, etc... should be built into the Python toolchain (the | python binary itself). Needing to chose a bunch of third party | tools to make it work... makes Python, well... un-pythonic. | siproprio wrote: | State of the art python packaging must include support for common | use cases such as conda+machine learning. | | It's incredible how even Julia's Pkg.jl supports better python | packaging in combination with conda than the official python | packaging tools. | | This is very clearly a question of the culture of the core python | developers (such as brett cannon) who seem to think the machine | learning people with their compilers and JITs are not an | important part of the community. | bzxcvbn wrote: | I just wish they'd change their name so that my students stop | snickering. (The name is pronounced like the French word for | "piss".) | d0mine wrote: | Isn't PyPI pronounced like: pie (food/p) + pea + eye (). | | that is different from the french "pipi": pea + pea. | bzxcvbn wrote: | Sure. Now tell that to a bunch of immature college juniors. | If you read the letters "pypi" in French, it sounds exactly | like "pipi". | | And wait until you learn what "bit" sounds like in French. | di wrote: | Yep: https://pypi.org/help/#pronunciation | yewenjie wrote: | Is there any hope that even if there emerges a consensus package | management solution going forward, old packages will be easily | portable to it? | black3r wrote: | Is there a problem with the package format itself though? There | are lots of serious problems tied to distribution rather than | package format which would make the experience way easier | especially for beginners and people used to other package | managers... | | lacking binary wheels on PyPI, problems with shipping project | with dependencies, confusion about there being multiple | "package managers" (pip, pip-tools, poetry, pipenv, conda) and | multiple formats of dependency lists (setup.py, setup.cfg, | requirements.txt, pyproject.toml, ...), sys.path associated | confusion (global packages, user-level packages, and anything | specified in PYTHONPATH, ...) | ferdowsi wrote: | Nothing will improve as long as the Python powers insist on | packaging being an exercise for the community. | at_a_remove wrote: | I have a terrible admission to make: one of the reason I like | Python is its huge standard library, and I like that because I | just ... despise looking for libraries, trying to install them, | evaluating their fitness, and so on. | | I view dependencies outside of the standard library as a kind of | technical debt, not because I suffer from Not Invented Here and | want to code it myself, no, I look and think, "Why isn't this in | the standard library with a working set of idioms around it?" | | I haven't developed anything with more than five digits of code | to it, which is fine for me, but part of it is just ... avoidance | of having to screw with libraries. Ran into a pip issue I won't | go into (it requires a lot of justification to see how I got | there) and just ... slumped over. | | This has been a bad spot in Python for a long, long time. While | people are busy cramming their favorite feature from their last | language into Python, this sort of thing has languished. | | Sadly, I have nothing to offer but encouragement, I don't know | the complexities of packaging, it seems like a huge topic that | perhaps nobody really dreamed Python would have to seriously deal | with twenty years ago. | itake wrote: | More packages in the standard library means it can run in less | machines and more extra junk needs to be installed. | | Minimal standard library languages let you pick and choose what | needs to be run. Golang is a nice happy medium since it's | compiled. | samwillis wrote: | > despise looking for libraries, trying to install them, | evaluating their fitness, and so on. | | This is exactly why I prefer the larger opinionated web | frameworks (Django, Vue.js) to the smaller more composable | frameworks (Flask, React). I don't what to make decisions every | time I need a new feature, I want something that "just works". | | Python and Django just work, and brilliantly at that! | [deleted] | 5d8767c68926 wrote: | Currently dealing with Flask and it makes me sad from the | endless decision fatigue. Enormous variations in quality of | code, documentation, SO answers, etc. To not even consider | the potential for supply side attacks. | | With Django there is a happy path answer for most everything. | If I run into a problem, I know I'm not the first. | manuelabeledo wrote: | This is one of the reasons why I don't quite like Node. It | feels like _everything_ is a dependency. | | It seems ridiculous to me that there isn't a native method for | something as simple and ubiquitous as putting a thread to | sleep, or that there is an external library (underscore) that | provides 100+ methods that seem to be staples in any modern | language. | | Python is nice in that way. It is also opinionated in a | cohesive and community driven manner, e.g. PEP8. | mrweasel wrote: | If requests and a basic web framework was in the standard | library you'd effectively eliminate the majority of my | dependencies. | | Honestly I doubt see the package management being an issue for | most end-users. Between the builtin venv, conda and Docker I | feel that the use-cases for most is well covered. | | The only focus area I really see is better documentation. | Easier to read documentation more precisely. Perhaps a set of | templates to help people getting start with something like | pyproject. | | It feels like the survey is looking for a specific answer, or | maybe it's just that surveys are really hard to do. In any case | I find responses to be mostly: I have no opinion one way or the | other. | rpcope1 wrote: | Something like bottle.py would be an excellent candidate for | inclusion. The real reason to avoid putting anything into the | standard library is that it seems to often be the place where | code goes to stagnate and die for Python. | at_a_remove wrote: | I am not sure why that has turned into a truism. | | Really good code in the standard library should reach a | level of near perfection, then eventually transition into | hopeful speed gains, after which you're really only | changing that code because the language has changed or the | specification has updated. | qayxc wrote: | > I view dependencies outside of the standard library as a kind | of technical debt | | That's an interesting position. So are you suggesting that very | specialised packages such as graph plotting, ML-packages, file | formats, and image processing should be part of the standard | library? What about very OS/hardware-specific packages, such as | libraries for microcontrollers? | | There are many areas that don't have a common agreed-upon set | of idioms or functionality and that are way too specialised to | be useful for most users. I really don't think putting those | into the standard library would be a good idea. | at_a_remove wrote: | Hrm. Graph-plotting ... yes. File formats ... yes, as many as | possible. Image processing, given the success of ImageMagick, | I'd say yes there as well. I don't know much about ML to say. | | OS-specific packages, quite possibly. | | The thing about the standard library is that it is like high | school: there's a lot of stuff you think you will never need, | and you're right about _most_ of it, but the stuff you do | need you 're glad you had something going, at least. | qayxc wrote: | ImageMagick is actually a good example: I use Python as my | primary tool for shell scripting (I don't like | "traditional" shell scripts for various reasons) - if I can | use Python to control external tools such as ImageMagick, | why would I want to include all its functionality, codecs, | effects, etc. in the standard library? | | Including too much leads to a huge burden for the | maintainers and consequently results in this: | https://peps.python.org/pep-0594/ | | Quote: | | > Times have changed. With the introduction of PyPI (nee | Cheeseshop), setuptools, and later pip, it became simple | and straightforward to download and install packages. | Nowadays Python has a rich and vibrant ecosystem of third- | party packages. It's pretty much standard to either install | packages from PyPI or use one of the many Python or Linux | distributions. | | > On the other hand, Python's standard library is piling up | with cruft, unnecessary duplication of functionality, and | dispensable features. | shrimpx wrote: | I like poetry for its simplicity but I can't tell how "official" | it is in the python ecosystem. I hope it doesn't die out. I think | it's the simplest possible way to maintain deps and publish to | PyPI if you don't have any weird edge cases. | wlkr wrote: | Couldn't agree more. Poetry is fantastic and provides that | 'just works' experience for most cases. It's not official | (although possibly should be adopted) but has gained ground by | virtue of its quality. Fortunately it's very actively developed | so will hopefully stick around. | tempest_ wrote: | I like poetry. It still has a way to go though since it is | slow as all hell doing almost anything and its error messages | are closer to a stack trace than something actionable. | shrimpx wrote: | I agree that the stack trace error messages are weird. That | aspect feels uncharacteristically hacky for an otherwise | pretty polished tool. | IceHegel wrote: | It should be make the official package manager IMO. | fsniper wrote: | Python lacks the flexible universal binary distribution solution | which nearly all of the new comers has. Consider golang, rust or | docker images. Most probably docker image distribution is the | only available solution for now and volume management is the | worst problem on that front. | black3r wrote: | I just wish that PyPI would enforce binary wheels going forward | (at least for linux x64/arm64 for people who use Docker, but | ideally for all common platforms). They already supply the | cibuildwheel tool to automate their builds, so it shouldn't be | that hard for library developers... | | Software developers shouldn't need to figure out what build-time | dependencies their libraries need... | JonathonW wrote: | I haven't had too much trouble with packages missing binary | wheels lately. Occasionally Pip doesn't find them, which looks | the same as if the binary wheel were missing entirely (looking | at you, Anaconda-- update your Pip already), but they're | usually there. | | But I'm usually on Windows if I need binary wheels; maybe the | coverage is a bit different on Linux. | verst wrote: | Try installing grpcio-tools on Darwin/arm64 with Python 3.10. | More often than not I run into problems where low level | headers required by some cryptography libraries cannot be | found and as a result compilation fails. | dalke wrote: | Should PyPI kick my project off because it don't support MS | Windows? | | My package uses C and Cython extensions. While I support macOS | and Linux-based OSes, I don't know how to develop on or support | MS Windows. | | I've tried to be careful about the ILP32 vs LP64 differences, | but I suspect there's going to be many places where I missed | up. | | I also use "/dev/stdin" to work-around my use of a third-party | library that has no way to read from stdin. As far as I can | tell, there's no equivalent in Windows, so I'll have to raise | an exception for that case, and modify my tests cases. | ciupicri wrote: | Can't you use "CON" instead of "/dev/stdin" on Windows? | dalke wrote: | I asked that question nearly 11 years ago on StackOverflow, | at https://stackoverflow.com/questions/7395157/windows- | equivale... . ;) | | Quoting the best comment, "echo test | type CON or echo | test | type CONIN$ will read from the console, not from | stdin." | Kim_Bruning wrote: | XKCD's opinion: | | https://xkcd.com/1987/ | 7373737373 wrote: | And that's just installing packages, _creating_ packages is | another hellscape altogether | YetAnotherNick wrote: | I wish there is some package manager in middle of conda and pip. | Conda is too strict and often get stuck in SAT solving. pip | doesn't even ask when reinstalling a version currently being | used. | | Edit: Typo: reinstalling a version of package currently being | used | woodruffw wrote: | > pip doesn't even ask when reinstalling a version currently | being used. | | Just as an explanation: a "version" in Python packaging can | come from one of many potential distributions, including a | local distribution (such as a path on disk) that might | different from a canonical released distribution on PyPI. | Having `pip install ...` always re-install based on its | candidate selection rules is _generally_ good (IMO), since an | explicit run of `pip install` implies user intent to search for | a potentially new or changed distribution. | YetAnotherNick wrote: | I meant this for dependency not the package I am installing. | woodruffw wrote: | I'm not sure I understand what you mean -- `pip` should not | be reinstalling transitive dependencies. If you install A | and B and both depend on C, C should only be installed | once. | Izkata wrote: | I think they're referring to requirements files. I've | seen the same behavior - on day 1, pip installs packages | A and B, then on day 2 when someone else modified the | requirements file it installs C and reinstalls B even | though B hasn't changed. | | This one I've seen when the dependency doesn't specify an | exact version and you included "--upgrade". | | There's a second case that I think was fixed in pip 21 or | 22, where two transitive dependencies overlap - A and B | depend on C, but with different version ranges. If A | allows a newer version of C than B allows, C can get | installed twice. | 5d8767c68926 wrote: | My ask would be to get rid of the need for conda all together. | | Conda obviously offers a lot of value in sharing hairy compiled | packages, but it does not play well with anything else. None of | available the tooling really works with both conda and pip. It | fragments the already lousy packaging story. | lmeyerov wrote: | we find mamba solves deps solving for conda (we fail to do GPU | dependencies without it), and I think it's getting integrated | | my main thing w/ conda is it's bananas figuring out how to make | a new recipe, which is pretty surprising | kylebarron wrote: | Agreed, I've found packaging for conda to be so much harder | than packaging for pip | thefinaluser wrote: | Try poetry. It wraps pip and fixes a lot of its issues | lightspot21 wrote: | Seconding Poetry. IMO it should have been the standard | package manager - it just works (TM) | Smaug123 wrote: | Although it consumed 70GB of RAM before I killed it, when I | tried to use it to `poetry install` stable-diffusion. | dwagnerkc wrote: | Try using mamba (https://github.com/mamba-org/mamba) | | We ran into many unsolvable or 30m+ solvable envs with conda | that mamba handled quickly. | | The underlying solver can be used with conda directly as well, | but I have not done that | (https://www.anaconda.com/blog/a-faster-conda-for-a- | growing-c...) | ryan29 wrote: | I didn't take the survey because I've never packaged anything for | PyPI, but I wish all of the package managers would have an option | for domain validated namespaces. | | If I own example.com, I should be able to have | 'pypi.org/example.com/package'. The domain can be tied back to my | (domain verified) GitHub profile and it opens up the possibility | of using something like 'example.com/.well-known/pypi/' for self- | managed signing keys, etc.. | | I could be using the same namespace for every package manager in | existence if domain validated namespaces were common. | | Then, in my perfect world, something like Sigstore could support | code signing with domain validated identities. Domain validated | signatures make a lot of sense. Domains are relatively | inexpensive, inexhaustible, and globally unique. | | For code signing, I recognize a lot of project names and | developer handles while knowing zero real names for the companies | / developers involved. If those were sitting under a recognizable | organizational domain name (example.com/ryan29) I can do a | significantly better job of judging something's trustworthiness | than if it's attributed to 'Ryan Smith Inc.', right? | simonw wrote: | That's a really interesting idea, but I worry about what | happens when a domain name expires and is re-registered | (potentially even maliciously) by someone else. | ryan29 wrote: | I think you'd probably need some buy in from the domain | registries and ICANN to make it really solid. Ideally, | domains would have something similar to public certificate | transparency logs where domain expirations would be recorded. | I even think it would be reasonable to log registrant changes | (legal registrant, not contact info). In both cases, it | wouldn't need to include any identifiable info, just a simple | expired/ownership changed trigger so others would know they | need to revalidate related identities. | | I don't know if registries would play ball with something | like that, but it would be useful and should probably exist | anyway. I would even argue that once a domain rolls through | grace, redemption, etc. and gets dropped / re-registered, | that should invalidate it as an account recovery method | everywhere it's in use. | | There's a bit of complexity when it comes to the actual | validation because of stuff like that. I think you'd need buy | in from at least one large company that could do the actual | verification and attest to interested parties via something | like OAuth. Think along the lines of "verify your domain by | logging in with GitHub" and at GitHub an organization owner | that's validated their domain would be allowed to grant OAuth | permission to read the verified domain name. | dane-pgp wrote: | You've already talked about Sigstore (which is an excellent | technology for this space), so we can consider developers | holding keys that are stored in an append-only log. Then it | doesn't matter if the domain expires and someone re- | registers it, since they don't have the developer's private | keys. | | Of course there are going to be complexities involving key- | rollover and migrating to a different domain, but a | sufficiently intelligent Sigstore client could handle the | various messages and cryptographic proofs needed to secure | that. The hard part is how to issue a new key if you lose | the old one, since that probably requires social vouching | and a reputation system. | | [0] https://docs.sigstore.dev/ | jacques_chester wrote: | > _Then it doesn 't matter if the domain expires and | someone re-registers it, since they don't have the | developer's private keys._ | | A principal reason to use sigstore is to get out of the | business of handling private keys entirely. It turns a | key management problem into an identity problem, the | latter being much easier to solve at scale. | ryan29 wrote: | > Then it doesn't matter if the domain expires and | someone re-registers it, since they don't have the | developer's private keys. | | That's a good point in terms of invalidation, but a new | domain registrant should be able to claim the namespace | and start using it. | | I think one possible solution to that would be to assume | namespaces can have their ownership changed and build | something that works with that assumption. | | Think along the lines of having 'pypi.org/example.com' be | a redirect to an immutable organization; | 'pypi.org/abcd1234'. If a new domain owner wants to take | over the namespace they won't have access to the existing | account and re-validating to take ownership would force | them to use a different immutable organization; | 'pypi.org/ef567890'. | | If you have a package locking system (like NPM), it would | lock to the immutable organization and any updates that | resolve to a new organization could throw a warning and | require explicit approval. If you think of it like an | organization lock: | | v1: pypi.org/example.com --> | pypi.org/abcd1234 | | v2: pypi.org/example.com --> | pypi.org/ef123456 | | If you go from v1 to v2 you _know_ there was an ownership | change or, at the very least, an event that you need to | investigate. | | Losing control of a domain would be recoverable because | existing artifacts wouldn't be impacted and you could use | the immutable organization to publish the change since | that's technically the source of truth for the artifacts. | Put another way, the immutable organization has a pointer | back the current domain validated namespace: | | v1: pypi.org/abcd1234 --> example.com | | v2: pypi.org/abcd1234 --> example.net | | If you go from v1 to v2 you _know_ the owner of the | artifacts you want has moved from the domain example.com | to example.net. The package manager could give a warning | about this and let an artifact consumer approve it, but | it 's less risky than the change above because the owner | of 'abcd1234' hasn't changed and you're already trusting | them. | | I think that's a reasonably effective way of solving | attacks that rely on registering expired domains to take | over a namespace and it also makes it fairly trivial for | namespace owners to point artifact consumers to a new | domain if needed. | | Think of the validated domain as more of a vanity pointer | than an actual artifact repository. In fact, thinking | about it like that, you don't actually need any | cooperation or buy in from the domain registries. | | > The hard part is how to issue a new key if you lose the | old one, since that probably requires social vouching and | a reputation system. | | It's actually really hard because as you increase the | value of a key, I think you decrease the security | practices around handling them. For example, some people | will simply drop their keys into OneDrive if there's any | inconvenience associated with losing them. | | I would really like to have something where I can use a | key generated on a tamper proof device like a YubiKey and | not have to worry about losing it. Ideally, I could | register a new key without any friction. | westurner wrote: | CT: Certificate Transparency logs log creation and revocation | events. | | The Google/trillian database which supports Google's CT logs | uses Merkle trees but stores the records in a centralized | data store - meaning there's _at least one_ SPOF Single Point | of Failure - which one party has root on and sole backup | privileges for. | | Keybase, for example, stores their root keys - at least - in | a distributed, redundantly-backed-up blockchain that nobody | has root on; and key creation and revocation events are | publicly logged similarly to now-called "CT logs". | | You can link your Keybase identity with your other online | identities by proving control by posting a cryptographic | proof; thus adding an edge to a WoT Web of Trust. | | While you can add DNS record types like CERT, OPENPGPKEY, | SSHFP, CAA, RRSIG, NSEC3; DNSSEC and DoH/DoT/DoQ cannot be | considered to be universally deployed across all TLDs. | Should/do e.g. ACME DNS challenges fail when a TLD doesn't | support DNSSEC, or hasn't secured root nameservers to a | sufficient baseline, or? DNS is not a trustless system. | | EDNS (Ethereum DNS) is a trustless system. Reading EDNS | records does not cost EDNS clients any | gas/particles/opcodes/ops/money. | | Blockcerts is designed to issue any sort of credential, and | allow for signing of any RDF graph like JSON-LD. | | List_of_DNS_record_types: | https://en.wikipedia.org/wiki/List_of_DNS_record_types | | Blockcerts: https://www.blockcerts.org/ | https://github.com/blockchain-certificates : | | > _Blockcerts is an open standard for creating, issuing, | viewing, and verifying blockchain-based certificates_ | | W3C VC-DATA-MODEL: https://w3c.github.io/vc-data-model/ : | | > _Credentials are a part of our daily lives; driver 's | licenses are used to assert that we are capable of operating | a motor vehicle, university degrees can be used to assert our | level of education, and government-issued passports enable us | to travel between countries. This specification provides a | mechanism to express these sorts of credentials on the Web in | a way that is cryptographically secure, privacy respecting, | and machine-verifiable_ | | W3C VC-DATA-INTEGRITY: "Verifiable Credential Data Integrity | 1.0" https://w3c.github.io/vc-data-integrity/#introduction : | | > _This specification describes mechanisms for ensuring the | authenticity and integrity of Verifiable Credentials and | similar types of constrained digital documents using | cryptography, especially through the use of digital | signatures and related mathematical proofs. Cryptographic | proofs enable functionality that is useful to implementors of | distributed systems. For example, proofs can be used to: Make | statements that can be shared without loss of trust,_ | | W3C _TR_ DID (Decentralized Identifiers) | https://www.w3.org/TR/did-core/ : | | > _Decentralized identifiers (DIDs) are a new type of | identifier that enables verifiable, decentralized digital | identity. A DID refers to any subject (e.g., a person, | organization, thing, data model, abstract entity, etc.) as | determined by the controller of the DID. In contrast to | typical, federated identifiers, DIDs have been designed so | that they may be decoupled from centralized registries, | identity providers, and certificate authorities. | Specifically, while other parties might be used to help | enable the discovery of information related to a DID, the | design enables the controller of a DID to prove control over | it without requiring permission from any other party. DIDs | are URIs that associate a DID subject with a DID document | allowing trustable interactions associated with that | subject._ | | > _Each DID document can express cryptographic material, | verification methods, or services, which provide a set of | mechanisms enabling a DID controller to prove control of the | DID. Services enable trusted interactions associated with the | DID subject. A DID might provide the means to return the DID | subject itself, if the DID subject is an information resource | such as a data model._ | jacques_chester wrote: | Sigstore uses Trillian for its transparency log, Rekor. | dane-pgp wrote: | For another example of how Ethereum might be useful for | certificate transparency, there's a fascinating paper from | 2016 called "EthIKS: Using Ethereum to audit a CONIKS key | transparency log" which is probably way ahead of its time. | | Abstract: https://link.springer.com/chapter/10.1007/978-3-6 | 62-53357-4_... | | PDF: https://jbonneau.com/doc/B16b-BITCOIN-ethiks.pdf | rhyselsmore wrote: | Needs compensating controls to get it right. | | * Dependencies are managed in a similar way to Go - where | hashes of installed packages are stored and compared client | side. This means that a hijacker could only serve up the | valid versions of packages that I've already installed. | | * This is still a "centralized" model where a certain level | of trust is placed in PyPi - a mode of operation where the | "fingerprint" of the TLS key is validated would assist here. | However it comes with a few constraints. | | Of course the above still comes with the caveat that you have | to trust pypi. I'm not saying that this is an unreasonable | ask. It's just how it is. | jacques_chester wrote: | Maven Central requires validation of a domain name in order to | use a reverse-domain package[0]. | | It's not without problems. One is that folks often don't | control the domain (consider Go's charming habit of conflating | version control with package namespacing). Another is what was | noted below: resurrection attacks on domains can be quite | trivial and already happen in other forms (eg registering | lapsed domains for user accounts and performing a reset). | | [0] https://central.sonatype.org/faq/how-to-set-txt-record/ | shireboy wrote: | I've been tinkering with stable diffusion lately and this has | been a rude introduction to python. Coming from .net (nuget) and | JavaScript (npm), it's baffling that there isn't an established | solution for python. It looks to me like people are trying, but | different libraries use different techniques. To a newcomer this | is confusing. | timtom39 wrote: | ML/AI is the deep end of python dependencies. Lots of hardware | specific requirements (e.g. CUDA, cuDNN, AVX2 TensorFlow | binaries, etc). A typical python web application is a lot | simpler. | antod wrote: | _> I've been tinkering with stable diffusion lately and this | has been a rude introduction to python. Coming from .net | (nuget) and JavaScript (npm), it's baffling that there isn't an | established solution for python._ | | Python has had multiple legacy solutions going back a long time | before nuget and npm existed, and before central registries of | dependencies. Every new solution has to cope with all that | compatibility/transitional baggage. Also a bunch of usecases | .NET or JS never really had to deal much with - eg being a core | system language for Linux distros, and supporting cross | platform installs back in the download and run something days. | The scope of areas Python gets used in means its packaging is | pulled in more directions than most other languages who mostly | stick to a main niche. | | So the history and surface area of problems to solve in Python | packaging is larger than what most other languages have had to | deal with. It also takes years for the many 3rd party tools to | try out new approaches, gain traction and then slowly get their | best ideas synthesized and adapted into the much more | conservative core Python stdlib. | | Not saying it is great, just laying out some of the reasons it | is what it is. | zbentley wrote: | I imagine many of you have feedback that could be useful to folks | making decisions about the future of Python packaging, a common | subject of complaint in many discussions here. | | Remember not to just complain, but to offer specific | problems/solutions--i.e. avoid statements like "virtualenvs suck, | why can't it be like NPM?" and prefer instead feedback like "the | difference between Python interpreter version and what virtualenv | is being used causes confusion". | Kwpolska wrote: | "virtualenvs suck, why can't it be like NPM?" is a specific | problem and a specific solution. The problem being having to | manage venvs (which have many gotchas and pitfalls, and no | standarization), and the solution is to replace those with | packages being installed into the project folder with | standardized and well-known tools. | qbasic_forever wrote: | Keep an eye on https://peps.python.org/pep-0582/ it's a | proposal to add local directory/node_modules-like behavior to | package installs. It stalled out a few years ago but I heard | there is a lot more discussion and push to get it in now. | | I think if this PEP makes it in then like 90% of people's | pain with pip just completely goes away almost overnight. | Love it or hate it the NPM/node_modules style of all | dependencies dumped in a local directory solves a _ton_ of | problems in the packaging world. It would go a long way | towards making the experience much smoother for most python | users. ___________________________________________________________________ (page generated 2022-09-07 23:00 UTC)