[HN Gopher] Python: Overlooked core functionalities ___________________________________________________________________ Python: Overlooked core functionalities Author : erikvdven Score : 79 points Date : 2023-07-24 20:06 UTC (2 hours ago) (HTM) web link (erikvandeven.medium.com) (TXT) w3m dump (erikvandeven.medium.com) | robertlagrant wrote: | This is a bit of bikeshedding, but I think if not | n in memo: | | is more naturally written as if n not in memo: | progmetaldev wrote: | As someone learning Python, but having worked with other | languages, I think your second example is better as it reads | more like English. I think that simplicity actually ends up | much more rewarding when it comes to reading code. | qsort wrote: | The big missing item from the list: generators! | | Using "yield" instead of "return" turns the function into a | coroutine. This is useful in all sorts of cases and works _very_ | well with the itertools module of the standard library. | | One of my favorite examples: a very concise snippet of code that | generates all primes: def primes(): ps | = defaultdict(list) for i in count(2): if | i not in ps: yield i | ps[i**2].append(i) else: for n in | ps[i]: ps[i + (n if n == 2 else | 2*n)].append(n) del ps[i] | atoav wrote: | And this is a presentation explaining _why_ generators may be | extremely useful for all kind of data pipelines: | https://www.dabeaz.com/generators/Generators.pdf | | If you don't know it already, it is really worth looking into. | I am a python dev with nearly a decade of experience and I knew | generators, and yet this was still an eye opener. | thenberlin wrote: | Wow, thanks for that -- that's an excellent slide deck. | gurchik wrote: | I like using generators when querying APIs that paginate | results. It's an easy way to abstract away the pagination for | your caller. def get_api_results(query): | params = { "next_token": None } while True: | response = requests.get(URL, params=params) json = | response.json() yield from json["results"] | if json["next_token"] is None: return | params["next_token"] = json["next_token"] for | result in get_api_results(QUERY): | process_result(result) # No need to worry about pagination | TrianguloY wrote: | But wait, there's more, you can send data back to the function! | (Will be returned as the yield output) | | https://stackoverflow.com/questions/20579756/passing-value-t... | | And don't forget "yield from" (same as yielding all values in a | list, but keeps the original generator! You can send data back | to the list if it is itself another generator!) | erikvdven wrote: | Thanks! I tried to add mostly the stuff I don't encounter that | often in blogs/tutorials etc. But guess you are right. | Generators, or at least the 'yield' keyword, is often | misunderstood, and we can't emphasize them enough | qsort wrote: | Just to clarify, I don't mean your article is bad or | incomplete -- quite the contrary, I enjoyed it a lot. | Generators are one of my favorite Python features and they're | kind of underused, mostly because people simply don't know | about them. | | A couple more along the same lines: | | - Metaclasses and type. (This is admittedly dark magic, but | useful in library code, less so in application code) | | - Magic methods! Everyone knows about __init__, but you can | override all sorts of behaviors (see: | https://docs.python.org/3/reference/datamodel.html) | | My favorite example (I have a lot of favorite examples :)) is | __call__, which emulates function calling and is the | equivalent of C++'s operator(). | | Why is it my favorite? Because as the old adage goes, "a | class is a poor man's closure, a closure is a poor man's | class": class C: def __init__(self, | x): self.x = x def __call__(self, y): | return self.x + y >>> a = C(2) >>> a(3) | 5 | erikvdven wrote: | Thanks a lot! Really appreciate it. Love the example! | Haven't used the dunder __call__ yet (like many magic | methods I guess), but that's a nice one! | | I didn't have to use Metaclasses, either, though I have | read about them, especially in Fluent Python. But I guess I | belong to the 99% who haven't had to worry about them, yet | :P | slt2021 wrote: | can you explain how generators work with multiprocess (Thread | based pool) ? | | is _ps_ internal variable unique for each Thread or same? | | is it safe to execute your primes() from different threads? | TrianguloY wrote: | A yield will simply return a generator object, which contains | information about the next value to use, and how to continue | the function execution. That's why you need to use functions | that yield things inside loops or list(...). | | If you run it from different threads I guess it will be the | same as calling the function multiple times, it will return a | new started-from-the-top generator. def | sum(): yield 1 yield 2 | print(repr(sum())) print(next(sum())) | print(next(sum())) | | Prints <generator object sum at | 0x7fc6f14823c0> 1 1 | slt2021 wrote: | so Thread based based pool will have same instance of | generator, while Process based pool with have unique | instance of generator? | TrianguloY wrote: | I don't know if a generator can be shared across threads, | but in that case ... I have no idea :/ | | You'll need to search, or try! | qsort wrote: | > can you explain how generators work with multiprocess | | The best way to think of a generator is as an object | implementing the iteration protocol. They don't really | interact with concurrency, as far as multiprocess is | concerned, they're just regular objects. So the answer is | that it depends on how you plan to share memory between the | processes. | | > is ps internal variable unique for each Thread or same? | | ps is local to the generator _instance_. def | f(): x = 0 while True: | yield (x := x + 1) >>> f() <generator | object f at 0x10412e500> >>> x = f() >>> y = f() | >>> next(x) 1 >>> next(x) 2 >>> | next(y) 1 | | > is it safe to execute your primes() from different threads? | | For this specific generator, you would run into the GIL. More | generally, if you're talking about non CPU-bound operations, | you need to synchronize the threads. It's worth looking into | asyncio for those use cases. | MayeulC wrote: | Hmm, I encountered or used all of these somewhere, but 4 days ago | I learned something else: python natively supports complex | numbers. a=1+3j b=a+4j | | I encountered this when a friend noticed some weird syntax for a | numpy meshgrid (via mgrid): np.mgrid[-1:1:5j] | version_five wrote: | Re unpacking with * one I use often is when you have a list of | types of coordinates you want to plot, i.e. # z = | [(x0,y0), (x1,y1) ...] | | You can do import matplotlib.pyplot as plt | plt.plot(*zip(*z)) | | I spent years doing x = [t[0] for t in z] # etc | | before I realized this. | jacurtis wrote: | I'd argue that your original approach is actually better than | your new approach. | | Using a list comprehension, such as your original approach, is | pretty easily understood by anyone writing python and is easy | to follow, it is also quite terse. | | Your recursive unpacking zip thing is much harder to understand | and read. This reminds me of the type of stuff you find in the | codebase years later when the person who wrote it is long gone | and you find a comment next to it that says: | | # No idea why this works, but don't touch it | | One of the problems I have with python is that there are a | million super creative ways to do stuff, especially using less | known parts of the language. People love to get super creative | with it, but usually the simplest solution is actually the best | one, especially when working on a team. | | In your example above, you aren't even saving any real space. | Both approaches can be done inline, the list comprehension is | maybe a few extra characters. You're not really saving | anything, just making it harder to read and maintain by others. | | When I moved from a company that wrote in Python to one that | wrote in Golang, I found that the restrictions that Golang | offers is a huge benefit in a team. Because you don't have | access to all these crazy language components that python has, | the code written in Go would be almost identical regardless of | who wrote it. Of course everything in Golang is far far more | verbose than Python, but I actually found it 100x more | maintainable. | | In the python codebase it was very easy to tell who wrote | different parts of a codebase without looking at the git blame, | because there was almost a "voice" with the style of writing | python. But in Golang it was more restrictive which meant that | the entire codebase was more cohesive and easily to jump | around. | version_five wrote: | Yes I completely agree that python has lots of "too clever | for it's own good" ways of doing things, and that my example | could be seen that way. | | Using it as a scripting language though I still like the | shortcut. | ziedaniel1 wrote: | A couple details worth noting: | | - `repr` often outputs valid source code that evaluates to the | object, including in the post's example: running | `datetime.datetime(2023, 7, 20, 15, 30, 0, 123456)` would give | you a `datetime.datetime` object equivalent to `today`. | | - Using `_` for throwaway variables is merely a convention and | not built into the language in any way (unlike in Haskell, say). | zwieback wrote: | Great list, thanks, I'll be sure to use some of these. | | Here's the obvious question: how many more unknown-but-useful | features are hidden away in other similar articles. | atxbcp wrote: | - none of these functionalities are "overlooked", this is pretty | basic python - for fibonacci you have a decorator for | memoization (functools cache / lru_cache) - you don't need | to use parenthesis for a single line "if" | agumonkey wrote: | You consider these 'basic' python ? just curious, I'd say it's | a bit below intermediate. | OJFord wrote: | At the point we're disagreeing about 'basic' vs. 'bit below | intermediate'.. idk we at least have to agree how many levels | the model has. | | Fwiw I also thought it was pretty regular stuff, and then | arcane library functions you've either needed or you haven't. | Also, that's a generator, not a list comprehension. | erikvdven wrote: | You are very much right a lot of it is pretty basic knowledge. | From my experience though, a lot of python developers don't | take the python docs or tutorial as first resource, and quite | some developers I met did lack quite some knowledge I mentioned | in the article. | | You are right about the fibonacci operator, I thought I did | refer to another article where I mention the lru_cache as well | :) But I'll double check. | | Good one about the parenthesis! I'll post an update soon | barrenko wrote: | Use all this and you've got yourself a poor man's Ruby. | carabiner wrote: | Would add: | | * For dicts, learn .setdefault() vs. .get() vs. defaultdict() | | * .sort(key=sortingkey) | | * itertools groupby, chain | | * map, filter, reduce | was_a_dev wrote: | The walrus operator isn't overlooked imo. It's more that many | still haven't updated to >3.8 | ivalm wrote: | Multiple context managers in a single with statement is something | I didn't know! | stabbles wrote: | > Python arguments are evaluated when the function definition is | encountered | | This is a giant pain. Easy to miss. Sometimes forces you to deal | with Optional[Something] instead of just Something. | | Compare with Julia where default arguments are evaluated ... very | late: julia> f(a, b, c, d = a * b * c) = d | f (generic function with 2 methods) julia> | f("hello", " ", "world") "hello world" | | that's really neat. | progmetaldev wrote: | Probably one of the benefits I gained from writing JavaScript | before ES5 (although have worked with many languages, I've only | used a few that were dynamic - PHP, JS, and old VB). I write my | functions as early as possible, having remembered hoisting | rules from JavaScript (and trying to only rely on OOP with | Python where it naturally makes sense). | | Looking at your Julia example, this seems much more friendly | and less surprise and error-prone. | IshKebab wrote: | "Overlooked core functionality" is an interesting way to spell | "massive footgun". | TrianguloY wrote: | first, _, last = [1, 2, 3, 4, 5] | | I guess this is a typo, it should be first, *_, | last = [1, 2, 3, 4, 5] | | (As explained above!) | | Other than that, nice list of python tricks, I love not-so-known | features because it can make code shorter and prettier! | erikvdven wrote: | Sharp! Updated that line. And thank you for the compliment :) | rowanseymour wrote: | Since Python 3.7 import pdb pdb.set_trace() | | can be written as just breakpoint() | agumonkey wrote: | I was told that at my job, but my fingers are so used to type | `pdb` and emacs template-replacing it that I can't change. | erikvdven wrote: | Thanks for the tip! :) | IshKebab wrote: | And this also works with Debugpy so you can actually use a | proper debugger and not pdb which is frankly terrible. | VWWHFSfQ wrote: | > Python arguments are evaluated when the function definition is | encountered. Good to remember that! | | I would never try to exploit this behavior to achieve some kind | of benefit (avoiding max recursion). Any tricks you try to do | with this is almost definitely going to to cause bugs that are | very difficult to track down. So don't be too clever here. | progmetaldev wrote: | Before I saw your comment, I had "overlooked" that these were | presented as beneficial features, rather than just curiosities. | As someone just learning Python, but familiar with other | languages, I can only hope that if I start using Python in | production with other developers they take the most obvious | route (or use a comment as to why they would be relying on this | type of behavior). | | I chose to learn Python because it seemed to be the easiest to | read, which to my mind meant working in a team would lead to | easier discovery and understanding. Then I see articles like | this, and wonder if I'll have a lot of footguns to watch out | for where the code isn't as clear as it seems. | hangonhn wrote: | Yeah. I was really surprised to see this as a feature to be | used rather than a gotcha. I've seen it more as gotchas, as in | actual bugs introduced because of this behavior, and never as a | feature until now. I can see why he thinks it's useful though | and, maybe within his specific context, it is. That said, even | for his example, I think he would have been better off using | https://docs.python.org/3/library/functools.html#functools.c... | BoppreH wrote: | Pretty good list. Two corrections: | | The `first, *middle, last` trick doesn't work if your list only | has one element: first, *middle, last = [1] | ValueError: not enough values to unpack (expected at least 2, got | 1) | | And the last title has a typo: | | > Separater for Large Numbers | m4r71n wrote: | I would not recommend the default arguments hack. Any decent | linter or IDE will flag that as an error and complain about the | default argument being mutable (in fact, mutable default | arguments are the target of many beginner-level interview | questions). It's much easier to decorate a function with | `functools.cache` to achieve the same result. | Smaug123 wrote: | More concretely, one of the classic Python bugs is to use `[]` | as a default argument and then mutate what "is obviously" a | local variable. | nighthawk454 wrote: | I think it's even more safe/preferable to use non-mutable | `None`s as a default and do: | | ``` def myfunc(x=None): x = x if x is not None else [] ... | ``` | TrianguloY wrote: | Or, if you need a "static" variable for other purposes, the | usual alternative is to just use a global variable, but if for | some reason you can't (or you don't want to) you can use the | function itself! def f(): if not | hasattr(f, "counter"): f.counter = 0 | f.counter += 1 return f.counter | print(f(),f(),f()) > 1 2 3 | data-ottawa wrote: | I didn't realize that the function was available in its own | scope. This information is going to help me do horrible | things with pandas. | kevincox wrote: | This is very important for self-recursion. | Spivak wrote: | I would hate to get an interview question where the very | premise of it is wrong. Python does have mutable arguments, but | so does Ruby. def func(arr=[]) # | Look ma we mutated it. arr.append 1 puts | arr end | | Why calling this function a few times outputs [1], [1],... | instead of [1], [1, 1],... isn't because Ruby somehow made the | array immutable and hid it with copy-on-write or anything like | that. It's because Ruby, unlike Python, has default | _expressions_ instead of default _values_. Whenever the default | it needed Ruby reevaluates the expression in the scope of the | function definition and assigns the result to the argument. If | your default expression always returned the same object you | would fall into the same trap as Python. | | The sibling comment is wrong too -- it _is_ a local variable, | or as much one as Python can have since all variables, local or | not, are names. | sanderjd wrote: | Agreed, I found that example very confusing. | klyrs wrote: | functools.cache is pretty new; py3.8 is still supported for | another year and a bit. | mkl95 wrote: | Some of these features lie in the border of the uncanny valley | where languages like Ruby and "vanilla Javascript" live, and are | not compatible with the principle of least surprise or even the | Zen of Python. I don't write too much Python anymore, but when I | do I keep it simple and explicit. ___________________________________________________________________ (page generated 2023-07-24 23:00 UTC)