[HN Gopher] Python: Overlooked core functionalities
       ___________________________________________________________________
        
       Python: Overlooked core functionalities
        
       Author : erikvdven
       Score  : 79 points
       Date   : 2023-07-24 20:06 UTC (2 hours ago)
        
 (HTM) web link (erikvandeven.medium.com)
 (TXT) w3m dump (erikvandeven.medium.com)
        
       | robertlagrant wrote:
       | This is a bit of bikeshedding, but I think                 if not
       | n in memo:
       | 
       | is more naturally written as                 if n not in memo:
        
         | progmetaldev wrote:
         | As someone learning Python, but having worked with other
         | languages, I think your second example is better as it reads
         | more like English. I think that simplicity actually ends up
         | much more rewarding when it comes to reading code.
        
       | qsort wrote:
       | The big missing item from the list: generators!
       | 
       | Using "yield" instead of "return" turns the function into a
       | coroutine. This is useful in all sorts of cases and works _very_
       | well with the itertools module of the standard library.
       | 
       | One of my favorite examples: a very concise snippet of code that
       | generates all primes:                 def primes():           ps
       | = defaultdict(list)           for i in count(2):               if
       | i not in ps:                   yield i
       | ps[i**2].append(i)               else:                   for n in
       | ps[i]:                       ps[i + (n if n == 2 else
       | 2*n)].append(n)                   del ps[i]
        
         | atoav wrote:
         | And this is a presentation explaining _why_ generators may be
         | extremely useful for all kind of data pipelines:
         | https://www.dabeaz.com/generators/Generators.pdf
         | 
         | If you don't know it already, it is really worth looking into.
         | I am a python dev with nearly a decade of experience and I knew
         | generators, and yet this was still an eye opener.
        
           | thenberlin wrote:
           | Wow, thanks for that -- that's an excellent slide deck.
        
         | gurchik wrote:
         | I like using generators when querying APIs that paginate
         | results. It's an easy way to abstract away the pagination for
         | your caller.                 def get_api_results(query):
         | params = { "next_token": None }         while True:
         | response = requests.get(URL, params=params)           json =
         | response.json()           yield from json["results"]
         | if json["next_token"] is None:             return
         | params["next_token"] = json["next_token"]              for
         | result in get_api_results(QUERY):
         | process_result(result)  # No need to worry about pagination
        
         | TrianguloY wrote:
         | But wait, there's more, you can send data back to the function!
         | (Will be returned as the yield output)
         | 
         | https://stackoverflow.com/questions/20579756/passing-value-t...
         | 
         | And don't forget "yield from" (same as yielding all values in a
         | list, but keeps the original generator! You can send data back
         | to the list if it is itself another generator!)
        
         | erikvdven wrote:
         | Thanks! I tried to add mostly the stuff I don't encounter that
         | often in blogs/tutorials etc. But guess you are right.
         | Generators, or at least the 'yield' keyword, is often
         | misunderstood, and we can't emphasize them enough
        
           | qsort wrote:
           | Just to clarify, I don't mean your article is bad or
           | incomplete -- quite the contrary, I enjoyed it a lot.
           | Generators are one of my favorite Python features and they're
           | kind of underused, mostly because people simply don't know
           | about them.
           | 
           | A couple more along the same lines:
           | 
           | - Metaclasses and type. (This is admittedly dark magic, but
           | useful in library code, less so in application code)
           | 
           | - Magic methods! Everyone knows about __init__, but you can
           | override all sorts of behaviors (see:
           | https://docs.python.org/3/reference/datamodel.html)
           | 
           | My favorite example (I have a lot of favorite examples :)) is
           | __call__, which emulates function calling and is the
           | equivalent of C++'s operator().
           | 
           | Why is it my favorite? Because as the old adage goes, "a
           | class is a poor man's closure, a closure is a poor man's
           | class":                 class C:           def __init__(self,
           | x):               self.x = x           def __call__(self, y):
           | return self.x + y             >>> a = C(2)       >>> a(3)
           | 5
        
             | erikvdven wrote:
             | Thanks a lot! Really appreciate it. Love the example!
             | Haven't used the dunder __call__ yet (like many magic
             | methods I guess), but that's a nice one!
             | 
             | I didn't have to use Metaclasses, either, though I have
             | read about them, especially in Fluent Python. But I guess I
             | belong to the 99% who haven't had to worry about them, yet
             | :P
        
         | slt2021 wrote:
         | can you explain how generators work with multiprocess (Thread
         | based pool) ?
         | 
         | is _ps_ internal variable unique for each Thread or same?
         | 
         | is it safe to execute your primes() from different threads?
        
           | TrianguloY wrote:
           | A yield will simply return a generator object, which contains
           | information about the next value to use, and how to continue
           | the function execution. That's why you need to use functions
           | that yield things inside loops or list(...).
           | 
           | If you run it from different threads I guess it will be the
           | same as calling the function multiple times, it will return a
           | new started-from-the-top generator.                   def
           | sum():             yield 1             yield 2
           | print(repr(sum()))         print(next(sum()))
           | print(next(sum()))
           | 
           | Prints                   <generator object sum at
           | 0x7fc6f14823c0>         1         1
        
             | slt2021 wrote:
             | so Thread based based pool will have same instance of
             | generator, while Process based pool with have unique
             | instance of generator?
        
               | TrianguloY wrote:
               | I don't know if a generator can be shared across threads,
               | but in that case ... I have no idea :/
               | 
               | You'll need to search, or try!
        
           | qsort wrote:
           | > can you explain how generators work with multiprocess
           | 
           | The best way to think of a generator is as an object
           | implementing the iteration protocol. They don't really
           | interact with concurrency, as far as multiprocess is
           | concerned, they're just regular objects. So the answer is
           | that it depends on how you plan to share memory between the
           | processes.
           | 
           | > is ps internal variable unique for each Thread or same?
           | 
           | ps is local to the generator _instance_.                 def
           | f():           x = 0           while True:
           | yield (x := x + 1)             >>> f()       <generator
           | object f at 0x10412e500>       >>> x = f()       >>> y = f()
           | >>> next(x)       1       >>> next(x)       2       >>>
           | next(y)       1
           | 
           | > is it safe to execute your primes() from different threads?
           | 
           | For this specific generator, you would run into the GIL. More
           | generally, if you're talking about non CPU-bound operations,
           | you need to synchronize the threads. It's worth looking into
           | asyncio for those use cases.
        
       | MayeulC wrote:
       | Hmm, I encountered or used all of these somewhere, but 4 days ago
       | I learned something else: python natively supports complex
       | numbers.                   a=1+3j         b=a+4j
       | 
       | I encountered this when a friend noticed some weird syntax for a
       | numpy meshgrid (via mgrid):                   np.mgrid[-1:1:5j]
        
       | version_five wrote:
       | Re unpacking with * one I use often is when you have a list of
       | types of coordinates you want to plot, i.e.                 # z =
       | [(x0,y0), (x1,y1) ...]
       | 
       | You can do                 import matplotlib.pyplot as plt
       | plt.plot(*zip(*z))
       | 
       | I spent years doing                 x = [t[0] for t in z] # etc
       | 
       | before I realized this.
        
         | jacurtis wrote:
         | I'd argue that your original approach is actually better than
         | your new approach.
         | 
         | Using a list comprehension, such as your original approach, is
         | pretty easily understood by anyone writing python and is easy
         | to follow, it is also quite terse.
         | 
         | Your recursive unpacking zip thing is much harder to understand
         | and read. This reminds me of the type of stuff you find in the
         | codebase years later when the person who wrote it is long gone
         | and you find a comment next to it that says:
         | 
         | # No idea why this works, but don't touch it
         | 
         | One of the problems I have with python is that there are a
         | million super creative ways to do stuff, especially using less
         | known parts of the language. People love to get super creative
         | with it, but usually the simplest solution is actually the best
         | one, especially when working on a team.
         | 
         | In your example above, you aren't even saving any real space.
         | Both approaches can be done inline, the list comprehension is
         | maybe a few extra characters. You're not really saving
         | anything, just making it harder to read and maintain by others.
         | 
         | When I moved from a company that wrote in Python to one that
         | wrote in Golang, I found that the restrictions that Golang
         | offers is a huge benefit in a team. Because you don't have
         | access to all these crazy language components that python has,
         | the code written in Go would be almost identical regardless of
         | who wrote it. Of course everything in Golang is far far more
         | verbose than Python, but I actually found it 100x more
         | maintainable.
         | 
         | In the python codebase it was very easy to tell who wrote
         | different parts of a codebase without looking at the git blame,
         | because there was almost a "voice" with the style of writing
         | python. But in Golang it was more restrictive which meant that
         | the entire codebase was more cohesive and easily to jump
         | around.
        
           | version_five wrote:
           | Yes I completely agree that python has lots of "too clever
           | for it's own good" ways of doing things, and that my example
           | could be seen that way.
           | 
           | Using it as a scripting language though I still like the
           | shortcut.
        
       | ziedaniel1 wrote:
       | A couple details worth noting:
       | 
       | - `repr` often outputs valid source code that evaluates to the
       | object, including in the post's example: running
       | `datetime.datetime(2023, 7, 20, 15, 30, 0, 123456)` would give
       | you a `datetime.datetime` object equivalent to `today`.
       | 
       | - Using `_` for throwaway variables is merely a convention and
       | not built into the language in any way (unlike in Haskell, say).
        
       | zwieback wrote:
       | Great list, thanks, I'll be sure to use some of these.
       | 
       | Here's the obvious question: how many more unknown-but-useful
       | features are hidden away in other similar articles.
        
       | atxbcp wrote:
       | - none of these functionalities are "overlooked", this is pretty
       | basic python       - for fibonacci you have a decorator for
       | memoization (functools cache / lru_cache)       - you don't need
       | to use parenthesis for a single line "if"
        
         | agumonkey wrote:
         | You consider these 'basic' python ? just curious, I'd say it's
         | a bit below intermediate.
        
           | OJFord wrote:
           | At the point we're disagreeing about 'basic' vs. 'bit below
           | intermediate'.. idk we at least have to agree how many levels
           | the model has.
           | 
           | Fwiw I also thought it was pretty regular stuff, and then
           | arcane library functions you've either needed or you haven't.
           | Also, that's a generator, not a list comprehension.
        
         | erikvdven wrote:
         | You are very much right a lot of it is pretty basic knowledge.
         | From my experience though, a lot of python developers don't
         | take the python docs or tutorial as first resource, and quite
         | some developers I met did lack quite some knowledge I mentioned
         | in the article.
         | 
         | You are right about the fibonacci operator, I thought I did
         | refer to another article where I mention the lru_cache as well
         | :) But I'll double check.
         | 
         | Good one about the parenthesis! I'll post an update soon
        
       | barrenko wrote:
       | Use all this and you've got yourself a poor man's Ruby.
        
       | carabiner wrote:
       | Would add:
       | 
       | * For dicts, learn .setdefault() vs. .get() vs. defaultdict()
       | 
       | * .sort(key=sortingkey)
       | 
       | * itertools groupby, chain
       | 
       | * map, filter, reduce
        
       | was_a_dev wrote:
       | The walrus operator isn't overlooked imo. It's more that many
       | still haven't updated to >3.8
        
       | ivalm wrote:
       | Multiple context managers in a single with statement is something
       | I didn't know!
        
       | stabbles wrote:
       | > Python arguments are evaluated when the function definition is
       | encountered
       | 
       | This is a giant pain. Easy to miss. Sometimes forces you to deal
       | with Optional[Something] instead of just Something.
       | 
       | Compare with Julia where default arguments are evaluated ... very
       | late:                   julia> f(a, b, c, d = a * b * c) = d
       | f (generic function with 2 methods)                  julia>
       | f("hello", " ", "world")         "hello world"
       | 
       | that's really neat.
        
         | progmetaldev wrote:
         | Probably one of the benefits I gained from writing JavaScript
         | before ES5 (although have worked with many languages, I've only
         | used a few that were dynamic - PHP, JS, and old VB). I write my
         | functions as early as possible, having remembered hoisting
         | rules from JavaScript (and trying to only rely on OOP with
         | Python where it naturally makes sense).
         | 
         | Looking at your Julia example, this seems much more friendly
         | and less surprise and error-prone.
        
       | IshKebab wrote:
       | "Overlooked core functionality" is an interesting way to spell
       | "massive footgun".
        
       | TrianguloY wrote:
       | first, _, last = [1, 2, 3, 4, 5]
       | 
       | I guess this is a typo, it should be                   first, *_,
       | last = [1, 2, 3, 4, 5]
       | 
       | (As explained above!)
       | 
       | Other than that, nice list of python tricks, I love not-so-known
       | features because it can make code shorter and prettier!
        
         | erikvdven wrote:
         | Sharp! Updated that line. And thank you for the compliment :)
        
       | rowanseymour wrote:
       | Since Python 3.7                 import pdb       pdb.set_trace()
       | 
       | can be written as just                 breakpoint()
        
         | agumonkey wrote:
         | I was told that at my job, but my fingers are so used to type
         | `pdb` and emacs template-replacing it that I can't change.
        
         | erikvdven wrote:
         | Thanks for the tip! :)
        
         | IshKebab wrote:
         | And this also works with Debugpy so you can actually use a
         | proper debugger and not pdb which is frankly terrible.
        
       | VWWHFSfQ wrote:
       | > Python arguments are evaluated when the function definition is
       | encountered. Good to remember that!
       | 
       | I would never try to exploit this behavior to achieve some kind
       | of benefit (avoiding max recursion). Any tricks you try to do
       | with this is almost definitely going to to cause bugs that are
       | very difficult to track down. So don't be too clever here.
        
         | progmetaldev wrote:
         | Before I saw your comment, I had "overlooked" that these were
         | presented as beneficial features, rather than just curiosities.
         | As someone just learning Python, but familiar with other
         | languages, I can only hope that if I start using Python in
         | production with other developers they take the most obvious
         | route (or use a comment as to why they would be relying on this
         | type of behavior).
         | 
         | I chose to learn Python because it seemed to be the easiest to
         | read, which to my mind meant working in a team would lead to
         | easier discovery and understanding. Then I see articles like
         | this, and wonder if I'll have a lot of footguns to watch out
         | for where the code isn't as clear as it seems.
        
         | hangonhn wrote:
         | Yeah. I was really surprised to see this as a feature to be
         | used rather than a gotcha. I've seen it more as gotchas, as in
         | actual bugs introduced because of this behavior, and never as a
         | feature until now. I can see why he thinks it's useful though
         | and, maybe within his specific context, it is. That said, even
         | for his example, I think he would have been better off using
         | https://docs.python.org/3/library/functools.html#functools.c...
        
       | BoppreH wrote:
       | Pretty good list. Two corrections:
       | 
       | The `first, *middle, last` trick doesn't work if your list only
       | has one element:                 first, *middle, last = [1]
       | ValueError: not enough values to unpack (expected at least 2, got
       | 1)
       | 
       | And the last title has a typo:
       | 
       | > Separater for Large Numbers
        
       | m4r71n wrote:
       | I would not recommend the default arguments hack. Any decent
       | linter or IDE will flag that as an error and complain about the
       | default argument being mutable (in fact, mutable default
       | arguments are the target of many beginner-level interview
       | questions). It's much easier to decorate a function with
       | `functools.cache` to achieve the same result.
        
         | Smaug123 wrote:
         | More concretely, one of the classic Python bugs is to use `[]`
         | as a default argument and then mutate what "is obviously" a
         | local variable.
        
           | nighthawk454 wrote:
           | I think it's even more safe/preferable to use non-mutable
           | `None`s as a default and do:
           | 
           | ``` def myfunc(x=None): x = x if x is not None else [] ...
           | ```
        
         | TrianguloY wrote:
         | Or, if you need a "static" variable for other purposes, the
         | usual alternative is to just use a global variable, but if for
         | some reason you can't (or you don't want to) you can use the
         | function itself!                   def f():             if not
         | hasattr(f, "counter"):                  f.counter = 0
         | f.counter += 1             return f.counter
         | print(f(),f(),f())              > 1 2 3
        
           | data-ottawa wrote:
           | I didn't realize that the function was available in its own
           | scope. This information is going to help me do horrible
           | things with pandas.
        
             | kevincox wrote:
             | This is very important for self-recursion.
        
         | Spivak wrote:
         | I would hate to get an interview question where the very
         | premise of it is wrong. Python does have mutable arguments, but
         | so does Ruby.                   def func(arr=[])           #
         | Look ma we mutated it.           arr.append 1           puts
         | arr         end
         | 
         | Why calling this function a few times outputs [1], [1],...
         | instead of [1], [1, 1],... isn't because Ruby somehow made the
         | array immutable and hid it with copy-on-write or anything like
         | that. It's because Ruby, unlike Python, has default
         | _expressions_ instead of default _values_. Whenever the default
         | it needed Ruby reevaluates the expression in the scope of the
         | function definition and assigns the result to the argument. If
         | your default expression always returned the same object you
         | would fall into the same trap as Python.
         | 
         | The sibling comment is wrong too -- it _is_ a local variable,
         | or as much one as Python can have since all variables, local or
         | not, are names.
        
         | sanderjd wrote:
         | Agreed, I found that example very confusing.
        
         | klyrs wrote:
         | functools.cache is pretty new; py3.8 is still supported for
         | another year and a bit.
        
       | mkl95 wrote:
       | Some of these features lie in the border of the uncanny valley
       | where languages like Ruby and "vanilla Javascript" live, and are
       | not compatible with the principle of least surprise or even the
       | Zen of Python. I don't write too much Python anymore, but when I
       | do I keep it simple and explicit.
        
       ___________________________________________________________________
       (page generated 2023-07-24 23:00 UTC)