• thebestaquaman@lemmy.world
    link
    fedilink
    English
    arrow-up
    115
    arrow-down
    5
    ·
    7 months ago

    I write a lot of Python. I hate it when people use “X is more pythonic” as some kind of argument for what is a better solution to a problem. I also have a hang up with people acting like python has any form of type safety, instead of just embracing duck typing.This lands us at the following:

    The article states that “you can check a list for emptiness in two ways: if not mylist or if len(mylist) == 0”. Already here, a fundamental mistake has been made: You don’t know (and shouldn’t care) whether mylist is a list. These two checks are not different ways of doing the same thing, but two different checks altogether. The first checks whether the object is “falsey” and the second checks whether the object has a well defined length that is zero. These are two completely different checks, which often (but far from always) overlap. Embrace the duck type- type safe python is a myth.

    • Avicenna@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      1
      ·
      edit-2
      7 months ago

      isn’t the expected behaviour exactly identical on any object that has len defined:

      “By default, an object is considered true unless its class defines either a bool() method that returns False or a len() method that returns zero, when called with the object.”

      ps: well your objection is I guess that we cant know in advance if that said object has len defined such as being a collection so this question does not really apply to your post I guess.

      • CompassRed@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        17
        ·
        7 months ago

        It’s not the same, and you kinda answered your own question with that quote. Consider what happens when an object defines both dunder bool and dunder len. It’s possible for dunder len to return 0 while dunder bool returns True, in which case the falsy-ness of the instance would not depend at all on the value of len

      • thebestaquaman@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        7 months ago

        Exactly as you said yourself: Checking falsieness does not guarantee that the object has a length. There is considerable overlap between the two, and if it turns out that this check is a performance bottleneck (which I have a hard time imagining) it can be appropriate to check for falsieness instead of zero length. But in that case, don’t be surprised if you suddenly get an obscure bug because of some custom object not behaving the way you assumed it would.

        I guess my primary point is that we should be checking for what we actually care about, because that makes intent clear and reduces the chance for obscure bugs.

    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      6
      ·
      7 months ago

      type safe python is a myth

      Sure, but type hints provide a ton of value in documenting for your users what the code expects. I use type hints everywhere, and it’s fantastic! Yes, there’s no guarantee that the types are correct, but with static analysis and the assumption that your users want their code to work correctly, there’s a very high chance that the types are correct.

      That said, I lie about types all the time. For example, if my function accepts a class instance as an argument, the intention is that the code accept any class that implements the same methods as the one I’ve defined in the parameter list, and you don’t necessarily have to pass an instance of that class in (or one of its sub-classes). But I feel like putting something reasonable in there makes a lot more sense than nothing, and I can clarify in the docstring that I really just need something that looks like that object. One of these days I’ll get around to switching that to Protocol classes to reduce type errors.

      That said, I don’t type hint everything. A lot of private methods and private functions don’t have types, because they’re usually short and aren’t used outside the class/file anyway, so what’s the point?

      • thebestaquaman@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 months ago

        Type hints are usually great, as long as they’re kept up to date and the IDE interprets them correctly. Recently I’ve had some problems with PyCharm acting up and insisting that matplotlib doesn’t accept numpy arrays, leading me to just disable the type checker altogether.

        All in all, I’m a bit divided on type hints, because I’m unsure whether I think the (huge) value added from correct type hints outweighs the frustration I’ve experienced from incorrect type hints. Per now I’m leaning towards “type hints are good, as long as you never blindly trust them and only treat them as a coarse indicator of what some dev thought at some point.”

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          3
          ·
          6 months ago

          leading me to just disable the type checker altogether.

          The better option is to just put # type: ignore on the statements where it gets confused, and add hints for your code. I’ve done that for SQLAlchemy before they got proper type hinting, and it worked pretty well.

          That said, a type hint is just that, a hint. It shouldn’t be relied on to be 100% accurate (i.e. lots of foo: list should actually be foo: list | None), but if you use a decent static analysis tool, you should catch the worst of it. We use pyright, which is built in to the VSCode extension pylance. It works incredibly well, though it’s a bit too strict in many cases (e.g. when things can be None but generally aren’t).

          So yeah, never blindly trust type hints, but do use them everywhere. The more hints you have, the more the static analysis can help, and disabling them on a case-by-case basis is incredibly easy. You’ll probably still get some runtime exceptions that correct type checking could have caught, but it’s a lot better than having a bunch of verbose checks everywhere that make no sense. A good companion to type checks is robust unit test cases with reasonable data (i.e. try to exercise the boundaries of what users can input).

          As it stands, we very rarely get runtime exceptions due to poor typing because our type hints are generally pretty good and our unit test cases back that up. Don’t blindly trust it, and absolutely read the docs for anything you plan to use, but as long as you are pretty consistent, you can start making some assumptions about what your data looks like.

          • thebestaquaman@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            6 months ago

            I really do agree on all your points, so at the end of the day I think a lot comes down to use-case and personal preference.

            My primary use cases for Python are prototyping and as a frontend/scripting tool for software written in C/C++/Fortran. In those scenarios, spending significant time on type hinting and unittests defeats the purpose of using Python (blazing fast development).

            I’ve written/worked on only one larger code base in pure Python, and my personal opinion became that I heavily prefer strictly typed languages once the code base exceeds a certain size. It just feels so much smoother to work with when I have actual guarantees that are enforced by the language.

            With that said, we were a bunch of people that are used to using Python for prototyping that developed this larger library, and it would probably have gone a lot better if we actually enforced use of proper type hinting from the start (which we were not used to).

            • sugar_in_your_tea@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              2
              ·
              6 months ago

              I heavily prefer strictly typed languages once the code base exceeds a certain size

              As do I, but we don’t all get to pick our stack.

              I use Rust for all my personal projects unless I have a good reason to pick something else. I like pretty much everything about it, from the lack of classes (I hate massive class hierarchies) to the borrow checker to everything being an expression. It feels like I’m getting most of the benefits of functional programming, without being tied down to FP to solve problems.

              That said, I think Python is a reasonable choice for large codebases. For simple scripts, I generally don’t bother with type hints. At my current company, our largest codebase is well over 100k lines of Python, so the type hints are absolutely welcome since they help document code I haven’t touched in over a year (if ever). If things get slow, there’s always the option of a native module. But for most things, Python is fast enough, so it’s no big deal. Because of this, I use type hints for anything that might become a larger project. After the initial POC, I’ll go through and update types, fix a bunch of linting warnings/errors, and flesh out the unit tests. That way I have something to build from when I inevitably come back to it in a year or so.

              So yeah, I definitely recommend using type hinting. The best time to add type hints is at the start of development, the next best time is now.

              • thebestaquaman@lemmy.world
                link
                fedilink
                English
                arrow-up
                3
                ·
                6 months ago

                The next best time is now

                If my Easter break gets boring I might just start cleaning up that Python library… It’s the prime example of something that developed from a POC to a fully functional code base, was left largely unused for about a year, and just the past weeks has suddenly seen a lot of use again. Luckily we’re strict about good docstrings, but type hints would have been nice too.

                • sugar_in_your_tea@sh.itjust.works
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  6 months ago

                  Woo, do it! And add some tests while you’re at it in case those don’t exist.

                  I found a few bugs just going through and cleaning up missing code coverage. Maybe you’ll find the same!

  • PattyMcB@lemmy.world
    link
    fedilink
    English
    arrow-up
    63
    arrow-down
    10
    ·
    7 months ago

    I know I’m gonna get downvoted to oblivion for this, but… Serious question: why use Python if you’re concerned about performance?

    • lengau@midwest.social
      link
      fedilink
      English
      arrow-up
      54
      ·
      7 months ago

      It’s all about trade-offs. Here are a few reasons why one might care about performance in their Python code:

      1. Performance is often more tied to the code than to the interpreter - an O(n³) algorithm in blazing fast C won’t necessarily perform any better than an O(nlogn) algorithm in Python.
      2. Just because this particular Python code isn’t particularly performance constrained doesn’t mean you’re okay with it taking twice as long.
      3. Rewriting a large code base can be very expensive and error-prone. Converting small, very performance-sensitive parts of the code to a compiled language while keeping the bulk of the business logic in Python is often a much better value proposition.

      These are also performance benefits one can get essentially for free with linter rules.

      Anecdotally: in my final year of university I took a computational physics class. Many of my classmates wrote their simulations in C or C++. I would rotate between Matlab, Octave and Python. During one of our labs where we wrote particle simulations, I wrote and ran Octave and Python simulations in the time it took my classmates to write their C/C++ versions, and the two fastest simulations in the class were my Octave and Python ones, respectively. (The professor’s own sim came in third place). The overhead my classmates had dealing with poorly optimised code that caused constant cache misses was far greater than the interpreter overhead in my code (though at the time I don’t think I could have explained why their code was so slow compared to mine).

      • PattyMcB@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        7 months ago

        I appreciate the large amount of info. Great answer. It just doesn’t make sense to me, all things being equal (including performant algorithms), why choose Python and then make a small performance tweak like in the article? I understand preferring the faster implementation, but it seems to me like waxing your car to reduce wind resistance to make it go faster, when installing a turbo-charger would be much more effective.

        • Teanut@lemmy.world
          link
          fedilink
          English
          arrow-up
          16
          ·
          7 months ago

          If you use the profiler and see that the slower operation is being used frequently, and is taking up a chunk of time deemed significant, why not swap it to the faster version?

          In a simulation I’m working on that goes through 42 million rounds I spent some time profiling and going through the code that was eating up a lot of time (especially things executed all 42 million times) and trying to find some optimizations. Brought the run time down from about 10 minutes to 5 minutes.

          I certainly wasn’t going to start over in C++ or Rust, and if I’d started with either of those languages I would have missed out on a lot of really strong Python libraries and probably spent more time coding rather than refining the simulation.

        • lengau@midwest.social
          link
          fedilink
          English
          arrow-up
          9
          ·
          7 months ago

          I think a better analogy would be that you’re tuning your bike for better performance because the trade-offs of switching to a car are worse than keeping the bike.

        • 0ops@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 months ago

          If anything, to me it seems more important for a slower language to be optimized. Ideally everything would be perfectly optimized, but over-optimization is a thing: making optimizations that aren’t economical. Even though c is many times faster than python, for many projects it’s fast enough that it makes no practical difference to the user. They’re not going to bitch about a function taking 0.1 seconds to execute instead of 0.001, but they might start to care when that becomes 100 seconds vs 1. As the program becomes more time intensive to run, the python code is going to hit that threshold where the user starts to notice before c, so economically, the python would need to be optimized first.

      • uis@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        4
        ·
        edit-2
        7 months ago
        1. Performance is often more tied to the code than to the interpreter - an O(n³) algorithm in blazing fast C won’t necessarily perform any better than an O(nlogn) algorithm in Python.

        An O(n³) algorithm in Python won’t necessarily perform any better than an O(nlogn) algorithm in C. Ever heard of galactic algorithms?

        The overhead my classmates had dealing with poorly optimised code that caused constant cache misses was far greater than the interpreter overhead in my code (though at the time I don’t think I could have explained why their code was so slow compared to mine).

        Did they write naive linear algebra operators?

    • JustAnotherKay@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      ·
      edit-2
      7 months ago

      Honestly most people use Python because it has fantastic libraries. They optimize it because the language is middling, but the libraries are gorgeous

      ETA: This might double post because my Internet sucks right now, will fix when I have a chance

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            ·
            6 months ago

            Exactly! Most of the important libraries in other languages have Python support. So I can use Python and not care if they wrote the library in C++, Rust, or something more exotic, provided it works in Python.

            Python is more about gluing stuff together than writing a bunch of greenfield logic. If you run into performance problems and you’re using Python, the best solution is to start rewriting the slow parts in something faster, not to rewrite the whole thing in something different.

            That said, most of my personal projects are in Rust, for a variety of reasons. But I still use a lot of Python as well, and it’s what I use 99% of the time at work.

          • ThirdConsul@lemmy.ml
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 months ago

            My point is tha the libraries itself are not in Python and thus most likely not exclusive to it. This is not an attack on Python, I just find it funny a bit :)

    • pastermil@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      11
      ·
      edit-2
      7 months ago

      This is my two cents as someone in the industry.

      Because, while you don’t want to nitpick on each instruction cycle, sometimes the code runs millions of times and each microsecond adds up.

      Keep in mind that people use this kind of things for work, serving real world customers who are doing their work.

      Yes, the language itself is not optimal even by design, but its easy to work with, so they are making it worth a while. There’s no shortage of people who can work with it. It is easy to develop and maintain stuff with it, cutting development cost. Yes, we’re talking real businesses with real resource constraints.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        Exactly. We picked it for the reasons you mentioned, and I still think it’s a good choice.

        That said, some of our heavier logic is in a lower-level language. We had some Fortran code until recently (rewrote in Python and just ate the perf cost to lower barrier to other devs fixing stuff), and we’re introducing some C++ code in the next month or two. But the bulk of our code is in Python, because that’s what glues everything together, and the code is fast enough for our needs.

        • pastermil@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 months ago

          People seem to be unaware that python has bindings for lower-level languages like C. In fact, people have been heavily using resource intensive libraries implemented in C (e.g. numpy, scipy, pandas, uwsgi).

          Also, Python interpreter performance has come a long way.

    • Takapapatapaka@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      7 months ago

      You may want to beneficiate from little performance boost even though you mostly don’t need it and still need python’s advantages. Being interested in performance isnt always looking for the very best performance there is out of any language, it can also be using little tips to go a tiny bit faster when you can.

    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      7 months ago

      Yes, Python is the wrong choice if performance is your top priority.

      But here’s another perspective: why leave easy performance wins on the table? Especially if the cost is simpler code that works as you probably wanted anyway with both None and []?

      Python is great if you want a really fast development cycle, because the code is generally quite simple and it’s “fast enough.” Any wins for “fast enough” is appreciated, because it delays me needing to actually look into little performance issues. It’s pretty easy for me to write a simple regex to fix this cose (s/if len\((\w+)\) == 0:/if not \1:/), and my codebase will be slightly faster. That’s awesome! I could even write up a quick pylint or ruff rule to catch these cases for developers going forward (if there isn’t one already).

      If I’m actively tweaking things in my Python code to get a little better performance, you’re right, I should probably just use something else (writing a native module is probably a better use of time). But the author isn’t arguing that you should do that, just that, in this case, if not foo is preferred over if len(foo) == 0 for technical reasons, and I’ll add that it makes a ton of sense for readability reasons as well.

      Here are some other simple wins:

      • [] and {} instead of list() and dict() - the former copy constants, whereas the latter actually constructs things; oh, and you save a few chars
      • use list comprehensions instead of regular loops - list comprehensions seem to be faster due to not needing to call append (and less code)
      • use built-ins when you can - they’re often implemented in native code

      I consider each of those cleaner Python code anyway, because they’re less code, just as explicit, and use built-in language features instead of reinventing the wheel.

    • Randelung@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 months ago

      It comes down to the question “Is YOUR C++ code faster than Python?” (and of course the reverse).

      I’ve built a SCADA from scratch and performance requirements are low to begin with, seeing as it’s all network bound and real world objects take time to react, but I’m finding everything is very timely.

      A colleague used SQLAlchemy for a similar task and got abysmal performance. No wonder, it’s constantly querying the DB for single results.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        ·
        7 months ago

        Exactly!

        We rewrote some Fortran code (known for fast perf) into Python and the net result was faster. Why? They used bubble sort in a hot loop, whereas we used Python’s built-in sort (probably qsort or similar). So despite Python being “slower” on average, good architecture matters a lot more.

        And your Python code doesn’t have to be 100% Python, you can write performance-critical code in something else, like C++ or Rust. This is very common, and it’s why popular Python libraries like numpy and scipy are written in a more performant language with a Python wrapper.

    • Reptorian@lemmy.zip
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      7 months ago

      I have the same question. I prefer other languages. I use G’MIC for image processing over Python and C++.

    • WolfLink@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      6 months ago

      I’ve worked on a library that’s Python because the users of said library are used to Python.

      The original version of the project made heavy use of numpy, so the actual performance sensitive code was effectively C++ and fourtran, which is what numpy is under the hood.

      We eventually replaced the performance sensitive part of the code with Rust (and still some fourtran because BLAS) which ended up being about 10x faster.

      The outermost layer of code is still Python though.

    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      8
      ·
      7 months ago

      I think there’s a good chance of that:

      • -2x instead of ~2x - a human is unlikely to make that mistake
      • no space here: ==0 - there’s a space every other time it’s done, including the screenshot
      • the numbers are wrong - the screenshot has different data than the image
      • why are there three bars? A naive approach would have two.
    • gerryflap@feddit.nl
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 months ago

      Looks like it. It’s a complete fever dream graph. I really don’t get how someone can use an image like that. Personally I don’t really like AI art anyways, but I could somewhat understand it as a sort of “filler” image to make your article a bit more interesting. But a graph that is supposed to convey actual information? No idea why anyone would AI gen that without checking

  • Avicenna@lemmy.world
    link
    fedilink
    English
    arrow-up
    35
    arrow-down
    7
    ·
    edit-2
    7 months ago

    Yea and then you use “not” with a variable name that does not make it obvious that it is a list and another person who reads the code thinks it is a bool. Hell a couple of months later you yourself wont even understand that it is a list. Moreover “not” will not throw an error if you don’t use an sequence/collection there as you should but len will.

    You should not sacrifice code readability and safety for over optimization, this is phyton after all I don’t think list lengths will be your bottle neck.

      • taladar@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        7 months ago

        It does if you are used to sane languages instead of the implicit conversion nonsense C and the “dynamic” languages are doing

      • Avicenna@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        2
        ·
        7 months ago

        well it does not imply directly per se since you can “not” many things but I feel like my first assumption would be it is used in a bool context

        • thebestaquaman@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          ·
          7 months ago

          I would say it depends heavily on the language. In Python, it’s very common that different objects have some kind of Boolean interpretation, so assuming that an object is a bool because it is used in a Boolean context is a bit silly.

          • Avicenna@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            ·
            edit-2
            7 months ago

            Well fair enough but I still like the fact that len makes the aim and the object more transparent on a quick look through the code which is what I am trying to get at. The supporting argument on bools wasn’t’t very to the point I agree.

            That being said is there an application of “not” on other classes which cannot be replaced by some other more transparent operator (I confess I only know the bool and length context)? I would rather have transparently named operators rather than having to remember what “not” does on ten different types. I like duck typing as much as the next person, but when it is so opaque (name-wise) as in the case of “not”, I prefer alternatives.

            For instance having open or read on different objects which does really read or open some data vs not some object god knows what it does I should memorise each case.

            • Jerkface (any/all)@lemmy.ca
              link
              fedilink
              English
              arrow-up
              5
              ·
              edit-2
              7 months ago

              Truthiness is so fundamental, in most languages, all values have a truthiness, whether they are bool or not. Even in C, int x = value(); if (!x) x_is_not_zero(); is valid and idiomatic.

              I appreciate the point that calling a method gives more context cues and potentially aids readability, but in this case I feel like not is the python idiom people expect and reads just fine.

              • Avicenna@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                7 months ago

                I don’t know, it throws me off but perhaps because I always use len in this context. Is there any generally applicable practical reason why one would prefer “not” over len? Is it just compactness and being pythonic?

                • Jerkface (any/all)@lemmy.ca
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  7 months ago

                  It’s very convenient not to have to remember a bunch of different means/methods for performing the same conceptual operation. You might call len(x) == 0 on a list, but next time it’s a dict. Time after that it’s a complex number. The next time it’s an instance. not works in all cases.

            • thebestaquaman@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              7 months ago

              I definitely agree that len is the preferred choice for checking the emptiness of an object, for the reasons you mention. I’m just pointing out that assuming a variable is a bool because it’s used in a Boolean context is a bit silly, especially in Python or other languages where any object can have a truthiness value, and where this is commonly utilised.

              • Avicenna@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                7 months ago

                It is not “assume” as in a conscious “this is probably a bool I will assume so” but more like a slip of attention by someone who is more used to the bool context of not. Is “not integer” or “not list” really that commonly used that it is even comparable to its usage in bool context?

                • thebestaquaman@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  7 months ago

                  Then I absolutely understand you :)

                  How common it is 100 % depends on the code base and what practices are preferred. In Python code bases where I have a word in decisions, all Boolean checks should be x is True or x is False if x should be a Boolean. In that sense, if I read if x or if not x, it’s an indicator that x does not need to be a Boolean.

                  In that sense, I could say that my preference is to flip it (in Python): Explicitly indicate/check for a Boolean if you expect/need a Boolean, otherwise use a “truethiness” check.

          • Glitchvid@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            7 months ago

            if not x thenend is very common in Lua for similar purposes, very rarely do you see hard nil comparisons or calls to typeof (last time I did was for a serializer).

      • acosmichippo@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        7 months ago

        i haven’t programmed since college 15 years ago and even i know that 0 == false for non bool variables. what kind of professional programmers wouldn’t know that?

      • jj4211@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        In context, one can consider it a bool.

        Besides, I see c code all the time that treats pointers as bool for the purposes of an if statement. !pointer is very common and no one thinks that means pointer it’s exclusively a Boolean concept.

      • JustAnotherKay@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 months ago

        Doesn’t matter what it implies. The entire purpose of programming is to make it so a human doesn’t have to go do something manually.

        not x tells me I need to go manually check what type x is in Python.

        len(x) == 0 tells me that it’s being type-checked automatically

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 months ago

          That’s just not true:

          • not x - has an empty value (None, False, [], {}, etc)
          • len(x) == 0 - has a length (list, dict, tuple, etc, or even a custom type implementing __len__)

          You can probably assume it’s iterable, but that’s about it.

          But why assume? You can easily just document the type with a type-hint:

          def do_work(foo: list | None):
              if not foo:
                  return
              ...
          
      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        Maybe, but that serves as a very valuable teaching opportunity about the concept of “empty” is in Python. It’s pretty intuitive IMO, and it can make a lot of things more clear once you understand that.

        That said, larger projects should be using type hints everywhere, and that should make the intention here painfully obvious:

        def do_work(foo: list | None):
            if not foo:
                ... handle empty list ...
            ...
        

        That’s obviously not a boolean, but it’s being treated as one. If the meaning there isn’t obvious, then look it up/ask someone about Python semantics.

        I’m generally not a fan of learning a ton of jargon/big frameworks to get the benefits of more productivity (e.g. many design patterns are a bit obtuse IMO), but learning language semantics that are used pretty much everywhere seems pretty reasonable to me. And it’s a lot nicer than doing something like this everywhere:

        if foo is None or len(foo) == 0:
        
    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      edit-2
      7 months ago

      That’s why we use type-hinting at my company:

      def do_work(foo: list | None):
          if not foo:
              return
          ...
      

      Boom, self-documenting, faster, and very simple.

      len(foo) == 0 also doesn’t imply it’s a list, it could be a dict or any other type that implements the __len__. That matters a lot in most cases, so I highly recommend using type hints instead of relying on assumptions like len(foo) == 0 is probably a list operation.

      • LegoBrickOnFire@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        6 months ago

        Well, in your case it is not clear whether you intended to branch in the variable foo being None, or on the list being empty which is semantically very different…

        Thats why it’s better to explicitly express whether you want an empty collection (len = 0) or a None value.

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 months ago

          Well yeah, because I’m explicitly not defining a difference between None and []. In most cases, the difference doesn’t matter.

          If I did want to differentiate, I’d use another if block:

          if foo is None:
              ...
          if not foo:
              ...
          

          Explicit is better than implicit. I hate relying on exceptions like len(foo) == 0 raising a TypeError because that’s very much not explicit.

          Exceptions should be for exceptional cases, as in, things that aren’t expected. If it is expected, make an explicit check for it.

          • LegoBrickOnFire@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 months ago

            I don’t really understand the point about exceptions. Yeah “not foo” cannot throw an exception. But the program should crash if an invalid input is provided. If the function expects an optional[list] it should be provided with either a list or None, nothing else.

            • sugar_in_your_tea@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              6 months ago

              Sure. But is None invalid input in your case, whereas [] is valid? If so, make that check explicit, don’t rely on an implicit check that len(...) does.

              When I see TypeError in the logs, I assume the developer screwed up. When I see ValueError in the logs, I assume the user screwed up. Ideally, TypeError should never happen, and every case where it could happen should transform it to another type of exception that indicates where the error actually lies.

              The only exceptions I want to see in my code are:

              • exceptions from libraries, such as databases and whatnot, when I do something invalid
              • explicitly raised exceptions

              Implicit ones like accessing attributes on None or calling methods that don’t exist shouldn’t be happening in production code.

              • LegoBrickOnFire@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                6 months ago

                I agree. So if None is a valid input we should check it first, and then check if the length is zero. In this situation, we see a type error only if the programmer screwed up and everything is explicit

                • sugar_in_your_tea@sh.itjust.works
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  6 months ago

                  Yes. If None is just as valid and has the same meaning as [] for the function (true more often than not), just do if not foo. If None should be handled separately from [] for some reason, treat them both separately so it’s absolutely clear.

                  Explicit is better than implicit.
                  Errors should never pass silently.

                  And I especially like this one:

                  That said, jihadists are a subset of Nazis, just a not very stereotypical one for a westerner.

                  There should be one-- and preferably only one --obvious way to do it

                  The one obvious way to check if you have data is if foo. That works for pretty much everything as you’d expect. Explicitly deviating from that is a cue to the reader that they should pay attention. In this case, that means None is semantically different than empty data, and that’s something the reader should be aware of because that’s usually not the case.

                  Edit: Oops, horrendous copy buffer issue from another thread. Read stuff before you post kids, don’t be like me. 😆

    • LegoBrickOnFire@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 months ago

      I really dislike using boolean operators on anything that is not a boolean. I recently made an esception to my rule and got punished… Yeah it is skill issue on my part that I tried to check that a variable equal to 0 was not None using “if variable…”. But many programming rules are there to avoid bugs caused by this kind of inattention.

    • Artyom@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 months ago

      In my experience, if you didn’t write the function that creates the list, there’s a solid chance it could be None too, and if you try to check the length of None, you get an error. This is also why returning None when a function fails is bad practice IMO, but that doesn’t seem to stop my coworkers.

      • Avicenna@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        7 months ago

        good point I try to initialize None collections to empty collections in the beginning but not always guaranteed and len would catch it

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 months ago

          Sometimes there’s an important difference between None and []. That’s by far not the most common use, but it does exist (e.g. None could mean “user didn’t supply any data” and [] could mean “user explicitly supplied empty data”).

          If the distinction matters, make it explicit:

          if foo is None:
              raise ValueError("foo must be defined for this operation")
          if not foo:
              return None
          
          for bar in foo:
              ...
          
          return some_other_value
          

          This way you’re explicit about what constitutes an error vs no data, and the caller can differentiate as well. In most cases though, you don’t need that first check, if not foo can probably just return None or use some default value or whatever, and whether it’s None or [] doesn’t matter.

          if len(foo) == 0: is bad for a few reasons:

          • TypeError will be raised if it’s None, which is probably unexpected
          • it’s slower
          • it’s longer

          If you don’t care about the distinction, handle both the same way. If you do care, handle them separately.

      • chunkystyles@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 months ago

        Comments shouldn’t explain code. Code should explain code by being readable.

        Comments are for whys. Why is the code doing the things it’s doing. Why is the code doing this strange thing here. Why does a thing need to be in this order. Why do I need to store this value here.

        Stuff like that.

      • thebestaquaman@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        7 months ago

        There is no guarantee that the comment is kept up to date with the code. “Self documenting code” is a meme, but clearly written code is pretty much always preferable to unclear code with a comment, largely because you can actually be sure that the code does what it says it does.

        Note: You still need to comment your code kids.

      • Avicenna@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        7 months ago

        If there is an alternative through which I can achieve the same intended effect and is a bit more safer (because it will verify that it has len implemented) I would prefer that to commenting. Also if I have to comment every len use of not that sounds quite redundant as len checks are very common

  • uis@lemm.ee
    link
    fedilink
    English
    arrow-up
    16
    ·
    7 months ago

    There are decades of articles on c++ optimizations, that say “use empty() instead of size()”, which is same as here.

    • dreugeworst@lemmy.ml
      link
      fedilink
      English
      arrow-up
      5
      ·
      7 months ago

      except for c++ it was just to avoid a single function call, not extra indirection. also on modern compilers size() will get inlined and ultimate instructions generated by the compiler will likely be the same

    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      6 months ago

      Oh, there are plenty of other terrible ways:

      for _ in mylist:
          break
      else:
          # whatever you'd do if mylist was empty
      
      if not any(True for _ in mylist):
      
      try:
          def do_raise(): raise ValueError
      
          _ = [do_raise() for _ in mylist]
      except ValueError:
          pass
      else:
          # whatever you'd do i mylist was empty
      

      I could probably come up with a few others as well.

      Please note that none of these handles the TypeError if mylist is None.

  • antlion@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    7 months ago

    Could also compare against:

    if not len(mylist)
    

    That way this version isn’t evaluating two functions. The bool evaluation of an integer is false when zero, otherwise true.

      • antlion@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 months ago

        But the first example does the same thing for an empty list. I guess the lesson is that if you’re measuring the speed of arbitrary stylistic syntax choices, maybe Python isn’t the best language for you.

        • FooBarrington@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          edit-2
          7 months ago

          Yes, the first example does the same thing, but there’s still less to mentally parse. Ideally you should just use if len(mylist) == 0:.

    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 months ago

      That’s worse. IMO, solve this problem with two things:

      • type hint mylist as list | None or just list
      • use if not mylist:

      The first documents intent and gives you static analysis tools some context to check for type consistency/compatibility, and the second shows that None vs empty isn’t an important distinction here.

    • iknowitwheniseeit@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      3
      ·
      7 months ago

      You’d need to explicitly check for None if using the len() construct as well, so this doesn’t change the point of the article.

      • gigachad@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        7 months ago

        But None has no len

        if not foo:  
        

        -> foo could be an empty list or None, it is ambiguous.

        len(foo) will lead to an exception TypeError if foo is None, I can cleanly catch that.

        It suggests I deal with a boolean when that is not the case. Explicit is better than implicit, and if not foo to check for an empty list may be pythonic, but it’s still implicit af

        • iknowitwheniseeit@lemmynsfw.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 months ago

          My point is that if your variable can be None then you need the same pattern for the length check.

          So for the Pythonic version:

          if (foo is not None) and not foo:
             ...
          

          For the explicit length check:

          if (foo is not None) and (len(foo) == 0):
            ...
          

          Honestly you’re probably better off using type hints and catching such things with static checks and not adding the None check.

          • gigachad@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            7 months ago

            This is what I would come up with:

            try:
                if len(foo) == 0:
                ...
            except TypeError:
                ...
            

            There is no need to add a None check, as foo being None should be considered as a faulty input. Avoiding the possibility of foo being None from the beginning using static checks or testing is of course the preferred solution. But in reality we do not work in such optimal environments, at least I can say that from the perspective of data science, where often procedural, untested code is produced that runs only a few times. But I get your point and I think both paths are viable, but I am also okay with being in the wrong here,

            • sugar_in_your_tea@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              1
              ·
              6 months ago

              That’s terrible, and I would block that PR in a heartbeat, unless there was a very good reason for it (given context). I would instead prefer:

              if foo is None:
                  ...
              

              Exceptions are useful for bubbling up errors, they’re a massive code smell if you’re catching something thrown by local logic. Just like you shouldn’t catch IndexError right after indexing a list, you shouldn’t catch TypeError right after checking the length. If you need to check parameters, check them at the start of your function and return early.

                • sugar_in_your_tea@sh.itjust.works
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  6 months ago

                  Rejecting a PR shouldn’t be offensive, it should be a learning opportunity, both for the reviewer and the submitter. If I reject it, I’ll give a clear reason why, and suggestions on how to fix it. I’ll also engage in conversation if you’re not clear on why I made a given comment, as well as a defense for why your code should be accepted as-is (i.e. that context I’m talking about).

                  So please bother me with terrible, terrible code. I want to take time out of my day to help contributors learn, and I like pointing out areas where I learn something as well (like, “hey, this is really clever and also really easy to read, good job!”). I’m not always right, but I do have a lot of experience that I think others could benefit from. I know I was deeply appreciative of constructive criticism as a new dev, and I hope that’s true for the people I provide reviews for.

        • mint_tamas@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 months ago

          Apart from the quote from the zen of python, does this really make your code better though? You will end up writing 4-5 lines with an extra level of indentation. The code does the same, but has worse performance and communicates the intent poorly (compared to the “pythonic” version).

          • gigachad@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            7 months ago

            I am not saying it’s better, just that I don’t like the proposed way :) I would argue that being “pythonic” has even less value than the Zen, which I quoted because it’s true, not because it is some strict rule (which it isn’t anyway).

            You could argue I also need to write that extra code for the if not case, as I explicitly have to check if it is None if my program somewhere further down expects only lists.

            Hunting for those sweet milliseconds is a popular game in the Python community ;) if this mechanism is that important for your program, you should definitely use it, I would do as well!

            • mint_tamas@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              6 months ago

              I think pythonic is more important than performance and I would still choose that version over a try-catch block, were it slower. Being pythonic means it represents a commonly understood pattern in Python code, therefore it is more efficient in communicating intent.

              • sugar_in_your_tea@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                2
                ·
                6 months ago

                Exactly. The point of following a code style is to make obvious patterns easy to spot and deviations stand out. That’s why code style guidelines say your priorities should be:

                1. follow whatever style the code around it uses
                2. follow project style guidelines
                3. do the technically optimal option

                3 should only be prioritized if the win is big enough, and there should probably be a comment right there explaining why the deviation was made.

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 months ago

          it’s still implicit

          I don’t see it that way. If you’re doing if len(foo) == 0, you’re implying that foo is expected to not be None, and expecting an exception should not be the default assumption, because exceptions should be… exceptional.

          Here’s what I assume:

          • if foo is not None - empty values are explicitly acceptable
          • if not foo - the difference between an empty and None value isn’t important
          • if len(foo) == 0 - implicit assumption that foo is not None (I frequently forget that len(...) raises on None)

          If an exception was intended by the last bullet point, I prefer an explicit raise:

          if foo is None:
              raise ValueError("foo may not be None")
          

          I actually use schema validation to enforce this at the edge so the rest of my code can make reasonable assumptions, and I’m explicit about whether each field may or may not be None.

  • palordrolap@fedia.io
    link
    fedilink
    arrow-up
    5
    ·
    7 months ago

    As a Perl fossil I recognise this syntax as equivalent to if(not @myarray) which does the same thing. And here I was thinking Guido had deliberately aimed to avoid Perlisms in Python.

    That said, the Perlism in question is the right* way to do it in Perl. The length operator does not do the expected thing on an array variable. (You get the length of the stringified length of the array. And a warning if those are enabled.)

    * You can start a fight with modern Perl hackers with whether unless(@myarray) is better or just plain wrong, even if it works and is equivalent.

      • palordrolap@fedia.io
        link
        fedilink
        arrow-up
        1
        ·
        6 months ago

        Well, you see, Perl’s length is only for strings and if you want the length of an array, you use @arrayname itself in scalar context.

        Now, length happens to provide scalar context to its right hand side, so @arrayname already returns the required length. Unfortunately, at that point it hasn’t been processed by length yet, and length requires a string. And so, the length of the array is coerced to be a string and then the length of that string is returned.

        A case of “don’t order fries if your meal already comes with them or you’ll end up with too many fries”.

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 months ago

          the length of the array is coerced to be a string and then the length of that string is returned.

          it’s the string coercion that I have a problem with. I’d much rather have an error than have things silently be coerced to different types.

          • palordrolap@fedia.io
            link
            fedilink
            arrow-up
            2
            ·
            6 months ago

            Perl was originally designed to carry on regardless, and that remains its blessing and curse, a bit like JavaScript which came later.

            Unlike JavaScript, if you really want it to throw a warning or even bail out completely at compiling such constructs (at least some of the time, like this one) it’s pretty easy to turn that on rather than resort to an entirely different language.

            use warnings; at the top of a program and it will punt a warning to STDERR as it carries merrily along.

            Make that use warnings FATAL => "syntax"; and things that are technically valid but semantically weird like this will throw the error early and also prevent the program from running in the first place.

    • tiredofsametab@fedia.io
      link
      fedilink
      arrow-up
      1
      ·
      7 months ago

      I really liked unless in perl; especially as I get older !length or something makes that bang really easy to miss. I use !(length) or something instead to visually set it aside. unless made this much more visually clear.

    • Womble@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 months ago

      Empty sequences being false goes back a lot further than perl, it was already a thing in the first lisp (in fact the empty list was the cannonical false).

  • Harvey656@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 months ago

    I could have tripped, knocked over my keyboard, cried for 13 straight minutes on the floor, picked my keyboard back up, accidentally hit the enter key making a graph and it would have made more sense than this thing.

    -2x faster. What does that even mean?

  • AnUnusualRelic@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 months ago

    From that little image, they’re happy it takes a tenth of a fucking second to check if a list is empty?

    What kind of dorito chip is that code even running on?

  • Archr@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 months ago

    I haven’t read the article. But I’d assume this is for the same reason that not not string is faster than bool(string). Which is to say that it has to do with having to look up a global function rather than a known keyword.

  • borokov@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    7
    ·
    7 months ago

    Isn’t it because list is linked list, so to get the Len it has to iterate over the whole list whereas to get emptyness it just have to check if there is a 1st element ?

    I’ too lazy to read the article BTW.

    • dreugeworst@lemmy.ml
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      7 months ago

      why comment if you don’t even want to read the article? python lists are not linked lists, they’re contiguous with a smart growth strategy.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        7 months ago

        Like in most reasonable languages. Linked lists would be a terrible implementation for a list where grabbing arbitrary indices is explicitly supported.

        And even then, many linked list implementations maintain an updated size or length because checking that is a pretty common operation. So even if that is the implementation, it would still be fast because len(list) is a very common operation so they’d definitely optimize it.

      • borokov@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        6 months ago

        I comment because this is how a social network works, and this is how you keep lemmy alive. My comment has generated a dozen of other comments, so he achieved his goal.

        There is not a single question that’s already have been answered on internet, so there no point on asking anything on social plateforms except just for the sake of interacting with other peoples.

        Lemmy is not stackoverflow 😉

        • dreugeworst@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 months ago

          If the point of Lemmy is just to generate as many comments as possible with everyone just assuming whatever they want about linked articles without reading them I’ll quickly leave again. I’m here for informed discussion, not for a competition in generating engagement

    • riodoro1@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      edit-2
      7 months ago

      So… it has to iterate over the whole empty list is what you’re saying? like once for every of the zero items in the list?

      • borokov@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 months ago

        Don’t know how list are implemented in Python. But in the dumb linked list implementation (like C++ std::list), each element has a “next” member that point the the next element. So, to have list length, you have to do (pseudo code, not actual python code):

        len = 0
        elt = list.fisrt
        while exist(elt):
            elt = elt.next
            len++
        return len
        

        Whereas to test if list is empty, you just have to:

        return exist(list.first)