[HN Gopher] Pysa: An open source tool to detect and prevent secu...
       ___________________________________________________________________
        
       Pysa: An open source tool to detect and prevent security issues in
       Python code
        
       Author : jimarcey
       Score  : 159 points
       Date   : 2020-08-07 16:07 UTC (6 hours ago)
        
 (HTM) web link (engineering.fb.com)
 (TXT) w3m dump (engineering.fb.com)
        
       | [deleted]
        
       | sinancepel wrote:
       | One of the authors of the blog post and software engineer working
       | on Pysa here - happy to answer any questions you may have :)
        
         | devy wrote:
         | Hi,
         | 
         | I have 2 questions:
         | 
         | 1. The installation doc specified `watchman` as a dependency.
         | Why is that used? without watchman, would pysa not work?
         | 
         | 2. Also, why can the pysa become a separate stand alone tool
         | instead of living in the pyre Github repo?
        
           | sinancepel wrote:
           | 1. Pysa should work without watchman - it shares some code
           | and infrastructure with Pyre, but doesn't need Watchman to
           | complete its analysis.
           | 
           | 2. Hopefully the answer to (1) helps here. Pysa shares some
           | code with Pyre, including the parallelization infrastructure
           | - the same infrastructure that makes Pyre fast interactively
           | makes Pysa fast on large codebases. Living on the Pyre GitHub
           | repo allows Pysa to use the parallelization infra, in
           | addition to the type checking APIs of Pyre as necessary.
           | 
           | See also our original post introducing Pyre - our goal from
           | the outset was to build a platform for deeper static
           | analyses: https://www.facebook.com/notes/protect-the-
           | graph/pyre-fast-t...
        
         | therealcamino wrote:
         | This is very interesting work. I've been looking for something
         | exactly like this to use on a large C application --
         | specifically to be able to annotate various API's as sources of
         | different kinds of data, checks on how the data types are
         | permitted to be used together, and operations that transform
         | one kind to another. Compared to taint analysis we want to
         | allow more categories than tainted/untainted, and transforming
         | items between categories. Do you have any recommendations for
         | similar tools that work with C?
        
         | cubes wrote:
         | How fast does Pysa typically run? If I want to run it as part
         | of my CI system, how much additional time might I expect it to
         | add? Obviously this varies from code base to code base, but I'm
         | curious what the experience at Instagram is like?
        
           | sinancepel wrote:
           | For Instagram (millions of LOC), the analysis gives feedback
           | to engineers in about 65 minutes on average - note that this
           | is in the context of a diff run: We compare the results of a
           | run on the base revision to the proposed changes, running the
           | tool once or twice depending on whether we hit the cache.
           | It's hard to say how long it'll take on your repository as it
           | depends on a lot of factors, but hopefully that provides some
           | intuition.
        
             | cubes wrote:
             | That's super helpful. I'm currently at Eventbrite, and
             | we're probably in the same order of magnitude.
        
         | yingw787 wrote:
         | Not sure if you can answer this, but what are some classes of
         | security bugs you can find with Pysa? I've only worked on
         | smaller codebases so security I've dealt with is mostly
         | AuthN/AuthZ.
        
           | the_storm wrote:
           | You can find most of security issues with Pysa that you can
           | model as a taint flow problem. Examples could be flows to
           | function that enable code execution or shell injection, SQL
           | injection, SSRF, XSS and many others. As long as you can
           | model the security issue in a taint-flow model then Pysa
           | should be able to detect these issues. These are the
           | configuration we share with Pysa where you can find examples
           | of bug categories we detect https://github.com/facebook/pyre-
           | check/blob/master/stubs/tai...
        
           | gbleaney wrote:
           | Pysa can find any bug that you can model as a flow of data
           | from one place to another. That includes your standard web
           | app bugs like SQLi, RCE, etc., also some AuthN/AuthZ bugs
           | depending on how you do your checks. Concretely, this is a
           | list of the vulnerabilities Pysa able to catch out of the box
           | without any customization: https://github.com/facebook/pyre-
           | check/blob/6975ff55fc59b7b9...
        
         | cubes wrote:
         | Python 2 and 3 or 3 only? I looked for this on the linked site,
         | and it wasn't immediately obvious.
        
           | sinancepel wrote:
           | Pyre & Pysa try to do a best-effort analysis of Python 2, and
           | supports Python 2 style taint annotations, but most of the
           | code we analyze at Facebook is Python 3.6+.
        
         | dkarp wrote:
         | Does Pysa require the code to be fully type hinted? or will it
         | work on non-type hinted code also?
        
           | sinancepel wrote:
           | Pysa will try to analyze all functions regardless of whether
           | they have type hints, but it work better if the function
           | under consideration is typed. Namely, without type hints, it
           | won't be able to pick up on tainted method calls or attribute
           | accesses. However, regular function calls, etc. and standard
           | data structures like dicts and lists should still be tracked
           | normally.
        
           | gbleaney wrote:
           | In addition to what Sinan said, we've had success running on
           | fully untyped codebases. These are you some strategies that
           | you can use to get results on untyped codebases:
           | https://pyre-check.org/docs/pysa-coverage.html
           | 
           | Types will definitely make Pysa find more issues, but you
           | don't need 100% coverage, or really more than the minimal
           | coverage described in that doc I linked, to start finding
           | some issues.
        
       | prepend wrote:
       | This seems like a good idea and the more open source static
       | analyzers the better. (It really tempts me to eventually pay for
       | GitLab high versions.)
       | 
       | Pysa is part of pyre-check and the documentation [0] seems like a
       | lot of work to set up and hope it gets better.
       | 
       | I'm using to using safety [1] and bandit [2] and they are one
       | line drop ins to my builds.
       | 
       | Pysa isn't the same thing and seems much more powerful but I hope
       | they get to a "Just give me something useful out of the box and
       | I'll customize my taint scans later."
       | 
       | [0] https://pyre-check.org/docs/pysa-running [1]
       | https://pypi.org/project/safety/ [2]
       | https://pypi.org/project/bandit/
        
       ___________________________________________________________________
       (page generated 2020-08-07 23:00 UTC)