[HN Gopher] Pysa: An open source tool to detect and prevent secu... ___________________________________________________________________ Pysa: An open source tool to detect and prevent security issues in Python code Author : jimarcey Score : 159 points Date : 2020-08-07 16:07 UTC (6 hours ago) (HTM) web link (engineering.fb.com) (TXT) w3m dump (engineering.fb.com) | [deleted] | sinancepel wrote: | One of the authors of the blog post and software engineer working | on Pysa here - happy to answer any questions you may have :) | devy wrote: | Hi, | | I have 2 questions: | | 1. The installation doc specified `watchman` as a dependency. | Why is that used? without watchman, would pysa not work? | | 2. Also, why can the pysa become a separate stand alone tool | instead of living in the pyre Github repo? | sinancepel wrote: | 1. Pysa should work without watchman - it shares some code | and infrastructure with Pyre, but doesn't need Watchman to | complete its analysis. | | 2. Hopefully the answer to (1) helps here. Pysa shares some | code with Pyre, including the parallelization infrastructure | - the same infrastructure that makes Pyre fast interactively | makes Pysa fast on large codebases. Living on the Pyre GitHub | repo allows Pysa to use the parallelization infra, in | addition to the type checking APIs of Pyre as necessary. | | See also our original post introducing Pyre - our goal from | the outset was to build a platform for deeper static | analyses: https://www.facebook.com/notes/protect-the- | graph/pyre-fast-t... | therealcamino wrote: | This is very interesting work. I've been looking for something | exactly like this to use on a large C application -- | specifically to be able to annotate various API's as sources of | different kinds of data, checks on how the data types are | permitted to be used together, and operations that transform | one kind to another. Compared to taint analysis we want to | allow more categories than tainted/untainted, and transforming | items between categories. Do you have any recommendations for | similar tools that work with C? | cubes wrote: | How fast does Pysa typically run? If I want to run it as part | of my CI system, how much additional time might I expect it to | add? Obviously this varies from code base to code base, but I'm | curious what the experience at Instagram is like? | sinancepel wrote: | For Instagram (millions of LOC), the analysis gives feedback | to engineers in about 65 minutes on average - note that this | is in the context of a diff run: We compare the results of a | run on the base revision to the proposed changes, running the | tool once or twice depending on whether we hit the cache. | It's hard to say how long it'll take on your repository as it | depends on a lot of factors, but hopefully that provides some | intuition. | cubes wrote: | That's super helpful. I'm currently at Eventbrite, and | we're probably in the same order of magnitude. | yingw787 wrote: | Not sure if you can answer this, but what are some classes of | security bugs you can find with Pysa? I've only worked on | smaller codebases so security I've dealt with is mostly | AuthN/AuthZ. | the_storm wrote: | You can find most of security issues with Pysa that you can | model as a taint flow problem. Examples could be flows to | function that enable code execution or shell injection, SQL | injection, SSRF, XSS and many others. As long as you can | model the security issue in a taint-flow model then Pysa | should be able to detect these issues. These are the | configuration we share with Pysa where you can find examples | of bug categories we detect https://github.com/facebook/pyre- | check/blob/master/stubs/tai... | gbleaney wrote: | Pysa can find any bug that you can model as a flow of data | from one place to another. That includes your standard web | app bugs like SQLi, RCE, etc., also some AuthN/AuthZ bugs | depending on how you do your checks. Concretely, this is a | list of the vulnerabilities Pysa able to catch out of the box | without any customization: https://github.com/facebook/pyre- | check/blob/6975ff55fc59b7b9... | cubes wrote: | Python 2 and 3 or 3 only? I looked for this on the linked site, | and it wasn't immediately obvious. | sinancepel wrote: | Pyre & Pysa try to do a best-effort analysis of Python 2, and | supports Python 2 style taint annotations, but most of the | code we analyze at Facebook is Python 3.6+. | dkarp wrote: | Does Pysa require the code to be fully type hinted? or will it | work on non-type hinted code also? | sinancepel wrote: | Pysa will try to analyze all functions regardless of whether | they have type hints, but it work better if the function | under consideration is typed. Namely, without type hints, it | won't be able to pick up on tainted method calls or attribute | accesses. However, regular function calls, etc. and standard | data structures like dicts and lists should still be tracked | normally. | gbleaney wrote: | In addition to what Sinan said, we've had success running on | fully untyped codebases. These are you some strategies that | you can use to get results on untyped codebases: | https://pyre-check.org/docs/pysa-coverage.html | | Types will definitely make Pysa find more issues, but you | don't need 100% coverage, or really more than the minimal | coverage described in that doc I linked, to start finding | some issues. | prepend wrote: | This seems like a good idea and the more open source static | analyzers the better. (It really tempts me to eventually pay for | GitLab high versions.) | | Pysa is part of pyre-check and the documentation [0] seems like a | lot of work to set up and hope it gets better. | | I'm using to using safety [1] and bandit [2] and they are one | line drop ins to my builds. | | Pysa isn't the same thing and seems much more powerful but I hope | they get to a "Just give me something useful out of the box and | I'll customize my taint scans later." | | [0] https://pyre-check.org/docs/pysa-running [1] | https://pypi.org/project/safety/ [2] | https://pypi.org/project/bandit/ ___________________________________________________________________ (page generated 2020-08-07 23:00 UTC)