[HN Gopher] Show HN: Semgrep App
       ___________________________________________________________________
        
       Show HN: Semgrep App
        
       https://semgrep.dev/products/semgrep-app  Hi! I work on Semgrep, an
       open-source project (discussed on HN previously [0][1)]. We're one
       of those companies that maintain an OSS tool and a web app, and
       then monetize by selling enterprise features on said web app. Our
       free web app just went through a major revamp (sort of like a v1.0
       release) so this feels like the perfect time to share and hear what
       the HN crowd thinks!  Let me start with some backstory on Semgrep.
       Our team, r2c, has been experimenting with various ways to help
       organizations step up their application security game. One of our
       earliest experiments was Bento, a wrapper around multiple existing
       linters to help people configure various tools like ESLint and
       Bandit in one go. The bottleneck with a tool like this was, of
       course, interfacing with more and more tools. I had previously
       worked on a similar project called coala[2] which got all the way
       up to 78 analyzers covering 54 languages, until the project ground
       to a halt over the maintenance burden of all that. One of our team
       members at r2c came up with a novel approach to this problem: he
       suggested reusing some of his old work on Coccinelle[3] and later
       Sgrep[4], which were tools to search parsed syntax trees of various
       languages. Conceptually this meant that while Bento and coala could
       standardize the command-line interface, the configuration syntax,
       and file targeting logic of linters, now we could also standardize
       the core linting logic. Extending Bento with linting rules using
       this pattern language proved to be so easy that we rather just
       reimplemented the existing linters with it. And thus, Semgrep was
       born specifically to scan code with these pattern definitions, and
       there was no longer a need for Bento. Our rule registry[5] now
       contains over 1,500 rule definitions in this standardized linter
       rule definition language, across 20 languages.  And this leads us
       to our web app. Early adopters of Semgrep encountered problems
       rolling out the CLI tool across their organization. Their key
       needs: scanning hundreds of repos, reviewing all their scan
       results, deploying custom organization-internal rules across them,
       and avoiding backlash from developers during all that. We also made
       the unorthodox decision to start with a ground rule that we never
       ever want to have access to the source code of our customers. These
       needs and rules guided our web app's feature set, which ended up
       being: provisioning CI jobs on repositories, centrally configuring
       which rules should block builds or notify people, sending
       notifications via PR comments/Slack/email, and displaying the list
       of all findings, along with some analytics.  As for today, we just
       launched a major release of Semgrep App, which cuts down on the
       complexity that built up in our original implementation, and we
       also tried to expand the problem space our app tackles all the way
       through remediating issues on the web UI. You can read more about
       these recent changes at https://r2c.dev/blog/2021/semgrep-app-
       fall-2021-updates/  And as for the future, two main areas of
       interest are 1) intelligently selecting all the right Semgrep
       Registry rules for a given project and 2) creating a smooth
       workflow for organizations to collaboratively maintain their own
       set of internal Semgrep rules.  Please check out the app we built
       at https://semgrep.dev/products/semgrep-app, and let us know what
       you think! I'll be hanging out in the comments as one of the
       engineers who built the app, but our CEO (ievans) is also ready to
       answer questions, and the rest of the team will surely be lurking
       here as well.  [0]: https://news.ycombinator.com/item?id=24931985
       [1]: https://news.ycombinator.com/item?id=26904951  [2]:
       https://github.com/coala/coala/  [3]:
       https://en.wikipedia.org/wiki/Coccinelle_(software)  [4]:
       https://github.com/facebookarchive/pfff/wiki/Sgrep  [5]:
       https://semgrep.dev/r/
        
       Author : underyx
       Score  : 51 points
       Date   : 2021-10-22 16:24 UTC (6 hours ago)
        
       | smoldesu wrote:
       | Is the "Enforce Security" dialogue on your website supposed to
       | overlap with the infographic? I'm browsing on Chromium/Linux
        
         | underyx wrote:
         | Could you check again? We released a fix for this exactly as
         | you commented :D
        
       | Fervicus wrote:
       | Congrats on the launch! Just a heads up that the website seems to
       | have some issues on Firefox. The green check marks show up over
       | the copy making it unreadable.
        
         | pedalpete wrote:
         | Not just firefox, brave (chromium) too
        
         | underyx wrote:
         | Thank you! We're releasing a fix for this right now.
        
       | T3RMINATED wrote:
       | sonar qube
        
       | frellus wrote:
       | Most of it written in OCaml, cool! What made you pick OCaml as
       | the primary language to use for the business logic?
        
         | underyx wrote:
         | Technically, OCaml only applies to Semgrep, as the app which is
         | the subject of this post uses a more neo-traditional Python &
         | TypeScript stack :)
         | 
         | I don't have full context on the parser core, but I do know
         | that a major thing we've got going for OCaml is a translation
         | layer we wrote for getting OCaml code generated based on tree-
         | sitter grammars: https://github.com/returntocorp/ocaml-tree-
         | sitter-semgrep
        
         | padator wrote:
         | Why OCaml? It's a great language to write programs that works
         | on complex data structures, e.g. ASTs. This choice was actually
         | not very original: people in academia at stanford, berkeley,
         | Microsoft research used OCaml for program analysis (CCured,
         | Saturn, CIL, SLAM). And now and now the industry is also using
         | it (Facebook Infer, Facebook Hack/Flow/Pyre, MS Static Device
         | Verifier, etc.)
        
           | underyx wrote:
           | To add some context, padator is on the Semgrep team; he's the
           | person I referenced as
           | 
           | > One of our team members at r2c came up with a novel
           | approach to this problem: he suggested reusing some of his
           | old work on Coccinelle[3] and later Sgrep[4]
        
       | underyx wrote:
       | Clickable links:
       | 
       | https://semgrep.dev/products/semgrep-app
       | 
       | [0]: https://news.ycombinator.com/item?id=24931985
       | 
       | [1]: https://news.ycombinator.com/item?id=26904951
       | 
       | [2]: https://github.com/coala/coala/
       | 
       | [3]: https://en.wikipedia.org/wiki/Coccinelle_(software)
       | 
       | [4]: https://github.com/facebookarchive/pfff/wiki/Sgrep
       | 
       | [5]: https://semgrep.dev/r/
        
       | losvedir wrote:
       | This looks neat, but I'm still not sure I quite get it. Do I
       | understand correctly that earlier tools helped you _use_ , e.g.,
       | ESLint, but now it _replaces_ ESLint and does the linting itself?
       | Or is it still something of an orchestrator of different
       | underlying linters?
        
         | underyx wrote:
         | Semgrep replaces ESLint's security rules. We have a ruleset[0]
         | which shows you how we reimplemented the eslint security
         | plugin's rules with our pattern matching language. I'm not sure
         | why there's a mismatch in the number of rules between the
         | original and our implementation; perhaps a more eye-catching
         | example is GitLab's re-implementation of Bandit's rules[1].
         | GitLab used to bundle Bandit in their SAST analyzer, but they
         | recently switched over to generating the same results via
         | Semgrep[2], as our tool is faster and they can replace many of
         | their linter integrations with it.
         | 
         | [0]: https://semgrep.dev/p/eslint-plugin-security
         | 
         | [1]: https://semgrep.dev/p/gitlab-bandit
         | 
         | [2]: https://r2c.dev/blog/2021/introducing-semgrep-for-
         | gitlab/#se...
        
         | underyx wrote:
         | Oh, silly me. I totally forgot that the GitLab team
         | reimplemented also a set of ESLint rules in Semgrep[0], just
         | like I mentioned they did with Bandit. We published an in-depth
         | comparison with ESLint[1] that might clear things up even more.
         | 
         | [0]: https://semgrep.dev/p/gitlab-eslint
         | 
         | [1]: https://r2c.dev/blog/2021/javascript-static-analysis-
         | compari...
        
       ___________________________________________________________________
       (page generated 2021-10-22 23:00 UTC)