[HN Gopher] GitHub's database of security advisories is now open...
       ___________________________________________________________________
        
       GitHub's database of security advisories is now open source
        
       Author : greysteil
       Score  : 145 points
       Date   : 2022-02-26 19:22 UTC (3 hours ago)
        
 (HTM) web link (github.blog)
 (TXT) w3m dump (github.blog)
        
       | myroon5 wrote:
       | Is it possible to submit new security advisories? Have an
       | advisory for a repository I don't have permissions for
        
         | greysteil wrote:
         | For anything that already has a CVE, yes. You can add
         | information about CVEs that are currently "unreviewed" by the
         | GitHub curation team. By doing so, you'll bump those to the top
         | of the stack for our curators to review (and help them review
         | them). Once reviewed, they'll trigger Dependabot alerts, show
         | up in npm audit, and be more usable by anyone else consuming
         | the data.
         | 
         | For anything that doesn't already have a CVE, no. We don't want
         | that disclosure process to happen in public - we recommend you
         | reach out to the maintainer privately. (Currently we don't have
         | an on-platform way to do that, but we're planning one.)
        
       | greysteil wrote:
       | PM from GitHub here. I've been wanting to do this since I joined
       | three years ago! Happy to answer any questions about where we're
       | going with open source security.
        
         | [deleted]
        
         | lol768 wrote:
         | What can we do to try and reduce "alert fatigue"? I've lost
         | track of the number of super-scary-looking regex DoS "high"
         | vulnerabilities I've had to review for an app that only uses
         | client-side JS and is incredibly unlikely to be exploitable in
         | practice (or particularly where the vulnerable dependencies are
         | build-time only).
         | 
         | One of the problems I've also had with Snyk is low-quality
         | duplicative entries (for example, cataloguing each
         | deserialisation blacklist bypass in Jackson as a separate "new"
         | vulnerability because "yay CVE numbers to put on CVs") which
         | then wastes the time of folks triaging vulnerabilities who may
         | have already concluded there's no exploitation risk (due to
         | e.g. not deserialising user input, or not using polymorphic
         | deserialisation anywhere) and have to review issues again.
        
           | smurda wrote:
           | There are a couple of early startups trying to address this:
           | 
           | https://www.tromzo.com/ - early but very strong vision
           | 
           | https://www.dazz.io/ - dumb name but decent vision
        
           | charcircuit wrote:
           | >What can we do to try and reduce "alert fatigue"?
           | 
           | The more you do something the easier it is to do. There is
           | nothing wrong with it no longer feeling like an alert.
           | Patching security vulnerabilities is just a normal part of
           | software development and the easier and more comfortable
           | people are wish it the better.
        
           | greysteil wrote:
           | A lot. Honestly, GitHub dropped the ball for a while here.
           | (The inside story is that we bought a SAST company, shifted a
           | lot of focus into making that acquisition successful, and
           | didn't give enough attention to our open source security
           | offerings for a couple of years.)
           | 
           | On the alerting side, we have a couple of things coming.
           | Neither are magic bullets, but both will help.
           | 
           | - Better handling of vulnerabilities in dev dependencies.
           | Some vulnerabilities matter if they're in a dev dependency -
           | anything that exfiltrates your local filesystem, for example.
           | Other's don't - DoS vulnerabilities, for example. At the
           | moment, GitHub doesn't even tell you whether the dependency a
           | vulnerability affects is a runtime or development dependency.
           | We can and will get better there.
           | 
           | - Analysis of whether the vulnerable code in a dependency is
           | called. You almost certainly want to react faster to
           | vulnerabilities in your code that your application is
           | actually exposed to than to ones that it may be exposed to in
           | future. (You probably want to respond to the unreachable
           | ones, too, especially if you can get an auto-generated PR to
           | do so, but there's much less urgency.) We have this in
           | private beta for Python right now, and expect to have it in
           | public beta in the next few months.
           | 
           | Beyond alerting, the other big thing is that GitHub's
           | incentives for this database and the experiences it triggers
           | are fundamentally different from other vendors. We aren't
           | selling its contents, so don't have an incentive to inflate
           | it. Open source maintainers are at the heart of our platform,
           | and we really don't want low quality advisories go out about
           | their software. And developers are our core customers, and we
           | want to deliver experiences they love above all else. That
           | difference in incentives will likely manifest in lots of
           | little differences, but at a high level, we're aligned on
           | wanting to reduce the alert fatigue.
           | 
           | Sorry we dropped the ball on this for the last couple of
           | years. You're going to see steady improvements from here on.
        
         | mmsbdjjkvjj wrote:
         | Where are you going with open source security?
        
           | greysteil wrote:
           | Ha! Well, there's a lot.
           | 
           | On major strand is more work like this to make it easy for
           | the community to collaborate. I expect we'll make a lot of
           | iterative improvements to the database over the next few
           | months, aimed at making it easier to contribute to, maintain
           | and use. We need to improve our APIs for this data, for
           | example (currently only available via GraphQL).
           | 
           | Another big one that we're starting to think about is the
           | security vulnerability disclosure process. Our goal there is
           | to support maintainers as much as possible, and there's more
           | we can do. Recent articles on loguru, beg bounties, and the
           | way log4j initially reached public attention all point to
           | problems GitHub can and should help with. In the next 12
           | months we'd like to give maintainers the option to receive
           | vulnerability disclosures privately on GitHub, and for us to
           | be able to support them through that process. (GitHub already
           | does a bit here - through maintainer security advisories we
           | issued about 30% of the CVEs in the JavaScript ecosystem last
           | year, for example. But we can and will do more.)
           | 
           | Loguru CVE article: https://tomforb.es/cve-2022-0329-and-the-
           | problems-with-autom...
           | 
           | Beg bounties: https://www.troyhunt.com/beg-bounties/
           | 
           | Log4j PR: https://github.com/apache/logging-
           | log4j2/pull/608#issuecomme...
        
         | leereeves wrote:
         | What's the thinking there about the pros and cons?
         | Specifically, is there any concern that this might help people
         | who would exploit vulnerabilities rather than fix them?
        
           | greysteil wrote:
           | We believe that, on balance, the pros significantly outweigh
           | the cons here.
           | 
           | One big reason is that the alternative to this structured
           | data being open source is that it lives in proprietary
           | databases. In that world, attackers still have knowledge
           | about these vulnerabilities - they don't need the structured
           | data as much as defenders, and the licenses on those
           | proprietary databases aren't going to deter them anyway (most
           | are public for SEO reasons). Defenders on the other hand,
           | often won't have as much or as high quality information.
        
           | chews wrote:
           | I don't see very many cons with more information.
           | 
           | The world is safer with this info in the public domain, will
           | there be new exploits based on additional info? Sure, but
           | that will get mitigated.
           | 
           | Software, like law or medicine is a practice, meaning we
           | aren't experts... we're just learning better ways to do
           | things.
           | 
           | This just opens the world to formal verification... for
           | goodness sakes we're just getting to fully reproducible
           | deterministic software builds.
        
           | totony wrote:
           | You can probably already create a repo and use Github as an
           | oracle for security vulns. This seems like it'd be very
           | beneficial to people for which security is a second priority
           | (so most developers).
           | 
           | EDIT: Although your concerns might apply to unconfirmed
           | public PRs
        
         | sockpuppet69 wrote:
        
         | akkartik wrote:
         | In the wake of Log4shell I've spent some time thinking about
         | how we can streamline the recovery from such large bugs. I
         | suspect a lot of eyes are on this area now. Do y'all have any
         | plans here? Figuring out what services are impacted by tracking
         | the container images they use, the language runtimes in those
         | images, the packages installed in each language runtime, that
         | sort of thing. Currently this is all a huge manual, often
         | spreadsheet-driven process.
        
           | greysteil wrote:
           | We do a bit here already, and we've got plans to do more.
           | 
           | For repositories using a language the GitHub Dependency Graph
           | supports, we automatically create an inventory of the
           | dependencies the repository uses and create alerts if/when
           | any have a vulnerability (via Dependabot alerts and, as a
           | sibling comment has already mentioned, Dependabot update
           | PRs).
           | 
           | The next improvement we'd like to ship is an API that lets
           | you upload a list of dependencies to us for repositories in
           | which we can't automatically detect them. A good example is
           | repositories using Gradle for dependency management - it's
           | hard for us to understand the dependency tree there without
           | running a build. With the new API you'll be able to upload a
           | list of dependencies (generated using a Gradle command) to
           | GitHub in CI, and GitHub will then be able to send alerts
           | if/when there's a vulnerability in one of those dependencies,
           | just like we do for repos using other package managers.
           | 
           | Your comment specifically mentions containers. That's one
           | area that's a little further off for native GitHub support,
           | but where the open source advisory database should help.
           | Whilst we're currently focussed on scanning source code and
           | surfacing results on repos (not containers), the structured
           | data in the advisory database is just as usable with the
           | results of a container scan. Indeed, I believe all the open
           | source container scanning solutions already use it as a data
           | sources.
        
           | coredog64 wrote:
           | Isn't that what Dependabot is? Github will already scan known
           | package managers for CVEs for reporting purposes, and if you
           | have the right kind of testing, you can allow Dependabot to
           | manage the toil here.
           | 
           | I worked at an i-bank that had their own version of
           | Dependabot and it was great: New version(s) come out and once
           | a week I get a PR to approve that shows that my code still
           | passes tests after the update.
        
         | vcdimension wrote:
         | How big is the entire dataset? How many files? I'd like to know
         | that (approximately) before I click download and try to rustle
         | up some command line tooling scripts to query it. Perhaps you
         | can publish that info in the REAME?
        
           | greysteil wrote:
           | You can see some of that metadata in the UI for the database:
           | https://github.com/advisories
        
       | thenerdhead wrote:
       | How does this scale? I assume with all the unreviewed advisories
       | today and with the oncoming PRs, it will require a full team
       | operating on all cylinders.
       | 
       | Will the team add more members to triage these things or bring
       | upon better automations to ensure no exploitation happens through
       | the process such as incentivizing trusted members of various
       | ecosystems to help?
       | 
       | I love the idea of a public ledger using GitHub & PRs, but could
       | more be done here to instill trust outside a single GitHub
       | account? Perhaps even GitHub organizations could help out further
       | of these known ecosystems.
       | 
       | With security advisories, it seems a bit worrying to see
       | unreviewed advisories to yet be categorized or PRs be open for
       | more than a few days with updated details.
        
         | greysteil wrote:
         | We have a full-time team of curators on staff, as part of the
         | GitHub Security Lab, and we're committed to scaling that team
         | to meet the demand here. That team is already responsible for
         | reviewing all new entries on the NVD for inclusion in the
         | database, and for reviewing all requests for GitHub to issue
         | CVEs from maintainers.
         | 
         | We have some work to do on the tooling to make it really slick,
         | and a couple of those PRs have taken longer to get reviewed
         | than we'd like, but we're working on it!
         | 
         | On trusted members of language ecosystem - we'd be super
         | interested to explore that. It will require some work on the
         | tooling on our side, so I don't expect progress there
         | overnight, but in the long term is a model I think we could
         | make work really well.
        
       ___________________________________________________________________
       (page generated 2022-02-26 23:00 UTC)