[HN Gopher] AI-Exploits: Repo of multiple unauthenticated RCEs i... ___________________________________________________________________ AI-Exploits: Repo of multiple unauthenticated RCEs in AI tools Author : DanMcInerney Score : 43 points Date : 2023-11-16 16:48 UTC (6 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | aftbit wrote: | Is anyone using any of these services? The only one I actually | recognize from their list[1] is Triton Inference Server. | | 1: https://github.com/protectai/ai-exploits/tree/main/nmap-nse | swatcoder wrote: | The purpose of the repo seems to be to collect an archive of | what real-world vulnerabilities look like, to inform service | implementors and security researchers in their future work. | | I suppose I'm idly curious about the answer to your question | too, but paying too much attention to the specific targets | feels like it's missing the point and purpose of the | collection. | spmurrayzzz wrote: | h2o is definitely somewhat popular specifically for LLMs, but | ray is certainly widely used for distributed training workloads | ianbutler wrote: | I recognize most of them, they're all pretty common | orchestration, distributed computation, or experiment | management tools. Maybe you're just not as integrated on the | operations portion of the ML space? | wolftickets wrote: | [I work at Protect AI] - The goal here was initially relatively | common tooling around MLOps/Data Science work. All ears here if | you have some ideas for other projects to explore. | gumballindie wrote: | No wonder people working in ai think ai will replace programmers, | given the prevalent lack of experience with actual programming | among them. | | Having said that, the Achilles heel of ai is data. The lower the | quality the more powerful the attack. | | I imagine if someone wanted to mess about with it on a serious | scale they'd go for the jugular - the data. Write content and | create hundreds or thousands of code repositories with subtle | issues and bang, you've compromised thousands and thousands of | unsuspecting folks relying on ai to create code, or any other | type of content. | wolftickets wrote: | [I work at Protect AI] You're spot on for data being the | jugular, interestingly with exploits like this as an attacker | you could quickly go for attacking model content but also have | credentials that would grant you access to data in many cases. | | These tools can serve as the first opening but a sizable one | when looking to attack an enterprise more broadly. | swyx wrote: | > Protect AI is the first company focused on the security of | AI and ML Systems creating a new category we call MLSecOps. | | alright i looked you up, congrats on your fundraising. is | there like an OWASP top 10 vuln list for MLSecOps? does it | differ between traditional ML apps and LLM apps? | byt3bl33d3r wrote: | (I work for ProtectAI) There isn't an OWASP top 10 for | MLSecOps at the moment. There a general OWASP top 10 for | Machine Learning [1] and MITRE ATLAS [2] however. | | [1] https://owasp.org/www-project-machine-learning- | security-top-... [2] https://atlas.mitre.org/ | gumballindie wrote: | Indeed. I am thinking that one way to protect data and ensure | its integrity is to somehow use agents trained on trusted | sources to validate that the content is secure? For instance | to detect "injections" of malicious or ill written code. Same | for other types of content, but difficult. | | Suppose someone magically creates thousands of repositories | that write about a specific way of doing c pointers but all | allow for buffer overflows, or sql queries with subtle ways | to inject strings. | | One way to defend is each data source that goes into training | is to have an ai agent asses the input sources. | | But even so it's extremely difficult to catch convoluted | attacks (ie when an exploit can be made upon meeting certain | criteria). | | Until then i'd consider any code written by an ai and | unsupervised by a competent person as potentially tainted. | dwringer wrote: | I'm not sure... hundreds or thousands of code repositories with | subtle issues sounds like... the real world of code | repositories. And I'd think through analogy and redundancy of | some common algorithms, the LLM trained that way might | conceivably be able to _FIX_ many of those errors. | gumballindie wrote: | Someone should build a poc. Ai doesnt know things other than | what it's ingested. So for such an attack to be successful | you'd need to tilt the statistic towards problematic code. | You'd need loads and loads of repositories but its definitely | doable. | RomanPushkin wrote: | How does it work? Can't understand from the description | waihtis wrote: | Nice work, just saw these pop up on the official CVE feed ___________________________________________________________________ (page generated 2023-11-16 23:00 UTC)