[HN Gopher] AI for AWS Documentation
       ___________________________________________________________________
        
       AI for AWS Documentation
        
       Author : whatsthenews
       Score  : 90 points
       Date   : 2023-07-06 17:22 UTC (5 hours ago)
        
 (HTM) web link (www.awsdocsgpt.com)
 (TXT) w3m dump (www.awsdocsgpt.com)
        
       | nextworddev wrote:
       | Just use Phind.com for searching developer docs for most cases.
        
         | lukebbutton wrote:
         | This is cool, thanks for linking
        
       | jtokoph wrote:
       | Prompt: What is glacier?
       | 
       | Glacier is a term that is not directly mentioned in the provided
       | sources.
       | 
       | Prompt: What is a glacier?
       | 
       | A glacier is a large mass of ice that moves slowly over time due
       | to the accumulation of snow, ice, and other forms of frozen
       | precipitation.
       | 
       | Seems like it's just using a general model?
        
         | manojlds wrote:
         | What is Glacier works for me but What is a Glacier doesn't.
        
         | serjester wrote:
         | It's definitely just using standard semantic search (otherwise
         | you wouldn't be getting links). "What is glacier storage" gives
         | you a great response.
        
           | redox99 wrote:
           | If they finetuned the model on AWS docs, would the
           | embeddings, and thus the vector search improve?
        
             | alsima wrote:
             | Most likely, the model would be less inclined to answer
             | questions/hallucinate for prompts not related to AWS--this
             | is definitely be a future path for improvement
        
       | coder543 wrote:
       | This answer about Graviton was not correct:
       | https://i.imgur.com/3D9WokF.jpg
        
       | 71a54xd wrote:
       | I've been using GPT4 for this since the beginning - ironically a
       | large majority of AWS documentation has been machine generated
       | since 2018. Circa 2019 the entire exlixir API for AWS was machine
       | generated.
       | 
       | Asking GPT4 is also consistently less of a headache than asking
       | the devops guy and getting a 20min explanation for a simple
       | question.
        
         | istjohn wrote:
         | You need to add something like "You give clear and succinct
         | answers to questions" to the beginning of your prompts to the
         | devops guy.
        
       | alexy201 wrote:
       | Hey everyone, I am the creator of AWS Docs GPT and its been
       | extremely useful to garner all of your feedback for the site--
       | thank you guys so much! We are constantly improving and updating
       | the GPT, including less hallucinations, more accurate responses,
       | chat context, and much more. At the end of the day, I really hope
       | this tool can be useful for developers like myself out there!!!
        
       | flaminHotSpeedo wrote:
       | The problem is garbage in -> garbage out.
       | 
       | When the docs are wrong or misleading you'll still get burned,
       | even if the model doesn't hallucinate responses
        
         | lukebbutton wrote:
         | Agreed, that's the problem eod. Was trying to set up an
         | instance a few days ago and the docs for it hadn't been updated
         | since '21
        
       | jgalt212 wrote:
       | is AWS designed to take all my money?
       | 
       | No, AWS is not designed to take all your money. AWS offers a
       | variety of payment options and cost-saving measures to help you
       | manage your expenses effectively. ...
        
       | bjt wrote:
       | It invented an answer for something that AWS actually can't do
       | right now.
       | 
       | https://imgur.com/a/0IYZ2WV
        
         | macksd wrote:
         | Maybe it could help them when designing APIs for new products
         | to be consistent with previous design choices. But perhaps it's
         | too late for that.
        
         | jrvarela56 wrote:
         | phind.com did not halucinate in this case:
         | https://capture.dropbox.com/4gIUDuAxr14bnNIt
         | 
         | https://www.phind.com/search?cache=d0b3a85b-17f9-4def-b8d0-b...
        
       | coding123 wrote:
       | seems super useful, try this prompt:
       | 
       | how do you use wrangler and glue to make athena tables using
       | terraform
        
         | alsima wrote:
         | [dead]
        
       | ilc wrote:
       | I asked it to write some basic terraform:
       | 
       | - Make a VPC. - Add an Instance. - Abstract the region and AZ,
       | into vars.
       | 
       | etc... every time I wanted to change the code, I asked the bot to
       | do the refactor, and it did.
       | 
       | Overall, I'm impressed. It wasn't the most complicated thing, but
       | it didn't dive off the deep end.
        
       | ghomem wrote:
       | Please I beg you: ask it how to take a snapshot of an EC2
       | instance and then how to restore it :)
        
       | mkl95 wrote:
       | The thought that a bunch of people will trust this tool and make
       | some terrible decisions is unsettling. On the other hand it could
       | be pretty powerful if you know what you are doing.
        
       | zgluck wrote:
       | I asked it:
       | 
       | "how do I avoid high NAT gateway bills when an ECS service keeps
       | downloading the same image over and over?"
       | 
       | It offered three replies. The first and third were outright
       | incorrect, the second was (technically) correct:
       | 
       | https://i.imgur.com/la98cxC.png
       | 
       | Also: I'm assuming you haven't actually secured a license to use
       | the AWS logo.
        
       | scrum-treats wrote:
       | Using ChatGPT for AWS service questions is actually pretty good.
       | For instance, I asked it for a Cloud Practitioner study guide
       | (using a small set of crafted prompts), and GPT performed quite
       | well. While I have yet to query GPT about Solutions Architect or
       | DevOps material, I know I can feed a set of URLs and GPT will
       | "learn" the material and summarize it in ways meaningful and
       | relevant to my prompts. In this way, ChatGPT is quite a powerful
       | assistant on its own.
        
         | aradox66 wrote:
         | Agree, I've had great results asking chatgpt questions about
         | AWS services. The interactivity is very helpful, and chatgpt
         | will draft scripts too, although that's hit or miss. But for
         | understanding concepts and services, it's great.
        
       | underlines wrote:
       | RAG is very difficult to do right. I am experimenting with
       | various RAG projects from [1]. The main problems are:
       | 
       | - Chunking can interfer with context boundaries
       | 
       | - Content vectors can differ vastly from question vectors, for
       | this you have to use hypothetical embeddings (they generate
       | artificial questions and store them)
       | 
       | - Instead of saving just one embedding per text-chuck you should
       | store various (text chunk, hypothetical embedding questions, meta
       | data)
       | 
       | - RAG will miserably fail with requests like "summarize the whole
       | document"
       | 
       | - to my knowledge, openAI embeddings aren't performing well, use
       | a embedding that is optimized for question answering or
       | information retrieval and supports multi language. SOTA textual
       | embedding models can be found on the MTEB Leaderboard [2]. Also
       | look into instructorEmbeddings
       | 
       | - the LLM used for the Q&A using your context should be fine-
       | tuned for this task. There are several open (source?) LLMs based
       | on openllama and others, that are fine tuned for information
       | retrieval. They hallucinate less and are sticking to the context
       | given.
       | 
       | 1 https://github.com/underlines/awesome-marketing-datascience/...
       | 
       | 2 https://github.com/embeddings-benchmark/mteb
        
         | darkteflon wrote:
         | This comment was very helpful for me, thanks.
         | 
         | I've been working with RAG for months, too, and it's
         | vanishingly rare to see anything but toy examples in the wild.
         | This is a solid, concise list of where the dragons are.
         | 
         | Any idea where all the RAG practitioners hang out and trade war
         | stories? Is there a forum or Discord or something?
        
       | zoomzoom wrote:
       | We did something similar for all the cloud whitepapers from AWS,
       | Azure, GCP, CloudFlare, and CNCF at
       | https://cloudwhitepapers.withcoherence.com/
       | 
       | These are fun projects!
        
         | victor106 wrote:
         | This looks cool.
         | 
         | How does this work?
        
       | jamestimmins wrote:
       | What's the best current way to make a bunch of documents
       | searchable via LLMs like this?
       | 
       | I've tried the using OpenAI w embeddings (iirc), but this was
       | slow, got expensive quickly, and it struggled to answer questions
       | about the text accurately. Curious if there's better standard
       | approaches now.
        
         | Jianghong94 wrote:
         | A couple of things come to mind: 1. embedding methods: there're
         | a couple of ways to do that, the most used one is OpenAi's
         | text-davinci-002, although in my use case(short sentence
         | description of API) it didn't work pretty well; 2. how you
         | truncate documents into pieces: for this langchain has some
         | implementation and helpful pointers.
         | 
         | I think you have to do lots of experiment on this till you find
         | your best information retrieval strategy
        
         | linguistbreaker wrote:
         | I just came across this project which seems to be aiming at
         | streamlining exactly that :
         | 
         | https://github.com/Mintplex-Labs/anything-llm
        
       | sovietmudkipz wrote:
       | This is exactly the kind of software that people should be
       | making. Even if this one is bad, the concept is very sound.
       | 
       | I want to have a specialist AI that is trained to help me learn
       | how to use the software. 100% what should be happening.
       | 
       | General AI should know how to do stuff too but having an
       | specialist AI implies that the company/group of people/person is
       | making sure to tune the model.
       | 
       | Just an IMO.
        
         | scarface_74 wrote:
         | Why?
         | 
         | You can ask the same questions to ChatGPT and get the same or
         | better answers.
         | 
         | I also know from personal experience with ChatGPT, that you can
         | use it to:
         | 
         | - convert Python/boto3 to any language that has an AWS SDK
         | 
         | - convert CloudFormation to Terraform or the CDK
         | 
         | - write scripts that use the SDK
         | 
         | You will get the occasional hallucination.
        
           | politician wrote:
           | Ideally, specialist models are small enough to run locally.
        
           | cle wrote:
           | Here's one example question that ChatGPT utterly fails at,
           | but that this answers fine: "What is Amazon CodeCatalyst?"
           | 
           | ChatGPT: "I'm sorry, but as of my knowledge cut-off in
           | September 2021, there was no service, tool, or product known
           | as Amazon CodeCatalyst offered by Amazon Web Services (AWS).
           | [...]"
        
           | KRAKRISMOTT wrote:
           | ChatGPT is not up to date unless you start using the plugins.
           | This sort of indexing is based on vector databases and
           | various intermediate prompting. If you want to get technical,
           | the academic term is "Retrieval Augmented Generation".
        
             | fzliu wrote:
             | Hallucination is unfortunately inevitable when it comes to
             | any autoregressive model, even with RAG. You can minimize
             | hallucination by prompting, but you'll still see some
             | factually incorrect responses here and there
             | (https://zilliz.com/blog/ChatGPT-VectorDB-Prompt-as-code).
             | 
             | I unfortunately don't think we'll be able to solve
             | hallucination anytime soon. Maybe with the successor to the
             | transformer architecture?
        
       | CSDude wrote:
       | Well, I'll pass.
       | 
       | Prompt: how to connect a lambda function to an RDS instance and
       | query it safely?
       | 
       | 1. Configure your RDS instance for outbound connections to AWS
       | Lambda. This step involves ensuring that your DB instance is in a
       | public subnet on your VPC and that the instance's
       | "PubliclyAccessible" property is set to true.
        
         | klysm wrote:
         | Lol yeah no thanks. This is one of things that makes me nervous
         | about using LLMs. If the internet contains shitty solutions for
         | solving a problem, it's going to recommend shitty solutions.
         | Inexperienced folks won't know better.
         | 
         | There's also the thing where that's a positive feedback look
         | and we overflow the worlds corpus with garbage.
        
           | WrtCdEvrydy wrote:
           | > Inexperienced folks won't know better.
           | 
           | I've seen answers like this in Stack Overflow before.
           | Technically correct, but basically broken in every other way:
           | cybersecurity's future is gonna be interesting.
        
             | scarface_74 wrote:
             | One of my pet peeves about almost every answer on Stack
             | Overflow is that the answer always starts with
             | client=boto3.client(<service>, access-key,secret-key)
             | 
             | There is never a reason to include those last two
             | parameters. It's always a security risk no matter how you
             | do it.
        
               | klysm wrote:
               | This is great example, and I think it's a smaller version
               | of the same problem.
        
         | vasili111 wrote:
         | I find the links from where AI got information more useful than
         | answer of AI.
        
           | Jianghong94 wrote:
           | AWS has some great documentations
        
         | scarface_74 wrote:
         | Asking the same question to Chatgpt
         | 
         | To connect a Lambda function to an RDS (Relational Database
         | Service) instance and query it safely, you can follow these
         | general steps:
         | 
         | 1. Configure your Lambda function's execution role: Ensure that
         | your Lambda function has the necessary permissions to access
         | the RDS instance. The role should include the
         | `AWSLambdaVPCAccessExecutionRole` and the appropriate RDS
         | permissions.
         | 
         | 2. Place your Lambda function and RDS instance in the same VPC:
         | Ensure that both the Lambda function and the RDS instance are
         | in the same Virtual Private Cloud (VPC) to enable their
         | communication. You can configure the VPC settings in the AWS
         | Management Console.
         | 
         | 3. Configure the security group settings: Set up the security
         | groups for your Lambda function and RDS instance to allow the
         | necessary inbound and outbound connections. The Lambda
         | function's security group should allow outbound access to the
         | RDS instance's security group on the RDS port (usually port
         | 3306 for MySQL or 5432 for PostgreSQL).
         | 
         | 4. Create a subnet group: If your RDS instance is in a private
         | subnet, create a subnet group that includes the appropriate
         | subnets for your Lambda function to access the RDS instance
         | securely.
         | 
         | 5. Connect to the RDS instance from the Lambda function: In
         | your Lambda function code, establish a database connection
         | using the appropriate database client library, such as pymysql
         | for MySQL or psycopg2 for PostgreSQL. Use the connection
         | details (endpoint, username, password, etc.) of your RDS
         | instance.
         | 
         | 6. Execute queries safely: To query the RDS instance safely,
         | ensure that your Lambda function's code incorporates secure
         | coding practices. Use parameterized queries or prepared
         | statements to prevent SQL injection attacks. Avoid storing
         | sensitive information (such as database credentials) in your
         | Lambda function code and instead use environment variables or
         | AWS Secrets Manager for secure storage and retrieval.
         | 
         | Remember to regularly update and patch your Lambda function,
         | RDS instance, and associated dependencies to maintain security
         | and stay protected against potential vulnerabilities.
         | 
         | It's worth noting that the exact implementation details may
         | vary depending on your specific use case, programming language,
         | and AWS setup. Consulting the AWS documentation and resources
         | related to Lambda, RDS, and VPC networking can provide more
         | detailed and up-to-date instructions for your specific
         | requirements.
        
           | 8note wrote:
           | This all looks about right, but how would somebody without
           | aws experience be able to validate it without making a mess?
        
             | scarface_74 wrote:
             | Keep going down the rabbit hole...
             | 
             | "I'm a beginner. Walk me through step $n"
        
       | tomrod wrote:
       | Meh. It doesn't actually pull a valid response. We just upgraded
       | a database that required an updated EC2 instance, and it failed
       | to connect those dots.
        
       | JimtheCoder wrote:
       | Are you allowed to use the AWS logo on a site that is not owned
       | by Amazon?
       | 
       | I originally thought this was an official Amazon website...their
       | lawyers would probably say the same thing...
        
         | nextworddev wrote:
         | Definitely not
        
         | jborden13 wrote:
         | I thought I heard openai was sending cease and desists for *gpt
         | domain names as well
        
         | QuinnyPig wrote:
         | "Allowed" is a funny thing.
         | 
         | I launched "Last Week in AWS" with AWS in the domain name seven
         | years ago. AWS has never made an issue of it, though they
         | obviously have that option.
         | 
         | I also have the option (and ownership) to migrate to "Last Week
         | in the Cloud" and talk about their competitors, so it's likely
         | everyone is happier this way--but I confess to not kicking the
         | bear hard enough to find out.
        
           | scarface_74 wrote:
           | It's probably because no one at AWS has heard of your little
           | podcast or website /s
           | 
           | I'm sure you know that your name is brought up frequently
           | inside AWS.
        
       | jdlyga wrote:
       | I really love this concept. While I do get better results from
       | GPT-4 for AWS questions right now, AI as the "interpreter" for
       | documentation works really well.
        
       | JimmyRuska wrote:
       | I wonder if people will make DSLs specifically for LLMs.
       | 
       | For example the terseness / symbols of APL, Perl, or event set
       | notation.
       | 
       | LLMs could train and output the shorter symbolic notation, and it
       | could be expanded for human readability by another program at
       | export.
        
       | yayitswei wrote:
       | Nice work! Would be even more useful to be able to have a
       | conversation with it.
        
         | alsima wrote:
         | [dead]
        
       | stan_kirdey wrote:
       | I am building something similar, it has documentation from azure,
       | aws, and lots of slack/discord threads of software projects are
       | also searchable, check it out https://www.kwq.ai
       | 
       | it even gets real time indexing from slack of aws deep java
       | library, and from discord of deepset haystack project
        
       | social_quotient wrote:
       | It seems to know more than just AWS, I was thinking it was just
       | embeddings but then I asked it "Is Aws better than azure?" And it
       | seemed to give an answer which widener seem to be derived from
       | the source documents.
        
       | ghomem wrote:
       | Simple AWS snapshot:
       | 
       | https://imgur.com/a/IGu1syf
        
       | scarface_74 wrote:
       | I hate to be that guy. But what's the purpose of this? What does
       | this do that I can't just do with ChatGPT?
        
       ___________________________________________________________________
       (page generated 2023-07-06 23:00 UTC)