SQL doesn’t necessarily have to mean full database access. I known it’s pretty c...

danShumway · on June 15, 2023

> If you’re interested in being safer, it’s worth learning the security features built in to your database.

The problem isn't that there's no way to be safe, the problem is that OpenAI's documentation does not do anything to discourage developers from implementing this in the most dangerous way possible. Like you suggest, the most common way this will be implemented is via a db user with full access to do anything.

Developers would be far more likely to implement this safely if they were discouraged from using direct SQL queries. Developers who know how to safely add SQL queries will still know how to do that -- but developers who are copying and pasting code or thinking naively "can't I just feed my schema into GPT" should be pushed towards an implementation that's harder to mess up.

jmull · on June 15, 2023

It's hard for me to believe openai's documentation will have any effect on developers who write or copy-and-paste data access code without regard to security, no matter what it says.

If you provide an API or other external access to app data and the app data contains anything not everyone should be able to access freely then your API has to implement some kind of access control. It really doesn't matter if your API is SQL-based, REST-based, or whatever.

A SQL-based API isn't inherently less secure than a non-SQL-based one if you implement access control, and a non-SQL-based API isn't inherently more secure than a SQL-based one if you don't implement access control. The SQL-ness of an API doesn't change the security picture.

danShumway · on June 15, 2023

> If you provide an API or other external access to app data and the app data contains anything not everyone should be able to access freely then your API has to implement some kind of access control. It really doesn't matter if your API is SQL-based, REST-based, or whatever.

I don't think that's the way developers are going to interact with GPT at all, I don't think they're looking at this as if it's external access. OpenAI's documentation makes it feel like a system library or dependency, even though it's clearly not.

I'll go out on a limb, I suspect a pretty sizable chunk (if not an outright majority) of the devs who try to build on this will not be thinking about the fact that they need access controls at all.

> A SQL-based API isn't inherently less secure than a non-SQL-based one if you implement access control, and a non-SQL-based API isn't inherently more secure than a SQL-based one if you don't implement access control. The SQL-ness of an API doesn't change the security picture.

I'm not sure I agree with this either. If I see a dev exposing direct query access to a database, my reaction is going to be very dependent on whether or not I think they're an experienced programmer already. If I know them enough to trust them, fine. Otherwise, my assumption is that they're probably doing something dangerous. I think the access controls that are built into SQL are a lot easier to foot-gun, I generally advise devs to build wrappers because I think it's generally harder to mess them up. Opinion me :shrug:

Regardless, I do think the way OpenAI talks about this does matter, I do think their documentation will influence how developers use the product, so I think if they're going to talk about SQL they should in-code be showing examples of how to implement those access controls. "We're just providing the API, if developers mess it up its their fault" -- I don't know, good APIs and good documentation should try to when possible provide a "pit of success[0]" for naive developers. In particular I think that matters when talking about a market segment that is getting a lot of naive VC money thrown at it sometimes without a lot of diligence, and where those security risks may end up impacting regular people.

[0]: https://blog.codinghorror.com/falling-into-the-pit-of-succes...