DEV Community

Dhruv B Shetty
Dhruv B Shetty

Posted on

From MySQL to PostgreSQL: Enforcing Row-Level Security

Recently, I was looking to add a generative AI feature to one of my older applications and then I came upon a glaring security risk from doing so, based on the current software architecture I had.

I wanted to feed the data from my MySQL database to the LLM, so that users could ask questions like “How many futuristic, dystopian movies have I watched?” so that it could then search their list of finished movies.

However, a user may also ask about what other users watched or aggregations of what others watched but they shouldn’t be able to access this data.

The LLM has access to all the users data and MySQL doesn’t support native row level security i.e. users only having access to their own data which means having to maintain intelligent prompts and abstractions to handle this.

In my case I think it was pretty clear that I should migrate from MySQL to PostgreSQL because of the above, complex queries, data types and low difficulty of migration for my own project.

Here is a solution that I came up with:

Choosing PostgreSQL as my database, I first create a new user with only the required permissions

moviedb=# CREATE USER llmuser WITH PASSWORD 'yourpassword';
GRANT USAGE ON SCHEMA public TO llmuser;
GRANT SELECT ON USER_MOVIES TO llmuser;
Enter fullscreen mode Exit fullscreen mode

Now let's have a look at the data

Description

We want llmuser to access data only for whichever user makes the request from the frontend. To set this, we can get the id from the app and then set it with a PostgreSQL custom config parameter.

For this we need to enable row level security and create a policy on user_id

moviedb=# ALTER TABLE user_movies ENABLE ROW LEVEL SECURITY;
moviedb=# CREATE POLICY USER_DATA ON user_movies USING (user_id = current_setting('my.user_id')::int);

Enter fullscreen mode Exit fullscreen mode

Finally when a user from the frontend makes a request to the Gen AI feature, before fetching any data to feed to the LLM we first set my.user_id

moviedb=# SET my.user_id=2
Enter fullscreen mode Exit fullscreen mode

Now finally when we retrieve data from the table with a select statement we only get data for user_id =2

moviedb=> select * from user_movies;
 user_id | movie_id | watched | done
---------+----------+---------+------
       2 |      101 | f       | f
       2 |      103 | t       | t
(2 rows)
Enter fullscreen mode Exit fullscreen mode

This is just one approach I came up with, and I’d love to hear your thoughts or any alternative solutions you might suggest to tackle this issue. Feel free to share your ideas in the comments!

Top comments (0)