2

I have a PHP script that you can upload very large files with (up to 500MB), and the file's content is stored in a MySQL database. Currently I do something like this:

mysql_query("INSERT INTO table VALUES('')");

$uploadedfile = fopen($_FILES['file']['tmp_name'], 'rb');
while (!feof($uploadedfile)) {
    $line = mysql_escape_string(fgets($uploadedfile, 4096));
    mysql_query("UPDATE table SET file = CONCAT(file, '$line') WHERE something = something");
}
fclose($uploadedfile);

This of course does a bloody lot of sql queries.

I did that rather than something like

$file = file_get_contents($_FILES['file']['tmp_name']);
mysql_query("INSERT INTO table VALUES('$file')");

because that would use up however much memory the file was, and it seemed better to do more sql queries than to use 500 mb of memory.
However, there must be a better way. Should I go ahead and do it the file_get_contents way or is there a better way than CONCAT, or is the way I'm doing it now the lesser of all evils?

2
  • 9
    I find it simultaneously interesting and frustrating that whenever a question is asked on StackOverflow, much more is said about how someone should do something rather on how to do what they're asking. Answerers should present alternatives, and if they are rejected, try to answer the question asked. It doesn't matter why I have to do it this way. I came to StackOverflow to get an answer to a question but mostly I just get comments about how I shouldn't do it this way. Commented Oct 10, 2010 at 17:02
  • If you say, "I want to drive my car down the freeway with just rims, no tires. Any tips?" The answer most people will give you is, "Don't." If you ask how to do a dumb thing with code, most dev's who know their stuff will answer likewise. It's a kindness, not an insult. Commented Apr 14, 2015 at 16:02

6 Answers 6

4

I always store my files on the server, and store their location in the database.

Sign up to request clarification or add additional context in comments.

4 Comments

I chose store files in an sql database because that option doesn't work for me. I wouldn't go to all this trouble if I could just save it on the disk.
@nickolas. Could you explain why? My first thought was file systems are good for file...
Good answer. Storing the whole file in the table is like parking your car in your pocket - better to park the car in the garage and put the key in your pocket, same with files, better to store the file in a folder and it's location in the table.
A lot of times storing files on file systems become an issue because one may have load balanced servers. So file stored on \files\1.jpg may exist on one server and not on others. This then brings us down to syncing files etc or have a dedicated files server!
4

You are right, in some cases a filesystem cannot do the job. Because databases have features such as locking, replication, integrity, no limitation on number of rows, etc etc... Which do not exist in a file system.

Also, backup/restore/migrate the system becomes more complicated and cannot be done safely on a running server (risk of inconsistency and data loss). Or at least guaranting this is very difficult in a DB+FS configuration.

What about migrating from a "/" separator based OS to a "\" based one? You need to update all your paths.

Your method seems to be correct, but the 4096 byte slicing is way too small. Mysql will have no trouble working with 256kb slices, for instance.

Also, I would not concatenate, but rather store each slice as a single record. The database may have trouble storing huge files in a single record, and this may hit limitations mentioned in other answers.

Keeping the data sliced will allow streaming the content without ever storing the whole data in memory, for instance. This way, there is virtually no limit to stored file size.

Comments

1

This wouldn't actually work (by default) with mySQl, because that would cause a 500 MB big query.

$file = file_get_contents($_FILES['file']['tmp_name']);
mysql_query("INSERT INTO table VALUES('$file')");

because the max_allowed_packet is set to 16777216. You would either be required to increase it or split it in chunks smaller than 16 MB (minus query ~500-1000 bytes for the query string).

You can find out the max_allowed_packet of your mysql server by doing querying

SELECT @@global.max_allowed_packet

1 Comment

If you have a large set of images it will slow down the performance.
1

I have yet to see an application that actually needs to store files in a relational database.

There are a significant number of freely available, powerful, databases out there that are designed and optimized specifically for storing/retrieving files. They're called filesystems

Store your files in your filesystem, and your metadata in the RDBMS.

You're worried about using up 500MB of memory while inserting, and it's not clear why. You're eventually going to want to get those files back out of the database, and I don't think you'll find a way to read the file data out in chunks.

2 Comments

It's data. It goes in the database. Instead, you advocate storing some of the data in one system and the rest in another. You are going to store data in two parallel systems. So, you have the potential for these two systems to get out of sync. You have dual points of failure. You also have the potential for exploitation since your application must necessarily be able to write to the file system. So, there is more potential for writing an executable that somehow gets executed.
@Charles - Those are all valid concerns. However, in my experience, trying to store any significant amount of blob data in the RDBMS inevitably leads to pain. Filesystems are easy to partition, easy to replicate/back up (rsync), and performant for the use case. These days, you can use something like S3, and have someone else worry about resiliency and availability. Put another way, there's a reason Amazon didn't implement S3 with an RDBMS on the back-end for blob storage.
1

You can use this :

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_load-file It also gives a straightforward query example.

Comments

0

I would imaging that the most effective way to do that would be to do all the validation in the script UP TO the point of the insert, then shell out and do a file move of the uploaded $FILES temp file piped into a MySQL command line insert query. You'd want someone better in bash than me to validate that but it seems it would pretty much remove the memory issue?

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.