Timeline for Database vs Flat files for rarely accessed data
Current License: CC BY-SA 3.0
15 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Nov 4, 2020 at 14:52 | comment | added | Stack Exchange Broke The Law |
Sanity check: you have an index on data_timestamp, right?
|
|
| Nov 21, 2017 at 23:28 | comment | added | Frank Hileman | Having written similar tools in the past, I would never use a database to store such data. Files are the way to go. | |
| Nov 21, 2017 at 22:37 | answer | added | Vince | timeline score: 1 | |
| Nov 21, 2017 at 10:28 | history | tweeted | twitter.com/StackSoftEng/status/932918538318336001 | ||
| Nov 20, 2017 at 18:55 | comment | added | kdgregory | MySQL provides partitioning for just this type of use-case. While there are definitely cases where it makes sense to split the data into different repositories, I suspect this isn't one of them. | |
| Nov 19, 2017 at 15:17 | comment | added | Doc Brown | Before guessing around what will happen when you get 2 billion rows, I would recommend to make a test, generate that much data artificially and profile this. I would not be astonished when the test shows no noteable performance degradation for those queries. However, you should also have backup/recovery times in mind, those will definitely increase linearly with the database size. | |
| Nov 19, 2017 at 14:45 | answer | added | Ralf Kleberhoff | timeline score: 4 | |
| Nov 19, 2017 at 13:59 | answer | added | Christophe | timeline score: 8 | |
| Nov 19, 2017 at 11:11 | comment | added | Patrick | Have you tried using a timeseries-database for your timeseries-data? e.g. blog.timescale.com/timescaledb-vs-6a696248104e | |
| Nov 19, 2017 at 9:34 | answer | added | Martin Maat | timeline score: 1 | |
| Nov 19, 2017 at 9:31 | comment | added | GrandmasterB | Database size shouldn't impact your query performance if the tables are indexed properly. | |
| Nov 19, 2017 at 9:09 | answer | added | Martin Maat | timeline score: 2 | |
| Nov 19, 2017 at 7:07 | comment | added | Phil Helix | I wouldn't store it all in the same database let alone the same table. Could you setup two database catalogues with identical structures for the querying of current data and another for historical data? You could have a job that moves current data into historical data after 30 days or so. This should improve query performance for your most queried data. | |
| Nov 19, 2017 at 5:13 | review | First posts | |||
| Nov 20, 2017 at 9:24 | |||||
| Nov 19, 2017 at 5:11 | history | asked | Ananth | CC BY-SA 3.0 |