Skip to main content
added 214 characters in body
Source Link
elliot42
  • 871
  • 2
  • 8
  • 19

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the raw data or aggregation results

IMHO, "log all events to files, crawl them at a later time to filter and aggregate the stream" is a pretty standard UNIX Way, but my Rails-y compatriots seem to think that nothing is real unless it's in MySQL.

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the raw data or aggregation results

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the raw data or aggregation results

IMHO, "log all events to files, crawl them at a later time to filter and aggregate the stream" is a pretty standard UNIX Way, but my Rails-y compatriots seem to think that nothing is real unless it's in MySQL.

Tweeted twitter.com/#!/StackProgrammer/status/228075976226127873
added 23 characters in body
Source Link
elliot42
  • 871
  • 2
  • 8
  • 19

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the raw data or aggregation results

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the data or

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the raw data or aggregation results
Source Link
elliot42
  • 871
  • 2
  • 8
  • 19

Data architecture for event log metrics?

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the data or