0

I've got a huge monster of a database (Okay that's not quite true, but there are over 8 million records in one product table).. This table is fed by 13 suppliers. Even with the best indexing I could come up with, searching for the top 10,000 records that are ready for supplier 8, is crazy slow. What I'd like to do is create a product table for each supplier and parse the table into smaller tables.

Now in c++ or what have you, I'd just switch the table that I'm working with inside the class.

In ruby, it seems I'll have to create a new class for each table, and do a migration. Also as I plan to have some in session tables #, I'd be interested in getting ruby to work with them..


Oh.. 8 million and set to grow to 20 million in the next 6 months.

A question posed, was what's my db engine.. Right now it's sql, but I'm open to pushing my db to another engine, if it will mean I can use temp tables, and "partitioned" tables.


One additional point to indexing.. Indexing on fields that change frequently isn't practical. Like price and quantity.. I'd have to re-index the changed items, each time I made a change.

9
  • 1
    Could you post the SQL query you're using. 8 millions is a a lot but with proper indexing and an optimized query, response times should still be reasonable. Commented Nov 30, 2011 at 6:22
  • To the db.. we're migrating to what ever.. currently it's sql server, but we'll likely go with the db ruby "comes with". Commented Nov 30, 2011 at 21:21
  • select top(10000) * from tableX where supplier=8 and status=1 order by priorityID, batchID, marketID, sku desc Commented Nov 30, 2011 at 21:24
  • That's sort of what it looks like.. It's the order by that kills us. But both the comments miss the point. I want to dynamically tell the collection of records to be sourced from a table that didn't exist when I defined the class. The structure of the new table is known, and unchanging.. just the table name will change. Commented Nov 30, 2011 at 21:39
  • def self.table_name "table_name" end try that in your model. Commented Nov 30, 2011 at 21:58

1 Answer 1

1

By Ruby, I am assuming you mean that inheriting from the ActiveRecord::Base class in a Ruby on Rails application. By convention, you are correct in that each class is meant to represent a separate table.

You can easily execute arbitrary SQL using the "ActiveRecord::Base.connection.execute" method, and passing a string that is your SQL query. This would bypass having to create separate Ruby classes that would represent transient tables. This is not the "Rails approach", however it does address your question of allowing switching of the tables inside a class file.

More information on ActiveRecord database statements can be found here: http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/DatabaseStatements.html

However, as other people have pointed out, you should be able to optimize your query such that splitting across multiple tables is not necessary. You may want to analyze your SQL query's execution plan using various tools to optimize the execution. If you are using MySQL view check out their query execution planning functionality: http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html

By introducing indexes, or changing join methods between tables, etc you should be able to return reduce your query execution time.

Sign up to request clarification or add additional context in comments.

4 Comments

I know I mentioned that I was new to Ruby/rails.. I am by no means new to sql.. I've optimized the snot out of my queries with every tool I could find. The problem is one of size.. when I stated this thread I had 8 million records.. Now I've got 10.. I expect to crack 20 million in the next 3 to 6 months.. I've normalized, and de-normalized.. changed my table structure.. and the only thing that has worked is a table split.. Perfectly fine for c++.. Ruby has been thrust upon me, and now I hurt for answers to questions I've already answered.
Question: Would that connection be for the current object and all the iterations across it's "collection"? Or would that be like the static issue I've mentioned before.. Note: this product will be multi-tenanted, so across the class isn't practical.
This connection is to the database you have defined in your config/database.yml file for a given environment. If you are planning on having multiple tenants, you can use schemas, or scope through a tenant relationship, such as account_id. Read this for more information on making your Rails app compatible with various tenant methods: blog.jerodsanto.net/2011/07/…
Not really on topic mate. I'm not talking about multi-tenant. I was talking about dynamic tables.. In several db's you can have tables that only exist for a small amount of time. Or tables that are really just a partitioned table. Nothing to do with multi-tenant or db connection... Nice of you to try though.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.