nested_select allows the partial selection of the relations attributes during preloading process, leading to less RAM and CPU usage. Here is a benchmark output for a gist I've created to run real-life example: displaying a course with its structure.
Given:
- Models are: Course, Topic, Lesson.
- Their relations has a following structure: course has_many topics, each topic has_many lessons.
- To display a single course you need its structure, minimum data needed: topic and lessons titles and ordering.
Single course, a real example against production data and a real flow (~ x33 times less RAM):
irb(main):216:0>compare_nested_select(ids, 1, silence_ar_logger_for_memory_profiling: false)
------- CPU comparison, for root_collection_size: 1 ----
user system total real
nested_select 0.096008 0.002876 0.098884 ( 0.466985)
simple includes 0.209188 0.058340 0.267528 ( 0.903893)
----------------- Memory comparison, for root_collection_size: 1 ---------
# partial selection
D, [2025-01-12T19:08:36.163282 #503] DEBUG -- : Topic Load (4.1ms) SELECT "topics"."id", "topics"."position", "topics"."title", "topics"."course_id" FROM "topics" WHERE "topics"."deleted_at" IS NULL AND "topics"."course_id" = $1 [["course_id", 1624]]
D, [2025-01-12T19:08:36.168803 #503] DEBUG -- : Lesson Load (3.9ms) SELECT "lessons"."id", "lessons"."title", "lessons"."topic_id", "lessons"."position", "lessons"."topic_id" FROM "lessons" WHERE "lessons"."deleted_at" IS NULL AND "lessons"."topic_id" = $1 [["topic_id", 7297]]
# selects in full
D, [2025-01-12T19:08:37.220379 #503] DEBUG -- : Topic Load (4.2ms) SELECT "topics"."id", "topics"."position", "topics"."title", "topics"."course_id" FROM "topics" WHERE "topics"."deleted_at" IS NULL AND "topics"."course_id" = $1 [["course_id", 1624]]
D, [2025-01-12T19:08:37.247484 #503] DEBUG -- : Lesson Load (25.7ms) SELECT "lessons".* FROM "lessons" WHERE "lessons"."deleted_at" IS NULL AND "lessons"."topic_id" = $1 [["topic_id", 7297]]
------ Nested Select memory consumption for root_collection_size: 1 ------
Total allocated: 80.84 kB (972 objects)
Total retained: 34.67 kB (288 objects)
------ Full preloading memory consumption for root_collection_size: 1 ----
Total allocated: 1.21 MB (1105 objects)
Total retained: 1.16 MB (432 objects)
RAM ratio improvements x33.54678126442086 on retain objects
RAM ratio improvements x15.002820281285949 on total_allocated objects
100 courses, this is kinda a synthetic example since there is no UI for multiple courses display together with their structures. It executed against the real production data. (nested select serves x7 faster):
irb(main):280:0> compare_nested_select(ids, 100)
------- CPU comparison, for root_collection_size: 100 ----
user system total real
nested_select 1.571095 0.021778 1.592873 ( 2.263369)
simple includes 5.374909 1.704284 7.079193 ( 15.488579)
----------------- Memory comparison, for root_collection_size: 100 ---------
------ Nested Select memory consumption for root_collection_size: 100 ------
Total allocated: 2.79 MB (30702 objects)
Total retained: 2.05 MB (16431 objects)
------ Full preloading memory consumption for root_collection_size: 100 ----
Total allocated: 33.05 MB (38332 objects)
Total retained: 32.00 MB (24057 objects)
RAM ratio improvements x15.57707431190517 on retain objects
RAM ratio improvements x11.836000856510193 on total_allocated objects
Summary: if you have CPU/RAM bottlenecks, heavy relations instantiation for heavy views or reports generation, and you want it to be less demanding in RAM and CPU -- you should try nested_select.
Install the gem and add to the application's Gemfile by executing:
$ bundle add nested_select
If bundler is not being used to manage dependencies, install the gem by executing:
$ gem install nested_select
Assume you have a relation users <- profile, and you want to preview users in a paginated feed, and you need only :photo_url attribute of a profile, with nested_select you can do it like this:
class User
has_one :profile
end
class Profile
belongs_to :user
end
# this will preload profile with exact attributes:
# :id -- since its a primary key,
# :user_id -- since its a foreign_key
# and the :photo_url as requested
User.includes(:profile).select(profile: :photo_url).limit(10)
Whenever you are using through
relations between models and running preload,
then rails will fully load all intermediate objects under the hood!
That is definitely wastes lots of RAM, CPU including those on the DB side.
With nested_select you can apply selections to through
relations.
Ex:
class User
has_one :user_profile, inverse_of: :user
has_many :avatars, through: :user_profile, inverse_of: :user
end
# pay attention user_profile relation, wasn't included explicitly,
# but still rails needed them to be preloaded to be able to match and preload avatars
user = User.includes(:avatars)
.select(avatars: [:img_url, { user_profile: [:zip_code] }]).first
# user - loaded fully
# avatars - foreign and primary keys needed to establish relations + img_url
# user_profile - foreign and primary keys + zip_code
REM: Through preloading happens in reverse, so to nest their selection you must start from the latest, in this case avatar, and go to the previous ones in this case its a user_profile
If you want intermediate models to be completely skinny, you should select like this:
class User
has_one :user_profile, inverse_of: :user
has_many :avatars, through: :user_profile, inverse_of: :user
has_many :through_avatar_images, through: :avatars, class_name: :Image, source: :images
end
# only through_avatar_images is matter here, and we want everything else to be as small as possible
user = User.includes(:through_avatar_images)
.select(through_avatar_images: [avatars: [:id, user_profile: [:id]]]).first
# through_avatar_images -- loaded in full
# avatars, user_profile -- only relations columns id, user_profile_id e.t.c
REM There was an idea for through relations use a skinny approach: no nested attributes means, only relation keys should be loaded:
user = User.includes(:through_avatar_images)
.select(through_avatar_images: [avatars: :user_profile]).first
but that could be easily confused with normal flow behaviour, so I stick to basic default: no nested attributes, means default behaviour, i.e. all attributes.
How safe is the partial model loading? Earlier version of rails and activerecord would return nil in the case, when attribute wasn't selected from a DB, but rails 6 started to raise a ActiveModel::MissingAttributeError. So the major problem is already solved -- your code will not operate based on falsy blank values, it will raise an exception.
But if you are working with attributes directly ( which you should not btw ), you will see nil, without any exception. Using as_json on such models will also deliver json without exception and without skipped attributes.
If you are doing some strange or narrow cases whenever you preloading same objects via different preloading branches, including the most common case through relations, which rails preloads in full, then you must be very accurate with nested selection, cause rails loads and attach associations only once, if it was partial than you might get yourself into trouble. BUT nested_select will check and raise an exception if you are trying to re-instantiate with a different set of attributes. Ex:
ActiveModel::MissingAttributeError: Reflection 'avatars' already loaded with a different set of basic attributes.
expected: ["img_url", "user_profile_id", "id"], already loaded with: ["created_at", "user_profile_id", "id"]
Hint: ensure that you are using same set of attributes for entrance of same relation
on nesting selection tree including reverse through relations
Rails preloading happens from loaded records to their reflections step by step. That's makes it pretty easy to include foreign keys for has_* relations, and very hard for belongs_to, to work this out you need to analyze includes based on the already loaded records, analyze and traverse their relations. This needs a lot of monkey patching, and for now I decided not to go this way. That means in case when nesting selects based on belongs_to reflections, you'll need to select their foreign keys EXPLICITLY!
class Avatar < ApplicationRecord
belongs_to user
has_one :image
end
class Image < ApplicationRecord
belongs_to :avatar
end
Image.includes(avatar: :user).select(avatar: [:size, { user: [:email] }]).load # <--- will raise a Missing Attribute exception
#> ActiveModel::MissingAttributeError: Parent reflection avatar was missing foreign key user_id in nested selection
#> while trying to preload belongs_to reflection named user.
#> Hint: didn't you forgot to add user_id inside [:id, :size]?
Image.includes(avatar: :user).select(avatar: [:size, :user_id, { user: [:email] }]).load
Right now it will not work with ar_lazy_preload gem. nested_select relies on the includes_values definition of a relation. If you are doing it in a lazy way, there weren't any explicit includes, that means it will not extract any nested selection.
docker compose run test
- Cover all relation combinations and add missing functionality
- Ensure relations foreign keys are present on the selection
- Ensure primary key will be added
- [-] Ensure belongs_to will add a foreign_key column (Too hard to manage :(, its definitely not a low hanging fruit)
- Optimize through relations ( since they loading a whole set of attributes )
- Separated rails version testing
- Merge multiple nested selections
- Don't apply any selection if blank ( allows to limit only part of subselection tree)
- Allows to use custom attributes
- Eager loading?
Bug reports and pull requests are welcome on GitHub at https://github.com/alekseyl/nested_select. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.
The gem is available as open source under the terms of the MIT License.
Everyone interacting in the NestedSelect project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.