Attempting to optimize a portion of a query that is joining two related tables, and getting odd results compared to other queries in the project with similar structures. Here is a very simplified example where I'm still seeing the issue:
SELECT
`j`.`job_date` AS `VOUCHERDATE`,
`labor_equipment`.`time_entry` AS `HOURS`
FROM `uat_portal`.`jobs` `j`
LEFT JOIN(
SELECT
`uat_portal`.`jobs_employees`.`job_id`,
`uat_portal`.`jobs_employees`.`time_entry`
FROM
`uat_portal`.`jobs_employees`
UNION ALL
SELECT
`uat_portal`.`jobs_equipment`.`job_id`,
`uat_portal`.`jobs_equipment`.`time_entry`
FROM
`uat_portal`.`jobs_equipment`
) `labor_equipment`
ON
`j`.`id` = `labor_equipment`.`job_id`
This yields the following EXPLAIN statement- as you can see all rows are fetched for jobs_employees and jobs_equipment
1 PRIMARY j index NULL idx_jobs_job_date 3 NULL 218110 Using index
1 PRIMARY <derived2> ref key0 key0 5 uat_portal.j.id 10
2 DERIVED jobs_employees index NULL job_id_index 4 NULL 953371 Using index
3 UNION jobs_equipment index NULL job_id_index 4 NULL 391702 Using index
Removing the UNION ALL and just joining one or the other table yields the expected results where only a row or two are fetched:
SELECT
`j`.`job_date` AS `VOUCHERDATE`,
`labor_equipment`.`time_entry` AS `HOURS`
FROM `uat_portal`.`jobs` `j`
LEFT JOIN(
SELECT
`uat_portal`.`jobs_employees`.`job_id`,
`uat_portal`.`jobs_employees`.`time_entry`
FROM
`uat_portal`.`jobs_employees`
) `labor_equipment`
ON
`j`.`id` = `labor_equipment`.`job_id`
1 SIMPLE j index NULL idx_jobs_job_date 3 NULL 218110 Using index
1 SIMPLE jobs_equipment ref job_id_index job_id_index 4 uat_portal.j.id 1 Using index
SELECT
`j`.`job_date` AS `VOURCHERDATE`,
`labor_equipment`.`time_entry` AS `HOURS`
FROM `uat_portal`.`jobs` `j`
LEFT JOIN(
SELECT
`uat_portal`.`jobs_equipment`.`job_id`,
`uat_portal`.`jobs_equipment`.`time_entry`
FROM
`uat_portal`.`jobs_equipment`
) `labor_equipment`
ON
`j`.`id` = `labor_equipment`.`job_id`
1 SIMPLE j index NULL idx_jobs_job_date 3 NULL 218110 Using index
1 SIMPLE jobs_equipment ref job_id_index job_id_index 4 uat_portal.j.id 1 Using index
With where statements back in place, just this very small and heavily simplified portion of the larger query is taking about 3 seconds with the UNION ALL in place, which isn't terrible but it compounds with the further complexity (phpmyadmin currently cannot load the associated View at all).
I saw something similar with UNION ALL's in other portions of the project causing similar issues, and that came down to the fields in the subquery's SELECT being different types (I had 3 tables UNION ALL'd, one with an int primary key and two with Unsigned BigInt primary keys).
In this case though, everything seems to line up for all three tables involved in the query. Here are my SHOW CREATE TABLE statements for all three tables:
CREATE TABLE `jobs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_bigint_test` bigint(20) unsigned NOT NULL,
`parent_job` int(11) NOT NULL,
`workorder_id` int(11) DEFAULT NULL,
`wo_day_id` int(11) NOT NULL,
`quote_id` int(11) NOT NULL,
`job_type` varchar(100) NOT NULL,
`job_name` varchar(255) NOT NULL,
`job_number` varchar(25) NOT NULL,
`job_date` date NOT NULL,
`job_color` varchar(10) NOT NULL,
`onsite_time` time NOT NULL,
`sales_person` varchar(25) NOT NULL,
`badging_needed` tinyint(1) NOT NULL,
`badging_completed` tinyint(1) NOT NULL,
`notes` text NOT NULL,
`location` varchar(25) NOT NULL,
`added_on` datetime NOT NULL,
`added_by` varchar(25) NOT NULL,
`updated_on` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
`updated_by` varchar(25) NOT NULL,
`removed` tinyint(1) NOT NULL,
`removed_on` datetime NOT NULL,
`removed_by` varchar(25) NOT NULL,
`required_crew_size` varchar(255) DEFAULT NULL,
`min_skill_level` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_jobs_job_date` (`job_date`),
KEY `idx_jobs_job_number` (`job_number`),
KEY `indx_jobs_parent_job` (`parent_job`),
KEY `id_bigint_test` (`id_bigint_test`)
) ENGINE=InnoDB AUTO_INCREMENT=209896 DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
CREATE TABLE `jobs_equipment` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`rate_card_owned_equipment_id` int(11) NOT NULL,
`job_id` int(11) NOT NULL,
`job_id_bigint_test` bigint(20) unsigned NOT NULL,
`equipment_id` int(11) NOT NULL,
`owned_equipment_id` varchar(255) DEFAULT NULL,
`start_time` varchar(25) NOT NULL,
`end_time` varchar(25) NOT NULL,
`override_time` tinyint(1) NOT NULL,
`time_entry` float(8,2) NOT NULL,
`billable` tinyint(1) NOT NULL,
`removed` tinyint(1) NOT NULL,
`removed_by` varchar(25) NOT NULL,
`removed_on` datetime NOT NULL,
`added_on` datetime NOT NULL,
`added_by` varchar(25) NOT NULL,
`updated_on` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
`updated_by` varchar(25) NOT NULL,
PRIMARY KEY (`id`),
KEY `job_id_index` (`job_id`),
KEY `job_id_bigint_index` (`job_id_bigint_test`)
) ENGINE=InnoDB AUTO_INCREMENT=392211 DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
CREATE TABLE `jobs_employees` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`rate_card_labor_id` int(11) NOT NULL,
`job_id` int(11) NOT NULL,
`job_id_bigint_test` bigint(20) unsigned NOT NULL,
`employee_id` int(11) NOT NULL,
`title` varchar(255) NOT NULL,
`labor_id` varchar(255) DEFAULT NULL,
`category` varchar(50) NOT NULL,
`start_time` varchar(25) NOT NULL,
`end_time` varchar(25) NOT NULL,
`override_time` tinyint(1) NOT NULL,
`truck` varchar(100) NOT NULL,
`trailer` varchar(100) NOT NULL,
`job_action` varchar(100) NOT NULL,
`time_entry` float(8,2) NOT NULL,
`billable` tinyint(1) NOT NULL,
`incl_break` tinyint(1) NOT NULL,
`added_on` datetime NOT NULL,
`added_by` varchar(25) NOT NULL,
`updated_on` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
`updated_by` varchar(25) NOT NULL,
`removed` tinyint(1) NOT NULL,
`removed_on` datetime NOT NULL,
`removed_by` varchar(25) NOT NULL,
PRIMARY KEY (`id`),
KEY `job_id_index` (`job_id`),
KEY `job_id_bigint_index` (`job_id_bigint_test`)
) ENGINE=InnoDB AUTO_INCREMENT=1018745 DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Thinking it was still a data type issue like before, I've tried alternate column types for the jobs 'id' and jobs_equipment/jobs_employees 'job_id' columns- you'll notice the 'job_id_bigint_test' columns in the SHOW CREATE TABLE statements.
Thinking it was an indexing issue (though the individual tables perform fine without UNION ALL's) I've tried deleting and recreating the indexes for the 'job_id' columns and creating and removing foreign key constraints.
To try to force the optimizer to pull what I'm expecting, I've tried fetching different columns, joining on different columns, and haphazardly adding GROUP BY and WHERE statements.
UPDATE 9/15- added some actual examples of selects used by the View.
j.id
with no column from theLEFT JOIN
edlabor_equipment
, we could just answer thatSELECT id FROM jobs
would return the same results…IN (<subquery>)
, and SO question MariaDB using PK for inner query result looks exactly like your question (still with no answer) ; there's also How to improve UNION query but it's not as near to your question.(SELECT UNION SELECT) JOIN
is not an equivalent forSELECT JOIN UNION SELECT JOIN
. For example, when any non-stable construction (variable, function) is used.