4

Consider this simple example:

We have the apartments table below:

ID Address
0 555 Maple St
1 123 Oak St
2 999 Pine St

We also have the tenants table below:

ID Name
0 Bob Foo
1 Jane Bar

If I want to map tenants to apartments (assuming that one tenant could rent multiple apartments for this example and "Jane" is renting two apartments) I could create a junction table like this:

TenantID ApartmentID
0 0
1 1
1 2

It seems to me that I could accomplish the same mapping if instead I just added a column to the apartments table like this and didn't use a junction table at all:

ID Address TenantId
0 555 Maple St 0
1 123 Oak St 1
2 999 Pine St 1

Is there a reason to avoid creating the relationship this way? I realize that not using a junction table would create a constraint that each apartment can have only one tenant (you wouldn't be able to have two people that share the cost of the rent), but if that is acceptable is there still a reason to use the junction table? Thanks for any input, just trying to learn more about SQL database design principles.

3 Answers 3

8

As you already noted by yourself, if there is an N:M relationship between two entities, a single foreign key field is not sufficient, so a junction table would be the natural choice. Hence the interesting question is: "In case an 1:N relationship is all what's needed, why would someone use a junction table anywhere?"

I can spot three possible reasons:

  1. The data model for tenants and appartments - without any relationship between those two entities - is provided by someone else (a third party vendor) and the tables must not be touched. However, you want or need to extend the model, and you are allowed to add new tables. Then a junction table might be used.

  2. You want to model rentals of an appartment as an entity of its own, because you are going to add further attributes to the rental, like the rental fee or contractual information, and you don't want to mix these attributes with those which belong clearly to the appartment.

  3. You want to assign different access rights to the entities and the junction.

I have seen these scenarios in real-world systems, so these are not just hypothetical reasons.

0

By adding a foreign key to the apartments table you make the assumption that there is only one tenant per apartment. If anyone wants roommates, your data model will need to be changed, or data would need to be duplicated — one apartment per tenant.

This is specific to the problem domain. I would say it is reasonable to assume some tenants will want roommates, since this is a common living situation, especially amongst younger people or people with lower income. As such, the junction table is the right solution, even if 90% of apartments are rented by a single person. The junction table captures the others that want roommates.

0
0

In terms of design principles, it's important to recognise that an explicit junction table is the most general solution for expressing relationships between tables.

But under certain constraints, what would be the contents of the junction table can be folded into one of the base tables as additional columns. This may entail performance advantages and space savings, though probably too few to mention.

The real saving is probably for the programmer. Jettisoning a separate junction table saves one name in the namespace, saves maybe one join per query, and saves analysis on coordinating access to (and integrity between) three tables instead of two.

For tables storing master-and-detail or header-and-lines type data, in which one logical record has to be split over two (or more) tables to accord with the purely tabular structure of the relational model, then adding an extra column to the subordinate table in order to link it back to the superior table, is the lightest touch possible to create the links between the tables. Creating an explicit junction table to express these links, with its separate name and existence, would look very heavyweight indeed by contrast.

But in many other cases, a pair of tables may not be seen as containing one logical record, but as containing two separate records, any links between which are external to both. And there may be no natural hierarchy between them either, as there is with header-and-lines tables.

For example, data which expresses which personnel are married or partnered, would normally be seen as a kind of link which is separate from the data about the individuals, and more worthy of a "MarriedCouples" or "Marriages" junction table, than having a nullable "WifeId" column. To make the distinction real, it's the difference between having a separate marriages roll, and having a birth certificate with empty fields onto which any details of a marriage partner are later recorded.

So my advice on this question, where the two alternatives for database design exist, would be to look at which alternative seems to accord more closely with the underlying conceptual model, and whether the links appear to be a first-class concept separate from the things being linked, or whether the links are purely a product of the structural principle of the relational model in which everything must be a table.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.