I am in the process of designing a database layout to run on PostgreSQL. In this database i need to handle a lot of addresses, so I naturally want to use an address table and then have my other data just point at the primary key.
I want addresses to be unique, so i plan on using a simple primary key and then a unique index on all of the data columns.
Something along the lines of this:
CREATE TABLE address(id serial NOT NULL, name VARCHAR(32), address1 VARCHAR(32), address2 VARCHAR(32), postal VARCHAR(12), city VARCHAR(64), CONSTRAINT pkey PRIMARY KEY (id), CONSTRAINT unq UNIQUE (name,address1,address2,postal,city));
I anticipate that there will be well over 2 million entries in this table from day one.
The table will be hit by a lot of simultaneous SELECTS and also a few INSERTS. But for data integrity purposes the table will never see UPDATES or DELETES.
What sort of performance can i expect from this kind of table? Is there any performance penalty to using large tables rather than copying the data to where it is needed?
sort -u) and load them into the database. For testing, it doesn't matter whether the address looks like an address.