How do you know when to use varchar and when to use text in sql?

Question

It seems like a very arbitrary decision. Both can accomplish the same thing in most cases. By limiting the varchar length seems to me like you're shooting yourself in the foot cause you never know how long of a field you will need.

Is there any specific guideline for choosing VARCHAR or TEXT for your string fields?

I will be using postgresql with the sqlalchemy orm framework for python.

As Quassnoi is to SQL what John Skeet is to... well, everything else, you can't top the answer he gave. — Lieven Keersmaekers
– Lieven Keersmaekers, Commented Jan 16, 2011 at 13:14
I'm referring to the duplicate posted in comments but that was before you've mentioned postgresql. — Lieven Keersmaekers
– Lieven Keersmaekers, Commented Jan 16, 2011 at 15:27

user330315user330315 · Accepted Answer · 2011-01-16 13:17:01Z

10

In PostgreSQL there is no technical difference between varchar and text

You can see a varchar(nnn) as a text column with a check constraint that prohibits storing larger values.

So each time you want to have a length constraint, use varchar(nnn).

If you don't want to restrict the length of the data use text

answered Jan 16, 2011 at 13:17

user330315

Sign up to request clarification or add additional context in comments.

2 Comments

Frank Heikens Over a year ago

You can also use VARCHAR when you have no restriction. VARCHAR and VARCHAR(n) are two different things.

user330315 Over a year ago

Ah, right. I always forget that varchar can be used without a length restriction as well. I usually use TEXT then.

orlp · Accepted Answer · 2011-01-16 13:13:47Z

2

This sentence is wrong:

By limiting the varchar length seems to me like you're shooting yourself in the foot cause you never know how long of a field you will need.

If you are saving, for example, MD5 hashes you do know how large the field is your storing and your storage becomes more efficient. Other examples are:

Usernames (64 max)
Passwords (128 max)
Zip codes
Addresses
Tags
Many more!

edited Jan 16, 2011 at 13:13

answered Jan 16, 2011 at 13:07

orlp

119k39 gold badges226 silver badges324 bronze badges

13 Comments

BoltClock Over a year ago

MD5 hashes would even be more efficiently stored in CHAR(32) columns.

orlp Over a year ago

True, but I'm giving an example where limiting lengths is not shooting yourself in the foot. But once again, true.

TheOne Over a year ago

who said passwords should be 64 max?

BoltClock Over a year ago

@Absolute0: Passwords with 1000 chars will still be hashed to 40-char SHA1 checksums.

orlp Over a year ago

@SolomonUcko Zip codes in my country are most certainly not integer.

|

davin · Accepted Answer · 2011-01-16 13:12:38Z

1

In brief:

Variable length fields save space, but because each field can have different length, it makes table operations slower
Fixed length fields make table operations fast, although must be large enough for the maximum expected input, so can use more space

Think of an analogy to arrays and linked lists, where arrays are fixed length fields, and linked lists are like varchars. Which is better, arrays or linked lists? Lucky we have both, because they are both useful in different situations, so too here.

answered Jan 16, 2011 at 13:12

davin

45.6k9 gold badges81 silver badges78 bronze badges

4 Comments

TheOne Over a year ago

Most of the times we use some variation of a vector :) but I get your point.

davin Over a year ago

@Absolute0, and how do you think vector is typically implemented internally? Arrays. It's the exact same principle, if you want random access to any element, you need to know the size of every element (fixed size), otherwise, you can save space, although access requires you to move element-element, like in a linked list.

intgr Over a year ago

This advice is not true in PostgreSQL, char/varchar/text types have exactly the same representation on disk. There's no efficiency to be gained by using char.

davin Over a year ago

@intgr, good point, i wonder why that's the case. will have to dig up some postgresql code one day...

BvdVen · Accepted Answer · 2011-01-16 13:07:43Z

0

In the most cases you do know what the max length of a string in a field is. In case of a first of lastname you don't need more then 255 characters for example. So by design you choose wich type to use, if you always use text you're wasting resources

answered Jan 16, 2011 at 13:07

BvdVen

2,96126 silver badges36 bronze badges

1 Comment

davin Over a year ago

That's a blanket statement that isn't quite accurate. Each choice wastes resources, because variable length fields make searching slower, which results in wasted CPU resources.

Frank Heikens · Accepted Answer · 2011-01-16 15:07:58Z

0

Check this article on PostgresOnline, it also links to two other usefull articles.

Most problems with TEXT in PostgreSQL occur when you're using tools, applications and drivers that treat TEXT very different from VARCHAR because other databases behave very different with these two datatypes.

answered Jan 16, 2011 at 15:07

Frank Heikens

129k26 gold badges157 silver badges153 bronze badges

Comments

Mike Sherrill 'Cat Recall' · Accepted Answer · 2011-01-16 13:19:02Z

-1

Database designers almost always know how many characters a column needs to hold. US delivery addresses need to hold up to 64 characters. (The US Postal Service publishes addressing guidelines that say so.) US ZIP codes are 5 characters long.

A database designer will look at representative sample data from her clients when she's specifying columns. She'll ask herself, questions like "What's the longest product name?" And when the answer is "70 characters", she won't make the column 3000 characters wide.

VARCHAR has a limit of 8k in SQL Server (I think). Most applications don't require nearly that much storage for a single column.

answered Jan 16, 2011 at 13:19

Mike Sherrill 'Cat Recall'

96.6k20 gold badges132 silver badges195 bronze badges

3 Comments

mu is too short Over a year ago

You shouldn't be limiting US zip codes to five characters, they expanded them to ZIP+4 (ten characters) about thirty years ago.

Mike Sherrill 'Cat Recall' Over a year ago

At the risk of stating the obvious, ZIP codes are still five characters. ZIP+4 codes are five characters plus four more characters. Two columns simplifies the constraints, and it simplifies grouping addresses by ZIP code. Grouping by ZIP code is required for price breaks on bulk mail.

Mike Sherrill 'Cat Recall' Over a year ago

Only if you expect Canadian addresses to conform to USPS specifications. Professionals know better. (Canada? Preferably less than 30 characters, but no more than 40 characters per address line for machineable mail.)

Collectives™ on Stack Overflow

How do you know when to use varchar and when to use text in sql?

6 Answers 6

2 Comments

13 Comments

4 Comments

1 Comment

Comments

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

2 Comments

13 Comments

4 Comments

1 Comment

Comments

3 Comments

Linked

Related