Showing posts with label Differences between varchar and nvarchar in SQL Server. Show all posts
Showing posts with label Differences between varchar and nvarchar in SQL Server. Show all posts

Thursday, October 15, 2009

Differences between varchar and nvarchar in SQL Server

The broad range of data types in SQL Server can sometimes throw people through a loop, especially when the data types seem to be highly interchangeable. Two in particular that constantly spark questions are VARCHAR and NVARCHAR: what's the difference between the two, and how important is the difference?

VARCHAR is an abbreviation for variable-length character string. It's a string of text characters that can be as large as the page size for the database table holding the column in question. The size for a table page is 8,196 bytes, and no one row in a table can be more than 8,060 characters. This in turn limits the maximum size of a VARCHAR to 8,000 bytes.

The "N" in NVARCHAR means uNicode. Essentially, NVARCHAR is nothing more than a VARCHAR that supports two-byte characters. The most common use for this sort of thing is to store character data that is a mixture of English and non-English symbols — in my case, English and Japanese.

The key difference between the two data types is how they're stored. VARCHAR is stored as regular 8-bit data. But NVARCHAR strings are stored in the database as UTF-16 — 16 bits or two bytes per character, all the time — and converted to whatever codepage is being used by the database connection on output (typically UTF-8). That said, NVARCHAR strings have the same length restrictions as their VARCHAR cousins — 8,000 bytes. However, since NVARCHARs use two bytes for each character, that means a given NVARCHAR can only hold 4,000 characters (not bytes) maximum. So, the amount of storage needed for NVARCHAR entities is going to be twice whatever you'd allocate for a plain old VARCHAR.

Because of this, some people may not want to use NVARCHAR universally, and may want to fall back on VARCHAR — which takes up less space per row — whenever possible.