Friday, 7 February 2014

Data Types Mapping Between SQL Server and TeraData

The other day I was loading data from SQL Server into Teradata so I needed to check corresponding data types between the two, so I made a small table with both data defitions.

I referred to this resources: 
  • MS SQL Server data types / http://www.w3schools.com/sql/sql_datatypes.asp 
  • TeraData data types
    • http://mant.upol.cz/sashtml/accdb/z1250040.htm 
    • https://www.aquaclusters.com/app/home/project/public/aquadatastudio/wikibook/Teradata-1-34/page/6/Data-Types

Data Type Converting from SQL Server to TeraData

nvarchar => nvarchar

  • nvarchar [SQL] – Variable width Unicode string. Maximum 4,000 characters
    VARCHAR (n) [TD] specifies a varying-length column of length n for character string data. The 
  • maximum for n is 64,000. VARCHAR is also known as CHARACTER VARYING

nvarchar(max) => varchar(4000)

  • nvarchar(max) [SQL] Variable width Unicode string. Maximum 536,870,912 characters 
  • VARCHAR(4000) [TD]


tinyint => byteint

  • TINYINT(size) [SQL]   -128 to 127 normal. 0 to 255 UNSIGNED*. The maximum number of digits may be specified in parenthesis
  • BYTEINT [TD] specifies a single-byte signed binary integer. Values can range from -128 to +127.

smallint => smallint

  • smallint [SQL] Allows whole numbers between -32,768 and 32,767 2 bytes
  • SMALLINT [TD] specifies a small integer. Values can range from -32,768 through +32,767.

int => int / integer

  • INT(size) [SQL] -2147483648 to 2147483647 normal. 0 to 4294967295 UNSIGNED*. The maximum number of digits may be specified in parenthesis
    INT / INTEGER [TD] specifies a large integer. Values can range from -2,147,483,648 through +2,147,483,647.

numberic => decimal

  • numeric(18, 3) [SQL] Fixed precision and scale numbers.
    Allows numbers from -10^38 +1 to 10^38 –1.
    The p parameter indicates the maximum total number of digits that can be stored (both to the left and to the right of the decimal point). p must be a value from 1 to 38. Default is 18.
    The s parameter indicates the maximum number of digits stored to the right of the decimal point. s must be a value from 0 to p. Default value is 0
     
  • DECIMAL(n,m) or NUMERIC(n,m) [TD] specifies a packed-decimal number. n is the total number of digits (precision). m is the number of digits to the right of the decimal point (scale). The range for precision is 1 through 18. The range for scale is 0 through n. If m is omitted, 0 is assigned and n can also be omitted. Omitting both n and m results in the default DECIMAL(5,0). DECIMAL is also known as NUMERIC.

ntext => long varchar

  • ntext [SQL] Variable width Unicode string. Maximum 2GB of text data
    into ??? LONG VARCHAR [TD]


bigint => bigint

  • bigint [SQL] Allows whole numbers between -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807
  • bigint [TD] -9,223,372,036,854,775,808 < = BIGINT <= 9,223,372,036,854,775,807

smalldatetime => timestamp

  • smalldatetime [SQL] From January 1, 1900 to June 6, 2079 with an accuracy of 1 minute 4 bytes
  • timestamp [TD] (i hope)

datetime2 into timestamp / http://wiki.ispirer.com/sqlways/sql-server/data-types/datetime2
datetime into timestamp / http://wiki.ispirer.com/sqlways/sql-server/data-types/datetime






Friday, 3 January 2014

MySQL data insert test

Recently I needed to duplicate huge portion of data (about 8 GB) on MySQL for testing purpose. So I was wondering how to do it in the fastest possible way using mysqldump utility.
In mysqldump there are two ways how to export data:
  • normal inserts = one INSERT command contains only one record
  • extended inserts = one INSERT command combine many records (default mysqldump behaviour)
I tested to import 2M records (cca 800 MB) and here is the result:

Normal INSERTs

  • Export
    • /usr/local/mysql/bin/mysqldump --extended-insert=FALSE --user=user --password=password -P3906 -hserver db table | gzip -9 -c > dump_table_no-extended.gz
  • Import
    • zcat dump_table_no-extended.gz | /usr/local/mysql/bin/mysql -h server --user=user --password=password -P3906 db
  • Result
    • 19 min

Extended INSERTs

  • Export
    • /usr/local/mysql/bin/mysqldump --extended-insert --user=user --password=password -P3906 -hserver db table | gzip -9 -c > dump_table_extended.gz
  • Import
    • zcat dump_table_extended.gz | /usr/local/mysql/bin/mysql -h server --user=user --password=password -P3906 db
  • Result
    • 1:26 min

Disable keys

Further, I tested if there is any improvement when I disable keys during import as MySQL rebuilds index tree every time it inserts a new record. 

Extended INSERTs + DISABLE KEYS

  • Export
    • /usr/local/mysql/bin/mysqldump --extended-insert --no-data --user=user --password=password -P3906 -hserver db table | gzip -9 -c > dump_table_extended-structure.gz
    • /usr/local/mysql/bin/mysqldump --extended-insert --no-create-info --user=user --password=password -P3906 -hserver db table | gzip -9 -c > dump_table_extended-data.gz
  • Import
    • zcat dump_table_extended-structure.gz | /usr/local/mysql/bin/mysql -h server --user=user --password=password -P3906 db
    • ALTER TABLE table DISABLE KEYS;
    • zcat dump_table_extended-data.gz | /usr/local/mysql/bin/mysql -h server --user=user --password=password -P3906 db
    • ALTER TABLE table ENABLE KEYS;
  • Result
    • 1:20 min

Note

By default, mysqldump generates only one INSERT command per table, resulting in one (very long) line of inserted data for each table that got dumped. So I wondered if it is OK to import huge data set using extended INSERTs due to possible memory issue. Fortunately, mysqldump has a limit for one INSERT set by option net_buffer_lenght which is 16384 bytes. This will ensure that big table is actually dumped into many extended INSERT commands.

Conclusion

The best and simplest way is to use default mysqldump settings, e.g. with extended inserts.
Further, I found that its better to uncompress file, open mysql console and use "source dump_table.sql". It's faster than using "zcat dump_table.gz | mysql .....".