Information About

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 24 August 2006

VARCHAR versus Compression

Posted on 10:55 by Unknown
A couple of days ago I posted a blurb giving advice on using variable character columns in DB2. After thinking about the topic a little bit more, I decided to post a follow-on topic: namely, comparing the use of VARCHAR to DB2 compression.

Even though these are two entirely different "things," they are each probably done for similar reasons - to save disk storage. VARCHAR does this by adjusting the size of the column to fit the actual length of text being stored; compression does this by sending rows of data through an algorithm to minimize its length. For those interested in the details of compression I refer you to Willie Favero's excellent blog where he has written a several-part series on compression -- here are the links to it: part one, part two, and part three.

So, what advice can I give on comparing the two? Well, you might want to consider forgoing the use of variable columns and instead turn on compression. With variable columns you always add overhead: there is a two-byte prefix for every VARCHAR column to store the length of the VARCHAR. If instead you use CHAR and turn on compression you no longer need the extra two bytes per row per variable column.

Also, keep in mind that compression reduces the size of the entire row. So not only will you be compressing the CHAR column (that used to be VARCHAR), but you will also give DB2 the opportunity to compress every other column in that row.

All in all, that means that comrpession can return better disk storage savings than variable columns, and all without the programmatic overhead of having to calculate and store the two-byte prefix for each previously variable column.

Of course, I don't want to give the impression that this should always be done... (remember the DBA's maxim: Almost never say "always or never.") And there are additional things to consider, such as:
  • Compression adds a compression dictionary to the table space so a compressed table space can actually be larger than a non-compressed table space (if it is very small to begin with).
  • Compression requires additional CPU cycles to compress and de-compress the data as it is inserted, modified, and read (of course, I/O can decrease because smaller rows will fit more on each page, so degraded CPU performance can be offset by improved I/O)

This is just an additional "thing to consider" when you are building your DB2 databases and trying to decide whether you should use VARCHAR or CHAR...

Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • DB2 for z/OS Version 9 Beta Announcement
    On May 2, 2006 IBM announced the beta for the next version of mainframe DB2: namely, DB2 V9.1 for z/OS. You can view the announcement here ....
  • Index on Expressions [DB2 9 for z/OS]
    DB2 9 for z/OS offers, for the first time, the ability to create an index on data that is not technically in the table. At this point, you m...
  • DB2 Locking, Part 8: LOBs and Locking
    When a row is read or modified in a table containing LOB columns, the application will obtain a normal transaction lock on the base table. T...
  • DBA Rules of Thumb - Part 8 (Being Business Savvy)
    Understand the Business, Not Just the Technology Remember that being technologically adept is just a part of being a good DBA. Although tech...
  • When Not to Index
    Answering a question I got via e-mail on indexing... Every now and then I take the opportunity to blog about a question I get through e-...
  • DBA Rules of Thumb - Part 1
    Over the years I have gathered, written, and assimilated multiple collections of general rules of the road that apply to the management disc...
  • DB2 Locking, Part 15: Tackling Timeout Troubles
    Many shops battle with locking issues and frequently, the cause of performance issues can be traced back to locking issues, more specificall...
  • Adding Column Names to an Unload File
    I received an e-mail from a reader asking an interesting question. She wanted to know if any of the DB2 unload utilities are able to include...
  • DB2 11 for z/OS: Coming Soon!
    Today, October 3, 2012, IBM announced the Early Support Program for the next version of DB2. Heretofore code-named Sequoia, the DB2 11 ESP i...
  • Can You Write a Redbook?
    If you've been working with mainframes for any period of time you have almost certainly become familiar with the IBM redbook. These are ...

Categories

  • .NET
  • ACID
  • ALTER
  • analytics
  • articles
  • automation
  • award
  • backup
  • best practices
  • BETWEEN
  • BI
  • Big Data
  • BIND
  • blogging
  • book review
  • bufferpool
  • buffers
  • CASE
  • change management
  • claim
  • Cognos
  • COMMIT
  • compliance
  • compression
  • conference
  • constraints
  • COPY
  • data
  • data breaches
  • data quality
  • data security
  • Data Sharing
  • data types
  • data warehouse
  • database archiving
  • database auditing
  • database design
  • date
  • DB2
  • DB2 10
  • DB2 11
  • DB2 9
  • DB2 Analystics Accelerator
  • DB2 Catalog
  • DB2 conversion
  • DB2 Developer's Guide
  • DB2 X
  • DB2-L
  • DBA
  • DDL
  • developerWorks
  • dirty read
  • DISPLAY
  • DL/1
  • drain
  • DSNZPARM
  • Dynamic SQL
  • eBook
  • education
  • enclave SRB
  • encryption
  • ERP
  • FETCH FIRST
  • Freakonomics
  • functions
  • generosity factor
  • Happy Holidays
  • Happy New Year
  • Hibernate
  • HIPAA
  • history
  • IBM
  • ICF
  • IDUG
  • IFL
  • IMS
  • index
  • Information Agenda
  • Informix
  • InfoSphere
  • infrastructure
  • integrity
  • IOD
  • IOD11
  • IOD2009
  • IOD2011
  • IODGC
  • IRLM
  • ISOLATION
  • Java
  • JDBC
  • load balancing
  • LOBs
  • locking
  • LUW
  • mainframe
  • Malcolm Gladwell
  • manuals
  • memory
  • middleware
  • migration
  • misc
  • monitoring
  • natural key
  • Netezza
  • new blog location
  • NoSQL
  • nulls
  • OLAP
  • optimization
  • Oracle versus DB2
  • packages
  • PCI-DSS
  • performance
  • PIECESIZE
  • poll
  • primary key
  • production data
  • programming
  • Q+A
  • QMF
  • REBIND
  • recovery
  • RedBook
  • regulatory compliance
  • reliability
  • REORG
  • research
  • RI
  • RTO
  • salaries
  • SAP
  • scalability
  • security
  • smarter planet
  • SoftwareOnZ
  • sort
  • SOX
  • specialty processors
  • SPUFI
  • SQL
  • Stage 1
  • Stage 2
  • standards
  • Steelers
  • storage
  • stored procedures
  • stream computing
  • surrogate key
  • SYSADM
  • Sysadmin
  • table expressions
  • table space
  • TechDoc
  • tips and tricks
  • Top Ten
  • trace
  • training
  • triggers
  • Twitter
  • UDFs
  • UNION
  • unstructured data
  • user groups
  • utilities
  • V1
  • V10
  • V2
  • V3
  • V4
  • V5
  • V6
  • V7
  • V8
  • V9
  • variables
  • views
  • VOLATILE
  • Web 2.0
  • webinar
  • Wordle
  • XML
  • z/OS
  • zAAP
  • zIIP

Blog Archive

  • ►  2014 (2)
    • ►  January (2)
  • ►  2013 (50)
    • ►  December (6)
    • ►  November (6)
    • ►  October (5)
    • ►  September (5)
    • ►  August (3)
    • ►  July (7)
    • ►  June (4)
    • ►  May (4)
    • ►  April (5)
    • ►  March (1)
    • ►  February (2)
    • ►  January (2)
  • ►  2012 (17)
    • ►  December (1)
    • ►  November (2)
    • ►  October (3)
    • ►  August (2)
    • ►  July (1)
    • ►  May (1)
    • ►  April (1)
    • ►  March (2)
    • ►  February (2)
    • ►  January (2)
  • ►  2011 (27)
    • ►  December (1)
    • ►  November (1)
    • ►  October (6)
    • ►  September (2)
    • ►  August (3)
    • ►  July (2)
    • ►  June (3)
    • ►  May (2)
    • ►  April (3)
    • ►  March (1)
    • ►  February (3)
  • ►  2010 (29)
    • ►  December (1)
    • ►  October (6)
    • ►  September (1)
    • ►  August (2)
    • ►  July (2)
    • ►  June (1)
    • ►  May (3)
    • ►  April (3)
    • ►  March (3)
    • ►  February (4)
    • ►  January (3)
  • ►  2009 (43)
    • ►  December (5)
    • ►  November (4)
    • ►  October (6)
    • ►  September (2)
    • ►  August (1)
    • ►  July (3)
    • ►  June (2)
    • ►  May (3)
    • ►  April (2)
    • ►  March (4)
    • ►  February (5)
    • ►  January (6)
  • ►  2008 (44)
    • ►  December (1)
    • ►  November (4)
    • ►  October (4)
    • ►  September (6)
    • ►  August (1)
    • ►  July (4)
    • ►  June (3)
    • ►  May (5)
    • ►  April (4)
    • ►  March (4)
    • ►  February (2)
    • ►  January (6)
  • ►  2007 (51)
    • ►  December (2)
    • ►  November (3)
    • ►  October (5)
    • ►  September (3)
    • ►  August (6)
    • ►  July (4)
    • ►  June (4)
    • ►  May (5)
    • ►  April (8)
    • ►  March (5)
    • ►  February (4)
    • ►  January (2)
  • ▼  2006 (60)
    • ►  November (4)
    • ►  October (8)
    • ►  September (4)
    • ▼  August (11)
      • Accessing Partitioned Data
      • How are Indexes Being Used?
      • VARCHAR versus Compression
      • IBM Mainframes - Not Just for Big Shops Any More
      • Advice on Using Variable Character Columns in DB2
      • Greatest Software Ever?
      • SHARE Travelers Take Heed
      • Where exactly is a DB2 plan stored?
      • Mainframe Weekly: A new mainframe-focused blog
      • Upcoming SHARE Conference
      • New Mainframe Redbook
    • ►  July (7)
    • ►  June (2)
    • ►  May (7)
    • ►  April (3)
    • ►  March (6)
    • ►  February (4)
    • ►  January (4)
  • ►  2005 (11)
    • ►  December (3)
    • ►  November (6)
    • ►  October (2)
Powered by Blogger.

About Me

Unknown
View my complete profile