Information About

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Friday, 29 April 2011

I'll Be Tweeting Live From IDUG

Posted on 14:15 by Unknown
For those of you who use Twitter, make sure you are following me next week (http://www.twitter.com/craigmullins) as I will be tweeting my experiences from the IDUG conference in Anaheim.

If you aren't planning to go, you can follow my Tweets to hear what is going on... and if you are attending the show, you can follow my Tweets to hear my perspective on things...

I arrive in Anaheim Tuesday afternoon, so I will miss the kickoff, but I'll be there the rest of the week.
Read More
Posted in IDUG | No comments

Tuesday, 26 April 2011

100 Years of IBM

Posted on 15:14 by Unknown

If you have anything at all to do with computers or information technology, you have something to thank IBM for. Watch this video to find out what!
Read More
Posted in IBM | No comments

Monday, 4 April 2011

What About Surrogate Keys?

Posted on 09:30 by Unknown

As is so often the case with my blog, today's topic came about as the result of an e-mail question I received from a DBA I know. His question was this:

"A great debate rages here about the use of ‘synthetic’ keys. We read all sorts of articles on the wild wild web but none seem to address the database performance impacts of designs using synthetic keys. I wondered if you could point me to any information on this…"

If you've ever Googled the term "surrogate key" you know the hornet's nest of opinions that swirls around "out there" about the topic. For those who haven't heard the term, here is my attempt at a quick summary: a surrogate key is a generated unique value that is used as the primary key of a database table; database designers tend to consider surrogate keys when the natural key consists of many columns, is very long, or may need to change.

And here is the response I sent to my e-mail inquisitor:

I doubt that there is any “final word” on this topic. It has been raging on for years and years; so folks pro, others con. This Wikipedia article offers up a nice start: http://en.wikipedia.org/wiki/Surrogate_key

However, when I get to the performance area of this article I don’t think I agree. The article puts a lot of emphasis on there being fewer columns to join and therefore better performance.. If you’ve got an index on those multiple columns how much “worse” will the performance be, really? Sure, the SQL is more difficult to write, but will a join over 4 or 5 indexed columns perform that much worse than a join on one indexed column? I suppose as the number of columns required for the natural key increases the impact could be greater (e.g. 10 columns???)

I guess I can see the argument if you are swapping a variable length key with a surrogate having a fixed length key – that should improve things!

Furthermore consider this: the natural key columns are still going to be there, after all, they are naturally part of the data, right? So the surrogate (synthetic) key gets added to each row. This will likely reduce the number of rows per page (maybe not, but probably). And that, in turn, will negatively impact the performance of sequential access because more I/O will be required to read the “same” number of rows.

And what about the impact of adding data? If there are a significant number of new rows being added at the same time by different processes there will be locking issues as they all try to put the new data on the same page (unless, of course, your surrogate key is not a sequential number and is, instead, something like the microseconds portion of the current timestamp [that must be tested to avoid duplicates]).

The one thing that usually causes me to tend to favor natural keys is just that – they are natural. If the data is naturally occurring it becomes easier for end users to remember it and use it. If it is a randomly generated surrogate nobody will actually know the data. Yes, this can be masked to a great deal based on the manner in which you build your applications to access the data, but ad hoc access becomes quite difficult.

I guess the bottom line is that “it depends” on a lot of different things! No surprise there, I suppose.

Here are a few other resources with information (not so much on performance though) that you may or may not have reviewed already:

  • http://www.agiledata.org/essays/keys.html
  • http://www.techrepublic.com/forum/discussions/10-85582-3172364
  • http://www.infobright.org/forums/viewthread/348/
  • http://www.bcarter.com/intsurr1.htm
  • http://weblogs.sqlteam.com/mladenp/archive/2009/10/06/Why-I-prefer-surrogate-keys-instead-of-natural-keys-in.aspx
  • http://www.sqlservercentral.com/articles/Primary+key/70747/ - this one talks about SQL Server performance

What do you think about natural keys versus surrogate keys? Surely some readers here have an opinion on this topic! If so, post them as comments...

Read More
Posted in database design, natural key, primary key, surrogate key | No comments
Newer Posts Older Posts Home
Subscribe to: Posts (Atom)

Popular Posts

  • DB2 for z/OS Version 9 Beta Announcement
    On May 2, 2006 IBM announced the beta for the next version of mainframe DB2: namely, DB2 V9.1 for z/OS. You can view the announcement here ....
  • Index on Expressions [DB2 9 for z/OS]
    DB2 9 for z/OS offers, for the first time, the ability to create an index on data that is not technically in the table. At this point, you m...
  • When Not to Index
    Answering a question I got via e-mail on indexing... Every now and then I take the opportunity to blog about a question I get through e-...
  • DB2 Locking, Part 8: LOBs and Locking
    When a row is read or modified in a table containing LOB columns, the application will obtain a normal transaction lock on the base table. T...
  • DBA Rules of Thumb - Part 8 (Being Business Savvy)
    Understand the Business, Not Just the Technology Remember that being technologically adept is just a part of being a good DBA. Although tech...
  • DBA Rules of Thumb - Part 1
    Over the years I have gathered, written, and assimilated multiple collections of general rules of the road that apply to the management disc...
  • Can You Write a Redbook?
    If you've been working with mainframes for any period of time you have almost certainly become familiar with the IBM redbook. These are ...
  • Adding Column Names to an Unload File
    I received an e-mail from a reader asking an interesting question. She wanted to know if any of the DB2 unload utilities are able to include...
  • DB2 Locking, Part 15: Tackling Timeout Troubles
    Many shops battle with locking issues and frequently, the cause of performance issues can be traced back to locking issues, more specificall...
  • UPDATE SCHEMA and CATMAINT [DB2 9 for z/OS]
    Welcome back to my blog as I continue our examination of the new features of DB2 9 for z/OS. Today we will look at the new UPDATE SCHEMA cap...

Categories

  • .NET
  • ACID
  • ALTER
  • analytics
  • articles
  • automation
  • award
  • backup
  • best practices
  • BETWEEN
  • BI
  • Big Data
  • BIND
  • blogging
  • book review
  • bufferpool
  • buffers
  • CASE
  • change management
  • claim
  • Cognos
  • COMMIT
  • compliance
  • compression
  • conference
  • constraints
  • COPY
  • data
  • data breaches
  • data quality
  • data security
  • Data Sharing
  • data types
  • data warehouse
  • database archiving
  • database auditing
  • database design
  • date
  • DB2
  • DB2 10
  • DB2 11
  • DB2 9
  • DB2 Analystics Accelerator
  • DB2 Catalog
  • DB2 conversion
  • DB2 Developer's Guide
  • DB2 X
  • DB2-L
  • DBA
  • DDL
  • developerWorks
  • dirty read
  • DISPLAY
  • DL/1
  • drain
  • DSNZPARM
  • Dynamic SQL
  • eBook
  • education
  • enclave SRB
  • encryption
  • ERP
  • FETCH FIRST
  • Freakonomics
  • functions
  • generosity factor
  • Happy Holidays
  • Happy New Year
  • Hibernate
  • HIPAA
  • history
  • IBM
  • ICF
  • IDUG
  • IFL
  • IMS
  • index
  • Information Agenda
  • Informix
  • InfoSphere
  • infrastructure
  • integrity
  • IOD
  • IOD11
  • IOD2009
  • IOD2011
  • IODGC
  • IRLM
  • ISOLATION
  • Java
  • JDBC
  • load balancing
  • LOBs
  • locking
  • LUW
  • mainframe
  • Malcolm Gladwell
  • manuals
  • memory
  • middleware
  • migration
  • misc
  • monitoring
  • natural key
  • Netezza
  • new blog location
  • NoSQL
  • nulls
  • OLAP
  • optimization
  • Oracle versus DB2
  • packages
  • PCI-DSS
  • performance
  • PIECESIZE
  • poll
  • primary key
  • production data
  • programming
  • Q+A
  • QMF
  • REBIND
  • recovery
  • RedBook
  • regulatory compliance
  • reliability
  • REORG
  • research
  • RI
  • RTO
  • salaries
  • SAP
  • scalability
  • security
  • smarter planet
  • SoftwareOnZ
  • sort
  • SOX
  • specialty processors
  • SPUFI
  • SQL
  • Stage 1
  • Stage 2
  • standards
  • Steelers
  • storage
  • stored procedures
  • stream computing
  • surrogate key
  • SYSADM
  • Sysadmin
  • table expressions
  • table space
  • TechDoc
  • tips and tricks
  • Top Ten
  • trace
  • training
  • triggers
  • Twitter
  • UDFs
  • UNION
  • unstructured data
  • user groups
  • utilities
  • V1
  • V10
  • V2
  • V3
  • V4
  • V5
  • V6
  • V7
  • V8
  • V9
  • variables
  • views
  • VOLATILE
  • Web 2.0
  • webinar
  • Wordle
  • XML
  • z/OS
  • zAAP
  • zIIP

Blog Archive

  • ►  2014 (2)
    • ►  January (2)
  • ►  2013 (50)
    • ►  December (6)
    • ►  November (6)
    • ►  October (5)
    • ►  September (5)
    • ►  August (3)
    • ►  July (7)
    • ►  June (4)
    • ►  May (4)
    • ►  April (5)
    • ►  March (1)
    • ►  February (2)
    • ►  January (2)
  • ►  2012 (17)
    • ►  December (1)
    • ►  November (2)
    • ►  October (3)
    • ►  August (2)
    • ►  July (1)
    • ►  May (1)
    • ►  April (1)
    • ►  March (2)
    • ►  February (2)
    • ►  January (2)
  • ▼  2011 (27)
    • ►  December (1)
    • ►  November (1)
    • ►  October (6)
    • ►  September (2)
    • ►  August (3)
    • ►  July (2)
    • ►  June (3)
    • ►  May (2)
    • ▼  April (3)
      • I'll Be Tweeting Live From IDUG
      • 100 Years of IBM
      • What About Surrogate Keys?
    • ►  March (1)
    • ►  February (3)
  • ►  2010 (29)
    • ►  December (1)
    • ►  October (6)
    • ►  September (1)
    • ►  August (2)
    • ►  July (2)
    • ►  June (1)
    • ►  May (3)
    • ►  April (3)
    • ►  March (3)
    • ►  February (4)
    • ►  January (3)
  • ►  2009 (43)
    • ►  December (5)
    • ►  November (4)
    • ►  October (6)
    • ►  September (2)
    • ►  August (1)
    • ►  July (3)
    • ►  June (2)
    • ►  May (3)
    • ►  April (2)
    • ►  March (4)
    • ►  February (5)
    • ►  January (6)
  • ►  2008 (44)
    • ►  December (1)
    • ►  November (4)
    • ►  October (4)
    • ►  September (6)
    • ►  August (1)
    • ►  July (4)
    • ►  June (3)
    • ►  May (5)
    • ►  April (4)
    • ►  March (4)
    • ►  February (2)
    • ►  January (6)
  • ►  2007 (51)
    • ►  December (2)
    • ►  November (3)
    • ►  October (5)
    • ►  September (3)
    • ►  August (6)
    • ►  July (4)
    • ►  June (4)
    • ►  May (5)
    • ►  April (8)
    • ►  March (5)
    • ►  February (4)
    • ►  January (2)
  • ►  2006 (60)
    • ►  November (4)
    • ►  October (8)
    • ►  September (4)
    • ►  August (11)
    • ►  July (7)
    • ►  June (2)
    • ►  May (7)
    • ►  April (3)
    • ►  March (6)
    • ►  February (4)
    • ►  January (4)
  • ►  2005 (11)
    • ►  December (3)
    • ►  November (6)
    • ►  October (2)
Powered by Blogger.

About Me

Unknown
View my complete profile