Information About

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 31 August 2006

Accessing Partitioned Data

Posted on 08:12 by Unknown
One area that tends to confuse some DB2 developers until they gain experience is how DB2 partitioning works. A common question I get goes something like this: “If a table is in a partitioned table space, say four partitions, for example, then to process the table in batch can I run four instances of the batch program in parallel, one against each partition. What do I code to do this?”

Well, the short and sweet answer to this question is “Yes, you can run four instances of a batch program in parallel if you so desire.” But there is a nuance to this misconception that might be missed here. The question lurking beneath the question is this: “How can I make sure I am accessing only data in one partition in each of the batch programs?”

To do this requires some programming work. The program will need to have a means of identifying which partition it should run against. So, you might code the program to accept an input parameter of 1, 2, 3, or 4. The program would read the parameter and translate it into the key range of values that should be read by the program. This is the LIMITKEY value for the particular partition as found in the partitioning index. You can retrieve this value from the DB2 catalog using the following query:

SELECT PARTITION, LIMITKEY
FROM SYSIBM.SYSINDEXPART
WHERE IXNAME = ?
AND IXCREATOR = ?
ORDER BY PARTITION;

Supply the index name and creator and this query will return the partition number and LIMITKEY for that partition. (If you include this query in the program you probably will want to include the PARTITION column in the WHERE clause and return only a single row.) The LIMITKEY is the high key value for that partition. Using this information you will need to write the queries in the program such that only values from the partition being processes will be retrieved. As long as the program adheres to that key range you should only process data from the one partition that holds that data.

Of course, none of this is necessary to get DB2 to operate in parallel. The best approach uses DB2 query parallelism because it minimizes the amount of programming work and has the potential to maximize performance. To signal DB2 to turn on parallelism you will need to BIND your program specifying the DEGREE(ANY) parameter. Then DB2 will choose the degree of parallelism for the batch program. You will only need to run the program once (instead of 4 times as in our example); DB2 will figure out how many tasks it can run in parallel for each query in the program. And there is no need to modify the program at all! This is far simpler than any alternate approach because DB2 handles chunking up the work into parallel tasks for you.
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • DB2 for z/OS Version 9 Beta Announcement
    On May 2, 2006 IBM announced the beta for the next version of mainframe DB2: namely, DB2 V9.1 for z/OS. You can view the announcement here ....
  • Index on Expressions [DB2 9 for z/OS]
    DB2 9 for z/OS offers, for the first time, the ability to create an index on data that is not technically in the table. At this point, you m...
  • DB2 Locking, Part 8: LOBs and Locking
    When a row is read or modified in a table containing LOB columns, the application will obtain a normal transaction lock on the base table. T...
  • DBA Rules of Thumb - Part 8 (Being Business Savvy)
    Understand the Business, Not Just the Technology Remember that being technologically adept is just a part of being a good DBA. Although tech...
  • When Not to Index
    Answering a question I got via e-mail on indexing... Every now and then I take the opportunity to blog about a question I get through e-...
  • DBA Rules of Thumb - Part 1
    Over the years I have gathered, written, and assimilated multiple collections of general rules of the road that apply to the management disc...
  • DB2 Locking, Part 15: Tackling Timeout Troubles
    Many shops battle with locking issues and frequently, the cause of performance issues can be traced back to locking issues, more specificall...
  • Adding Column Names to an Unload File
    I received an e-mail from a reader asking an interesting question. She wanted to know if any of the DB2 unload utilities are able to include...
  • DB2 11 for z/OS: Coming Soon!
    Today, October 3, 2012, IBM announced the Early Support Program for the next version of DB2. Heretofore code-named Sequoia, the DB2 11 ESP i...
  • Can You Write a Redbook?
    If you've been working with mainframes for any period of time you have almost certainly become familiar with the IBM redbook. These are ...

Categories

  • .NET
  • ACID
  • ALTER
  • analytics
  • articles
  • automation
  • award
  • backup
  • best practices
  • BETWEEN
  • BI
  • Big Data
  • BIND
  • blogging
  • book review
  • bufferpool
  • buffers
  • CASE
  • change management
  • claim
  • Cognos
  • COMMIT
  • compliance
  • compression
  • conference
  • constraints
  • COPY
  • data
  • data breaches
  • data quality
  • data security
  • Data Sharing
  • data types
  • data warehouse
  • database archiving
  • database auditing
  • database design
  • date
  • DB2
  • DB2 10
  • DB2 11
  • DB2 9
  • DB2 Analystics Accelerator
  • DB2 Catalog
  • DB2 conversion
  • DB2 Developer's Guide
  • DB2 X
  • DB2-L
  • DBA
  • DDL
  • developerWorks
  • dirty read
  • DISPLAY
  • DL/1
  • drain
  • DSNZPARM
  • Dynamic SQL
  • eBook
  • education
  • enclave SRB
  • encryption
  • ERP
  • FETCH FIRST
  • Freakonomics
  • functions
  • generosity factor
  • Happy Holidays
  • Happy New Year
  • Hibernate
  • HIPAA
  • history
  • IBM
  • ICF
  • IDUG
  • IFL
  • IMS
  • index
  • Information Agenda
  • Informix
  • InfoSphere
  • infrastructure
  • integrity
  • IOD
  • IOD11
  • IOD2009
  • IOD2011
  • IODGC
  • IRLM
  • ISOLATION
  • Java
  • JDBC
  • load balancing
  • LOBs
  • locking
  • LUW
  • mainframe
  • Malcolm Gladwell
  • manuals
  • memory
  • middleware
  • migration
  • misc
  • monitoring
  • natural key
  • Netezza
  • new blog location
  • NoSQL
  • nulls
  • OLAP
  • optimization
  • Oracle versus DB2
  • packages
  • PCI-DSS
  • performance
  • PIECESIZE
  • poll
  • primary key
  • production data
  • programming
  • Q+A
  • QMF
  • REBIND
  • recovery
  • RedBook
  • regulatory compliance
  • reliability
  • REORG
  • research
  • RI
  • RTO
  • salaries
  • SAP
  • scalability
  • security
  • smarter planet
  • SoftwareOnZ
  • sort
  • SOX
  • specialty processors
  • SPUFI
  • SQL
  • Stage 1
  • Stage 2
  • standards
  • Steelers
  • storage
  • stored procedures
  • stream computing
  • surrogate key
  • SYSADM
  • Sysadmin
  • table expressions
  • table space
  • TechDoc
  • tips and tricks
  • Top Ten
  • trace
  • training
  • triggers
  • Twitter
  • UDFs
  • UNION
  • unstructured data
  • user groups
  • utilities
  • V1
  • V10
  • V2
  • V3
  • V4
  • V5
  • V6
  • V7
  • V8
  • V9
  • variables
  • views
  • VOLATILE
  • Web 2.0
  • webinar
  • Wordle
  • XML
  • z/OS
  • zAAP
  • zIIP

Blog Archive

  • ►  2014 (2)
    • ►  January (2)
  • ►  2013 (50)
    • ►  December (6)
    • ►  November (6)
    • ►  October (5)
    • ►  September (5)
    • ►  August (3)
    • ►  July (7)
    • ►  June (4)
    • ►  May (4)
    • ►  April (5)
    • ►  March (1)
    • ►  February (2)
    • ►  January (2)
  • ►  2012 (17)
    • ►  December (1)
    • ►  November (2)
    • ►  October (3)
    • ►  August (2)
    • ►  July (1)
    • ►  May (1)
    • ►  April (1)
    • ►  March (2)
    • ►  February (2)
    • ►  January (2)
  • ►  2011 (27)
    • ►  December (1)
    • ►  November (1)
    • ►  October (6)
    • ►  September (2)
    • ►  August (3)
    • ►  July (2)
    • ►  June (3)
    • ►  May (2)
    • ►  April (3)
    • ►  March (1)
    • ►  February (3)
  • ►  2010 (29)
    • ►  December (1)
    • ►  October (6)
    • ►  September (1)
    • ►  August (2)
    • ►  July (2)
    • ►  June (1)
    • ►  May (3)
    • ►  April (3)
    • ►  March (3)
    • ►  February (4)
    • ►  January (3)
  • ►  2009 (43)
    • ►  December (5)
    • ►  November (4)
    • ►  October (6)
    • ►  September (2)
    • ►  August (1)
    • ►  July (3)
    • ►  June (2)
    • ►  May (3)
    • ►  April (2)
    • ►  March (4)
    • ►  February (5)
    • ►  January (6)
  • ►  2008 (44)
    • ►  December (1)
    • ►  November (4)
    • ►  October (4)
    • ►  September (6)
    • ►  August (1)
    • ►  July (4)
    • ►  June (3)
    • ►  May (5)
    • ►  April (4)
    • ►  March (4)
    • ►  February (2)
    • ►  January (6)
  • ►  2007 (51)
    • ►  December (2)
    • ►  November (3)
    • ►  October (5)
    • ►  September (3)
    • ►  August (6)
    • ►  July (4)
    • ►  June (4)
    • ►  May (5)
    • ►  April (8)
    • ►  March (5)
    • ►  February (4)
    • ►  January (2)
  • ▼  2006 (60)
    • ►  November (4)
    • ►  October (8)
    • ►  September (4)
    • ▼  August (11)
      • Accessing Partitioned Data
      • How are Indexes Being Used?
      • VARCHAR versus Compression
      • IBM Mainframes - Not Just for Big Shops Any More
      • Advice on Using Variable Character Columns in DB2
      • Greatest Software Ever?
      • SHARE Travelers Take Heed
      • Where exactly is a DB2 plan stored?
      • Mainframe Weekly: A new mainframe-focused blog
      • Upcoming SHARE Conference
      • New Mainframe Redbook
    • ►  July (7)
    • ►  June (2)
    • ►  May (7)
    • ►  April (3)
    • ►  March (6)
    • ►  February (4)
    • ►  January (4)
  • ►  2005 (11)
    • ►  December (3)
    • ►  November (6)
    • ►  October (2)
Powered by Blogger.

About Me

Unknown
View my complete profile