Information About

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 31 August 2006

Accessing Partitioned Data

Posted on 08:12 by Unknown
One area that tends to confuse some DB2 developers until they gain experience is how DB2 partitioning works. A common question I get goes something like this: “If a table is in a partitioned table space, say four partitions, for example, then to process the table in batch can I run four instances of the batch program in parallel, one against each partition. What do I code to do this?”

Well, the short and sweet answer to this question is “Yes, you can run four instances of a batch program in parallel if you so desire.” But there is a nuance to this misconception that might be missed here. The question lurking beneath the question is this: “How can I make sure I am accessing only data in one partition in each of the batch programs?”

To do this requires some programming work. The program will need to have a means of identifying which partition it should run against. So, you might code the program to accept an input parameter of 1, 2, 3, or 4. The program would read the parameter and translate it into the key range of values that should be read by the program. This is the LIMITKEY value for the particular partition as found in the partitioning index. You can retrieve this value from the DB2 catalog using the following query:

SELECT PARTITION, LIMITKEY
FROM SYSIBM.SYSINDEXPART
WHERE IXNAME = ?
AND IXCREATOR = ?
ORDER BY PARTITION;

Supply the index name and creator and this query will return the partition number and LIMITKEY for that partition. (If you include this query in the program you probably will want to include the PARTITION column in the WHERE clause and return only a single row.) The LIMITKEY is the high key value for that partition. Using this information you will need to write the queries in the program such that only values from the partition being processes will be retrieved. As long as the program adheres to that key range you should only process data from the one partition that holds that data.

Of course, none of this is necessary to get DB2 to operate in parallel. The best approach uses DB2 query parallelism because it minimizes the amount of programming work and has the potential to maximize performance. To signal DB2 to turn on parallelism you will need to BIND your program specifying the DEGREE(ANY) parameter. Then DB2 will choose the degree of parallelism for the batch program. You will only need to run the program once (instead of 4 times as in our example); DB2 will figure out how many tasks it can run in parallel for each query in the program. And there is no need to modify the program at all! This is far simpler than any alternate approach because DB2 handles chunking up the work into parallel tasks for you.
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • DB2 Locking, Part 8: LOBs and Locking
    When a row is read or modified in a table containing LOB columns, the application will obtain a normal transaction lock on the base table. T...
  • DB2 for z/OS Version 9 Beta Announcement
    On May 2, 2006 IBM announced the beta for the next version of mainframe DB2: namely, DB2 V9.1 for z/OS. You can view the announcement here ....
  • Index on Expressions [DB2 9 for z/OS]
    DB2 9 for z/OS offers, for the first time, the ability to create an index on data that is not technically in the table. At this point, you m...
  • When Not to Index
    Answering a question I got via e-mail on indexing... Every now and then I take the opportunity to blog about a question I get through e-...
  • CLONE Tables [DB2 9 for z/OS]
    This new feature in DB2 V9 might sound like an old monster movie ( Invasion of the Clone Tables!!! ), but it is actually a nifty new capabil...
  • DB2 Locking, Part 2: Table Space and Table Locks
    Today's post is the second in our DB2 locking series and it covers the topic of table space and table locks. Table Space Locks A table s...
  • IBM Manages the Data Lifecycle
    Data lifecycle is a somewhat new-ish term, at least in terms of what I plan to talk about in this blog posting. The data lifecycle – and dat...
  • IBM Announces DB2 10 for z/OS Beta Program
    IBM announced the beta program for the next version of DB2 today, now "officially" known as DB2 10 (no more DB2 X). It is a closed...
  • DB2 Locking, Part 5: Lock Suspensions, Timeouts, and Deadlocks
    The longer a lock is held, the greater the potential impact to other applications. When an application requests a lock that is already held ...
  • DB2 Locking, Part 15: Tackling Timeout Troubles
    Many shops battle with locking issues and frequently, the cause of performance issues can be traced back to locking issues, more specificall...

Categories

  • .NET
  • ACID
  • ALTER
  • analytics
  • articles
  • automation
  • award
  • backup
  • best practices
  • BETWEEN
  • BI
  • Big Data
  • BIND
  • blogging
  • book review
  • bufferpool
  • buffers
  • CASE
  • change management
  • claim
  • Cognos
  • COMMIT
  • compliance
  • compression
  • conference
  • constraints
  • COPY
  • data
  • data breaches
  • data quality
  • data security
  • Data Sharing
  • data types
  • data warehouse
  • database archiving
  • database auditing
  • database design
  • date
  • DB2
  • DB2 10
  • DB2 11
  • DB2 9
  • DB2 Analystics Accelerator
  • DB2 Catalog
  • DB2 conversion
  • DB2 Developer's Guide
  • DB2 X
  • DB2-L
  • DBA
  • DDL
  • developerWorks
  • dirty read
  • DISPLAY
  • DL/1
  • drain
  • DSNZPARM
  • Dynamic SQL
  • eBook
  • education
  • enclave SRB
  • encryption
  • ERP
  • FETCH FIRST
  • Freakonomics
  • functions
  • generosity factor
  • Happy Holidays
  • Happy New Year
  • Hibernate
  • HIPAA
  • history
  • IBM
  • ICF
  • IDUG
  • IFL
  • IMS
  • index
  • Information Agenda
  • Informix
  • InfoSphere
  • infrastructure
  • integrity
  • IOD
  • IOD11
  • IOD2009
  • IOD2011
  • IODGC
  • IRLM
  • ISOLATION
  • Java
  • JDBC
  • load balancing
  • LOBs
  • locking
  • LUW
  • mainframe
  • Malcolm Gladwell
  • manuals
  • memory
  • middleware
  • migration
  • misc
  • monitoring
  • natural key
  • Netezza
  • new blog location
  • NoSQL
  • nulls
  • OLAP
  • optimization
  • Oracle versus DB2
  • packages
  • PCI-DSS
  • performance
  • PIECESIZE
  • poll
  • primary key
  • production data
  • programming
  • Q+A
  • QMF
  • REBIND
  • recovery
  • RedBook
  • regulatory compliance
  • reliability
  • REORG
  • research
  • RI
  • RTO
  • salaries
  • SAP
  • scalability
  • security
  • smarter planet
  • SoftwareOnZ
  • sort
  • SOX
  • specialty processors
  • SPUFI
  • SQL
  • Stage 1
  • Stage 2
  • standards
  • Steelers
  • storage
  • stored procedures
  • stream computing
  • surrogate key
  • SYSADM
  • Sysadmin
  • table expressions
  • table space
  • TechDoc
  • tips and tricks
  • Top Ten
  • trace
  • training
  • triggers
  • Twitter
  • UDFs
  • UNION
  • unstructured data
  • user groups
  • utilities
  • V1
  • V10
  • V2
  • V3
  • V4
  • V5
  • V6
  • V7
  • V8
  • V9
  • variables
  • views
  • VOLATILE
  • Web 2.0
  • webinar
  • Wordle
  • XML
  • z/OS
  • zAAP
  • zIIP

Blog Archive

  • ►  2014 (2)
    • ►  January (2)
  • ►  2013 (50)
    • ►  December (6)
    • ►  November (6)
    • ►  October (5)
    • ►  September (5)
    • ►  August (3)
    • ►  July (7)
    • ►  June (4)
    • ►  May (4)
    • ►  April (5)
    • ►  March (1)
    • ►  February (2)
    • ►  January (2)
  • ►  2012 (17)
    • ►  December (1)
    • ►  November (2)
    • ►  October (3)
    • ►  August (2)
    • ►  July (1)
    • ►  May (1)
    • ►  April (1)
    • ►  March (2)
    • ►  February (2)
    • ►  January (2)
  • ►  2011 (27)
    • ►  December (1)
    • ►  November (1)
    • ►  October (6)
    • ►  September (2)
    • ►  August (3)
    • ►  July (2)
    • ►  June (3)
    • ►  May (2)
    • ►  April (3)
    • ►  March (1)
    • ►  February (3)
  • ►  2010 (29)
    • ►  December (1)
    • ►  October (6)
    • ►  September (1)
    • ►  August (2)
    • ►  July (2)
    • ►  June (1)
    • ►  May (3)
    • ►  April (3)
    • ►  March (3)
    • ►  February (4)
    • ►  January (3)
  • ►  2009 (43)
    • ►  December (5)
    • ►  November (4)
    • ►  October (6)
    • ►  September (2)
    • ►  August (1)
    • ►  July (3)
    • ►  June (2)
    • ►  May (3)
    • ►  April (2)
    • ►  March (4)
    • ►  February (5)
    • ►  January (6)
  • ►  2008 (44)
    • ►  December (1)
    • ►  November (4)
    • ►  October (4)
    • ►  September (6)
    • ►  August (1)
    • ►  July (4)
    • ►  June (3)
    • ►  May (5)
    • ►  April (4)
    • ►  March (4)
    • ►  February (2)
    • ►  January (6)
  • ►  2007 (51)
    • ►  December (2)
    • ►  November (3)
    • ►  October (5)
    • ►  September (3)
    • ►  August (6)
    • ►  July (4)
    • ►  June (4)
    • ►  May (5)
    • ►  April (8)
    • ►  March (5)
    • ►  February (4)
    • ►  January (2)
  • ▼  2006 (60)
    • ►  November (4)
    • ►  October (8)
    • ►  September (4)
    • ▼  August (11)
      • Accessing Partitioned Data
      • How are Indexes Being Used?
      • VARCHAR versus Compression
      • IBM Mainframes - Not Just for Big Shops Any More
      • Advice on Using Variable Character Columns in DB2
      • Greatest Software Ever?
      • SHARE Travelers Take Heed
      • Where exactly is a DB2 plan stored?
      • Mainframe Weekly: A new mainframe-focused blog
      • Upcoming SHARE Conference
      • New Mainframe Redbook
    • ►  July (7)
    • ►  June (2)
    • ►  May (7)
    • ►  April (3)
    • ►  March (6)
    • ►  February (4)
    • ►  January (4)
  • ►  2005 (11)
    • ►  December (3)
    • ►  November (6)
    • ►  October (2)
Powered by Blogger.

About Me

Unknown
View my complete profile