2.5. Detailed check of unit of work partitioning (DBMS_PARALLEL_EXECUTE)

This is related to checking the details of work unit division. Examine the degree to which the result of dividing work units (chunks) by ROWID is evenly divided, whether the sum of work units is the same as the whole, and whether there are no omissions, and the correlation between the number of work units and the number of jobs.

This is a continuation of the previous article.

2. Parallel processing case of ROWID division method

2.5. Work unit division details check

Let's take a look at the following to see if the unit of work is well-divided.

  • Uniformity of work units partitioned by ROWID
  • Check that there are no missing units of work divided by ROWID
  • Correlation between the number of work units (chunks) and the PARALLEL_LEVEL (the number of jobs to be executed)

2.5.1. Uniformity of work units partitioned by ROWID

The following division means to divide the Z_DPE_TEST_TAB table into 10,000 work units based on ROW Count.

-- 2단계: 작업 단위 분할
BEGIN
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID(TASK_NAME   => 'DPE_TEST(BY ROWID)',
                                               TABLE_OWNER => USER,
                                               TABLE_NAME  => 'Z_DPE_TEST_TAB',
                                               BY_ROW      => TRUE,
                                               CHUNK_SIZE  => 10000);
END;
/

Let's see if it really splits by 10,000.

-- 작업단위 개수 확인 --> 115
SELECT COUNT(*)
  FROM USER_PARALLEL_EXECUTE_CHUNKS
WHERE TASK_NAME = 'DPE_TEST(BY ROWID)';

Since the total number of rows is 1,000,000, it seems that it should be divided into 100 chunks if it is divided into 10,000 rows, but in reality it is divided into 115 chunks.

If you calculate the row count for each chunk,

SELECT  B.CHUNK_ID, COUNT(A.ID), MIN(A.ID) MIN_ID, MAX(A.ID) MAX_ID
  FROM  Z_DPE_TEST_TAB A,
        (
        SELECT  CHUNK_ID, START_ROWID, END_ROWID
          FROM  USER_PARALLEL_EXECUTE_CHUNKS
         WHERE  TASK_NAME = 'DPE_TEST(BY ROWID)'
        ) B
 WHERE  A.ROWID BETWEEN B.START_ROWID AND B.END_ROWID
 GROUP  BY B.CHUNK_ID
 ORDER  BY B.CHUNK_ID;

You can see that it is divided into various Counts, such as 1334, 2064, and 1886, not 10,000.

chunk별 row count
row count per chunk

The number of rows contained in the chunk is 14 in the 1,000-2,000 section, 2 in the 2,000 row section, 33 in the 6,000 row section, 64 in the 11,000 row section, and 2 in the 12,000 row section. (It may not give the same result every time.)

Row수 구간별 chunk수
Number of Rows Number of chunks per section

As such, it is not divided into a completely equal number of rows. This seems to be because the ROWID partitioning method is based on DBA_EXTENTS, and the number of blocks included in each EXTENT and the ROW Count of each block can be different. For reference, if there are unused BLOCKs in the table by DELETE, etc., a ROWID section where no data actually exists may be created as a chunk. This is also expected to occur when data is deleted after being allocated to DBA_EXTENT, or when there are EXTENTs that have been allocated but do not contain actual data.

2.5.2. Check that there are no missing units of work divided by ROWID

Could there be missing data when the START_ROWID and END_ROWID sections of each chunk are expanded?

First, let's check the row count.

-- Chunk의 RowID range로 추출한 Row Count
SELECT  COUNT(A.ID) ROW_COUNT
  FROM  Z_DPE_TEST_TAB A,
        (
        SELECT  CHUNK_ID, START_ROWID, END_ROWID
          FROM  USER_PARALLEL_EXECUTE_CHUNKS
         WHERE  TASK_NAME = 'DPE_TEST(BY ROWID)'
        ) B
 WHERE  A.ROWID BETWEEN B.START_ROWID AND B.END_ROWID;
row count 일치 확인
Check row count matches

Matching the total row count of 1,000,000, once confirmed that the count was not missing, and then, based on the data extracted in the range of START_ROWID to END_ROWID of Chunk, when the original data was LEFT OUTER JOINed, missing (missing) ) to see if the data exists.

-- Chunk의 RowID로 추출한 데이터의 누락없음 확인
SELECT  *
  FROM  Z_DPE_TEST_TAB A LEFT OUTER JOIN
        (
        SELECT  A.ID  -- Chunk의 RowID range로 추출한 데이터
          FROM  Z_DPE_TEST_TAB A,
                (
                SELECT  CHUNK_ID, START_ROWID, END_ROWID
                  FROM  USER_PARALLEL_EXECUTE_CHUNKS
                 WHERE  TASK_NAME = 'DPE_TEST(BY ROWID)'
                ) B
         WHERE  A.ROWID BETWEEN B.START_ROWID AND B.END_ROWID
        ) B ON (A.ID = B.ID)
 WHERE  B.ID IS NULL;
데이터 누락 없음 확인
Make sure no data is missing

From the above results, it can be seen that there is no missing data.

2.5.3. Correlation between the number of work units (chunks) and the PARALLEL_LEVEL (the number of jobs to be executed)

<2.3. In Task Execution>, there was the following content.

PARALLEL_LEVEL means the number of jobs to be executed simultaneously, that is, the degree of parallelism (DOP), and may be equal to or smaller than the number of chunks, which are work units. In the same case, one job processes one chunk, and in a small case, one job processes several chunks.

* Reference: 2. Parallel processing case of ROWID division method_2.3. job run

Consider the following cases regarding the number of chunks and the number of jobs.

▼ When the number of chunks and the number of jobs are the same (Chunk = Job)

When one job completes the execution of the assigned chunk, it ends because there are no more jobs to run.

Chunk수와 Job수가 동일한 경우
When the number of chunks and the number of jobs are the same

▼ When the number of jobs is less than the number of chunks (Chunk > Job)

When the job of the chunk to which the job is assigned is completed, the chunk that has not yet been executed is allocated and executed continuously. Below is an example when there are 3 jobs.

Chunk수보다 Job수가 적은 경우 (Chunk > Job)
When the number of jobs is less than the number of chunks (Chunk > Job)

In addition, if the number of jobs is greater than the number of chunks (Chunk < Job), it is also possible, but it is better not to apply it because chunks are not allocated and unnecessary resources can be used by maintaining the running state.

* Reference: DBMS_PARALLEL_EXECUTE Chunk by ROWID Example


So far, we have looked at the ROWID partitioning method. Next, look at the NUMBER column partitioning method.

Leave a Reply

Your email address will not be published. Required fields are marked *

en_USEnglish