Oracle Character Set Conversion (4): 4.Configuring the Test Environment

Continuing from the previous article, we look at the configuration of the Oracle Character Set conversion test environment. You can check Oracle server environment configuration (three instances, DB Link creation), test table creation, and test data creation scripts. This test environment is used in the character set conversion process later.

This is a continuation of the previous article.

Oracle Character Set Conversion (3): 3. Client Environment Configuration (2)

4. Configure Oracle Character Set Conversion Test Environment

4.1. Oracle DB Server Environment Configuration

It is assumed that the test environment for Oracle Character Set conversion is configured as follows on the Local PC (Windows).

Oracle DB 서버 환경 구성
Oracle DB server environment configuration

Three Oracle DB instances were installed on one PC, and the configuration of each is as follows.

EnvironmentSIDServer Character SetOracle VersionTest User (Schema)
US7ASCII Current DBORAUS7US7ASCII11.2.0.1.0LEG
KO16MSWIN949 Current DBORAMSWIN949KO16MSWIN94911.2.0.1.0LEG
AL32UTF8 Target DBORAUTFAL32UTF811.2.0.1.0LEG

DB Link is configured as follows.

Generated SIDTarget SIDConnection User (Schema)purpose
ORAMSWIN949ORAUS7LEGUS7ASCII access and conversion test on KO16MSWIN949
ORAUTFORAUS7LEGAL32UTF8 to US7ASCII access and conversion test

The DB Link creation script is as follows.

CREATE DATABASE LINK DL_US7
CONNECT TO leg IDENTIFIED BY leg
USING '(DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ORAUS7)
    )
  )'
;

4.2. test table configuration

The data model for testing is as follows.

Oracle Character Set 변환 테스트 테이블 구성
Oracle Character Set Conversion Test Table Configuration

(For testing purposes, all additional details such as normalization and identifier configuration are omitted. The same length is applied to US7ASCII and MSWIN949, and the length is increased to AL32UTF8.)

The data model above is a structure in which two items (reference description, reference image) for testing are added to the data set of <Information on the number of people getting on and off by subway line and station in Seoul> disclosed by the Seoul Metropolitan Government.

source: https://data.seoul.go.kr/dataList/OA-12252/S/1/datasetView.do

The DDL that creates the schema of the test data model above is as follows.

-- 한글 2 byte 체계 스키마 생성(US7ASCII, MSWIN949)
CREATE TABLE SUB_MON_STAT 
(
      USE_MON            VARCHAR2(8),
      LINE_NUM           VARCHAR2(14), -- AL32UTF8: 50
      SUB_STA_ID         VARCHAR2(4),
      SUB_STA_NM         VARCHAR2(20), -- AL32UTF8: 100
      RIDE_PASGR_NUM     NUMBER(10),
      ALIGHT_PASGR_NUM   NUMBER(10),
      WORK_DT            VARCHAR2(8),
      COMMT              VARCHAR2(4000),
      REF_DES            CLOB,
      REF_IMG            BLOB
);

-- 한글 3 byte 체계 스키마 생성(AL32UTF8)
CREATE TABLE SUB_MON_STAT 
(
      USE_MON            VARCHAR2(8),
      LINE_NUM           VARCHAR2(50),
      SUB_STA_ID         VARCHAR2(4),
      SUB_STA_NM         VARCHAR2(100),
      RIDE_PASGR_NUM     NUMBER(10),
      ALIGHT_PASGR_NUM   NUMBER(10),
      WORK_DT            VARCHAR2(8),
      COMMT              VARCHAR2(4000),
      REF_DES            CLOB,
      REF_IMG            BLOB
);

In AL32UTF8, 3 bytes are required for 1 Korean character, so the length of the two columns is increased sufficiently.

Data files in Excel format can be downloaded from the github path below.

* Excel data file: https://github.com/DAToolset/Oracle-CharacterSet-Conversion/raw/main/Number of passengers by subway station_Seoul Open Data_2013.xlsx

The DML for entering data into the table can be downloaded from the github path below.

* DML: https://github.com/DAToolset/Oracle-CharacterSet-Conversion/raw/main/Number of passengers by subway station_Seoul Open Data_2013.sql

Some excerpts from the above sql file are pasted below.

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT, COMMT, REF_DES, REF_IMG)
VALUES('201301', '1호선', '0150', '서울역', 2199181, 1855268, '20130723', RPAD('서울역', 3000, '서울역'), RPAD('서울역', 3000, '서울역'), UTL_RAW.CAST_TO_RAW('12345'));

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0151', '시청', 737005, 726892, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0152', '종각', 1542710, 1446464, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0153', '종로3가', 1159253, 1126995, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0154', '종로5가', 816307, 820769, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0155', '동대문', 446527, 512163, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0156', '신설동', 496150, 484583, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0157', '제기동', 587740, 601913, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0158', '청량리(지하)', 966341, 985741, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '1호선', '0159', '동묘앞', 240864, 252523, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0202', '을지로입구', 1610744, 1607133, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0203', '을지로3가', 539622, 559866, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0204', '을지로4가', 364217, 365994, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0205', '동대문역사문화공원', 435358, 508405, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0206', '신당', 407110, 438543, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0207', '상왕십리', 288629, 274519, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0208', '왕십리', 583444, 498091, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0209', '한양대', 262492, 286399, '20130723');

INSERT INTO SUB_MON_STAT(USE_MON, LINE_NUM, SUB_STA_ID, SUB_STA_NM, RIDE_PASGR_NUM, ALIGHT_PASGR_NUM, WORK_DT)
VALUES('201301', '2호선', '0210', '뚝섬', 402631, 420166, '20130723');

...

So far, we have looked at test environment configuration and configuration scripts for Oracle Character Set conversion. Next, we'll look at different ways to transform the Character Set in this test environment. We will look into the method recommended by Oracle and the method that requires a separate implementation.

Leave a Reply

Your email address will not be published. Required fields are marked *

en_USEnglish