Data standard inspection tool_2.1. Screen composition, 2.2. Standard inspection function
Examine the standard data inspection tool screen composition, standard inspection function, and option input/output items. You can check the overall function of the data standard check tool.
2. Standard inspection function and standard dictionary configuration
2.1. Composition of standard inspection tool screen
The data standard checking tool is Excel VBA (Note: Excel VBA Course(1): Overview of Excel VBA), and has the following screen composition.
Here is a brief description of each area.
- Features & Options: Set and run standard inspection options
- List of attributes subject to inspection: name, data type, length (input) of attributes subject to standard inspection
- Check result: standard logical name, physical name, data type, length, stop word, domain, and additional candidate words for each of the properties to be checked (output)
- Standard Dictionary: Reference data for standard inspection (input)
The following sheet is for reference only. (Please read the terms of the license carefully.)
- Config: Connection and SQL settings to update standard dictionary from database
- # License, History#: Author of this tool, e-mail, blog, contents of license notice, development history of tool
※ License
division | Contents |
writer | Lee Young Min |
leeyoungmin@gmail.com | |
blog | https://prodskill.com/ |
Github | https://github.com/DAToolset/ToolsForDataStandard |
license content | By default, it follows the BSD-2 license as specified in the github repository. If not specified, follow the instructions below. —————————————————————————– 1. Representation of the copyright holder: The name of the copyright holder, email address, blog address, and github repository address must be used without change. In case of changing the source code and using it, you can add content related to the secondary author. 2. Warranty Disclaimer: The copyright holder cannot be held liable for any problems caused by using this source code and tools. —————————————————————————– If the above conditions are satisfied, it is permitted for all purposes, such as personal use such as learning/utilization, use of this tool in business at work, or use of the tool to be included in a commercial product by changing the source code. This license content cannot be removed. |
tell me | You do not have to disclose the source code of the derivative work. However, if it is not a problem, I would like you to let me know which secondary work you used it for. You can send it by email or leave it as a blog comment. Please also notify us of errors or improvement requests in the tool's functionality, errors in the guide documentation, or requests for improvement by e-mail or blog comments. |
2.2. Standard inspection functions and options, input/output items
Let's take a closer look at the functions, options, and input/output items.
2.2.1. standard check function
It provides the following check functions.
- Standard inspection: Based on the standard word dictionary, standard term dictionary, and standard domain dictionary, standard inspection of property names and data types/lengths subject to standard inspection of the list is performed
- Optional attribute Only
- Option to select the standard inspection target (if checked: standard inspection of selected properties only, unchecked: all properties standard inspection)
- Even if it is not checked, if you click the “Standard Check” button while holding down the Shift key, it operates the same as when it is checked.
- Clear: Clear standard check results
- Standard dictionary refresh
- Download and refresh word/term/domain dictionary from standard dictionary repository
- You can select the “Refresh standard dictionary” checkbox when running standard check, but it is used when only standard dictionary is refreshed without running standard check.
- When refreshing the standard dictionary, the previous dictionary is backed up as “_Bak” sheet, and the standard date and time of the new standard dictionary and the standard date and time of the backup dictionary are displayed in the upper right corner.
- Config: Move to “Config” sheet for setting SQL to obtain connection string for accessing standard dictionary repository and list of standard words/terms/domains
- About... : Display tool description and license contents
- Add to Dictionary: Add additional candidate words to the dictionary list.
2.2.2. Standard inspection option
You can control the function with the following options.
- Word Combination Direction: Set the direction to find the word composition of the attribute name (Default setting: Left->Right, Right->Left are all checked)
- left -> right
- Search the word structure of the attribute name from left to right
- right -> left
- Find the word structure of the attribute name from right to left
- Since the attribute classifier (used for domain designation) is found first, it is more accurate
- Note
- As for the combination direction, it is recommended to select both by default, and if only one is selected, “right->left” is recommended.
- There are many cases that can be combined in one direction, but the longest matching word is selected to form the logical/physical name combination result.
- When standard check is performed by selecting both “Left->Right” and “Right->Left”, word combinations may be different, and the result in this case is that the background color of the cell is automatically set to orange (conditional formatting used) and can be identified
- left -> right
- Find Standard Dictionary: Set which standard dictionary to combine words with (Default Setting: Word & Term)
- Words & Terms: First look for standard terms matching standard terms in the standard term dictionary, use the corresponding standard terms, if not, combine them with standard words
- Word: Combined with standard words only (even if there is a standard term, it is combined with only words)
- Terminology: Finds only matching standard terms in the standard glossary
- Standard Dictionary Refresh: Set whether to renew standard words/standard terms/standard domains from the standard repository during standard inspection
- Duplicate logical names of standard words (disambiguation) allowed
- Select to allow duplicates in logical names of standard words, i.e. homophones, and deselect if not to allow homophones.
- When deselected and executed, if there is a homonym, the corresponding duplicate history is shown and the execution is stopped
- Duplicate standard word physical name (allophone synonyms) allowed
- Select if duplicate synonyms, that is, physical names of standard words, are allowed, and deselect if allosynonyms are not allowed.
- When deselected and executed, if there is an allophone synonym, the corresponding duplicate history is shown and the execution is stopped.
- It is good in terms of standard quality not to register allophone synonyms in the standard word dictionary, but limited use only when unavoidable
- physical name maximum length
- If the set physical name exceeds the maximum length, the color is changed with conditional formatting (red background)
- Display column: physical name Length (column M)
2.2.3. standard check entry
Next, let's look at input items and output (check result) items.
- attribute name
- Property name subject to standard inspection
- By default, the property name does not use spaces, but if spaces are used, it is recognized as a single word in units of spaces and checked.
- Example: country code
- Data Type/Len
- Type/Size of the data to be assigned to the property
- If not entered (value is empty), do not check
- Example: VARCHAR2(20)
2.2.4. Standard inspection output (check result) items
The output (check result) items are as follows.
- standard word logical name combination
- The result of combining the logical names of standard words constituting the property name with the separator '_'
- If different results are obtained depending on the direction of word combination, left->right, right->left, both are displayed.
- Example: country_code
- standard word physical name combination
- The result of combining the physical names of the standard words constituting the property name with the separator '_'
- If different results are obtained depending on the direction of word combination, left->right, right->left, both are displayed.
- Example: CNTRY_CD
- Attribute Name Check Result
- Word Criteria: standard word combination, standard word combination (user defined), no standard word, no standard word (user defined)
(Here, 'custom' means that the word combination is intentionally designated by the user as there is a space in the property name.) - Terminology standard: Standard term matching, no standard term
- Example: standard term matching
- Word Criteria: standard word combination, standard word combination (user defined), no standard word, no standard word (user defined)
- Standard terms Type/Size
- Data Type/Size of the standard term if the property name check result is 'consistent with standard term'
- Example: VARCHAR2(20)
- attribute name terminator
- Last word in property name (property classifier used in most domains)
- Example: code
- Property name terminator check result
- Classification word: When the ending word is a classification word
- Basic word: If the ending word is the basic word –> It is necessary to change the ending word to another classification word or change the ending word to a classification word
- No word: If the ending word is not in the standard word dictionary –> The ending word needs to be registered in the standard dictionary
- Example: classification word
- Domain, Data Type Check Result
- Result of checking whether the data Type/Size of the input property name matches the standard term, or if the word terminator is an attribute classifier, whether it exists in the list of domains connected to the domain classification, and whether the type and size match.
- Example: Domain Type/Size Match
- additional candidate words
- Among the words that make up the property name, words that are not in the standard word dictionary
- In some cases, it is not necessary as a standard word, so check it and register it as a standard word.
2.2.5. Standard dictionary Repository setting
Settings related to the standard dictionary Repository are as follows.
- Connection String
- String for standard dictionary repository connection
- Create by clicking “Make Connection String” button
- If an error occurs when clicking the button, delete the contents of the connection string and click again to execute.
- Standard Word Dictionary Query: SQL to get standard word dictionary list
- Standard Glossary Query: SQL to get standard glossary list
- Standard domain dictionary Query: SQL to get a list of standard domain dictionaries
These are notes related to Config.
- Regardless of the type of standard management system (metadata management system), it can be used if the list can be obtained through DB connection and SQL.
- In the case of sharing an Excel file and using it as a standard dictionary, you can use it by selecting “Microsoft.Jet.OLEDB.4.0” as the Provider. (Not recommended) However, “Microsoft.Jet.OLEDB.4.0” does not provide a 64-bit provider, so it can only be used with 32-bit Excel.
- If an error (see below) occurs when clicking the “Make Connection String” button, simply delete the Connection String value (cell B3 in the image above) and try again. (Error message: Could not retrieve initialization string from IDataInitialize. Class not registered.)
Up to this point, we have looked at the screen composition of the data standard inspection tool and the standard inspection function.
<< List of related articles >>
- Data standard inspection tool_1.Overview
- Data standard inspection tool_2.1. Screen composition, 2.2. Standard inspection function
- Data standard inspection tool_2.3. Data standard dictionary composition
- Data Standard Check Tool_3. Standard Check Case Results
- Data standard check tool_4.Attachment
- How to Fix Data Standard Check Tool_v1.33 Error “Not enough memory”
- Data Standard Check Tool Description Contents , Download
Based on SQL Server, VARCHAR(MAX) data type/length is used.
If the TypeSize of the standard domain dictionary is VARCHAR(MAX), a debug error occurs when the property standard check is performed.
Guessing the reason, it seems to be a problem caused by treating the length as an Integer variable in VB.
Your guess is correct.
A patch is needed so that the MAX of VARCHAR(MAX) can be compared without converting it to an integer type.
Thank you for reporting.
I'll patch it up when I have time and post it on my blog.
Uploaded a version that patched an error when using VARCHAR(MAX).
https://prodskill.com/ko/data-standard-checker-v1-34/
Please try it out and let me know if there are any other errors.