单词提取工具

分词工具 v0.42 发布：Bug 修复

发布日期 2023 年 2 月 24 日 · 已更新 2023 年 2 月 24 日

变革时刻
资源： https://pixabay.com/images/id-3842467/

上次发布的分词工具v0.41有bug。分发词提取工具 v0.42，修复了导致 KeyError: “Column(s) ['DBSchema'] do not exist” 错误的错误。

Kim Ki-young 通过以下评论报告了该错误。

단어 추출 도구 v0.41 버그 내용KeyError: "Column(s) ['DBSchema'] do not exist" — 词提取工具 v0.41 错误内容
KeyError：“列 ['DBSchema'] 不存在”

你好！

当使用从没有DB注释的文件中提取单词的方法时，这是三种执行方法之一
(python word_extractor.py –in_path .\in –out_path .\out)

txt, word, ppt 全部

miniconda3\envs\wordextr\lib\site-packages\pandas\core\apply.py”，第 601 行，在 normalize_dictlike_arg raise KeyError(f”Column(s) {cols_sorted} 不存在”)

KeyError：“列 ['DBSchema'] 不存在”

它正在退出并出现错误。

输入 DB 注释文件的执行方法 2 和 3 没有错误。

我把 'DBSchema': [db_schema] 放在第 97 行，但是这次

在 get_grouper raise KeyError(gpr) KeyError: 'Word' 错误被显示。

谢谢

修改后的代码如下。

    if 'DB' in df_result.columns:
        df_group = df_result.groupby('Word').agg({
            'Word': 'count',
            'Source': lambda x: '\n'.join(list(x)[:10]),
            'DBSchema': 'nunique'
        }).rename(columns={
            'Word': 'Freq',
            'Source': 'Source',
            'DBSchema': 'DBSchema_Freq'
        })
    else:
        df_result['DB'] = ''
        df_result['Schema'] = ''
        df_result['Table'] = ''
        df_result['Column'] = ''
        df_result['DBSchema'] = ''

        df_group = df_result.groupby('Word').agg({
            'Word': 'count',
            'Source': lambda x: '\n'.join(list(x)[:10])
        }).rename(columns={
            'Word': 'Freq',
            'Source': 'Source'
        })

'DB'在列列表中存在和不存在的情况分为处理。

Word Extraction Tool v0.42 的完整源代码可以在以下 URL 找到。

https://github.com/DAToolset/ToolsForDataStandard/blob/main/WordExtractor/word_extractor.py

标签： Python 词提取词提取器

KSM说道：

2025년 07월 04일 3:20 下午

截至安装日期 2025.07.05，根据以下版本检查单词提取
– Anaconda3-2025.06-0-Windows-x86_64
– 预装 Microsoft Build Tools 2022
– Python：3.9.6
– numpy：1.20.3 -> 1.23（需要升级版本）
–熊猫：1.3.1

回复
- Zerom说道：
  
  2025년 07월 10일 8:00 下午
  
  我很高兴它有效。
  感谢您留下评论。
  
  回复

分词工具 v0.42 发布：Bug 修复

2 条回复

发表回复取消回复

🔔 分类

📌 最近的帖子

⭐ 热门帖子/评论/标签

分词工具 v0.42 发布：Bug 修复

相关文章：

2 条回复

发表回复 取消回复

🔔 分类

📌 最近的帖子

⭐ 热门帖子/评论/标签

发表回复取消回复