Word Extraction Tool (2): Configure the Word Extraction Tool Execution Environment
The word extraction tool is a tool developed in Python, and prior to execution, an environment configuration process such as installing Python and necessary packages is required. Let's take a look at the configuration of the execution environment of the word extraction tool.
This is a continuation of the previous article.
Word Extraction Tool(1): Overview of Word Extraction Tool
2. Configuration of the word extraction tool execution environment
2.1. Environment configuration overview
2.1.1. Recommendations
It is recommended to install Miniconda rather than Anaconda. Anaconda installs too many packages into the default environment, which makes it large. We recommend using Miniconda as it is small and lightweight to start with.
If Miniconda is not installed, virtualenv installation is recommended. If you install the package in a separate environment isolated from the basic environment, you can avoid problems such as package version conflicts.
If it is judged that there is no problem or if only the word extractor is used, it is okay to use the default environment. This article explains how to use Miniconda on Windows 10 64bit.
2.1.2. Stemmer Selection: Mecab
Mecab was chosen because it was the fastest to execute among open natural language morpheme analyzers and most suited to the purpose of word extraction. To use a morpheme analyzer other than Mecab, you can rewrite the get_word_list() function.
2.1.3. Overall order of environment configuration
- Install Miniconda
- Creating and activating a virtual environment
- Install Python in virtual environment
- Install the packages required for the virtual environment (install in the basic environment if the virtual environment is not used)
2.2. Install Miniconda
https://conda.io/en/latest/miniconda.html#windows-installers Select and download the Python version from . The word extraction tool was developed in Python 3.8 and works well in 3.9. Here we will download and install 3.9.
Execute the downloaded file (Miniconda3-py39_4.10.3-Windows-x86_64.exe) to proceed with the installation. Click the Next button a few times to complete the installation.
Subsequent tasks are executed from the Miniconda Prompt. You can run it from the following path.
Start Menu > Anaconda3 (64bit) > Anaconda Prompt (miniconda3)
2.3. Creating and activating a virtual environment
When you run Miniconda Prompt for the first time, the base environment (base) is activated. (see image above)
Create a separate virtual environment for the word extraction tool.
(base) C:\Users\ymlee>conda create -n wordextr
Activate the created virtual environment with the following command. If the virtual environment name (wordextr) appears in front after executing the command, it is normally activated.
(base) C:\Users\ymlee>conda activate wordextr (wordextr) C:\Users\ymlee>
2.4. Install Python in virtual environment
Run the following command.
(wordextr) C:\Users\ymlee>conda install python
Something like the following is output:
(wordextr) C:\Users\ymlee>conda install python Collecting package metadata (current_repodata.json): done Solving environment: done ## Package Plan ## environment location: C:\Users\ymlee\miniconda3\envs\wordextr added / updated specs: - python The following NEW packages will be INSTALLED: ca-certificates pkgs/main/win-64::ca-certificates-2021.7.5-haa95532_1 certifi pkgs/main/win-64::certifi-2021.5.30-py39haa95532_0 openssl pkgs/main/win-64::openssl-1.1.1l-h2bbff1b_0 pip pkgs/main/win-64::pip-21.2.4-py38haa95532_0 python pkgs/main/win-64::python-3.9.7-h6244533_1 setuptools pkgs/main/win-64::setuptools-58.0.4-py39haa95532_0 sqlite pkgs/main/win-64::sqlite-3.36.0-h2bbff1b_0 tzdata pkgs/main/noarch::tzdata-2021a-h5d7bf9c_0 vc pkgs/main/win-64::vc-14.2-h21ff451_1 vs2015_runtime pkgs/main/win-64::vs2015_runtime-14.27.29016-h5e58377_2 wheel pkgs/main/noarch::wheel-0.37.0-pyhd3eb1b0_1 wincertstore pkgs/main/win-64::wincertstore-0.2-py39h2bbff1b_0 Proceed ([y]/n)?
Just press Enter or type y and press Enter to start the installation. For reference, if you do not want to install it, type n and press Enter.
2.5. Install required packages
Install the necessary packages with the following command: Since wordcloud and eunjeon are not provided by conda, they must be installed with pip.
conda install pywin32 conda install pandas conda install Jinja2 conda install xlsxwriter pip install wordcloud pip install eunjeon
The purpose of each package is as follows.
- pywin32: Used to open and read MS Word, PowerPoint, and Excel files in OLE automation
- pandas: used to manage word extraction results in memory and save them to an excel file at the end
- Jinja2, xlsxwriter: used for ExcelWriter in pandas
- wordcloud: used to visualize word extraction results
- eunjeon: using Korean morpheme analyzer Mecab
When installing eunjeon, “Microsoft Visual C++ 14.0 or greater is required.” If an error occurs, download and install 'Microsoft Build Tools 2015 Update 3' among 'Redistributable Packages and Build Tools' from the URL below and try again.
https://visualstudio.microsoft.com/ko/vs/older-downloads/#microsoft-build-tools-2015-update-3
When installing, select “Desktop development using C++” and install. (The screen below is a screen captured after installation and is slightly different from the screen during installation)
After installing “Microsoft Build Tools 2015 Update 3”, install eunjeon with the following command.
pip install eunjeon
If eunjeon installation is complete, you can remove “Microsoft Build Tools 2015 Update 3”.
Run 'Visual Studio Installer' from the start menu, deselect “Desktop development using C++”, and click the “Modify” button on the bottom right to remove it.
At this point, the configuration of the environment is complete. Next, we will look at how to run the word extraction tool and check the results.
<< List of related articles >>
- Word Extraction Tool(1): Overview of Word Extraction Tool
- Word Extraction Tool (2): Configure the Word Extraction Tool Execution Environment
- Word Extraction Tool (3): How to Run the Word Extraction Tool and Check the Results
- Word Extraction Tool(4): Word Extraction Tool Source Code Description(1)
- Word Extraction Tool(5): Word Extraction Tool Source Code Description(2)
- Word Extraction Tool (6): Additional Description of Word Extraction Tool
- Full Contents of Word Extraction Tool Description , Download
(wordextr) E:\WordExtractor>python word_extractor.py –in_path .\in –out_path .\out
I am a beginner using python for the first time. I ran it as above and got the following result. There seems to be something wrong with the route designation, but I'm a novice and can't solve it. I would appreciate your help (the in and out folders have been created correctly).
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
————————————————————
Word Extractor v0.41 start — 2023-11-20 03:13:07.584787
##### arguments #####
multi_process_count: 32
db_comment_file: None
in_path: .\in
out_path: .\out
————————————————————
[2023-11-20 03:13:07.586789] Start Get File List…
[2023-11-20 03:13:07.586789] Finish Get File List.
— File List —
E:\WordExtractor\in\test.txt
[2023-11-20 03:13:07.588790] Start Get File Text…
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
E:\WordExtractor\word_extractor.py:382: SyntaxWarning: invalid escape sequence '\o'
usage_description = “””— Description —
E:\WordExtractor\word_extractor.py:406: SyntaxWarning: invalid escape sequence '\i'
parser.add_argument('–in_path', required=False, help='Input file (ppt, doc, txt) path name (e.g. .\in) ')
E:\WordExtractor\word_extractor.py:407: SyntaxWarning: invalid escape sequence '\o'
parser.add_argument('–out_path', required=True, help='Output file (xlsx, png) path name (e.g. .\out)')
get_txt_text: E:\WordExtractor\in\test.txt
multiprocessing.pool.RemoteTraceback:
“””
Traceback (most recent call last):
File “C:\ProgramData\miniconda3\envs\wordextr\Lib\multiprocessing\pool.py”, line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File “C:\ProgramData\miniconda3\envs\wordextr\Lib\multiprocessing\pool.py”, line 48, in mapstar
return list(map(*args))
^^^^^^^^^^^^^^^^
File “E:\WordExtractor\word_extractor.py”, line 367, in get_file_text
df_text = get_txt_text(file_name)
^^^^^^^^^^^^^^^^^^^^^^^
File “E:\WordExtractor\word_extractor.py”, line 238, in get_txt_text
df_text = df_text.append(sr_text, ignore_index=True)
^^^^^^^^^^^^^^
File “C:\ProgramData\miniconda3\envs\wordextr\Lib\site-packages\pandas\core\generic.py”, line 6204, in __getattr__
return object.__getattribute__(self, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?
“””
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “E:\WordExtractor\word_extractor.py”, line 559, in
main()
File “E:\WordExtractor\word_extractor.py”, line 460, in main
mp_text_result = pool.map(get_file_text, file_list)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\ProgramData\miniconda3\envs\wordextr\Lib\multiprocessing\pool.py”, line 367, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\ProgramData\miniconda3\envs\wordextr\Lib\multiprocessing\pool.py”, line 774, in get
raise self._value
AttributeError: 'DataFrame' object has no attribute 'append'
(wordextr) E:\WordExtractor>
hello nice to meet you.
Since this is an error I have not experienced, it is difficult to tell you how to solve it right away.
Could you please check and let me know the Python version, numpy, and pandas versions?
I think you need to check because the version is different.
For reference, the version of the environment I implemented and tested is as follows.
– Python: 3.9.6 (How to check: python –version)
– numpy: 1.20.3 (How to check: pip list) (You can also check pandas below at once)
– pandas: 1.3.1
I also had the same error. I ran it according to the versions of the packages you shared and it was successful.
hello. I have a question regarding Anaconda installation. I would like to use a word extraction tool within the company, but since Anaconda is paid, the company recommends using miniforge. Will there be any difference in functionality if I use the word extraction tool after installing miniforge?
I haven't used miniforge, so I don't know if there will be a functional difference.
The purpose of installing miniconda was to easily create and manage a virtual environment rather than to facilitate package installation.
Try this:
– Use venv or virtualenv instead of miniconda (see: https://richwind.co.kr/193)
– “2.5. Change “conda install” to “pip install” in the “Install necessary packages” content.
I hope it goes well.
First, I installed miniforge and performed the above process at the Miniforge Prompt, but nothing happened.
And the 'Microsoft Build Tools 2015 Update 3' you mentioned did not install well, so I installed Microsoft Build Tools 2022 and received eunjeon.
Now I will try the extraction tool and give you feedback 🙂
I hope it runs well ^^