Naver韩语词典/英语词典搜索工具操作方法及源码
这解释了 Naver 韩语词典/英语词典搜索工具的工作原理及其源代码。
这是上一篇文章的延续。
1. Naver韩语词典/英语词典检索工具操作方法及注意事项
用户使用Web浏览器向Naver服务请求搜索词,Naver服务器以请求的处理结果进行响应。
(有关网络如何运作的更多信息,请阅读下面 Google 搜索结果中的文章。)
https://www.google.co.kr/search?q=web+action+method
让我们仔细看看 Naver 词典服务的搜索请求和响应。
1.1. Naver词典搜索请求和响应
通过Web浏览器检查与服务器交换的内容有多种方法,这里我们将使用Fiddler Web Debugger进行说明。
下面是在 Naver 韩语词典中搜索“sign up”一词时,在 Fiddler 中检查请求和响应内容的结果。
- URL、Content-Type:您可以检查以下内容。
- 协议:HTTPS
- 主办方:ko.dict.naver.com
- URL: /api3/koko/search?query=%EA%B0%80%EC%9E%85&m=pc&hid=162470754628591300
- 여기에서 “%EA%B0%80%EC%9E%85″는 “가입”이 URL Encoding된 문자열이다.
- 请求头
- 您可以检查User-Agent、Cookie等内容。
- 回复内容
- 内容类型:application/json;charset=UTF-8
- 可以看到响应内容是json格式,字符集是UTF-8编码。
- 内容长度:50814
- 可以看到响应内容为50,814 Bytes,大约50KB。
- 内容正文:{“searchResultMap”:{“searchResultListMap”:{“WORD”:{“query”:“注册”,...
- 它是一个 JSON 字符串,如果您在“JSON”选项卡中检查它,它具有如下层次结构。
- 内容类型:application/json;charset=UTF-8
1.2.更改响应结果格式(HTML –> JSON)
该工具不使用Naver Open API,而是使用Web请求和响应方法。
虽然不准确,但响应结果的格式在 2018 年 12 月左右发生了变化。之前是HTML格式的,不过这个时候我碰巧用Fiddler查看了一下,发现响应已经变成了JSON格式。
该工具的第一个版本是在响应采用 HTML 格式时创建的。必要的项目是从 HTML 中提取的,但是每次 Naver 更改 HTML 结构时,它都无法正常工作,因此每次都必须更改源代码以匹配更改后的 HTML 结构。自从响应结果格式改为JSON后,在不改变源码的情况下一直运行良好。
1.3.使用注意事项
无法确认Naver是否已正式宣布将提供JSON格式的词典搜索结果。有关 JSON 结构的文档似乎也未发布。
(如果有任何消息或数据发布,请在评论中告诉我们。)
因此,请小心,因为它可能有一天会突然停止工作。
2. 实施
2.1.整体流程总结
对要搜索的单词进行 URL 编码,运行 GetDataFromURL 函数,解析检索到的 JSON 搜索结果,并提取所需的项。
Dim aWord As String, sBaseURL As String, sWord As String aWord = "가입" sBaseURL = "https://ko.dict.naver.com/api3/koko/search?query=%s" '기본 URL sWord = URLEncodeUTF8(aWord) '검색어 URL Encoding Dim sURL As String, sURLData As String, oParsedDic As Dictionary sURL = Replace(sBaseURL, "%s", sWord) '기본 URL에 검색어 대입 sURLData = GetDataFromURL(sURL, "GET", "", "utf-8") 'URL에서 결과 가져오기 Set oParsedDic = JsonConverter.ParseJson(sURLData) 'JSON결과를 Dictionary로 변환 'JSON이 변환된 Dictionary에서 검색결과에 해당하는 항목 추출 '시작 Path: oParsedDic("searchResultMap")("searchResultListMap")("WORD")("items")
我们来看看主要功能。
2.2.编码URL(URLEncodeUTF8源代码)
以 URLEncoded 字符串形式返回要搜索的 URL。使用了 ADODB.Stream 类。
Public Function URLEncodeUTF8( _ StringVal As String, _ Optional SpaceAsPlus As Boolean = False _ ) As String Dim bytes() As Byte, b As Byte, i As Integer, space As String If SpaceAsPlus Then space = "+" Else space = "%20" If Len(StringVal) > 0 Then With New ADODB.Stream .Mode = adModeReadWrite .Type = adTypeText .CharSet = "UTF-8" .Open .WriteText StringVal .Position = 0 .Type = adTypeBinary .Position = 3 ' skip BOM bytes = .Read End With ReDim Result(UBound(bytes)) As String For i = UBound(bytes) To 0 Step -1 b = bytes(i) Select Case b Case 97 To 122, 65 To 90, 48 To 57, 45, 46, 95, 126 Result(i) = Chr(b) Case 32 Result(i) = space Case 0 To 15 Result(i) = "%0" & Hex(b) Case Else Result(i) = "%" & Hex(b) End Select Next i URLEncodeUTF8 = Join(Result, "") End If End Function
要使用 ADODB 库,必须添加对“Microsoft ActiveX Data Object 6.1 Library”的引用。在 Excel 屏幕上 Alt + F11 只需按 键并切换到 VBA 编辑器即可添加它。
2.3.请求&获取响应(GetDataFromURL函数源代码)
使用“WinHttp.WinHttpRequest”类设置Request header和选项信息,并访问搜索URL获取结果。因为是用CreateObject创建对象的后期绑定方法,所以不需要添加库引用。
Function GetDataFromURL(strURL, strMethod, strPostData, Optional strCharSet = "UTF-8") Dim lngTimeout Dim strUserAgentString Dim intSslErrorIgnoreFlags Dim blnEnableRedirects Dim blnEnableHttpsToHttpRedirects Dim strHostOverride Dim strLogin Dim strPassword Dim strResponseText Dim objWinHttp lngTimeout = 59000 strUserAgentString = "http_requester/0.1" intSslErrorIgnoreFlags = 13056 ' 13056: ignore all err, 0: accept no err blnEnableRedirects = True blnEnableHttpsToHttpRedirects = True strHostOverride = "" strLogin = "" strPassword = "" Set objWinHttp = CreateObject("WinHttp.WinHttpRequest.5.1") '-------------------------------------------------------------------- 'objWinHttp.SetProxy 2, "xxx.xxx.xxx.xxx:xxxx", "" 'Proxy를 사용하는 환경에서 설정 '-------------------------------------------------------------------- objWinHttp.SetTimeouts lngTimeout, lngTimeout, lngTimeout, lngTimeout objWinHttp.Open strMethod, strURL If strMethod = "POST" Then objWinHttp.SetRequestHeader "Content-type", "application/x-www-form-urlencoded; charset=UTF-8" Else objWinHttp.SetRequestHeader "Content-type", "text/html; charset=euc-kr" End If If strHostOverride <> "" Then objWinHttp.SetRequestHeader "Host", strHostOverride End If objWinHttp.Option(0) = strUserAgentString objWinHttp.Option(4) = intSslErrorIgnoreFlags objWinHttp.Option(6) = blnEnableRedirects objWinHttp.Option(12) = blnEnableHttpsToHttpRedirects If (strLogin <> "") And (strPassword <> "") Then objWinHttp.SetCredentials strLogin, strPassword, 0 End If On Error Resume Next objWinHttp.Send (strPostData) objWinHttp.WaitForResponse If Err.Number = 0 Then If objWinHttp.Status = "200" Then 'GetDataFromURL = objWinHttp.ResponseText GetDataFromURL = BinaryToText(objWinHttp.ResponseBody, strCharSet) Else GetDataFromURL = "HTTP " & objWinHttp.Status & " " & _ objWinHttp.StatusText End If Else GetDataFromURL = "Error " & Err.Number & " " & Err.Source & " " & _ Err.Description End If On Error GoTo 0 Set objWinHttp = Nothing End Function
2.4.响应(搜索结果)JSON 字符串
响应(搜索结果)JSON字符串包含相当多的信息。由于没有缩进或分行,所以很难看到,但如果你组织得很好,它会如下所示。 (仅摘录部分内容)
{ "searchResultMap": { "searchResultListMap": { "WORD": { "query": "注册", "queryRevert": "", "items": [ { "rank": "1", "gdid": “8800000f_4002c436c93d4bb38d3e58632fe00af0”,“matchType”:“精确:entry”,“entryId”:“4002c436c93d4bb38d3e58632fe00af0”,“serviceCode”:“1”,“languageCode”:“KOKO”,“expDictTypeForm”:“D呃”,“dictTypeForm” : " 2", "sourceDictnameKO": "标准韩语词典", "sourceDictnameOri": "标准韩语词典。", "sourceDictnameLink": "https://stdict.korean.go.kr/main/main.do", ....“expEntry”:“<strong>加入</strong>", ... "destinationLink": "#/entry/koko/4002c436c93d4bb38d3e58632fe00af0", ... "meansCollector": [ { "partOfSpeech": "名词", "partOfSpeech2": "名词", "means": [ { "order": "1", "value": "加入组织或组织,或者申请提供服务的产品。", ... "exampleOri": "<strong>加入</strong> 申请表。", ... }, { "order": "2", "value": "插入更多。", ... "exampleOri": "稿件中间修改的内容。 <strong>加入</strong>被发现。", ... }, { "order": "3", "value": "未经条约文本认证过程而缔结条约的行为。通过允许一个人简单地表达自己的意图而成为一方,法律界...", ... "languageGroup": "Law", ... "exampleTrans": null, ... } ] } ], " simpleWordList": [] , "antonymWordList": [ { "antonymWordName": "取消订阅", "antonymWordLink": "#/entry/koko/14e89175152b46569c2a2b6360e835ad" } ], "expAliasEntryAlwaysList": [], "expAliasGeneralAlwaysList": [ { " originLanguageValue": "加入" } ], ... }, { "rank": "2", "gdid": "881857e6_e12c4e3432cf458c929bd49c929fd80b", "matchType": "exact:entry", "entryId": "e12c4e3432cf458c929bd49c9 29fd80b"," serviceCode": "1 ", "languageCode": "KOKO", "expDictTypeForm": "word", "dictTypeForm": "2", "sourceDictnameKO": "Urimalsaem", "sourceDictnameOri": "Urimalsaem", "sourceDictnameLink" :“https://opendict.korean.go.kr/main”,...“expEntry”:“<strong>加入</strong>", ... "destinationLink": "#/entry/koko/e12c4e3432cf458c929bd49c929fd80b", ... "meansCollector": [ { "partOfSpeech": "名词", "partOfSpeech2": "名词", "means": [ { "order": "", "value": "向种群中添加新个体。不过,这只适用于已经达到一定发展阶段的个体。", ... } ] } ], "similarWordList": [], "antonymWordList": [], ... }, ], "total" : 96, "sectionType": "WORD", "revert": "", "orKEquery": null } } } }
2.5. JSON解析器
你可以使用字符串函数(MID、INSTR等)从JSON字符串中提取所需的项目,但搜索变得复杂,代码变得非常混乱。
如果用 Python 实现,只需导入 json 模块并使用 json 类即可。 VBA 可用的库并不多,但幸运的是 GitHub 上有可用的 JSON 解析器,所以我很好地使用了它。
https://github.com/VBA-tools/VBA-JSON
这个 JSON 解析器的源代码长 1,123 行,因此没有发布在博客上。如果您需要,请查看上面网址的源代码。使用JSON解析器的简单示例如下(上面的代码发布在github上)
Dim Json As Object Set Json = JsonConverter.ParseJson("{""a"":123,""b"":[1,2,3,4],""c"":{""d"":456}}") ' Json("a") -> 123 ' Json("b")(2) -> 2 ' Json("c")("d") -> 456 Json("c")("e") = 789 Debug.Print JsonConverter.ConvertToJson(Json) ' -> "{"a":123,"b":[1,2,3,4],"c":{"d":456,"e":789}}" Debug.Print JsonConverter.ConvertToJson(Json, Whitespace:=2) ' -> "{ ' "a": 123, ' "b": [ ' 1, ' 2, ' 3, ' 4 ' ], ' "c": { ' "d": 456, ' "e": 789 ' } ' }"
2.6。搜索按钮点击事件源码
这是当单击“词典搜索”表上的“Naver 词典搜索”按钮时执行的代码。实施如下:
- 检查选项设置是否正确。
- 对搜索词重复执行字典搜索,并将结果显示在表格上。
- 显示的结果是matchType、searchEntry、含义、链接、同义词和反义词。
- 如果在执行过程中按下“停止搜索”按钮,则重复停止。
Private Sub cmdRunDicSearch_Click() Range("A1").Select DoEvents Dim bIsKorDicSearch As Boolean, bIsEngDicSearch As Boolean, sTargetDic As String bIsKorDicSearch = chkKorDic.Value: bIsEngDicSearch = chkEngDic.Value If (Not bIsKorDicSearch) And (Not bIsEngDicSearch) Then MsgBox "검색 대상 사전중 적어도 1개는 선택해야 합니다", vbExclamation + vbOKOnly, "검색 대상 사전 확인" Exit Sub End If Dim bIsMatchTypeExact As Boolean, bIsMatchTypeTermOr As Boolean, bIsMatchTypeAllTerm As Boolean '검색결과 표시 설정 bIsMatchTypeExact = chkMatchTypeExact.Value: bIsMatchTypeTermOr = chkMatchTypeTermOr.Value: bIsMatchTypeAllTerm = chkMatchTypeAllTerm.Value If (bIsMatchTypeExact Or bIsMatchTypeTermOr Or bIsMatchTypeAllTerm) = False Then MsgBox "검색결과 표시 설정중 적어도 하나는 선택해야 합니다.", vbExclamation + vbOKOnly, "확인" Exit Sub End If If bIsKorDicSearch And Not bIsEngDicSearch Then sTargetDic = "국어사전" If Not bIsKorDicSearch And bIsEngDicSearch Then sTargetDic = "영어사전" If bIsKorDicSearch And bIsEngDicSearch Then sTargetDic = "국어사전, 영어사전" Dim lMaxResultCount As Long lMaxResultCount = CInt(txtMaxResultCount.Value) If MsgBox("사전 검색을 시작하시겠습니까?" + vbLf + _ "대상 사전: " + sTargetDic + vbLf + _ "결과출력 제한개수: " + CStr(lMaxResultCount) _ , vbQuestion + vbYesNoCancel, "확인") <> vbYes Then Exit Sub Dim i As Long, iResultOffset As Long bIsWantToStop = False DoEvents Dim sWord As String, oKorDicSearchResult As TDicSearchResult, oEngDicSearchResult As TDicSearchResult Dim oBaseRange As Range Set oBaseRange = Range("검색결과Header").Offset(1, 0) oBaseRange.Select For i = 0 To 100000 If bIsWantToStop Then MsgBox "사용자의 요청으로 검색을 중단합니다.", vbInformation + vbOKOnly, "확인" Exit For End If If chkSkipIfResultExists.Value = True And _ oBaseRange.Offset(i, 1) <> "" Then GoTo Continue_For '이미 내용이 있으면 Skip sWord = oBaseRange.Offset(i) If sWord = "" Then Exit For oBaseRange.Offset(i).Select Application.ScreenUpdating = False If bIsKorDicSearch Then '국어사전 검색결과 표시 oKorDicSearchResult = DoDicSearch(dtsKorean, sWord, bIsMatchTypeExact, bIsMatchTypeTermOr, bIsMatchTypeAllTerm, lMaxResultCount) oBaseRange.Offset(i, 1).Select With oKorDicSearchResult oBaseRange.Offset(i, 1) = .sMatchType oBaseRange.Offset(i, 2) = .sSearchEntry oBaseRange.Offset(i, 3) = .sMeaning If oKorDicSearchResult.sLinkURL <> "" Then With ActiveSheet.Hyperlinks.Add(Anchor:=oBaseRange.Offset(i, 4), Address:=.sLinkURL, TextToDisplay:="네이버국어사전 열기: " & .sLinkWord) .Range.Font.Size = 8 End With End If oBaseRange.Offset(i, 5) = .sSynonymList oBaseRange.Offset(i, 6) = .sAntonymList End With End If If bIsEngDicSearch Then '영어사전 검색결과 표시 oEngDicSearchResult = DoDicSearch(dtsEnglish, sWord, bIsMatchTypeExact, bIsMatchTypeTermOr, bIsMatchTypeAllTerm, lMaxResultCount) 'oBaseRange.Offset(i, 7).Select With oEngDicSearchResult oBaseRange.Offset(i, 7) = .sMatchType oBaseRange.Offset(i, 8) = .sSearchEntry oBaseRange.Offset(i, 9) = .sMeaning If oKorDicSearchResult.sLinkURL <> "" Then With ActiveSheet.Hyperlinks.Add(Anchor:=oBaseRange.Offset(i, 10), Address:=.sLinkURL, TextToDisplay:="네이버영어사전 열기: " & .sLinkWord) .Range.Font.Size = 8 End With End If oBaseRange.Offset(i, 11) = .sSynonymList oBaseRange.Offset(i, 12) = .sAntonymList End With End If Application.ScreenUpdating = True Continue_For: DoEvents Next i MsgBox "사전 검색을 완료하였습니다", vbOKOnly + vbInformation End Sub
2.7.字典搜索(DoDicSearch源代码)
该函数发送针对一个搜索词的搜索请求,接收结果,然后提取并返回必要的项目。
- 将 JSON 字符串解析为字典:第 49 行
- 提取 matchType、searchEntry、含义、链接、同义词和反义词项:第 53 至 106 行
Const DICT_ROOT_URL_KO As String = "https://ko.dict.naver.com/" Const DICT_BASE_URL_KO As String = "https://ko.dict.naver.com/api3/koko/search?query=%s" Const DICT_ROOT_URL_EN As String = "https://en.dict.naver.com/" Const DICT_BASE_URL_EN As String = "https://en.dict.naver.com/api3/enko/search?query=%s" Public Enum DicToSearch dtsKorean = 1 dtsEnglish = 2 dtsAll = 10 End Enum Public Type TDicSearchResult sWord As String sMatchType As String sSearchEntry As String sMeaning As String sLinkURL As String sLinkWord As String sSynonymList As String sAntonymList As String End Type Public Function DoDicSearch(aDicToSearch As DicToSearch, aWord As String, _ bIsMatchTypeExact As Boolean, bIsMatchTypeTermOr As Boolean, bIsMatchTypeAllTerm As Boolean, _ aMaxResultCount As Long) As TDicSearchResult Dim sDicRootURL As String, sBaseURL As String, sURL As String, sURLData As String, sWord As String, oDicSearchResult As TDicSearchResult Dim oParsedDic As Dictionary Dim oItem As Dictionary, oMeansCollector As Dictionary, oMeans As Dictionary Dim oSimWords As Dictionary, oAntWord As Dictionary Dim sPOS As String, sMeaning As String, sLinkURL As String, sLinkWord As String Dim s유의어 As String, s유의어목록 As String, s반의어 As String, s반의어목록 As String Dim sMatchType As String, sSearchEntry As String, sHandleEntry As String Select Case aDicToSearch Case dtsKorean sDicRootURL = DICT_ROOT_URL_KO sBaseURL = DICT_BASE_URL_KO Case dtsEnglish sDicRootURL = DICT_ROOT_URL_EN sBaseURL = DICT_BASE_URL_EN End Select If aWord = "" Then Exit Function sWord = URLEncodeUTF8(aWord) sURL = Replace(sBaseURL, "%s", sWord) sURLData = GetDataFromURL(sURL, "GET", "", "utf-8") 'URL에서 결과 가져오기 Set oParsedDic = JsonConverter.ParseJson(sURLData) 'JSON결과를 Dictionary로 변환 Dim lMatchIdx As Long: lMatchIdx = 0 Dim lResultCount As Long: lResultCount = 0 For Each oItem In oParsedDic("searchResultMap")("searchResultListMap")("WORD")("items") lResultCount = lResultCount + 1 If (aMaxResultCount <> 0) And (lResultCount > aMaxResultCount) Then Exit For '결과출력 제한개수 초과시 Loop 종료 s유의어 = "": s반의어 = "" lMatchIdx = lMatchIdx + 1 'If oItem("matchType") <> "exact:entry" Then Exit For sHandleEntry = oItem("handleEntry") Select Case oItem("matchType") Case "exact:entry" sLinkWord = sHandleEntry sLinkURL = sDicRootURL + oItem("destinationLink") If Not bIsMatchTypeExact Then GoTo Continue_InnerFor Case "term:or" If Not bIsMatchTypeTermOr Then GoTo Continue_InnerFor Case "allterm:proximity:1.000000" If Not bIsMatchTypeAllTerm Then GoTo Continue_InnerFor Case Else End Select sMatchType = sMatchType + IIf(sMatchType = "", "", vbLf) & CStr(lMatchIdx) & ". " & oItem("matchType") sSearchEntry = sSearchEntry + IIf(sSearchEntry = "", "", vbLf) & CStr(lMatchIdx) & ". " & sHandleEntry For Each oMeansCollector In oItem("meansCollector") 'Debug.Print "품사: " & oMeansCollector("partOfSpeech") sPOS = "" If oMeansCollector.Exists("partOfSpeech") Then If Not IsNull(oMeansCollector("partOfSpeech")) Then sPOS = oMeansCollector("partOfSpeech") End If For Each oMeans In oMeansCollector("means") 'Debug.Print "뜻: " & oMeans("value") If oMeans.Exists("value") Then If Not IsNull(oMeans("value")) Then _ sMeaning = sMeaning + IIf(sMeaning = "", "", vbLf) & CStr(lMatchIdx) & ". " & IIf(sPOS = "", "", "[" & sPOS & "] ") & RemoveHTML(oMeans("value")) End If Next oMeans Next oMeansCollector For Each oSimWords In oItem("similarWordList") If oSimWords.Exists("similarWordName") Then _ s유의어 = s유의어 + IIf(s유의어 = "", "", ", ") & RemoveHTML(oSimWords("similarWordName")) Next oSimWords If s유의어 <> "" Then _ s유의어목록 = s유의어목록 & IIf(s유의어목록 = "", "", vbLf) & CStr(lMatchIdx) & ". " & sHandleEntry & ": " & s유의어 For Each oAntWord In oItem("antonymWordList") If oAntWord.Exists("antonymWordName") Then _ s반의어 = s반의어 + IIf(s반의어 = "", "", ", ") & RemoveHTML(oAntWord("antonymWordName")) Next oAntWord If s반의어 <> "" Then _ s반의어목록 = s반의어목록 & IIf(s반의어목록 = "", "", vbLf) & CStr(lMatchIdx) & ". " & sHandleEntry & ": " & s반의어 Continue_InnerFor: Next oItem If sMeaning = "" Then sMeaning = "#NOT FOUND#": sMatchType = sMeaning: sSearchEntry = sMeaning End If '결과값 반환 With oDicSearchResult .sWord = aWord .sMatchType = sMatchType .sSearchEntry = sSearchEntry .sMeaning = sMeaning .sLinkWord = sLinkWord .sLinkURL = Replace(sLinkURL, "#", "%23") 'Excel에서 #기호를 내부적으로 #20 - #20 으로 치환하는 것을 방지 .sSynonymList = s유의어목록 .sAntonymList = s반의어목록 End With DoDicSearch = oDicSearchResult End Function
上面,我们了解了这个工具的操作方式、注意事项和源代码。请留下评论,例如使用过该工具的人的评论、您有任何问题或您需要的任何功能。
<< 相关文章列表 >>
绝对同意你的观点。这是一个很好的主意。我支持你。
_ _ _ _ _ _ _ _ _ _ _ _ _ _
Nekultsy Ivan dxvk github
多谢。
请看看我博客里的其他文章^^