这里会显示出您选择的修订版和当前版本之间的差别。
两侧同时换到之前的修订记录 前一修订版 后一修订版 | 前一修订版 | ||
zh:notes:techemergence [2020/10/17 15:24] pzczxs [Fetch Full-Record and Cited References] |
zh:notes:techemergence [2020/10/17 16:19] (当前版本) pzczxs [Rum the DIM Model] |
||
---|---|---|---|
行 52: | 行 52: | ||
===== Merge Authors ===== | ===== Merge Authors ===== | ||
To run <color red>AuthorMerger.java</color> in the project <color red>TechEmergence</color>. | To run <color red>AuthorMerger.java</color> in the project <color red>TechEmergence</color>. | ||
+ | |||
+ | ===== Detect and Tokenize Sentences, and Recognize Entities ===== | ||
+ | To run <color red>Converter2Genia.java</color> in the package <color red>cn.edu.bjut.genia</color> of the project <color red>EmergingTopicsConverter</color>. Thus, the articles will be saved in the directories <color red>data/contest-Genia/DIM</color> and <color red>data/contest-Genia/CIM</color>. Each article is named by its resulting id. | ||
+ | |||
+ | <code bash> | ||
+ | > ./run_geniass.sh geniass data/contest-Genia/DIM & | ||
+ | > ./run_geniatagger.sh geniatagger data/contest-Genia/DIM & | ||
+ | > ./run_geniass.sh geniass data/contest-Genia/CIM & | ||
+ | > ./run_geniatagger.sh geniatagger data/contest-Genia/CIM & | ||
+ | </code> | ||
+ | |||
+ | For each document, two files will be generated with the extension name <color red>.txt.ss</color> and <color red>.txt.ss.tag</color>. To save all <color red>.txt.ss</color> and <color red>.txt.ss.tag</color> files in the directories <color red>data/contest-Genia/DIM</color> and <color red>data/contest-Genia/CIM</color>. | ||
+ | |||
+ | ===== Rum the CIM Model ===== | ||
+ | To run <color red>Converter2DIM.java</color> in the package <color red>cn.edu.bjut.genia</color> of the project <color red>EmergingTopicsConverter</color>. Several files will be generated for the DIM model in the directory <color red>data/contest-DIM/emergence</color>. | ||
+ | |||
+ | ===== Rum the DIM Model ===== |