Running JATE2.0 (Embedded mode)

source: https://github.com/ziqizhang/jate

download JATE2.0

https://github.com/ziqizhang/jate/releases

jate-2.0-beta.1-jar-with-dependencies.jar

clone and download JATE2.0

把solr-testbed拷贝到与以上jar包在同一个目录下，然后创建某个文件夹，把自己的数据放在那个文件夹中，最后运行下面的命令即可。

java -Xmx8g -XX:-UseGCOverheadLimit -cp jate-2.0-beta.1-jar-with-dependencies.jar uk.ac.shef.dcs.jate.app.AppCValue -corpusDir astro -c true -pf.mttf 2 -o cvalue-terms.json solr-testbed ACLRDTEC

Algorithms

Algorithm	APP_ALGORITHM
TTF	uk.ac.shef.dcs.jate.app.AppTTF
ATTF	uk.ac.shef.dcs.jate.app.AppATTF.
TTF-IDF	uk.ac.shef.dcs.jate.app.AppTFIDF
RIDF	uk.ac.shef.dcs.jate.app.AppRIDF
CValue	uk.ac.shef.dcs.jate.app.AppCValue
ChiSquare	uk.ac.shef.dcs.jate.app.AppChiSquare
RAKE	uk.ac.shef.dcs.jate.app.AppRAKE
Weirdness	uk.ac.shef.dcs.jate.app.AppWeirdness
GlossEx	uk.ac.shef.dcs.jate.app.AppGlossEx
TermEx	uk.ac.shef.dcs.jate.app.AppTermEx

Options

options	Expected Type	description
-corpusDir	string	The directory of the corpus that will be processed.
-prop	string	jate.properties file(path) for the configuration of Solr schema.
-c	boolean	Expect 'true' or 'false'. This parameter specifies whether to collect term information for exporting, e.g., offsets in documents. Default is false. Setting to true will significantly increase post-processing time that is need to query the Solr index for such information.
-r	string	Reference corpus frequency file (path) is required by AppGlossEx, AppTermEx and AppWeirdness. An example is provided in '/testdata/solr-testbed/ACLRDTEC/conf/bnc_unifrqs.normal'.
-cf.t	number	This is a post-filtering setting. Cutoff score threshold for selecting terms. If multiple -cf.* parameters are set the preference order will be cf.t, cf.k, cf.kp.
-cf.k	number	This is a post-filtering setting. Cutoff top ranked K terms to be selected. If multiple -cf.* parameters are set the preference order will be cf.t, cf.k, cf.kp.
-cf.kp	number	This is a post-filtering setting. Cutoff top ranked K% terms to be selected. If multiple -cf.* parameters are set the preference order will be cf.t, cf.k, cf.kp.
-pf.mttf	number	Pre-filter minimum total term frequency. Any candidate term whose total frequency in the corpus is less than this value will not be considered for ranking
-pf.mtcf	number	Pre-filter minimum context frequency of a term (used by co-occurrence based methods). This is the number of context objects where a term appears. If any candidate's mtcf is lower than this value it will not be considered for ranking
-o	string	File (path) to save output. Only JSON output is supported now.