Next: Testing, Previous: Class Structure, Up: API
Training data is generated using the function printTrainingData(text,outputStream,printNumClasses=true)
text
is the annotated text read from a corpus file. This function
tokenises the text given and extracts the feature values for the tokens
and writes the training data to the outputstream given. The function
also accepts an argument printNumClasses which is set by default to
true. If run with the default value, the first line of the training data
(the number of classes) will be printed. If the function is used for
training files in a batch, the number of classes should be printed for
the first file and all subsequent calls to the function should have the
argument value set to false. See Classifier, for information about
training data.