![]() I have read about vector concatenation but did not understand how would one go about, for example, concatenating vector from bigrams with vector from trigrams and then using that for classification with something like regression. When using CountVectorizer, how would one go about utilising PMI to select most relevant, for example, bigrams? I know PMI is available in llocations package, but I could not find any way to combine it with CountVectorizer effectively.įrom what I read from the paper, single words, bigrams and trigrams are constructed as features separately, and later used in combination. ![]() Since I am new to this type of problem, and all the things I've read on it so far confuse me more about the implementation, so I was hoping someone can answer the two questions I have about it, or at least point me in the right direction. ![]() I've read this paper about constructing feature vectors from comments, where single words, bigrams and trigrams are used as features, with n-grams being selected based on PMI value higher than a certain threshold. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |