主要内容

Modeling and Prediction

Develop predictive models using topic models and word embeddings

要查找高维文本数据集中的簇和提取功能,您可以使用机器学习技术和模型,例如LSA,LDA和Word Embeddings。您可以将创建的功能与Text Analytics Toolbox™与其他数据源的功能相结合。有了这些功能,您可以构建利用文本,数字和其他类型数据的机器学习模型。

Functions

展开全部

bagOfWords 单袋型号
bagOfNgrams Bag-of-n-grams model
addDocument 将文档添加到字袋或n-grams型号
removeDocument Remove documents from bag-of-words or bag-of-n-grams model
removeInfrequentWords Remove words with low counts from bag-of-words model
removeInfrequentNgrams 从n-grams模型中删除很少见的n-grams
removeWords Remove selected words from documents or bag-of-words model
removengrams 从n-grams模型中删除n-grams
removeEmptyDocuments 从令牌化的文档阵列,词袋模型或n-grams型号中删除空文档
topkwords Most important words in bag-of-words model or LDA topic
topkngrams Most frequent n-grams
encode Encode documents as matrix of word or n-gram counts
tfidf Term Frequency–Inverse Document Frequency (tf-idf) matrix
加入 结合多个字袋或n-grams型号
vaderSentimentScores Vader算法的情感分数
比例术 与比率规则的情感分数
FastTextWordembedding 预处理的fastText单词嵌入
wordEncoding Word encoding model to map words to indices and back
DOC2序列 Convert documents to sequences for deep learning
Wordembeddinglayer Word embedding layer for deep learning networks
Word2Vec 地图单词嵌入向量
word2ind Map word to encoding index
vec2word Map embedding vector to word
ind2word Map encoding index to word
isVocabularyWord Test if word is member of word embedding or encoding
ReadWordEmbedding 从文件中读取单词嵌入
trainWordEmbedding Train word embedding
写入wordembedding Write word embedding file
Wordembedding 单词嵌入模型以将单词映射到向量和后背
提取物 Extract summary from documents
rakeKeywords Extract keywords using RAKE
textrankKeywords Extract keywords using TextRank
bleuEvaluationScore Evaluate translation or summarization with BLEU similarity score
Rougeevaluationscore Evaluate translation or summarization with ROUGE similarity score
bm25Similarity Document similarities with BM25 algorithm
余弦 Document similarities with cosine similarity
TexTrankScores Document scoring with TextRank algorithm
lexrankScores Document scoring with LexRank algorithm
mmrScores Document scoring with Maximal Marginal Relevance (MMR) algorithm
fitlda Fit latent Dirichlet allocation (LDA) model
fitlsa 适合LSA型号
resume 简历安装LDA模型
logp LDA模型的文档对数概率和拟合度的优点
predict Predict top LDA topics of documents
transform 将文件转换为较低维的空间
ldaModel Latent Dirichlet allocation (LDA) model
lsaModel Latent semantic analysis (LSA) model
wordcloud Create word cloud chart from text, bag-of-words model, bag-of-n-grams model, or LDA model
textscatter 2-D scatter plot of text
textscatter3 3-D scatter plot of text

Topics

Classification and Modeling

情感分析and Keyword Extraction

深度学习

Language Support