Using Textstat

Description

Using Textstat

Using Textstat In Python

Jun 18, 2014 textstat. Smogindex (text) Returns the SMOG index of the given text. This is a grade formula in that a score of 9.3 means that a ninth grader would be able to read the document. Dec 14, 2020 Textstat. Modified from the original by Jonathan Pyle to remove the Pyphen dependency because it is a GPL library and textstat is MIT licensed. Textstat is an easy to use library to calculate statistics from text. It helps determine readability, complexity, and grade level. Photo by Patrick Tomasso on Unsplash. If you wanted to compare the frequency of a single term across different texts, you can also use textstatfrequency, group the frequency by speech and extract the term.

Using

Produces counts and document frequencies summaries of the features in adfm, optionally grouped by a docvars variable or other suppliedgrouping variable.

Usage

Using Textstat To Check

Arguments

a dfm object

Using Textstat To Print

(optional) integer specifying the top n features to be returned,within group if groups is specified

either: a character vector containing the names of documentvariables to be used for grouping; or a factor or object that can becoerced into a factor equal in length or rows to the number of documents.NA values of the grouping value are dropped.See groups for details.

character string specifying how ties are treated. Seedata.table::frank() for details. Unlike that function,however, the default is 'min', so that frequencies of 10, 10, 11would be ranked 1, 1, 3.

additional arguments passed to dfm_group(). This canbe useful in passing force = TRUE, for instance, if you are grouping adfm that has been weighted.

Value

Using Test Statistic To Find P-value

a data.frame containing the following variables:

feature

(character) the feature

frequency

count of the feature

rank

Using Textstat To Use

UsingUsing

rank of the feature, where 1 indicates the greatestfrequency

docfreq

document frequency of the feature, as a count (thenumber of documents in which this feature occurred at least once)

docfreq

document frequency of the feature, as a count

group

(only if groups is specified) the label of the group.If the features have been grouped, then all counts, ranks, and documentfrequencies are within group. If groups is not specified, the groupcolumn is omitted from the returned data.frame.

textstat_frequency returns a data.frame of features andtheir term and document frequencies within groups.

Examples