How to Index all words that are in caps in a group of documents

superhans · Jan 13, 2011

Hi, I have a group of 20k documents. I would like to have a list of all of the words that they contain which are in capital letters, ordered by frequency. What is the simplest way of doing that?

threadmark · Jan 14, 2011

Text file process starting from *.txt . Convert space to a delimination character like % because some languages can't use space as a delimited character. process text with case character equal TRUE store in text variable 1 . If variable 1 contains word modify word integer +1 store as word %integer% .

Then you will have a list of all words with capital and a number next to it contained by%% to indicate how many were found.

Should take about 2 minute to compile depending on the language.

kmote · Jan 14, 2011

What platform are you on and what tools do you have available? If you are on a unix like platform, you will probably have awk installed.

How to Index all words that are in caps in a group of documents

superhans

Beta member

threadmark

In Runtime

kmote

Seg Fault'n,

Similar threads