removing punctuation

Status
Not open for further replies.

Allan87

Beta member
Messages
1
The following was meant to remove punctuation from a list of imported words. Is there a correct way of doing this?

Code:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>


int main(void)

char * word_list = "greg.john,steve!julie";
char * punc_mark = "!@#$%^&*(),.<>[]{};':\"";

for ( int wlidx = 0; wlidx < strlen( word_list ); wlidx++ )
{
    for ( int pmidx = 0; pmidx < strlen( pmidx ); pmidx++)
      if ( word_list[wlidx]==punc_list[pmidx])
        word_list[wlidx]=' ';
}

{
  FILE *fp = fopen("./testl.txt", "r");
  char result[128];
  
  while (!feof(fp))
  {
    if (fscanf(fp, "%128s", result)<1)
    {
      break;
    }

    printf("Found word bit: %s \n", result);
  }
  fclose(fp);
}
 
Your algorithm for replacing the characters is very close, you're just off by one thing

Code:
int main(void)

char * word_list = "greg.john,steve!julie";
char * punc_mark = "!@#$%^&*(),.<>[]{};':\"";

for ( int wlidx = 0; wlidx < strlen( word_list ); wlidx++ )
{
    for ( int pmidx = 0; pmidx < strlen( punc_mark ); pmidx++)
      if ( word_list[wlidx]==punc_list[pmidx])
        word_list[wlidx]=' ';
}

I changed the strlen( pmidx) to strlen( punc_mark). This way you will traverse all the way through your punctuation mark list.

As far as getting it in and out of a text file, I'll let you figure out that part. Think about reading into an array and then writing from an array.
 
You could use strtok, to get pointers to each of the sub-strings, and construct a new string from them.

This is from glibc manual 0.10, but it explains strtok well
- Function: char * strtok (char *restrict NEWSTRING, const char
*restrict DELIMITERS)
A string can be split into tokens by making a series of calls to
the function `strtok'.

The string to be split up is passed as the NEWSTRING argument on
the first call only. The `strtok' function uses this to set up
some internal state information. Subsequent calls to get
additional tokens from the same string are indicated by passing a
null pointer as the NEWSTRING argument. Calling `strtok' with
another non-null NEWSTRING argument reinitializes the state
information. It is guaranteed that no other library function ever
calls `strtok' behind your back (which would mess up this internal
state information).

The DELIMITERS argument is a string that specifies a set of
delimiters that may surround the token being extracted. All the
initial characters that are members of this set are discarded.
The first character that is _not_ a member of this set of
delimiters marks the beginning of the next token. The end of the
token is found by looking for the next character that is a member
of the delimiter set. This character in the original string
NEWSTRING is overwritten by a null character, and the pointer to
the beginning of the token in NEWSTRING is returned.

On the next call to `strtok', the searching begins at the next
character beyond the one that marked the end of the previous token.
Note that the set of delimiters DELIMITERS do not have to be the
same on every call in a series of calls to `strtok'.

If the end of the string NEWSTRING is reached, or if the remainder
of string consists only of delimiter characters, `strtok' returns
a null pointer.

Note that "character" is here used in the sense of byte. In a
string using a multibyte character encoding (abstract) character
consisting of more than one byte are not treated as an entity.
Each byte is treated separately. The function is not
locale-dependent.
 
Status
Not open for further replies.
Back
Top Bottom