Problems with PHP regular expressions?

Posted on

Question :

I’m picking up a .txt file and removing the letters and lines in white. Tá giving problem with the special character t or s it does not recognize.

The code below:

<?php

function pass1() {
    $treat = fopen ("C:UsersBridgeDownloadsD_lotfaclott.txt", "r+w+");
    $treat1 = fopen ("C:UsersBridgeDownloadsD_lotfaclott1.txt", "r+w+");

    while (!feof ($treat)) {
        $linha = fgets($treat,4096);
        $patterns = array();
        $patterns [0] = '/[(A-Z)i]*/';
        $patterns [1] = '/Â|Ã|Á|À|É|Ê|Í|Î|Ç|Ó|Õ|Ô|Ö|Ú|Û|Ü/';
        $patterns [2] ='/ã|â|à|á|é|ê|í|î|ç|ó|ô|ô|ö|ú|û|ü/';
        $patterns [3] = '/t/';                 
        $patterns [4] = '/[(a-z)i]*/';
        $patterns [5] = '   ';

        $replacements = array();
        $replacements[] = '';
        $linha = preg_replace($patterns, $replacements, $linha);
        fwrite ($treat1, $linha); 

        printf($linha . "<br>");
        }
}

You are generating the file lott1.txt correctly, only tabs is not being removed nor the spaces ( 2x , 3x , etc). I already put the tab literally “” or put t inside the $pattern[] array. Does not delete.

What’s the problem?

    

Answer :

First

  

s Not “space” !!

You can see what s means here .

Problem

  • From what I noticed you also want to capture accented characters. For this I use the modified u , approached here .
  • To capture both upper and lower case you can use [a-zA-Z] , [[:alpha:]] or [a-z] with i modifier.
  • If you want to remove all tabs and spaces you can do [t ]+ .

Solution

In summary your pattern would be:

~[a-zt ]+~iu

Note

  • [(A-Z)i] – if your intention was to set a group with A-Z it does not occur within [] , the parentheses being interpreted literally.

Leave a Reply

Your email address will not be published. Required fields are marked *